date:20230919

[jira] [Commented] (IMPALA-12399) Pass eventTypeSkipList with OPEN_TXN in NotificationEventRequest to avoid receiving OPEN_TXN events from HMS

2023-09-19 Thread Quanlong Huang (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-12399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766955#comment-17766955
 ] 

Quanlong Huang commented on IMPALA-12399:
-

The original work has some issues like not taking care of the updates on 
lastSyncEventId_ and latestEventTimeMs_. It might need some time for 
[~VenuReddy] to address them. Details in the comment of 
https://gerrit.cloudera.org/c/20487/

I suggest we revert the original patch in the 4.3.0 branch (but leave it in the 
master branch) so this won't block the release.

> Pass eventTypeSkipList with OPEN_TXN in NotificationEventRequest to avoid 
> receiving OPEN_TXN events from HMS
> 
>
> Key: IMPALA-12399
> URL: https://issues.apache.org/jira/browse/IMPALA-12399
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Reporter: Venugopal Reddy K
>Assignee: Venugopal Reddy K
>Priority: Major
> Fix For: Impala 4.3.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Notification events like OPEN_TXN are ignored on catalogd 
> {{{}MetastoreEventsProcessor{}}}. So, we can pass eventTypeSkipList with 
> OPEN_TXN in NotificationEventRequest while invoking get_next_notification() 
> to avoid reading such notification messages from HMS and then ignoring on 
> catalogd. OPEN_TXN event being more frequent(received even upon describe 
> table operation from beeline), we can significantly reduce unwanted 
> processing on both HMS and catalogd. Catalogd reads events in batches of 
> EVENTS_BATCH_SIZE_PER_RPC, skipping such unnecessary events can help catchup 
> the events faster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-12390) Enable performance related clang-tidy checks

2023-09-19 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-12390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766937#comment-17766937
 ] 

ASF subversion and git services commented on IMPALA-12390:
--

Commit 4d15558b5eaa69e872917c8bbf69dc1dc2146bc5 in impala's branch 
refs/heads/master from Michael Smith
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=4d15558b5 ]

IMPALA-12390 (part3): Enable unnecessary-copy-initialization

Enables the clang-tidy performance-unnecessary-copy-initialization check
and fixes any issues found with run_clang_tidy.sh.

Change-Id: I217df2598b21551fe21099c2caa5a39865010c20
Reviewed-on: http://gerrit.cloudera.org:8080/20492
Reviewed-by: Joe McDonnell 
Tested-by: Michael Smith 


> Enable performance related clang-tidy checks
> 
>
> Key: IMPALA-12390
> URL: https://issues.apache.org/jira/browse/IMPALA-12390
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 4.3.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Major
> Fix For: Impala 4.3.0
>
>
> clang-tidy has several performance-related checks that seem like they would 
> be useful to enforce. Here are some examples:
> {noformat}
> /home/joemcdonnell/upstream/Impala/be/src/runtime/types.h:313:25: warning: 
> loop variable is copied but only used as const reference; consider making it 
> a const reference [performance-for-range-copy]
>         for (ColumnType child_type : col_type.children) {
>              ~~ ^
>              const &
> /home/joemcdonnell/upstream/Impala/be/src/catalog/catalog-util.cc:168:34: 
> warning: 'find' called with a string literal consisting of a single 
> character; consider using the more effective overload accepting a character 
> [performance-faster-string-find]
>       int pos = object_name.find(".");
>                                  ^~~~
>                                  '.'
> /home/joemcdonnell/upstream/Impala/be/src/util/decimal-util.h:55:53: warning: 
> the parameter 'b' is copied for each invocation but only used as a const 
> reference; consider making it a const reference 
> [performance-unnecessary-value-param]
>   static int256_t SafeMultiply(int256_t a, int256_t b, bool may_overflow) {
>                                             ^
>                                            const &
> /home/joemcdonnell/upstream/Impala/be/src/codegen/llvm-codegen.cc:847:5: 
> warning: 'push_back' is called inside a loop; consider pre-allocating the 
> vector capacity before the loop [performance-inefficient-vector-operation]
>     arguments.push_back(args_[i].type);
>     ^{noformat}
> In all, they seem to flag things that developers wouldn't ordinarily notice, 
> and it doesn't seem to have too many false positives. We should look into 
> enabling these.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-12454) CompoudPredicate with AND operator can result in very low selectivity.

2023-09-19 Thread Riza Suminto (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-12454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766930#comment-17766930
 ] 

Riza Suminto commented on IMPALA-12454:
---

Tried to hack it myself and found that using exponential backoff will cause 
cardinality overestimation in low scale TPC-DS, and even change the query plan 
shape in some of them. These tests were changed:
{code:java}
        modified:   
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q13.test
        modified:   
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q41.test
        modified:   
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q47.test
        modified:   
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q48.test
        modified:   
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q53.test
        modified:   
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q57.test
        modified:   
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q63.test
        modified:   
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q85.test
        modified:   
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q89.test
        modified:   
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q91.test 
{code}
It is probably better to solve this with column correlation or histogram stats 
in the future.

> CompoudPredicate with AND operator can result in very low selectivity.
> --
>
> Key: IMPALA-12454
> URL: https://issues.apache.org/jira/browse/IMPALA-12454
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 4.2.0
>Reporter: Riza Suminto
>Priority: Major
>
> CompoudPredicate with AND operator estimate its selectivity by doing simple 
> multiplication of its child expression's selectivity.
> [https://github.com/apache/impala/blob/3614a6a776819a1e918ce7fe833cd9e916d6002a/fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java#L174-L176]
>  
>  
> This can lead to very low number, like what happen in TPC-DS Q53.
> {code:java}
> |  F01:PLAN FRAGMENT [RANDOM] hosts=4 instances=4
> |  Per-Instance Resources: mem-estimate=24.30MB mem-reservation=1.00MB 
> thread-reservation=1
> |  00:SCAN S3 [tpcds_3000_string_parquet_managed.item, RANDOM]
> |     S3 partitions=1/1 files=4 size=33.54MB
> |     predicates: ((i_category IN ('Books', 'Children', 'Electronics') AND 
> i_class IN ('personal', 'portable', 'reference', 'self-help') AND i_brand IN 
> ('scholaramalgamalg #14', 'scholaramalgamalg #7', 'exportiunivamalg #9', 
> 'scholaramalgamalg #9')) OR (i_category IN ('Women', 'Music', 'Men') AND 
> i_class IN ('accessories', 'classical', 'fragrances', 'pants') AND i_brand IN 
> ('amalgimporto #1', 'edu packscholar #1', 'exportiimporto #1', 'importoamalg 
> #1')))
> |     stored statistics:
> |       table: rows=360.00K size=33.54MB
> |       columns: all
> |     extrapolated-rows=disabled max-scan-range-rows=117.77K
> |     mem-estimate=24.00MB mem-reservation=1.00MB thread-reservation=0
> |     tuple-ids=0 row-size=74B cardinality=51
> |     in pipelines: 00(GETNEXT) {code}
> The CompoudPredicate in 00:SCAN estimate very high selectivity, reducing 360K 
> rows into just 51. While in reality, it return 18.53K rows.
> {code:java}
> |  00:SCAN S3                 4      4   18.000ms   24.000ms   18.53K         
>  51    2.31 MB       24.00 MB  tpcds_3000_string_parquet_managed.item {code}
> Selectivity estimation in this CompoudPredicate case should use exponential 
> backoff algorithm similar as in PlanNode.computeCombinedSelectivity().
> [https://github.com/apache/impala/blob/3614a6a776819a1e918ce7fe833cd9e916d6002a/fe/src/main/java/org/apache/impala/planner/PlanNode.java#L730-L733]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Comment Edited] (IMPALA-12018) Consider runtime filters in resource estimates

2023-09-19 Thread Riza Suminto (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-12018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766904#comment-17766904
 ] 

Riza Suminto edited comment on IMPALA-12018 at 9/19/23 9:29 PM:


I have an idea that if a scan node receive a runtime filter from a join node, 
then that join node selectivity can be applied to reduce the cardinality of the 
scan node. However, this comes with 2 requirement:
 # Runtime filter arrived ontime, or it is guaranteed that scan node will need 
to wait for that runtime filter arrival (ie., join node right above the scan 
will not start pulling rows before its join build complete).
 # The runtime filter itself is effective and has accurate selectivity (not 
just quickly disabled by the scan node).

The second point can be tricky given that Impala's default 
RUNTIME_FILTER_ERROR_RATE == max_filter_error_rate == 0.75, and join build 
cardinality itself can be underestimated, leading to undersize bloom filter 
(IMPALA-12451, IMPALA-12454).


was (Author: rizaon):
I have an idea that if a scan node receive a runtime filter from a join node, 
then that join node selectivity can be applied to reduce the cardinality of the 
scan node. However, this comes with 2 requirement:
 # Runtime filter arrived ontime, or it is guaranteed that scan node will need 
to wait for that runtime filter arrival (ie., join node right above the scan 
will not start pulling rows before its join build complete).
 # The runtime filter itself accurate selectivity.

The second point can be tricky given that Impala's default 
RUNTIME_FILTER_ERROR_RATE == max_filter_error_rate == 0.75, and join build 
cardinality itself can be underestimated, leading to undersize bloom filter 
(IMPALA-12451, IMPALA-12454).

> Consider runtime filters in resource estimates
> --
>
> Key: IMPALA-12018
> URL: https://issues.apache.org/jira/browse/IMPALA-12018
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Reporter: Csaba Ringhofer
>Assignee: Riza Suminto
>Priority: Major
>
> Currently Impala creates a plan first and looks for runtime filters bases on 
> the complete plan.
> IMPALA-3573 is about considering runtime filters during join ordering which 
> would be a major change. Meanwhile it could be also useful to consider 
> selective looking runtime filters in resource estimates without changing the 
> plan topology.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Comment Edited] (IMPALA-12018) Consider runtime filters in resource estimates

2023-09-19 Thread Riza Suminto (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-12018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766904#comment-17766904
 ] 

Riza Suminto edited comment on IMPALA-12018 at 9/19/23 6:47 PM:


I have an idea that if a scan node receive a runtime filter from a join node, 
then that join node selectivity can be applied to reduce the cardinality of the 
scan node. However, this comes with 2 requirement:
 # Runtime filter arrived ontime, or it is guaranteed that scan node will need 
to wait for that runtime filter arrival (ie., join node right above the scan 
will not start pulling rows before its join build complete).
 # The runtime filter itself accurate selectivity.

The second point can be tricky given that Impala's default 
RUNTIME_FILTER_ERROR_RATE == max_filter_error_rate == 0.75, and join build 
cardinality itself can be underestimated, leading to undersize bloom filter 
(IMPALA-12451, IMPALA-12454).


was (Author: rizaon):
I have an idea that if a scan node receive a runtime filter from a join node, 
then that join node selectivity can be applied to reduce the cardinality of the 
scan node. However, this comes with 2 requirement:
 # Runtime filter arrived ontime, or it is guaranteed that scan node will need 
to wait for that runtime filter arrival (ie., join node right above the scan 
will not start pulling rows before its join build complete).
 # The runtime filter itself accurate selectivity.

The second point can be tricky given that Impala's default 
RUNTIME_FILTER_ERROR_RATE == max_filter_error_rate == 0.75, and join build 
cardinality itself can be underestimated, leading to undersize bloom filter 
(IMPALA-12451).

> Consider runtime filters in resource estimates
> --
>
> Key: IMPALA-12018
> URL: https://issues.apache.org/jira/browse/IMPALA-12018
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Reporter: Csaba Ringhofer
>Assignee: Riza Suminto
>Priority: Major
>
> Currently Impala creates a plan first and looks for runtime filters bases on 
> the complete plan.
> IMPALA-3573 is about considering runtime filters during join ordering which 
> would be a major change. Meanwhile it could be also useful to consider 
> selective looking runtime filters in resource estimates without changing the 
> plan topology.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-12454) CompoudPredicate with AND operator can result in very low selectivity.

2023-09-19 Thread Riza Suminto (Jira)

Riza Suminto created IMPALA-12454:
-

 Summary: CompoudPredicate with AND operator can result in very low 
selectivity.
 Key: IMPALA-12454
 URL: https://issues.apache.org/jira/browse/IMPALA-12454
 Project: IMPALA
  Issue Type: Improvement
  Components: Frontend
Affects Versions: Impala 4.2.0
Reporter: Riza Suminto


CompoudPredicate with AND operator estimate its selectivity by doing simple 
multiplication of its child expression's selectivity.
[https://github.com/apache/impala/blob/3614a6a776819a1e918ce7fe833cd9e916d6002a/fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java#L174-L176]
 

 

This can lead to very low number, like what happen in TPC-DS Q53.
{code:java}
|  F01:PLAN FRAGMENT [RANDOM] hosts=4 instances=4
|  Per-Instance Resources: mem-estimate=24.30MB mem-reservation=1.00MB 
thread-reservation=1
|  00:SCAN S3 [tpcds_3000_string_parquet_managed.item, RANDOM]
|     S3 partitions=1/1 files=4 size=33.54MB
|     predicates: ((i_category IN ('Books', 'Children', 'Electronics') AND 
i_class IN ('personal', 'portable', 'reference', 'self-help') AND i_brand IN 
('scholaramalgamalg #14', 'scholaramalgamalg #7', 'exportiunivamalg #9', 
'scholaramalgamalg #9')) OR (i_category IN ('Women', 'Music', 'Men') AND 
i_class IN ('accessories', 'classical', 'fragrances', 'pants') AND i_brand IN 
('amalgimporto #1', 'edu packscholar #1', 'exportiimporto #1', 'importoamalg 
#1')))
|     stored statistics:
|       table: rows=360.00K size=33.54MB
|       columns: all
|     extrapolated-rows=disabled max-scan-range-rows=117.77K
|     mem-estimate=24.00MB mem-reservation=1.00MB thread-reservation=0
|     tuple-ids=0 row-size=74B cardinality=51
|     in pipelines: 00(GETNEXT) {code}
The CompoudPredicate in 00:SCAN estimate very high selectivity, reducing 360K 
rows into just 51. While in reality, it return 18.53K rows.
{code:java}
|  00:SCAN S3                 4      4   18.000ms   24.000ms   18.53K          
51    2.31 MB       24.00 MB  tpcds_3000_string_parquet_managed.item {code}
Selectivity estimation in this CompoudPredicate case should use exponential 
backoff algorithm similar as in PlanNode.computeCombinedSelectivity().

[https://github.com/apache/impala/blob/3614a6a776819a1e918ce7fe833cd9e916d6002a/fe/src/main/java/org/apache/impala/planner/PlanNode.java#L730-L733]
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-12018) Consider runtime filters in resource estimates

2023-09-19 Thread Riza Suminto (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-12018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766904#comment-17766904
 ] 

Riza Suminto commented on IMPALA-12018:
---

I have an idea that if a scan node receive a runtime filter from a join node, 
then that join node selectivity can be applied to reduce the cardinality of the 
scan node. However, this comes with 2 requirement:
 # Runtime filter arrived ontime, or it is guaranteed that scan node will need 
to wait for that runtime filter arrival (ie., join node right above the scan 
will not start pulling rows before its join build complete).
 # The runtime filter itself accurate selectivity.

The second point can be tricky given that Impala's default 
RUNTIME_FILTER_ERROR_RATE == max_filter_error_rate == 0.75, and join build 
cardinality itself can be underestimated, leading to undersize bloom filter 
(IMPALA-12451).

> Consider runtime filters in resource estimates
> --
>
> Key: IMPALA-12018
> URL: https://issues.apache.org/jira/browse/IMPALA-12018
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Reporter: Csaba Ringhofer
>Assignee: Riza Suminto
>Priority: Major
>
> Currently Impala creates a plan first and looks for runtime filters bases on 
> the complete plan.
> IMPALA-3573 is about considering runtime filters during join ordering which 
> would be a major change. Meanwhile it could be also useful to consider 
> selective looking runtime filters in resource estimates without changing the 
> plan topology.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-12018) Consider runtime filters in resource estimates

2023-09-19 Thread Riza Suminto (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-12018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Riza Suminto reassigned IMPALA-12018:
-

Assignee: Riza Suminto

> Consider runtime filters in resource estimates
> --
>
> Key: IMPALA-12018
> URL: https://issues.apache.org/jira/browse/IMPALA-12018
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Reporter: Csaba Ringhofer
>Assignee: Riza Suminto
>Priority: Major
>
> Currently Impala creates a plan first and looks for runtime filters bases on 
> the complete plan.
> IMPALA-3573 is about considering runtime filters during join ordering which 
> would be a major change. Meanwhile it could be also useful to consider 
> selective looking runtime filters in resource estimates without changing the 
> plan topology.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-11877) Add support for DELETE statements for Iceberg tables

2023-09-19 Thread Michael Smith (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-11877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-11877:
---
Labels: docs-impacting impala-iceberg  (was: impala-iceberg)

> Add support for DELETE statements for Iceberg tables
> 
>
> Key: IMPALA-11877
> URL: https://issues.apache.org/jira/browse/IMPALA-11877
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend, Frontend
>Reporter: Zoltán Borók-Nagy
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>  Labels: docs-impacting, impala-iceberg
> Fix For: Impala 4.3.0
>
>
> Add support for DELETE statements for Iceberg tables.
> We can do it based on the following design doc: 
> https://docs.google.com/document/d/1GuRiJ3jjqkwINsSCKYaWwcfXHzbMrsd3WEMDOB11Xqw/edit#heading=h.5bmfhbmb4qdk
> Limitations:
> * only support merge-on-read
> * only write position delete files



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-11862) Document that the default value of ssl_cipher_list is not empty

2023-09-19 Thread Michael Smith (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-11862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-11862:
---
Component/s: Docs

> Document that the default value of ssl_cipher_list is not empty
> ---
>
> Key: IMPALA-11862
> URL: https://issues.apache.org/jira/browse/IMPALA-11862
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
>Priority: Major
> Fix For: Impala 4.3.0
>
>
> Since IMPALA-11240 the  default value of ssl_cipher_list is not empty.
> Update 
> https://github.com/apache/impala/blob/20fe8fea58061f5da7d5c0e7d26755712d02ef79/docs/topics/impala_ssl.xml#L202
>  to show thus.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-11658) Implement Iceberg manifest caching configuration for Impala

2023-09-19 Thread Michael Smith (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-11658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-11658:
---
Labels: docs-impacting iceberg  (was: iceberg)

> Implement Iceberg manifest caching configuration for Impala
> ---
>
> Key: IMPALA-11658
> URL: https://issues.apache.org/jira/browse/IMPALA-11658
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Riza Suminto
>Assignee: Riza Suminto
>Priority: Major
>  Labels: docs-impacting, iceberg
> Fix For: Impala 4.3.0
>
>
> Iceberg manifest caching feature has been approved upstream
> [https://github.com/apache/iceberg/pull/4518]
> Once a new Iceberg release with this feature available, Impala should 
> implement reading manifest caching configuration for its Iceberg catalogs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-11758) Databases named "iceberg" confuses the parser, throws ParseException

2023-09-19 Thread Michael Smith (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-11758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-11758:
---
Labels: docs-impacting  (was: )

> Databases named "iceberg" confuses the parser, throws ParseException
> 
>
> Key: IMPALA-11758
> URL: https://issues.apache.org/jira/browse/IMPALA-11758
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Gergely Fürnstáhl
>Assignee: Gergely Fürnstáhl
>Priority: Major
>  Labels: docs-impacting
> Fix For: Impala 4.3.0
>
>
> Impala can't create a database named "iceberg" but hive can. Valid queries 
> fails in impala using that database.
> {code:java}
> [localhost:21050] default> create database iceberg;
> Query: create database iceberg
> ERROR: ParseException: Syntax error in line 1:
> create database iceberg
>                 ^
> Encountered: ICEBERG
> Expected: DEFAULT, EXTENDED, FORMATTED, IF, IDENTIFIER {code}
> Created database in hive.
> {code:java}
> [localhost:21050] default> use iceberg;
> Query: use iceberg
> ERROR: ParseException: Syntax error in line 1: use iceberg   
>   ^
> Encountered: ICEBERG
> Expected: DEFAULT, IDENTIFIER CAUSED BY: Exception: Syntax error
>  {code}
> Selects fails too on existing tables.
> Escaping solves the issue, e.g.:
> {code:java}
> CREATE TABLE `iceberg`.`ice_9c` (i INT, t TIMESTAMP) PARTITIONED BY (j 
> BIGINT) STORED AS ICEBERG TBLPROPERTIES ('format-version' = '2');{code}
> Improving exception message could help the users in the future



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-11482) Implement Iceberg table rollback feature

2023-09-19 Thread Michael Smith (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-11482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-11482:
---
Labels: docs-impacting impala-iceberg  (was: impala-iceberg)

> Implement Iceberg table rollback feature
> 
>
> Key: IMPALA-11482
> URL: https://issues.apache.org/jira/browse/IMPALA-11482
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Frontend
>Reporter: Zoltán Borók-Nagy
>Assignee: Andrew Sherman
>Priority: Major
>  Labels: docs-impacting, impala-iceberg
> Fix For: Impala 4.3.0
>
>
> We should allow rolling back iceberg table's data to the state at an older 
> table snapshot. 
> Rollback to the last snapshot before a specific timestamp
> {code}
> ALTER TABLE ice_t EXECUTE ROLLBACK('2022-05-12 00:00:00')
> {code}
> Rollback to a specific snapshot ID
> {code}
>  ALTER TABLE ice_t EXECUTE ROLLBACK(); 
> {code}
> However, to revert a rollback we might need to be able to change the table's 
> metadata_location property (like HIVE-26203).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-4052) CREATE TABLE LIKE for Kudu tables

2023-09-19 Thread Michael Smith (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-4052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-4052:
--
Labels: docs-impacting docs-missing kudu  (was: kudu)

> CREATE TABLE LIKE for Kudu tables
> -
>
> Key: IMPALA-4052
> URL: https://issues.apache.org/jira/browse/IMPALA-4052
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 2.7.0
>Reporter: Dimitris Tsirogiannis
>Assignee: gaoxiaoqing
>Priority: Major
>  Labels: docs-impacting, docs-missing, kudu
> Fix For: Impala 4.3.0
>
>
> The semantics of CREATE TABLE LIKE when Kudu tables are involved, either as a 
> source or as a target table, are not well specified or properly implemented; 
> in some cases a misleading ImpalaRuntimeException is thrown. 
> Actions: 
> # Decide whether CREATE TABLE LIKE will be supported for Kudu tables. 
> # Implement whatever approach is decided
> # Properly document both in Impala and Kudu docs the supported operations



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-10860) Allow setting separate mem_limit for coordinators

2023-09-19 Thread Michael Smith (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-10860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-10860:
---
Labels: docs-impacting  (was: docs)

> Allow setting separate mem_limit for coordinators
> -
>
> Key: IMPALA-10860
> URL: https://issues.apache.org/jira/browse/IMPALA-10860
> Project: IMPALA
>  Issue Type: Task
>Reporter: Abhishek Rawat
>Assignee: Abhishek Rawat
>Priority: Critical
>  Labels: docs-impacting
> Fix For: Impala 4.3.0
>
>
> The mem_limit query option applies to all impala coordinators and executors. 
> This may not be ideal for dedicated coordinators, since they generally need 
> less memory per query and having same memory limit will reduce system wide 
> query concurrency.
> We could add new query options:
>  
> {code:java}
> mem_limit_coordinators
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-10860) Allow setting separate mem_limit for coordinators

2023-09-19 Thread Michael Smith (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-10860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-10860:
---
Labels: docs  (was: )

> Allow setting separate mem_limit for coordinators
> -
>
> Key: IMPALA-10860
> URL: https://issues.apache.org/jira/browse/IMPALA-10860
> Project: IMPALA
>  Issue Type: Task
>Reporter: Abhishek Rawat
>Assignee: Abhishek Rawat
>Priority: Critical
>  Labels: docs
> Fix For: Impala 4.3.0
>
>
> The mem_limit query option applies to all impala coordinators and executors. 
> This may not be ideal for dedicated coordinators, since they generally need 
> less memory per query and having same memory limit will reduce system wide 
> query concurrency.
> We could add new query options:
>  
> {code:java}
> mem_limit_coordinators
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-12390) Enable performance related clang-tidy checks

2023-09-19 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-12390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766838#comment-17766838
 ] 

ASF subversion and git services commented on IMPALA-12390:
--

Commit 3614a6a776819a1e918ce7fe833cd9e916d6002a in impala's branch 
refs/heads/master from gaurav1086
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=3614a6a77 ]

IMPALA-12390 (part 2): Enable some clang-tidy performance related checks

This enables the clang tidy performance check:
performance-inefficient-string-concatenation
"warning: string concatenation results in allocation of unnecessary
temporary strings"
Fix: Use StrCat() to concatenate multiple strings

Testing:
 - Ran bin/run_clang_tidy.sh with the new checks
 - Ran GVO

Change-Id: Ibad8bd0f12aab92ad874f5a6b9ec922dce7f3190
Reviewed-on: http://gerrit.cloudera.org:8080/20445
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Enable performance related clang-tidy checks
> 
>
> Key: IMPALA-12390
> URL: https://issues.apache.org/jira/browse/IMPALA-12390
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 4.3.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Major
> Fix For: Impala 4.3.0
>
>
> clang-tidy has several performance-related checks that seem like they would 
> be useful to enforce. Here are some examples:
> {noformat}
> /home/joemcdonnell/upstream/Impala/be/src/runtime/types.h:313:25: warning: 
> loop variable is copied but only used as const reference; consider making it 
> a const reference [performance-for-range-copy]
>         for (ColumnType child_type : col_type.children) {
>              ~~ ^
>              const &
> /home/joemcdonnell/upstream/Impala/be/src/catalog/catalog-util.cc:168:34: 
> warning: 'find' called with a string literal consisting of a single 
> character; consider using the more effective overload accepting a character 
> [performance-faster-string-find]
>       int pos = object_name.find(".");
>                                  ^~~~
>                                  '.'
> /home/joemcdonnell/upstream/Impala/be/src/util/decimal-util.h:55:53: warning: 
> the parameter 'b' is copied for each invocation but only used as a const 
> reference; consider making it a const reference 
> [performance-unnecessary-value-param]
>   static int256_t SafeMultiply(int256_t a, int256_t b, bool may_overflow) {
>                                             ^
>                                            const &
> /home/joemcdonnell/upstream/Impala/be/src/codegen/llvm-codegen.cc:847:5: 
> warning: 'push_back' is called inside a loop; consider pre-allocating the 
> vector capacity before the loop [performance-inefficient-vector-operation]
>     arguments.push_back(args_[i].type);
>     ^{noformat}
> In all, they seem to flag things that developers wouldn't ordinarily notice, 
> and it doesn't seem to have too many false positives. We should look into 
> enabling these.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-12453) TestIcebergV2Table::test_delete_partitioned slows down then times out in exhaustive mode

2023-09-19 Thread Laszlo Gaal (Jira)

Laszlo Gaal created IMPALA-12453:


 Summary: TestIcebergV2Table::test_delete_partitioned slows down 
then times out in exhaustive mode
 Key: IMPALA-12453
 URL: https://issues.apache.org/jira/browse/IMPALA-12453
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 4.3.0
Reporter: Laszlo Gaal


Exhaustive runs slow down, then eventually time out during Iceberg deletion 
tests. First the interval between reported test steps increased from minutes to 
two hours (maybe an internal timeout?), then the build's internal timeout for 
the whole test phase was triggered, which killed the test phase with core dumps.

This is the transition from PASSED to FAILED in the test run:{code}
19:44:16 
query_test/test_iceberg.py::TestIcebergV2Table::test_delete_partitioned[protocol:
 beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
parquet/none | disable_optimized_iceberg_v2_read: 1] 
19:46:37 [gw4] PASSED 
query_test/test_iceberg.py::TestIcebergV2Table::test_delete_partitioned[protocol:
 beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
parquet/none | disable_optimized_iceberg_v2_read: 1] 
19:46:37 
query_test/test_iceberg.py::TestIcebergV2Table::test_delete_partitioned[protocol:
 beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
parquet/none | disable_optimized_iceberg_v2_read: 0] 
19:48:18 [gw4] FAILED 
query_test/test_iceberg.py::TestIcebergV2Table::test_delete_partitioned[protocol:
 beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
parquet/none | disable_optimized_iceberg_v2_read: 0] 
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-12432) Keep LdapKerberosImpalaShellTest* compatible with older guava versions

2023-09-19 Thread Laszlo Gaal (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-12432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Gaal reassigned IMPALA-12432:


Assignee: Joe McDonnell

> Keep LdapKerberosImpalaShellTest* compatible with older guava versions
> --
>
> Key: IMPALA-12432
> URL: https://issues.apache.org/jira/browse/IMPALA-12432
> Project: IMPALA
>  Issue Type: Task
>  Components: Frontend
>Affects Versions: Impala 4.3.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Major
> Fix For: Impala 4.3.0
>
>
> LdapKerberosImpalaShellTestBase.java and LdapKerberosImpalaShellTest.java use 
> the ImmutableMap.of function with 8+ pairs. Older versions of guava like 
> 28.1-jre do not have ImmutableMap.of() for that number of arguments.
> Since we often want to use the guava version that the underlying Hadoop/Hive 
> use, it can be useful for compatibility to be able to build against older 
> guava (like 28.1-jre).
> Most other code is fine, so if we switch these locations to use 
> ImmutableMap.builder(), then the whole codebase can compile 
> with the older guava (while remaining forward compatible as well).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-12402) Make CatalogdMetaProvider's cache concurrency level configurable

2023-09-19 Thread Maxwell Guo (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-12402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766811#comment-17766811
 ] 

Maxwell Guo commented on IMPALA-12402:
--

Seems final test failed with py test ~~~

> Make CatalogdMetaProvider's cache concurrency level configurable
> 
>
> Key: IMPALA-12402
> URL: https://issues.apache.org/jira/browse/IMPALA-12402
> Project: IMPALA
>  Issue Type: Improvement
>  Components: fe
>Reporter: Maxwell Guo
>Assignee: Maxwell Guo
>Priority: Minor
>  Labels: pull-request-available
>
> when the cluster contains many db and tables such as if there are more than 
> 10 tables, and if we restart the impalad , the local cache_ 
> CatalogMetaProvider's need to doing some loading process. 
> As we know that the goole's guava cache 's concurrencyLevel os set to 4 by 
> default. 
> but if there is many tables the loading process will need more time and 
> increase the probability of lock contention, see 
> [here|https://github.com/google/guava/blob/master/guava/src/com/google/common/cache/CacheBuilder.java#L437].
>  
> So we propose to add some configurations here, the first is the concurrency 
> of cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-12293) Optimize statement for compacting Iceberg tables

2023-09-19 Thread Noemi Pap-Takacs (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-12293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noemi Pap-Takacs updated IMPALA-12293:
--
Description: 
A simple syntax to compact Iceberg tables. It executes the following tasks:
 * compact small files
 * rewrite partitions according to latest spec
 * merge delete deltas

{code:java}
Syntax:

OPTIMIZE TABLE 
[ REWRITE DATA ]
[ ( { FILE_SIZE_THRESHOLD | MIN_INPUT_FILES } =  [, ... ] ) ]
[ WHERE  ];{code}
Limitations - OPTIMIZE TABLE can not be executed on the following tables:
 * Non-Iceberg tables.
 * Tables with complex types columns. Currently, Impala does not support 
writing complex types.
 * If the 'write.format.default' is not Parquet. Impala can only write Parquet 
files.

  was:
A simple syntax to compact Iceberg tables. It executes the following tasks:
 * compact small files
 * rewrite partitions according to latest spec
 * merge delete deltas

{code:java}
Syntax:

OPTIMIZE TABLE 
[ REWRITE DATA ]
[ ( { FILE_SIZE_THRESHOLD | MIN_INPUT_FILES } =  [, ... ] ) ]
[ WHERE  ];{code}
 


> Optimize statement for compacting Iceberg tables
> 
>
> Key: IMPALA-12293
> URL: https://issues.apache.org/jira/browse/IMPALA-12293
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Frontend
>Reporter: Noemi Pap-Takacs
>Assignee: Noemi Pap-Takacs
>Priority: Major
>  Labels: impala-iceberg
>
> A simple syntax to compact Iceberg tables. It executes the following tasks:
>  * compact small files
>  * rewrite partitions according to latest spec
>  * merge delete deltas
> {code:java}
> Syntax:
> OPTIMIZE TABLE 
> [ REWRITE DATA ]
> [ ( { FILE_SIZE_THRESHOLD | MIN_INPUT_FILES } =  [, ... ] ) ]
> [ WHERE  ];{code}
> Limitations - OPTIMIZE TABLE can not be executed on the following tables:
>  * Non-Iceberg tables.
>  * Tables with complex types columns. Currently, Impala does not support 
> writing complex types.
>  * If the 'write.format.default' is not Parquet. Impala can only write 
> Parquet files.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-12448) Refreshing a non-existent partition may be stuck for a long time

2023-09-19 Thread Zhi Tang (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-12448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhi Tang reassigned IMPALA-12448:
-

Assignee: Zhi Tang

> Refreshing a non-existent partition  may be stuck for a long time
> -
>
> Key: IMPALA-12448
> URL: https://issues.apache.org/jira/browse/IMPALA-12448
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 4.2.0
>Reporter: Zhi Tang
>Assignee: Zhi Tang
>Priority: Major
> Attachments: image-2023-09-14-11-23-55-517.png
>
>
> When sync_ddl is set to true, refreshing a non-existent partition may be 
> stuck for a long time until the table has a new update log added to 
> topicUpdateLog_.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-11971) Simple syntax to Compact, RePartition, Clean Orphans etc for Iceberg Tables

2023-09-19 Thread Noemi Pap-Takacs (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-11971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noemi Pap-Takacs reassigned IMPALA-11971:
-

Assignee: (was: Noemi Pap-Takacs)

> Simple syntax to Compact, RePartition, Clean Orphans etc for Iceberg Tables
> ---
>
> Key: IMPALA-11971
> URL: https://issues.apache.org/jira/browse/IMPALA-11971
> Project: IMPALA
>  Issue Type: Epic
>  Components: Frontend
>Reporter: Manish Maheshwari
>Priority: Major
>  Labels: impala-iceberg
>
> Impala supports overwriting iceberg tables. Overwriting iceberg tables / 
> partitions can be used as an cheap way to implement the below without the 
> user having to run spark jobs
>  * compact small files
>  * rewrite partitions according to latest spec
>  * merge deltas due to deletes and updates (if any) due to Merge-on-read 
> strategy
>  * delete orphan files in table/partition
> Doing all of this as part of an overwrite partition is not intuitive. We 
> should support a syntax verb like `compact` or `consolidate` or `tune` to do 
> these operations 
> {code:java}
> alter table compact table;
> alter table compact table partition <>;
> alter table compact table partition <>;{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-12452) support for functions like percentile in hive

2023-09-19 Thread Weizisheng (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-12452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weizisheng updated IMPALA-12452:

Description: 
Hi, do we have any plan on built-in functions like  ‘percentile‘ series in 
Hive?  This may be helpful when i wanna tranform tasks from Hive to Impala.

[https://cwiki.apache.org/confluence/display/hive/languagemanual+udf#LanguageManualUDF-Built-inFunctions|http://example.com/]

  was:
Hi, is there any plan on built-in functions like  ‘percentile‘ series in Hive.  
This may be helpful when user wanna tranform tasks from Hive to Impala.

[https://cwiki.apache.org/confluence/display/hive/languagemanual+udf#LanguageManualUDF-Built-inFunctions|http://example.com/]


> support for functions like percentile in hive
> -
>
> Key: IMPALA-12452
> URL: https://issues.apache.org/jira/browse/IMPALA-12452
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Frontend
>Reporter: Weizisheng
>Priority: Major
>
> Hi, do we have any plan on built-in functions like  ‘percentile‘ series in 
> Hive?  This may be helpful when i wanna tranform tasks from Hive to Impala.
> [https://cwiki.apache.org/confluence/display/hive/languagemanual+udf#LanguageManualUDF-Built-inFunctions|http://example.com/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-12452) support for functions like percentile in Hive

2023-09-19 Thread Weizisheng (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-12452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weizisheng updated IMPALA-12452:

Summary: support for functions like percentile in Hive  (was: support for 
functions like percentile in hive)

> support for functions like percentile in Hive
> -
>
> Key: IMPALA-12452
> URL: https://issues.apache.org/jira/browse/IMPALA-12452
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Frontend
>Reporter: Weizisheng
>Priority: Major
>
> Hi, do we have any plan on built-in functions like  ‘percentile‘ series in 
> Hive?  This may be helpful when i wanna tranform tasks from Hive to Impala.
> [https://cwiki.apache.org/confluence/display/hive/languagemanual+udf#LanguageManualUDF-Built-inFunctions|http://example.com/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-12452) support for functions like percentile in hive

2023-09-19 Thread Weizisheng (Jira)

Weizisheng created IMPALA-12452:
---

 Summary: support for functions like percentile in hive
 Key: IMPALA-12452
 URL: https://issues.apache.org/jira/browse/IMPALA-12452
 Project: IMPALA
  Issue Type: New Feature
  Components: Frontend
Reporter: Weizisheng


hi， is there any plan on built-in functions like  ‘percentile‘ series in Hive.  
This may be helpful when user wanna tranform tasks from Hive to Impala.

[https://cwiki.apache.org/confluence/display/hive/languagemanual+udf#LanguageManualUDF-Built-inFunctions|http://example.com]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-12452) support for functions like percentile in hive

2023-09-19 Thread Weizisheng (Jira)



 [ 
https://issues.apache.org/jira/browse/IMPALA-12452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weizisheng updated IMPALA-12452:

Description: 
Hi, is there any plan on built-in functions like  ‘percentile‘ series in Hive.  
This may be helpful when user wanna tranform tasks from Hive to Impala.

[https://cwiki.apache.org/confluence/display/hive/languagemanual+udf#LanguageManualUDF-Built-inFunctions|http://example.com/]

  was:
hi， is there any plan on built-in functions like  ‘percentile‘ series in Hive.  
This may be helpful when user wanna tranform tasks from Hive to Impala.

[https://cwiki.apache.org/confluence/display/hive/languagemanual+udf#LanguageManualUDF-Built-inFunctions|http://example.com]


> support for functions like percentile in hive
> -
>
> Key: IMPALA-12452
> URL: https://issues.apache.org/jira/browse/IMPALA-12452
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Frontend
>Reporter: Weizisheng
>Priority: Major
>
> Hi, is there any plan on built-in functions like  ‘percentile‘ series in 
> Hive.  This may be helpful when user wanna tranform tasks from Hive to Impala.
> [https://cwiki.apache.org/confluence/display/hive/languagemanual+udf#LanguageManualUDF-Built-inFunctions|http://example.com/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-9641) Query hang when containing alias names as empty backticks

2023-09-19 Thread Weizisheng (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-9641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766670#comment-17766670
 ] 

Weizisheng commented on IMPALA-9641:


hi，will this be backport to 3.4.x

> Query hang when containing alias names as empty backticks
> -
>
> Key: IMPALA-9641
> URL: https://issues.apache.org/jira/browse/IMPALA-9641
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.4.0
>Reporter: Quanlong Huang
>Assignee: Tamas Mate
>Priority: Blocker
>  Labels: hang
> Fix For: Impala 4.0.0
>
>
> The following query will hang in an infinite loop:
> {code:java}
> select 1 as "``";
> {code}
> Stacktrace of its compiler thread:
> {code:java}
> "Thread-19" #34 prio=5 os_prio=0 tid=0x12fc nid=0x5514 runnable 
> [0x7f2abda41000]
>java.lang.Thread.State: RUNNABLE
> at java.io.FileOutputStream.writeBytes(Native Method)
> at java.io.FileOutputStream.write(FileOutputStream.java:326)
> at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> - locked <0x0005cc90f7b8> (a java.io.BufferedOutputStream)
> at java.io.PrintStream.write(PrintStream.java:482)
> - locked <0x0005cc90f798> (a java.io.PrintStream)
> at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
> at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
> at sun.nio.cs.StreamEncoder.flushBuffer(StreamEncoder.java:104)
> - locked <0x0005cc90f8d8> (a java.io.OutputStreamWriter)
> at java.io.OutputStreamWriter.flushBuffer(OutputStreamWriter.java:185)
> at java.io.PrintStream.write(PrintStream.java:527)
> - locked <0x0005cc90f798> (a java.io.PrintStream)
> at java.io.PrintStream.print(PrintStream.java:669)
> at java.io.PrintStream.println(PrintStream.java:806)
> - locked <0x0005cc90f798> (a java.io.PrintStream)
> at 
> org.antlr.runtime.BaseRecognizer.emitErrorMessage(BaseRecognizer.java:344)
> at 
> org.antlr.runtime.BaseRecognizer.displayRecognitionError(BaseRecognizer.java:194)
> at org.antlr.runtime.Lexer.reportError(Lexer.java:261)
> at org.antlr.runtime.Lexer.nextToken(Lexer.java:103)
> at 
> org.apache.impala.analysis.ToSqlUtils.hiveNeedsQuotes(ToSqlUtils.java:145)
> at 
> org.apache.impala.analysis.ToSqlUtils.getIdentSql(ToSqlUtils.java:199)
> at org.apache.impala.analysis.SlotRef.(SlotRef.java:58)
> at 
> org.apache.impala.analysis.SelectStmt$SelectAnalyzer.analyzeSelectClause(SelectStmt.java:283)
> at 
> org.apache.impala.analysis.SelectStmt$SelectAnalyzer.analyze(SelectStmt.java:215)
> at 
> org.apache.impala.analysis.SelectStmt$SelectAnalyzer.access$100(SelectStmt.java:199)
> at org.apache.impala.analysis.SelectStmt.analyze(SelectStmt.java:192)
> at 
> org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:473)
> at 
> org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:437)
> at 
> org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1530)
> at 
> org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1497)
> at 
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1467)
> at 
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:154)
> {code}
> org.antlr.runtime.Lexer keeps emitting the same error message to stderr 
> (which is redirected to impalad.ERROR):
> {code:java}
> line 1:0 rule Identifier failed predicate: {allowQuotedId()}?
> line 1:0 rule Identifier failed predicate: {allowQuotedId()}?
> line 1:0 rule Identifier failed predicate: {allowQuotedId()}?
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-12399) Pass eventTypeSkipList with OPEN_TXN in NotificationEventRequest to avoid receiving OPEN_TXN events from HMS

[jira] [Commented] (IMPALA-12390) Enable performance related clang-tidy checks

[jira] [Commented] (IMPALA-12454) CompoudPredicate with AND operator can result in very low selectivity.

[jira] [Comment Edited] (IMPALA-12018) Consider runtime filters in resource estimates

[jira] [Comment Edited] (IMPALA-12018) Consider runtime filters in resource estimates

[jira] [Created] (IMPALA-12454) CompoudPredicate with AND operator can result in very low selectivity.

[jira] [Commented] (IMPALA-12018) Consider runtime filters in resource estimates

[jira] [Assigned] (IMPALA-12018) Consider runtime filters in resource estimates

[jira] [Updated] (IMPALA-11877) Add support for DELETE statements for Iceberg tables

[jira] [Updated] (IMPALA-11862) Document that the default value of ssl_cipher_list is not empty

[jira] [Updated] (IMPALA-11658) Implement Iceberg manifest caching configuration for Impala

[jira] [Updated] (IMPALA-11758) Databases named "iceberg" confuses the parser, throws ParseException

[jira] [Updated] (IMPALA-11482) Implement Iceberg table rollback feature

[jira] [Updated] (IMPALA-4052) CREATE TABLE LIKE for Kudu tables

[jira] [Updated] (IMPALA-10860) Allow setting separate mem_limit for coordinators

[jira] [Updated] (IMPALA-10860) Allow setting separate mem_limit for coordinators

[jira] [Commented] (IMPALA-12390) Enable performance related clang-tidy checks

[jira] [Created] (IMPALA-12453) TestIcebergV2Table::test_delete_partitioned slows down then times out in exhaustive mode

[jira] [Assigned] (IMPALA-12432) Keep LdapKerberosImpalaShellTest* compatible with older guava versions

[jira] [Commented] (IMPALA-12402) Make CatalogdMetaProvider's cache concurrency level configurable

[jira] [Updated] (IMPALA-12293) Optimize statement for compacting Iceberg tables

[jira] [Assigned] (IMPALA-12448) Refreshing a non-existent partition may be stuck for a long time

[jira] [Assigned] (IMPALA-11971) Simple syntax to Compact, RePartition, Clean Orphans etc for Iceberg Tables

[jira] [Updated] (IMPALA-12452) support for functions like percentile in hive

[jira] [Updated] (IMPALA-12452) support for functions like percentile in Hive

[jira] [Created] (IMPALA-12452) support for functions like percentile in hive

[jira] [Updated] (IMPALA-12452) support for functions like percentile in hive

[jira] [Commented] (IMPALA-9641) Query hang when containing alias names as empty backticks

28 matches

Site Navigation

Mail list logo

Footer information