[jira] [Commented] (IMPALA-12458) When impalad and catalogd are started concurrently on a Kerberos-enabled node, catalogd will report authentication failure

2024-01-30 Thread Quanlong Huang (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812554#comment-17812554
 ] 

Quanlong Huang commented on IMPALA-12458:
-

Sorry to be late on this. Is this a concurrent issue that only occurs when 
catalogd and impalad launch in parallel on the same node? Or does it also 
happen when launching impalad on a node that already runs catalogd?

> When impalad and catalogd are started concurrently on a Kerberos-enabled 
> node, catalogd will report authentication failure
> --
>
> Key: IMPALA-12458
> URL: https://issues.apache.org/jira/browse/IMPALA-12458
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 4.0.0
> Environment: RHEL7
>Reporter: Davy Xu
>Priority: Major
> Attachments: Catalog.log, Impala.log
>
>
> [~tlipcon], would you please take a look at this problem?
> After enabling impala kerberos, start impalad and catalogd concurrently on a 
> node. The catalogd log periodically reports authentication failure, with the 
> following log: 
> 2023-09-23 10:55:08,821 INFO catalog: Invalidating all metadata. Version: 0
> 2023-09-23 10:55:08,983 WARN catalog: Exception encountered while connecting 
> to the server : org.apache.hadoop.security.AccessControlException: Client 
> cannot authenticate via:[TOKEN, KERBEROS]
> 2023-09-23 10:55:08,995 WARN catalog: Exception encountered while connecting 
> to the server : org.apache.hadoop.security.AccessControlException: Client 
> cannot authenticate via:[TOKEN, KERBEROS]
> 2023-09-23 10:55:09,052 ERROR catalog: Error loading cache pools: 
> Java exception follows:
> java.io.IOException: DestHost:destPort host-10-235-65-168:9000 , 
> LocalHost:localPort host-10-235-65-90/191.188.1.223:0. Failed on local 
> exception: java.io.IOException: 
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
> via:[TOKEN, KERBEROS]
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:840)
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:815)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1566)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1508)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1405)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>   at com.sun.proxy.$Proxy9.listCachePools(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.listCachePools(ClientNamenodeProtocolTranslatorPB.java:1502)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
>   at com.sun.proxy.$Proxy10.listCachePools(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocol.CachePoolIterator.makeRequest(CachePoolIterator.java:51)
>   at 
> org.apache.hadoop.hdfs.protocol.CachePoolIterator.makeRequest(CachePoolIterator.java:33)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.makeRequest(BatchedRemoteIterator.java:77)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.makeRequestIfNeeded(BatchedRemoteIterator.java:85)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.hasNext(BatchedRemoteIterator.java:99)
>   at 
> org.apache.impala.catalog.CatalogServiceCatalog$CachePoolReader.run(CatalogServiceCatalog.java:544)
>   at 
> org.apache.impala.catalog.CatalogServiceCatalog.reset(CatalogServiceCatalog.java:1904

[jira] [Updated] (IMPALA-12692) Typo in docs about random() function

2024-01-30 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang updated IMPALA-12692:

Labels: newbie  (was: )

> Typo in docs about random() function
> 
>
> Key: IMPALA-12692
> URL: https://issues.apache.org/jira/browse/IMPALA-12692
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Affects Versions: Impala 4.4.0
>Reporter: Joe McDonnell
>Priority: Major
>  Labels: newbie
>
> The docs for rand()/random() has a typo that refers to "randome":
> {noformat}
>           RAND(), RAND(BIGINT seed), RANDOME(), RANDOME(BIGINT seed){noformat}
> [https://github.com/apache/impala/blob/master/docs/topics/impala_math_functions.xml#L1466]
> It should be random().
> {noformat}
> [localhost:21050] functional> select randome();
> Query: select randome()
> Query submitted at: 2024-01-08 22:50:44 (Coordinator: 
> http://joemcdonnell-22743:25000)
> ERROR: AnalysisException: functional.randome() unknown for database 
> functional. Currently this db has 0 functions.
> [localhost:21050] functional> select random();
> Query: select random()
> Query submitted at: 2024-01-08 22:51:21 (Coordinator: 
> http://joemcdonnell-22743:25000)
> Query progress can be monitored at: 
> http://joemcdonnell-22743:25000/query_plan?query_id=7e467178f9a8af8e:771d01ee
> +--+
> | random()     |
> +--+
> | 0.4784972579 |
> +--+
> Fetched 1 row(s) in 0.11s
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-12693) Typo in link for ltrim in string functions docs

2024-01-30 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang updated IMPALA-12693:

Labels: newbie  (was: )

> Typo in link for ltrim in string functions docs
> ---
>
> Key: IMPALA-12693
> URL: https://issues.apache.org/jira/browse/IMPALA-12693
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Affects Versions: Impala 4.4.0
>Reporter: Joe McDonnell
>Priority: Major
>  Labels: newbie
>
> The link text for this URL is wrong:
> {noformat}
>       
>         LTRI 
>       {noformat}
> It should be LTRIM, not LTRI.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12713) Impala fails to start in IPv6 environment

2024-01-30 Thread Quanlong Huang (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812552#comment-17812552
 ] 

Quanlong Huang commented on IMPALA-12713:
-

Thanks for reporting the issue. Here is where the error comes:
{code:cpp}
  if (getaddrinfo(hostname.c_str(), NULL, &hints, &addr_info) != 0) {
stringstream ss;
ss << "Could not find IPv4 address for: " << hostname;
return Status(ss.str());
  }{code}
[https://github.com/apache/impala/blob/adfe82c97c0772cf9d336d88254f0a8b3acc7957/be/src/util/network-util.cc#L79-L83]

> Impala fails to start in IPv6 environment
> -
>
> Key: IMPALA-12713
> URL: https://issues.apache.org/jira/browse/IMPALA-12713
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0.0
>Reporter: Davy Xu
>Priority: Major
>
> Impala fails to start in the IPv6 environment. The error log is as follows:
> Could not find IPv4 address for: myhost. Impalad exiting.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12771) Impala catalogd events-skipped may mark the wrong number

2024-01-30 Thread Maxwell Guo (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812553#comment-17812553
 ] 

Maxwell Guo commented on IMPALA-12771:
--

Thanks for reminding me, I think I have used it before

> Impala catalogd events-skipped may mark the wrong number
> 
>
> Key: IMPALA-12771
> URL: https://issues.apache.org/jira/browse/IMPALA-12771
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Reporter: Maxwell Guo
>Assignee: Maxwell Guo
>Priority: Minor
>
> See the description of [event-skipped 
> metric|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java#L237]
>  
> {code:java}
>  // total number of events which are skipped because of the flag setting or
>   // in case of [CREATE|DROP] events on [DATABASE|TABLE|PARTITION] which were 
> ignored
>   // because the [DATABASE|TABLE|PARTITION] was already [PRESENT|ABSENT] in 
> the catalogd.
> {code}
>  
> As for CREATE and DROP event on Database/Table/Partition (Also AddPartition 
> is inclued) when we found that the table/database when the database or table 
> is not found in the cache then we will skip the event process and make the 
> event-skipped metric +1.
> But I found that there is some question here for alter table and Reload event:
> * For alter table if renaming a table , the events-skipped  metric will also 
> +1 ,see [oldTblRemoved to be false 
> |https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java#L1653]
> * For Reload event that is not describe in the description of events-skipped, 
> but the value is +1 when is oldevent;
> * Besides if the table is in blacklist the metric will also +1
> In summary, I think this description is inconsistent with the actual 
> implementation.
> So can we also mark the events-skipped metric for alter partition events and 
> modify the 
> description  to be all the events skipped 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12723) delete unused repo

2024-01-30 Thread Quanlong Huang (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812549#comment-17812549
 ] 

Quanlong Huang commented on IMPALA-12723:
-

[~morningman] Maybe you want to create an INFRA jira?

> delete unused repo
> --
>
> Key: IMPALA-12723
> URL: https://issues.apache.org/jira/browse/IMPALA-12723
> Project: IMPALA
>  Issue Type: Task
>  Components: Infrastructure
>Reporter: Mingyu Chen
>Priority: Major
>
> Dear infra:
> I have create a wrong repo:
> https://github.com/apache/doris-doris-streamloader
> please help me to delete it, here is an empty repo



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-12755) could not resolve artifact 'cdh-root:6.x-20220131.115617-21547795'

2024-01-30 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang reassigned IMPALA-12755:
---

Assignee: Quanlong Huang

> could not resolve artifact 'cdh-root:6.x-20220131.115617-21547795'
> --
>
> Key: IMPALA-12755
> URL: https://issues.apache.org/jira/browse/IMPALA-12755
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.4.0, Impala 3.4.1
>Reporter: guojingfeng
>Assignee: Quanlong Huang
>Priority: Major
> Attachments: image-2024-01-25-10-53-04-706.png
>
>
> We could not resolve 'cdh-root:6.x-20220131.115617-21547795' when building 
> impala-3.4.x versions. 
> {code:java}
> [INFO] BUILD FAILURE
> [ERROR] Failed to execute goal on project impala-minimal-hive-exec: Could not 
> resolve dependencies for project 
> org.apache.impala:impala-minimal-hive-exec:jar:0.1-SNAPSHOT: Failed to 
> collect dependencies at org.apache.hive:hive-exec:jar:2.1.1-cdh6.x-SNAPSHOT: 
> Failed to read artifact descriptor for 
> org.apache.hive:hive-exec:jar:2.1.1-cdh6.x-SNAPSHOT: Could not find artifact 
> com.cloudera.cdh:cdh-root:pom:6.x-20220131.115617-21547795 in 
> cdh.rcs.releases.repo 
> (https://repository.cloudera.com/artifactory/cdh-releases-rcs) -> [Help 
> 1]{code}
> There are only metadata in 'cdh.rcs.releases.repo' while the artifacts is 
> missing.
> !image-2024-01-25-10-53-04-706.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-12762) Build package from scratch failed by cmake error

2024-01-30 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang resolved IMPALA-12762.
-
Fix Version/s: Impala 4.4.0
   Resolution: Fixed

Resolving this. Thank [~zhangyifan27] for the quick fix!

> Build package from scratch failed by cmake error
> 
>
> Key: IMPALA-12762
> URL: https://issues.apache.org/jira/browse/IMPALA-12762
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 4.3.0
>Reporter: Quanlong Huang
>Assignee: YifanZhang
>Priority: Major
> Fix For: Impala 4.4.0
>
>
> Saw the following error when building the RPM package from scratch:
> {noformat}
> CMake Error at be/CMakeLists.txt:774 (ADD_DEPENDENCIES):
>   Cannot add target-level dependencies to non-existent target
>   "unified-be-test".
>   The add_dependencies works for top-level logical targets created by the
>   add_executable, add_library, or add_custom_target commands.  If you want to
>   add file-level dependencies see the DEPENDS option of the add_custom_target
>   and add_custom_command commands.
> Call Stack (most recent call first):
>   be/CMakeLists.txt:795 (ADD_UNIFIED_BE_TEST)
>   be/src/exec/json/CMakeLists.txt:36 (ADD_UNIFIED_BE_LSAN_TEST) {noformat}
> The command is
> {code:java}
> ./buildall.sh -noclean -notests -package{code}
> or
> {code:java}
> ./buildall.sh -noclean -notests -release -package{code}
> However, if I build the binaries (including tests) with -package, it succeeds.
> {code:java}
> ./buildall.sh -noclean -skiptests -release -package{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12771) Impala catalogd events-skipped may mark the wrong number

2024-01-30 Thread Quanlong Huang (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812545#comment-17812545
 ] 

Quanlong Huang commented on IMPALA-12771:
-

[~maxwellguo] Thanks for raising this! Agree that the events-skipped metric is 
inconsistent in some cases. Feel free to submit a patch.

Note that we use Gerrit for code review: 
https://cwiki.apache.org/confluence/display/IMPALA/Using+Gerrit+to+submit+and+review+patches

CC [~hemanth619], [~VenuReddy]

> Impala catalogd events-skipped may mark the wrong number
> 
>
> Key: IMPALA-12771
> URL: https://issues.apache.org/jira/browse/IMPALA-12771
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Reporter: Maxwell Guo
>Assignee: Maxwell Guo
>Priority: Minor
>
> See the description of [event-skipped 
> metric|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java#L237]
>  
> {code:java}
>  // total number of events which are skipped because of the flag setting or
>   // in case of [CREATE|DROP] events on [DATABASE|TABLE|PARTITION] which were 
> ignored
>   // because the [DATABASE|TABLE|PARTITION] was already [PRESENT|ABSENT] in 
> the catalogd.
> {code}
>  
> As for CREATE and DROP event on Database/Table/Partition (Also AddPartition 
> is inclued) when we found that the table/database when the database or table 
> is not found in the cache then we will skip the event process and make the 
> event-skipped metric +1.
> But I found that there is some question here for alter table and Reload event:
> * For alter table if renaming a table , the events-skipped  metric will also 
> +1 ,see [oldTblRemoved to be false 
> |https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java#L1653]
> * For Reload event that is not describe in the description of events-skipped, 
> but the value is +1 when is oldevent;
> * Besides if the table is in blacklist the metric will also +1
> In summary, I think this description is inconsistent with the actual 
> implementation.
> So can we also mark the events-skipped metric for alter partition events and 
> modify the 
> description  to be all the events skipped 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12471) Scripts to run unit-tests of external jdbc table for MySQL

2024-01-30 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812544#comment-17812544
 ] 

ASF subversion and git services commented on IMPALA-12471:
--

Commit adfe82c97c0772cf9d336d88254f0a8b3acc7957 in impala's branch 
refs/heads/master from gaurav1086
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=adfe82c97 ]

IMPALA-12471 PART-2: skip mysql ext jdbc tests if
setup environment fails.

This patch modifies the mysql tests to be marked as xfailed
if the mysql environment fails to setup successfully.

Change-Id: Ib7829aed09d25ff3e636004f3d1f32ecc6f37299
Reviewed-on: http://gerrit.cloudera.org:8080/20975
Reviewed-by: Impala Public Jenkins 
Tested-by: Wenzhe Zhou 


> Scripts to run unit-tests of external jdbc table for MySQL
> --
>
> Key: IMPALA-12471
> URL: https://issues.apache.org/jira/browse/IMPALA-12471
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Wenzhe Zhou
>Assignee: gaurav singh
>Priority: Major
> Fix For: Impala 4.4.0
>
>
> Need to write scripts to run unit-tests of external jdbc table for MySQL. 
> These scripts could be used as references to run-unit for other SQL servers, 
> like MSSql, Oracle, etc.
> Including:
>download, install and start MySQL server
>create user
>create table and load data
>download jdbc driver and copy driver jar file to hdfs
>write sqls to create Impala external data source and external jdbc table
>run sqls from impala-shell command line



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12765) Balance consecutive partitions better for Iceberg tables

2024-01-30 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812543#comment-17812543
 ] 

ASF subversion and git services commented on IMPALA-12765:
--

Commit 62a3168eca955119ad7c01b1f4d91a9702efd397 in impala's branch 
refs/heads/master from Zoltan Borok-Nagy
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=62a3168ec ]

IMPALA-12765: Balance consecutive partitions better for Iceberg tables

During remote read scheduling Impala does the following:

Non-Iceberg tables
 * The scheduler processes the scan ranges in partition key order
 * The scheduler selects N executors as candidates
 * The scheduler chooses the executor from the candidates based on
   minimum number of assigned bytes
 * So consecutive partitions are more likely to be assigned to
   different executors

Iceberg tables
 * The scheduler processes the scan ranges in random order
 * The scheduler selects N executors as candidates
 * The scheduler chooses the executor from the candidates based on
   minimum number of assigned bytes
 * So consecutive partitions (by partition key order) are assigned
   randomly, i.e. there's a higher chance of clustering

With this patch, IcebergScanNode orders its file descriptors based on
their paths, so we will have a more balanced scheduling for consecutive
partitions. It is especially important for queries that prune partitions
via runtime filters (e.g. due to a JOIN), because it doesn't matter that
we schedule the scan ranges evenly, the scan ranges that survive the
runtime filters can still be clustered on certain executors.

E.g. TPC-DS Q22 has the following JOIN and WHERE predicates:

 inv_date_sk=d_date_sk and
 d_month_seq between 1199 and 1199 + 11

The Inventory table is partitioned by column inv_date_sk, and we filter
the rows in the joined table by 'd_month_seq between 1199 and
1199 + 11'. This means that we will only need a range of partitions from
the Inventory table, but that range will only be revealed during
runtime. Scheduling neighbouring partitions to different executors means
that the surviving partitions are spread across executors more evenly.

Testing:
 * e2e test

Change-Id: I60773965ecbb4d8e659db158f1f0ac76086d5578
Reviewed-on: http://gerrit.cloudera.org:8080/20973
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Balance consecutive partitions better for Iceberg tables
> 
>
> Key: IMPALA-12765
> URL: https://issues.apache.org/jira/browse/IMPALA-12765
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Zoltán Borók-Nagy
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>  Labels: impala-iceberg
>
> During scheduling Impala does the following:
> * Non-Iceberg tables
> ** The scheduler processes the scan ranges in partition key order
> ** The scheduler selects N replicas as candidates
> ** The scheduler chooses the executor from the candidates based on minimum 
> number of assigned bytes
> ** So consecutive partitions are more likely to be assigned to different 
> executors
> * Iceberg tables
> ** The scheduler processes the scan ranges in random order
> ** The scheduler selects N replicas as candidates
> ** The scheduler chooses the executor from the candidates based on minimum 
> number of assigned bytes
> ** So consecutive partitions (by partition key order) are assigned randomly, 
> i.e. there's a higher chances of clustering
> If the IcebergScanNode ordered its file descriptors based on their paths we 
> would have a more balanced scheduling for consecutive partitions. Queries 
> that operate on a range of partitions are quite common, so it makes sense to 
> optimize that case.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12756) [DOCS] Unicode column name support documentation

2024-01-30 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812513#comment-17812513
 ] 

ASF subversion and git services commented on IMPALA-12756:
--

Commit ab445195b0d4eb78eda16c7501f49e1fa554530e in impala's branch 
refs/heads/master from pranavyl
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=ab445195b ]

IMPALA-12756: [DOCS] Unicode column name support documentation

The patch focuses on documenting that Impala supports unicode
column names, consistent with Hive's current support (as we use
Hive MetaStore to store table metadata).

Change-Id: I3d43d942a3ea069020f06adab6ea77e62ad5ffbe
Reviewed-on: http://gerrit.cloudera.org:8080/20950
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> [DOCS] Unicode column name support documentation
> 
>
> Key: IMPALA-12756
> URL: https://issues.apache.org/jira/browse/IMPALA-12756
> Project: IMPALA
>  Issue Type: Documentation
>Reporter: Pranav Yogi Lodha
>Assignee: Pranav Yogi Lodha
>Priority: Major
>
> The patch focuses on documenting that Impala  supports unicode column names, 
> consistent with Hive's current support (as we use Hive MetaStore to store 
> table metadata).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12762) Build package from scratch failed by cmake error

2024-01-30 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812512#comment-17812512
 ] 

ASF subversion and git services commented on IMPALA-12762:
--

Commit 18a77cd3bcafa7d650177ad8b6ad1db8cf21a21d in impala's branch 
refs/heads/master from zhangyifan27
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=18a77cd3b ]

IMPALA-12762: Fix cmake error in package building

This patch adds extra processing of option 'BUILD_WITH_NO_TESTS' in
be/src/exec/json/CMakeLists.txt, so test targets will not be generated
by the CMake when building Impala with -package and -notests.

Testing:
  - Run './buildall.sh -noclean -notests -package' with no error

Change-Id: Ice0cbb0671d915f997fa74217521a82be164ae57
Reviewed-on: http://gerrit.cloudera.org:8080/20965
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Build package from scratch failed by cmake error
> 
>
> Key: IMPALA-12762
> URL: https://issues.apache.org/jira/browse/IMPALA-12762
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 4.3.0
>Reporter: Quanlong Huang
>Assignee: YifanZhang
>Priority: Major
>
> Saw the following error when building the RPM package from scratch:
> {noformat}
> CMake Error at be/CMakeLists.txt:774 (ADD_DEPENDENCIES):
>   Cannot add target-level dependencies to non-existent target
>   "unified-be-test".
>   The add_dependencies works for top-level logical targets created by the
>   add_executable, add_library, or add_custom_target commands.  If you want to
>   add file-level dependencies see the DEPENDS option of the add_custom_target
>   and add_custom_command commands.
> Call Stack (most recent call first):
>   be/CMakeLists.txt:795 (ADD_UNIFIED_BE_TEST)
>   be/src/exec/json/CMakeLists.txt:36 (ADD_UNIFIED_BE_LSAN_TEST) {noformat}
> The command is
> {code:java}
> ./buildall.sh -noclean -notests -package{code}
> or
> {code:java}
> ./buildall.sh -noclean -notests -release -package{code}
> However, if I build the binaries (including tests) with -package, it succeeds.
> {code:java}
> ./buildall.sh -noclean -skiptests -release -package{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12763) Union with string struct crashes in ASAN

2024-01-30 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812514#comment-17812514
 ] 

ASF subversion and git services commented on IMPALA-12763:
--

Commit 46f04313212952ae2e8f432cb622457918bae6cd in impala's branch 
refs/heads/master from Daniel Becker
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=46f043132 ]

IMPALA-12763: Union with string struct crashes in ASAN

In ASAN builds, if we UNION ALL an array containing a struct of a string
with itself, Impala crashes. This is how to reproduce it:

In Hive:
  create table su (arr ARRAY>) stored as parquet;
  insert into su values (array(named_struct("s", "A")));

In Impala:
  select 1, arr from su
union all select 2, arr from su;

The ASAN error message indicates a heap-use-after-free.

Normally, UNIONs of structs are not supported yet (see IMPALA-10752),
but if the struct is inside an array it is allowed now. This was
probably not intentional and it leads to the above error, so this change
disables structs in unions completely, including embedded structs.

Testing:
 - adjusted existing tests
 - added a query that tests that types with embedded structs are not
   allowed in a UNION statement, in mixed-collections-and-structs.test

Change-Id: Id728f1254b74636be594a33313a478b0b77c7ae4
Reviewed-on: http://gerrit.cloudera.org:8080/20970
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Union with string struct crashes in ASAN
> 
>
> Key: IMPALA-12763
> URL: https://issues.apache.org/jira/browse/IMPALA-12763
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> In ASAN builds if we UNION ALL an array containing a struct of a string with 
> itself Impala crashes. This is how to reproduce it:
> In Hive:
>  
> {code:java}
> create table su (arr ARRAY>) stored as parquet;
> insert into su values (array(named_struct("s", "A")));
> {code}
> In Impala:
> {code:java}
> select 1, arr from su
>   union all select 2, arr from su;{code}
> The ASAN error message indicates a heap-use-after-free.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12767) Upgrade Guava to 32.0.1 due to CVE-2023-2976

2024-01-30 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812511#comment-17812511
 ] 

ASF subversion and git services commented on IMPALA-12767:
--

Commit f8302d90369f2c0fd912963eb764063f4a9a41e2 in impala's branch 
refs/heads/master from Riza Suminto
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=f8302d903 ]

IMPALA-12767: Upgrade Guava to 32.0.1 due to CVE-2023-2976

This patch upgrade Guava version from 31.1-jre to 32.0.1-jre to address
CVE-2023-2976.

Testing:
- Run and pass full build
  ./buildall.sh -skiptests -notests

Change-Id: Id932ed32200fba4f24b4fdd108546ac4494fc3e8
Reviewed-on: http://gerrit.cloudera.org:8080/20972
Tested-by: Impala Public Jenkins 
Reviewed-by: Joe McDonnell 


>  Upgrade Guava to 32.0.1 due to CVE-2023-2976
> -
>
> Key: IMPALA-12767
> URL: https://issues.apache.org/jira/browse/IMPALA-12767
> Project: IMPALA
>  Issue Type: Task
>  Components: Frontend
>Reporter: Riza Suminto
>Assignee: Riza Suminto
>Priority: Critical
> Fix For: Impala 4.4.0
>
>
> {quote}Use of Java's default temporary directory for file creation in 
> `FileBackedOutputStream` in Google Guava versions 1.0 to 31.1 on Unix systems 
> and Android Ice Cream Sandwich allows other users and apps on the machine 
> with access to the default Java temporary directory to be able to access the 
> files created by the class. Even though the security vulnerability is fixed 
> in version 32.0.0, we recommend using version 32.0.1 as version 32.0.0 breaks 
> some functionality under Windows.
> {quote}
> [https://nvd.nist.gov/vuln/detail/CVE-2023-2976]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10752) Support for UNION two structs

2024-01-30 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812515#comment-17812515
 ] 

ASF subversion and git services commented on IMPALA-10752:
--

Commit 46f04313212952ae2e8f432cb622457918bae6cd in impala's branch 
refs/heads/master from Daniel Becker
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=46f043132 ]

IMPALA-12763: Union with string struct crashes in ASAN

In ASAN builds, if we UNION ALL an array containing a struct of a string
with itself, Impala crashes. This is how to reproduce it:

In Hive:
  create table su (arr ARRAY>) stored as parquet;
  insert into su values (array(named_struct("s", "A")));

In Impala:
  select 1, arr from su
union all select 2, arr from su;

The ASAN error message indicates a heap-use-after-free.

Normally, UNIONs of structs are not supported yet (see IMPALA-10752),
but if the struct is inside an array it is allowed now. This was
probably not intentional and it leads to the above error, so this change
disables structs in unions completely, including embedded structs.

Testing:
 - adjusted existing tests
 - added a query that tests that types with embedded structs are not
   allowed in a UNION statement, in mixed-collections-and-structs.test

Change-Id: Id728f1254b74636be594a33313a478b0b77c7ae4
Reviewed-on: http://gerrit.cloudera.org:8080/20970
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Support for UNION two structs
> -
>
> Key: IMPALA-10752
> URL: https://issues.apache.org/jira/browse/IMPALA-10752
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend, Frontend
>Reporter: Gabor Kaszab
>Priority: Major
>
> {code:java}
> select id, tiny_struct from complextypes_structs
> union all
> select id, tiny_struct from complextypes_structs;
> {code}
> Result is the following error:
> {code:java}
> ERROR: AnalysisException: Incompatible return types 'STRUCT' and 
> 'STRUCT' of exprs 'tiny_struct' and 'tiny_struct'.
> {code}
> where complextypes_structs is the following (note, it is same with parquet):
> {code:java}
> CREATE TABLE complextypes_structs (
> id int,
> tiny_struct struct,
> small_struct struct
> ) STORED AS ORC;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-12742) DELETE/UPDATE Iceberg table partitioned by DATE fails with error

2024-01-30 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IMPALA-12742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy resolved IMPALA-12742.

Fix Version/s: Impala 4.4.0
   Resolution: Fixed

> DELETE/UPDATE Iceberg table partitioned by DATE fails with error
> 
>
> Key: IMPALA-12742
> URL: https://issues.apache.org/jira/browse/IMPALA-12742
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend, Catalog
>Reporter: Noemi Pap-Takacs
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>  Labels: impala-iceberg
> Fix For: Impala 4.4.0
>
>
> Iceberg tables can be identity partitioned by any type, e.g. int, date and 
> even float.
> If a table is partitioned, the file path contains the partition value in 
> human readable form. When an UPDATE or DELETE command is executed, the delete 
> file contains the file path to the referenced data file. It seems that DATE 
> type is converted to this form incorrectly, and cannot be parsed by the 
> Catalog and throws an error.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-12773) Log the snapshot id add it to the plan node for Iceberg queries

2024-01-30 Thread Tamas Mate (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tamas Mate updated IMPALA-12773:

Labels: impala-iceberg  (was: )

> Log the snapshot id add it to the plan node for Iceberg queries 
> 
>
> Key: IMPALA-12773
> URL: https://issues.apache.org/jira/browse/IMPALA-12773
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 4.4.0
>Reporter: Tamas Mate
>Priority: Major
>  Labels: impala-iceberg
>
> For supportability purposes Impala should track the snapshot id that will be 
> used to query the Iceberg table. This could help identify problems like:
> - whether two engines are reading from the same snapshots
> - trace back which snapshot was read by a specific query. Useful to see if 
> there were any changes between query executions.
> The snapshot id could be logged INFO level and added to the query plan tree 
> as well as an attribute of a SCAN node.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-12773) Log the snapshot id add it to the plan node for Iceberg queries

2024-01-30 Thread Tamas Mate (Jira)
Tamas Mate created IMPALA-12773:
---

 Summary: Log the snapshot id add it to the plan node for Iceberg 
queries 
 Key: IMPALA-12773
 URL: https://issues.apache.org/jira/browse/IMPALA-12773
 Project: IMPALA
  Issue Type: Improvement
  Components: Frontend
Affects Versions: Impala 4.4.0
Reporter: Tamas Mate


For supportability purposes Impala should track the snapshot id that will be 
used to query the Iceberg table. This could help identify problems like:
- whether two engines are reading from the same snapshots
- trace back which snapshot was read by a specific query. Useful to see if 
there were any changes between query executions.

The snapshot id could be logged INFO level and added to the query plan tree as 
well as an attribute of a SCAN node.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-12767) Upgrade Guava to 32.0.1 due to CVE-2023-2976

2024-01-30 Thread Riza Suminto (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Riza Suminto resolved IMPALA-12767.
---
Fix Version/s: Impala 4.4.0
   Resolution: Fixed

>  Upgrade Guava to 32.0.1 due to CVE-2023-2976
> -
>
> Key: IMPALA-12767
> URL: https://issues.apache.org/jira/browse/IMPALA-12767
> Project: IMPALA
>  Issue Type: Task
>  Components: Frontend
>Reporter: Riza Suminto
>Assignee: Riza Suminto
>Priority: Critical
> Fix For: Impala 4.4.0
>
>
> {quote}Use of Java's default temporary directory for file creation in 
> `FileBackedOutputStream` in Google Guava versions 1.0 to 31.1 on Unix systems 
> and Android Ice Cream Sandwich allows other users and apps on the machine 
> with access to the default Java temporary directory to be able to access the 
> files created by the class. Even though the security vulnerability is fixed 
> in version 32.0.0, we recommend using version 32.0.1 as version 32.0.0 breaks 
> some functionality under Windows.
> {quote}
> [https://nvd.nist.gov/vuln/detail/CVE-2023-2976]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12763) Union with string struct crashes in ASAN

2024-01-30 Thread Daniel Becker (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812334#comment-17812334
 ] 

Daniel Becker commented on IMPALA-12763:


Note that the error only occurs if codegen is enabled.

> Union with string struct crashes in ASAN
> 
>
> Key: IMPALA-12763
> URL: https://issues.apache.org/jira/browse/IMPALA-12763
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> In ASAN builds if we UNION ALL an array containing a struct of a string with 
> itself Impala crashes. This is how to reproduce it:
> In Hive:
>  
> {code:java}
> create table su (arr ARRAY>) stored as parquet;
> insert into su values (array(named_struct("s", "A")));
> {code}
> In Impala:
> {code:java}
> select 1, arr from su
>   union all select 2, arr from su;{code}
> The ASAN error message indicates a heap-use-after-free.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-12772) [DOCS] Update the documentation of identifiers

2024-01-30 Thread Pranav Yogi Lodha (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pranav Yogi Lodha updated IMPALA-12772:
---
Summary: [DOCS] Update the documentation of identifiers  (was: Update the 
documentation of identifiers)

> [DOCS] Update the documentation of identifiers
> --
>
> Key: IMPALA-12772
> URL: https://issues.apache.org/jira/browse/IMPALA-12772
> Project: IMPALA
>  Issue Type: Documentation
>Reporter: Pranav Yogi Lodha
>Priority: Major
>
> Documentation of identifiers seems to be a bit out-dated or less accurate, 
> this Jira focusses on investigating and identifying all such discrepencies 
> and updating the same.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-12772) [DOCS] Update the documentation of identifiers

2024-01-30 Thread Pranav Yogi Lodha (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pranav Yogi Lodha reassigned IMPALA-12772:
--

Assignee: Pranav Yogi Lodha

> [DOCS] Update the documentation of identifiers
> --
>
> Key: IMPALA-12772
> URL: https://issues.apache.org/jira/browse/IMPALA-12772
> Project: IMPALA
>  Issue Type: Documentation
>Reporter: Pranav Yogi Lodha
>Assignee: Pranav Yogi Lodha
>Priority: Major
>
> Documentation of identifiers seems to be a bit out-dated or less accurate, 
> this Jira focusses on investigating and identifying all such discrepencies 
> and updating the same.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-12772) Update the documentation of identifiers

2024-01-30 Thread Pranav Yogi Lodha (Jira)
Pranav Yogi Lodha created IMPALA-12772:
--

 Summary: Update the documentation of identifiers
 Key: IMPALA-12772
 URL: https://issues.apache.org/jira/browse/IMPALA-12772
 Project: IMPALA
  Issue Type: Documentation
Reporter: Pranav Yogi Lodha


Documentation of identifiers seems to be a bit out-dated or less accurate, this 
Jira focusses on investigating and identifying all such discrepencies and 
updating the same.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12771) Impala catalogd events-skipped may mark the wrong number

2024-01-30 Thread Maxwell Guo (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812236#comment-17812236
 ] 

Maxwell Guo commented on IMPALA-12771:
--

Besides, I found an interesting piece of code, [ here 
|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java#L1023]
as the TableLoadingException and DatabaseNotFoundException is catched in the 
method [here 
|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java#L1015]
 and the inner function of 
[reloadTableIfExists|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L2902]
 has already catched the exception and the function do not re-throw the  
exception out so , the outside function has no need to deal with these two 
exception in my mind. And I think it is not suitable to print an info level log 
[here|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L2903].
 Warn level logging is better.

> Impala catalogd events-skipped may mark the wrong number
> 
>
> Key: IMPALA-12771
> URL: https://issues.apache.org/jira/browse/IMPALA-12771
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Reporter: Maxwell Guo
>Assignee: Maxwell Guo
>Priority: Minor
>
> See the description of [event-skipped 
> metric|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java#L237]
>  
> {code:java}
>  // total number of events which are skipped because of the flag setting or
>   // in case of [CREATE|DROP] events on [DATABASE|TABLE|PARTITION] which were 
> ignored
>   // because the [DATABASE|TABLE|PARTITION] was already [PRESENT|ABSENT] in 
> the catalogd.
> {code}
>  
> As for CREATE and DROP event on Database/Table/Partition (Also AddPartition 
> is inclued) when we found that the table/database when the database or table 
> is not found in the cache then we will skip the event process and make the 
> event-skipped metric +1.
> But I found that there is some question here for alter table and Reload event:
> * For alter table if renaming a table , the events-skipped  metric will also 
> +1 ,see [oldTblRemoved to be false 
> |https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java#L1653]
> * For Reload event that is not describe in the description of events-skipped, 
> but the value is +1 when is oldevent;
> * Besides if the table is in blacklist the metric will also +1
> In summary, I think this description is inconsistent with the actual 
> implementation.
> So can we also mark the events-skipped metric for alter partition events and 
> modify the 
> description  to be all the events skipped 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12771) Impala catalogd events-skipped may mark the wrong number

2024-01-30 Thread Maxwell Guo (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812237#comment-17812237
 ] 

Maxwell Guo commented on IMPALA-12771:
--

If you think my suggestion is reasonable, I will submit a PR later

> Impala catalogd events-skipped may mark the wrong number
> 
>
> Key: IMPALA-12771
> URL: https://issues.apache.org/jira/browse/IMPALA-12771
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Reporter: Maxwell Guo
>Assignee: Maxwell Guo
>Priority: Minor
>
> See the description of [event-skipped 
> metric|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java#L237]
>  
> {code:java}
>  // total number of events which are skipped because of the flag setting or
>   // in case of [CREATE|DROP] events on [DATABASE|TABLE|PARTITION] which were 
> ignored
>   // because the [DATABASE|TABLE|PARTITION] was already [PRESENT|ABSENT] in 
> the catalogd.
> {code}
>  
> As for CREATE and DROP event on Database/Table/Partition (Also AddPartition 
> is inclued) when we found that the table/database when the database or table 
> is not found in the cache then we will skip the event process and make the 
> event-skipped metric +1.
> But I found that there is some question here for alter table and Reload event:
> * For alter table if renaming a table , the events-skipped  metric will also 
> +1 ,see [oldTblRemoved to be false 
> |https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java#L1653]
> * For Reload event that is not describe in the description of events-skipped, 
> but the value is +1 when is oldevent;
> * Besides if the table is in blacklist the metric will also +1
> In summary, I think this description is inconsistent with the actual 
> implementation.
> So can we also mark the events-skipped metric for alter partition events and 
> modify the 
> description  to be all the events skipped 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12771) Impala catalogd events-skipped may mark the wrong number

2024-01-30 Thread Maxwell Guo (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812215#comment-17812215
 ] 

Maxwell Guo commented on IMPALA-12771:
--

ping [~huangqiang] 

> Impala catalogd events-skipped may mark the wrong number
> 
>
> Key: IMPALA-12771
> URL: https://issues.apache.org/jira/browse/IMPALA-12771
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Reporter: Maxwell Guo
>Assignee: Maxwell Guo
>Priority: Minor
>
> See the description of [event-skipped 
> metric|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java#L237]
>  
> {code:java}
>  // total number of events which are skipped because of the flag setting or
>   // in case of [CREATE|DROP] events on [DATABASE|TABLE|PARTITION] which were 
> ignored
>   // because the [DATABASE|TABLE|PARTITION] was already [PRESENT|ABSENT] in 
> the catalogd.
> {code}
>  
> As for CREATE and DROP event on Database/Table/Partition (Also AddPartition 
> is inclued) when we found that the table/database when the database or table 
> is not found in the cache then we will skip the event process and make the 
> event-skipped metric +1.
> But I found that there is some question here for alter table and Reload event:
> * For alter table if renaming a table , the events-skipped  metric will also 
> +1 ,see [oldTblRemoved to be false 
> |https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java#L1653]
> * For Reload event that is not describe in the description of events-skipped, 
> but the value is +1 when is oldevent;
> * Besides if the table is in blacklist the metric will also +1
> In summary, I think this description is inconsistent with the actual 
> implementation.
> So can we also mark the events-skipped metric for alter partition events and 
> modify the 
> description  to be all the events skipped 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12771) Impala catalogd events-skipped may mark the wrong number

2024-01-30 Thread Maxwell Guo (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812214#comment-17812214
 ] 

Maxwell Guo commented on IMPALA-12771:
--

with alter parition event, we may found that database/table is not found , or 
table is not loaded see [here 
|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java#L1057].
 then we just skipped the and  made a debug log , the log is : " Ignoring the 
event since  the table is not found ". 
But actually, we skipped the handling of these events when table is not found 
.So I think we can also +1 on the events-skipped metric if table is not found  
or table is IncompleteTable or table was remove in catalog. 

Besides, we just mark the events-skipped metric for event process with 
isOlderEvent method and isSelfEvent see 
[isSelfEvent|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java#L857]
 and 
[isOlderEvent|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java#L1198]
But as for 
[canBeSkipped|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java#L1703]
 method, the metric is not +1. I think we can also add here.
 

> Impala catalogd events-skipped may mark the wrong number
> 
>
> Key: IMPALA-12771
> URL: https://issues.apache.org/jira/browse/IMPALA-12771
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Reporter: Maxwell Guo
>Assignee: Maxwell Guo
>Priority: Minor
>
> See the description of [event-skipped 
> metric|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java#L237]
>  
> {code:java}
>  // total number of events which are skipped because of the flag setting or
>   // in case of [CREATE|DROP] events on [DATABASE|TABLE|PARTITION] which were 
> ignored
>   // because the [DATABASE|TABLE|PARTITION] was already [PRESENT|ABSENT] in 
> the catalogd.
> {code}
>  
> As for CREATE and DROP event on Database/Table/Partition (Also AddPartition 
> is inclued) when we found that the table/database when the database or table 
> is not found in the cache then we will skip the event process and make the 
> event-skipped metric +1.
> But I found that there is some question here for alter table and Reload event:
> * For alter table if renaming a table , the events-skipped  metric will also 
> +1 ,see [oldTblRemoved to be false 
> |https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java#L1653]
> * For Reload event that is not describe in the description of events-skipped, 
> but the value is +1 when is oldevent;
> * Besides if the table is in blacklist the metric will also +1
> In summary, I think this description is inconsistent with the actual 
> implementation.
> So can we also mark the events-skipped metric for alter partition events and 
> modify the 
> description  to be all the events skipped 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-12771) Impala catalogd events-skipped may mark the wrong number

2024-01-30 Thread Maxwell Guo (Jira)
Maxwell Guo created IMPALA-12771:


 Summary: Impala catalogd events-skipped may mark the wrong number
 Key: IMPALA-12771
 URL: https://issues.apache.org/jira/browse/IMPALA-12771
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog
Reporter: Maxwell Guo
Assignee: Maxwell Guo


See the description of [event-skipped 
metric|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java#L237]
 

{code:java}
 // total number of events which are skipped because of the flag setting or
  // in case of [CREATE|DROP] events on [DATABASE|TABLE|PARTITION] which were 
ignored
  // because the [DATABASE|TABLE|PARTITION] was already [PRESENT|ABSENT] in the 
catalogd.
{code}
 
As for CREATE and DROP event on Database/Table/Partition (Also AddPartition is 
inclued) when we found that the table/database when the database or table is 
not found in the cache then we will skip the event process and make the 
event-skipped metric +1.
But I found that there is some question here for alter table and Reload event:

* For alter table if renaming a table , the events-skipped  metric will also +1 
,see [oldTblRemoved to be false 
|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java#L1653]
* For Reload event that is not describe in the description of events-skipped, 
but the value is +1 when is oldevent;
* Besides if the table is in blacklist the metric will also +1
In summary, I think this description is inconsistent with the actual 
implementation.
So can we also mark the events-skipped metric for alter partition events and 
modify the 
description  to be all the events skipped 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-12770) ExprRewriter enter infinite loop for nested Case statements

2024-01-30 Thread Wenzhe Zhou (Jira)
Wenzhe Zhou created IMPALA-12770:


 Summary: ExprRewriter enter infinite loop for nested Case 
statements
 Key: IMPALA-12770
 URL: https://issues.apache.org/jira/browse/IMPALA-12770
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Reporter: Wenzhe Zhou


ExprRewriter enter infinite loop when run following query with nested Case 
statement:
{code:java}
select 
  case
case ''
  when 'abc' then t4.string_col
end
when 'none' then 'Total'
  end
as fcol from functional.alltypes as t4 limit 1;
{code}

jstack shows Impala enter infinite loop in ExprRewriter functions:
{code:java}
"Thread-16" #39 prio=5 os_prio=0 tid=0x0e188000 nid=0x90ec8 runnable 
[0x7fa8b3d23000]
   java.lang.Thread.State: RUNNABLE
at org.apache.impala.service.FeSupport.NativeEvalExprsWithoutRow(Native 
Method)
at 
org.apache.impala.service.FeSupport.EvalExprsWithoutRowBounded(FeSupport.java:261)
at 
org.apache.impala.service.FeSupport.EvalExprWithoutRowBounded(FeSupport.java:205)
at 
org.apache.impala.analysis.LiteralExpr.createBounded(LiteralExpr.java:214)
at 
org.apache.impala.rewrite.FoldConstantsRule.apply(FoldConstantsRule.java:68)
at 
org.apache.impala.rewrite.ExprRewriter.applyRuleBottomUp(ExprRewriter.java:85)
at 
org.apache.impala.rewrite.ExprRewriter.applyRuleRepeatedly(ExprRewriter.java:71)
at org.apache.impala.rewrite.ExprRewriter.rewrite(ExprRewriter.java:55)
at 
org.apache.impala.rewrite.SimplifyConditionalsRule.simplifyCaseExpr(SimplifyConditionalsRule.java:240)
at 
org.apache.impala.rewrite.SimplifyConditionalsRule.apply(SimplifyConditionalsRule.java:71)
at 
org.apache.impala.rewrite.ExprRewriter.applyRuleBottomUp(ExprRewriter.java:85)
at 
org.apache.impala.rewrite.ExprRewriter.applyRuleRepeatedly(ExprRewriter.java:71)
at org.apache.impala.rewrite.ExprRewriter.rewrite(ExprRewriter.java:55)
at org.apache.impala.analysis.SelectList.rewriteExprs(SelectList.java:100)
at org.apache.impala.analysis.SelectStmt.rewriteExprs(SelectStmt.java:1517)
at 
org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:585)
at 
org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:492)
at 
org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:2397)
at org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:2144)
at org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1913)
at 
org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:169)
{code}

The issue does not happen if a matching value is found for the inner 'case' 
statement, or adding 'else' for the inner 'case' statement. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org