[jira] [Commented] (IMPALA-13170) InconsistentMetadataFetchException due to database dropped when showing databases

2024-07-01 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-13170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17861099#comment-17861099
 ] 

ASF subversion and git services commented on IMPALA-13170:
--

Commit 00d0b0dda1e215d8e91ff52688fe6654bee52282 in impala's branch 
refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=00d0b0dda ]

IMPALA-9441,IMPALA-13170: Ops listing dbs/tables should handle db not exists

We have some operations listing the dbs/tables in the following steps:
  1. Get the db list
  2. Do something on the db which could fail if the db no longer exists
For instance, when authorization is enabled, SHOW DATABASES would need a
step-2 to get the owner of each db. This is fine in the legacy catalog
mode since the whole Db object is cached in the coordinator side.
However, in the local catalog mode, the msDb could be missing in the
local cache. Coordinator then triggers a getPartialCatalogObject RPC to
load it from catalogd. If the db no longer exists in catalogd, such step
will fail.

The same in GetTables HS2 requests when listing all tables in all dbs.
In step-2 we list the table names for a db. Though it exists when we get
the db list, it could be dropped when we start listing the table names
in it.

This patch adds codes to handle the exceptions due to db no longer
exists. Also improves GetSchemas to not list the table names to get rid
of the same issue.

Tests:
 - Add e2e tests

Change-Id: I2bd40d33859feca2bbd2e5f1158f3894a91c2929
Reviewed-on: http://gerrit.cloudera.org:8080/21546
Reviewed-by: Yida Wu 
Tested-by: Impala Public Jenkins 


> InconsistentMetadataFetchException due to database dropped when showing 
> databases
> -
>
> Key: IMPALA-13170
> URL: https://issues.apache.org/jira/browse/IMPALA-13170
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.4.0
>Reporter: Yida Wu
>Assignee: Quanlong Huang
>Priority: Major
>
> Using impalad 3.4.0, an InconsistentMetadataFetchException occurs when 
> running "show databases" in Impala while simultaneously executing "drop 
> database" to drop the newly created database in Hive.
> Step is:
> 1, Creates database (Hive)
> 2, Creates tables (Hive)
> 3, Drops tables (Hive)
> 4, Run show databases (Impala)  Drop database (Hive)
> Logs in Impalad:
> {code:java}
> I0610 02:18:32.435815 278475 CatalogdMetaProvider.java:1354] 1:2] 
> Invalidated objects in cache: [list of database names, HMS_METADATA for DB 
> test_hive]
> I0610 02:18:32.436224 278475 jni-util.cc:288] 1:2] 
> org.apache.impala.catalog.local.InconsistentMetadataFetchException: Fetching 
> DATABASE failed. Could not find TCatalogObject(type:DATABASE, 
> catalog_version:0, db:TDatabase(db_name:test_hive))   
>   
>   
> 
>   at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.sendRequest(CatalogdMetaProvider.java:424)
>   at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.access$100(CatalogdMetaProvider.java:185)
>   at 
> org.apache.impala.catalog.local.CatalogdMetaProvider$2.call(CatalogdMetaProvider.java:643)
>   at 
> org.apache.impala.catalog.local.CatalogdMetaProvider$2.call(CatalogdMetaProvider.java:638)
>   at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.loadWithCaching(CatalogdMetaProvider.java:521)
>   at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.loadDb(CatalogdMetaProvider.java:635)
>   at org.apache.impala.catalog.local.LocalDb.getMetaStoreDb(LocalDb.java:91) 
>   at org.apache.impala.catalog.local.LocalDb.getOwnerUser(LocalDb.java:294)
>   at org.apache.impala.service.Frontend.getDbs(Frontend.java:1066)
>   at org.apache.impala.service.JniFrontend.getDbs(JniFrontend.java:301)
> I0610 02:18:32.436257 278475 status.cc:129] 1:2] 
> InconsistentMetadataFetchException: Fetching DATABASE failed. Could not find 
> TCatalogObject(type:DATABASE, catalog_version:0, 
> {code}
> Logs in Catalog:
> {code:java}
> I0610 02:18:16.190133 222885 MetastoreEvents.java:505] EventId: 141467532 
> EventType: CREATE_DATABASE Successfully added database test_hive 
> ...
> I0610 02:18:32.276082 222885 MetastoreEvents.java:516] EventId: 141467562 
> EventType: DROP_DATABASE Creating event 141467562 of type DROP_DATABASE on 
> database test_hive
> I0610 02:18:32.277876 222885 MetastoreEvents.java:254] Total number of events 
> received: 6 Total number of events filtered out: 0
> I0610 02:18:32.277910 222885 MetastoreEvents.java:258] Incremented skipped 
> metric to 2564
> I0610 02:18:

[jira] [Commented] (IMPALA-9441) TestHS2.test_get_schemas is flaky in local catalog mode

2024-07-01 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17861098#comment-17861098
 ] 

ASF subversion and git services commented on IMPALA-9441:
-

Commit 00d0b0dda1e215d8e91ff52688fe6654bee52282 in impala's branch 
refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=00d0b0dda ]

IMPALA-9441,IMPALA-13170: Ops listing dbs/tables should handle db not exists

We have some operations listing the dbs/tables in the following steps:
  1. Get the db list
  2. Do something on the db which could fail if the db no longer exists
For instance, when authorization is enabled, SHOW DATABASES would need a
step-2 to get the owner of each db. This is fine in the legacy catalog
mode since the whole Db object is cached in the coordinator side.
However, in the local catalog mode, the msDb could be missing in the
local cache. Coordinator then triggers a getPartialCatalogObject RPC to
load it from catalogd. If the db no longer exists in catalogd, such step
will fail.

The same in GetTables HS2 requests when listing all tables in all dbs.
In step-2 we list the table names for a db. Though it exists when we get
the db list, it could be dropped when we start listing the table names
in it.

This patch adds codes to handle the exceptions due to db no longer
exists. Also improves GetSchemas to not list the table names to get rid
of the same issue.

Tests:
 - Add e2e tests

Change-Id: I2bd40d33859feca2bbd2e5f1158f3894a91c2929
Reviewed-on: http://gerrit.cloudera.org:8080/21546
Reviewed-by: Yida Wu 
Tested-by: Impala Public Jenkins 


> TestHS2.test_get_schemas is flaky in local catalog mode
> ---
>
> Key: IMPALA-9441
> URL: https://issues.apache.org/jira/browse/IMPALA-9441
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Reporter: Sahil Takiar
>Assignee: Quanlong Huang
>Priority: Critical
>
> Saw this once on a ubuntu-16.04-dockerised-tests job:
> {code:java}
> Error Message
> hs2/hs2_test_suite.py:63: in add_session lambda: fn(self)) 
> hs2/hs2_test_suite.py:44: in add_session_helper fn() 
> hs2/hs2_test_suite.py:63: in  lambda: fn(self)) 
> hs2/test_hs2.py:423: in test_get_schemas 
> TestHS2.check_response(get_schemas_resp) hs2/hs2_test_suite.py:131: in 
> check_response assert response.status.statusCode == expected_status_code 
> E   assert 3 == 0 E+  where 3 = 3 E+where 3 = 
> TStatus(errorCode=None, errorMessage="DatabaseNotFoundException: Database 
> 'test_compute_stats_impala_2201_e794b8f' not found\n", sqlState='HY000', 
> infoMessages=None, statusCode=3).statusCode E+  where 
> TStatus(errorCode=None, errorMessage="DatabaseNotFoundException: Database 
> 'test_compute_stats_impala_2201_e794b8f' not found\n", sqlState='HY000', 
> infoMessages=None, statusCode=3) = TStatus(errorCode=None, 
> errorMessage="DatabaseNotFoundException: Database 
> 'test_compute_stats_impala_2201_e794b8f' not found\n", sqlState='HY000', 
> infoMessages=None, statusCode=3) E+where TStatus(errorCode=None, 
> errorMessage="DatabaseNotFoundException: Database 
> 'test_compute_stats_impala_2201_e794b8f' not found\n", sqlState='HY000', 
> infoMessages=None, statusCode=3) = 
> TGetSchemasResp(status=TStatus(errorCode=None, 
> errorMessage="DatabaseNotFoundException: Database 
> 'test_compute_stats_i...nHandle(hasResultSet=False, modifiedRowCount=None, 
> operationType=3, operationId=THandleIdentifier(secret='', guid=''))).status
> Stacktrace
> hs2/hs2_test_suite.py:63: in add_session
> lambda: fn(self))
> hs2/hs2_test_suite.py:44: in add_session_helper
> fn()
> hs2/hs2_test_suite.py:63: in 
> lambda: fn(self))
> hs2/test_hs2.py:423: in test_get_schemas
> TestHS2.check_response(get_schemas_resp)
> hs2/hs2_test_suite.py:131: in check_response
> assert response.status.statusCode == expected_status_code
> E   assert 3 == 0
> E+  where 3 = 3
> E+where 3 = TStatus(errorCode=None, 
> errorMessage="DatabaseNotFoundException: Database 
> 'test_compute_stats_impala_2201_e794b8f' not found\n", sqlState='HY000', 
> infoMessages=None, statusCode=3).statusCode
> E+  where TStatus(errorCode=None, 
> errorMessage="DatabaseNotFoundException: Database 
> 'test_compute_stats_impala_2201_e794b8f' not found\n", sqlState='HY000', 
> infoMessages=None, statusCode=3) = TStatus(errorCode=None, 
> errorMessage="DatabaseNotFoundException: Database 
> 'test_compute_stats_impala_2201_e794b8f' not found\n", sqlState='HY000', 
> infoMessages=None, statusCode=3)
> E+where TStatus(errorCode=None, 
> errorMessage="DatabaseNotFoundException: Database 
> 'test_compute_stats_impala_2201_e794b8f' not found\n", sqlState='HY000', 
> infoMessages=None, statusCode

[jira] [Resolved] (IMPALA-9441) TestHS2.test_get_schemas is flaky in local catalog mode

2024-07-01 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang resolved IMPALA-9441.

Fix Version/s: Impala 4.5.0
   Resolution: Fixed

> TestHS2.test_get_schemas is flaky in local catalog mode
> ---
>
> Key: IMPALA-9441
> URL: https://issues.apache.org/jira/browse/IMPALA-9441
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Reporter: Sahil Takiar
>Assignee: Quanlong Huang
>Priority: Critical
> Fix For: Impala 4.5.0
>
>
> Saw this once on a ubuntu-16.04-dockerised-tests job:
> {code:java}
> Error Message
> hs2/hs2_test_suite.py:63: in add_session lambda: fn(self)) 
> hs2/hs2_test_suite.py:44: in add_session_helper fn() 
> hs2/hs2_test_suite.py:63: in  lambda: fn(self)) 
> hs2/test_hs2.py:423: in test_get_schemas 
> TestHS2.check_response(get_schemas_resp) hs2/hs2_test_suite.py:131: in 
> check_response assert response.status.statusCode == expected_status_code 
> E   assert 3 == 0 E+  where 3 = 3 E+where 3 = 
> TStatus(errorCode=None, errorMessage="DatabaseNotFoundException: Database 
> 'test_compute_stats_impala_2201_e794b8f' not found\n", sqlState='HY000', 
> infoMessages=None, statusCode=3).statusCode E+  where 
> TStatus(errorCode=None, errorMessage="DatabaseNotFoundException: Database 
> 'test_compute_stats_impala_2201_e794b8f' not found\n", sqlState='HY000', 
> infoMessages=None, statusCode=3) = TStatus(errorCode=None, 
> errorMessage="DatabaseNotFoundException: Database 
> 'test_compute_stats_impala_2201_e794b8f' not found\n", sqlState='HY000', 
> infoMessages=None, statusCode=3) E+where TStatus(errorCode=None, 
> errorMessage="DatabaseNotFoundException: Database 
> 'test_compute_stats_impala_2201_e794b8f' not found\n", sqlState='HY000', 
> infoMessages=None, statusCode=3) = 
> TGetSchemasResp(status=TStatus(errorCode=None, 
> errorMessage="DatabaseNotFoundException: Database 
> 'test_compute_stats_i...nHandle(hasResultSet=False, modifiedRowCount=None, 
> operationType=3, operationId=THandleIdentifier(secret='', guid=''))).status
> Stacktrace
> hs2/hs2_test_suite.py:63: in add_session
> lambda: fn(self))
> hs2/hs2_test_suite.py:44: in add_session_helper
> fn()
> hs2/hs2_test_suite.py:63: in 
> lambda: fn(self))
> hs2/test_hs2.py:423: in test_get_schemas
> TestHS2.check_response(get_schemas_resp)
> hs2/hs2_test_suite.py:131: in check_response
> assert response.status.statusCode == expected_status_code
> E   assert 3 == 0
> E+  where 3 = 3
> E+where 3 = TStatus(errorCode=None, 
> errorMessage="DatabaseNotFoundException: Database 
> 'test_compute_stats_impala_2201_e794b8f' not found\n", sqlState='HY000', 
> infoMessages=None, statusCode=3).statusCode
> E+  where TStatus(errorCode=None, 
> errorMessage="DatabaseNotFoundException: Database 
> 'test_compute_stats_impala_2201_e794b8f' not found\n", sqlState='HY000', 
> infoMessages=None, statusCode=3) = TStatus(errorCode=None, 
> errorMessage="DatabaseNotFoundException: Database 
> 'test_compute_stats_impala_2201_e794b8f' not found\n", sqlState='HY000', 
> infoMessages=None, statusCode=3)
> E+where TStatus(errorCode=None, 
> errorMessage="DatabaseNotFoundException: Database 
> 'test_compute_stats_impala_2201_e794b8f' not found\n", sqlState='HY000', 
> infoMessages=None, statusCode=3) = 
> TGetSchemasResp(status=TStatus(errorCode=None, 
> errorMessage="DatabaseNotFoundException: Database 
> 'test_compute_stats_i...nHandle(hasResultSet=False, modifiedRowCount=None, 
> operationType=3, operationId=THandleIdentifier(secret='', guid=''))).status 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-13170) InconsistentMetadataFetchException due to database dropped when showing databases

2024-07-01 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-13170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang resolved IMPALA-13170.
-
Fix Version/s: Impala 4.5.0
   Resolution: Fixed

> InconsistentMetadataFetchException due to database dropped when showing 
> databases
> -
>
> Key: IMPALA-13170
> URL: https://issues.apache.org/jira/browse/IMPALA-13170
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.4.0
>Reporter: Yida Wu
>Assignee: Quanlong Huang
>Priority: Major
> Fix For: Impala 4.5.0
>
>
> Using impalad 3.4.0, an InconsistentMetadataFetchException occurs when 
> running "show databases" in Impala while simultaneously executing "drop 
> database" to drop the newly created database in Hive.
> Step is:
> 1, Creates database (Hive)
> 2, Creates tables (Hive)
> 3, Drops tables (Hive)
> 4, Run show databases (Impala)  Drop database (Hive)
> Logs in Impalad:
> {code:java}
> I0610 02:18:32.435815 278475 CatalogdMetaProvider.java:1354] 1:2] 
> Invalidated objects in cache: [list of database names, HMS_METADATA for DB 
> test_hive]
> I0610 02:18:32.436224 278475 jni-util.cc:288] 1:2] 
> org.apache.impala.catalog.local.InconsistentMetadataFetchException: Fetching 
> DATABASE failed. Could not find TCatalogObject(type:DATABASE, 
> catalog_version:0, db:TDatabase(db_name:test_hive))   
>   
>   
> 
>   at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.sendRequest(CatalogdMetaProvider.java:424)
>   at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.access$100(CatalogdMetaProvider.java:185)
>   at 
> org.apache.impala.catalog.local.CatalogdMetaProvider$2.call(CatalogdMetaProvider.java:643)
>   at 
> org.apache.impala.catalog.local.CatalogdMetaProvider$2.call(CatalogdMetaProvider.java:638)
>   at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.loadWithCaching(CatalogdMetaProvider.java:521)
>   at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.loadDb(CatalogdMetaProvider.java:635)
>   at org.apache.impala.catalog.local.LocalDb.getMetaStoreDb(LocalDb.java:91) 
>   at org.apache.impala.catalog.local.LocalDb.getOwnerUser(LocalDb.java:294)
>   at org.apache.impala.service.Frontend.getDbs(Frontend.java:1066)
>   at org.apache.impala.service.JniFrontend.getDbs(JniFrontend.java:301)
> I0610 02:18:32.436257 278475 status.cc:129] 1:2] 
> InconsistentMetadataFetchException: Fetching DATABASE failed. Could not find 
> TCatalogObject(type:DATABASE, catalog_version:0, 
> {code}
> Logs in Catalog:
> {code:java}
> I0610 02:18:16.190133 222885 MetastoreEvents.java:505] EventId: 141467532 
> EventType: CREATE_DATABASE Successfully added database test_hive 
> ...
> I0610 02:18:32.276082 222885 MetastoreEvents.java:516] EventId: 141467562 
> EventType: DROP_DATABASE Creating event 141467562 of type DROP_DATABASE on 
> database test_hive
> I0610 02:18:32.277876 222885 MetastoreEvents.java:254] Total number of events 
> received: 6 Total number of events filtered out: 0
> I0610 02:18:32.277910 222885 MetastoreEvents.java:258] Incremented skipped 
> metric to 2564
> I0610 02:18:32.279537 222885 MetastoreEvents.java:505] EventId: 141467562 
> EventType: DROP_DATABASE Removed Database test_hive
> {code}
> The case is similar to IMPALA-9441. We may want to handle the error in a 
> better way in Frontend.getDbs().



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-13191) Do Not Create Sort Node with Constant Ordering Expression

2024-07-01 Thread Noemi Pap-Takacs (Jira)
Noemi Pap-Takacs created IMPALA-13191:
-

 Summary: Do Not Create Sort Node with Constant Ordering Expression
 Key: IMPALA-13191
 URL: https://issues.apache.org/jira/browse/IMPALA-13191
 Project: IMPALA
  Issue Type: Bug
  Components: fe, Frontend
Reporter: Noemi Pap-Takacs


Rows are sorted before inserting into partitioned Iceberg tables. See 
Planner.createPreDmlSort().
If we update the partitioning column of the table, the sort ordering expression 
will be the partition column. If we set the new value to a constant (writing 
only to 1 partition), it will be evaluated as a constant. Therefore the Sort 
Node will get a constant ordering expression and works unnecessarily ordering 
indistinct values.
For example:

{code:java}
create table ice_part partitioned by spec (l_discount) stored by iceberg 
tblproperties('format-version'='2') as select * from tpch_parquet.lineitem 
where l_linenumber=1;
explain update ice_part set l_discount=0.11 where l_discount>0.07;
{code}

The output of explain - the plan - contains a Sort Node with the following 
ordering expression:

{code:java}
04:SORT
|  order by: 0.11 ASC NULLS LAST
{code}

It is unnecessary to create a Sort Node to sort rows by a constant ordering 
expression. Constant expressions should be omitted,  just like empty ones.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-13191) Do Not Create Sort Node with Constant Ordering Expression

2024-07-01 Thread Noemi Pap-Takacs (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-13191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noemi Pap-Takacs updated IMPALA-13191:
--
Labels: performance  (was: easyfix performance)

> Do Not Create Sort Node with Constant Ordering Expression
> -
>
> Key: IMPALA-13191
> URL: https://issues.apache.org/jira/browse/IMPALA-13191
> Project: IMPALA
>  Issue Type: Bug
>  Components: fe, Frontend
>Reporter: Noemi Pap-Takacs
>Priority: Major
>  Labels: performance
>
> Rows are sorted before inserting into partitioned Iceberg tables. See 
> Planner.createPreDmlSort().
> If we update the partitioning column of the table, the sort ordering 
> expression will be the partition column. If we set the new value to a 
> constant (writing only to 1 partition), it will be evaluated as a constant. 
> Therefore the Sort Node will get a constant ordering expression and works 
> unnecessarily ordering indistinct values.
> For example:
> {code:java}
> create table ice_part partitioned by spec (l_discount) stored by iceberg 
> tblproperties('format-version'='2') as select * from tpch_parquet.lineitem 
> where l_linenumber=1;
> explain update ice_part set l_discount=0.11 where l_discount>0.07;
> {code}
> The output of explain - the plan - contains a Sort Node with the following 
> ordering expression:
> {code:java}
> 04:SORT
> |  order by: 0.11 ASC NULLS LAST
> {code}
> It is unnecessary to create a Sort Node to sort rows by a constant ordering 
> expression. Constant expressions should be omitted,  just like empty ones.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org