[ https://issues.apache.org/jira/browse/IMPALA-11509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Sherman reassigned IMPALA-11509: --------------------------------------- Assignee: Andrew Sherman > Dropping files of Iceberg during table loading may cause Impalad to stuck in > infinite loop > ------------------------------------------------------------------------------------------ > > Key: IMPALA-11509 > URL: https://issues.apache.org/jira/browse/IMPALA-11509 > Project: IMPALA > Issue Type: Bug > Components: Frontend > Affects Versions: Impala 4.1.0 > Reporter: Gabor Kaszab > Assignee: Andrew Sherman > Priority: Critical > Labels: iceberg, impala-iceberg > > This issues is very similar to > https://issues.apache.org/jira/browse/IMPALA-11502. The repro steps are also > almost identical, however in this case the folder of the table should be > dropped right when the INSERT into starts. > Repro steps: > 1) Create the Iceberg table: > {code:java} > DROP DATABASE IF EXISTS `drop_incomplete_table` CASCADE; > CREATE DATABASE `drop_incomplete_table`; > CREATE TABLE drop_incomplete_table.iceberg_tbl (i int) stored as iceberg > tblproperties('iceberg.catalog'='hadoop.catalog', > > 'iceberg.catalog_location'='/test-warehouse/drop_incomplete_table'); > {code} > 2) For this step timing is essential and might require a few try to hit the > issue. Try to run INSERT INTO and dropping the HDFS folder at the same time. > Manually executing them is fine, this doesn't require scripting. > {code:java} > INSERT INTO drop_incomplete_table.iceberg_tbl VALUES (1), (2), (3); > hdfs dfs -rm -r hdfs://localhost:20500/test-warehouse/drop_incomplete_table > {code} > You will notice you hit the issue when Impala shell start to hang. The jstack > of the hanging impalad (not the catalogd) will contain this for one of the > threads: > {code:java} > "Thread-15" #30 prio=5 os_prio=0 tid=0x000000000db2a000 nid=0x56f4 in > Object.wait() [0x00007f0e7b59a000] > java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at > org.apache.impala.catalog.ImpaladCatalog.waitForCatalogUpdate(ImpaladCatalog.java:290) > - locked <0x0000000724f7cdc0> (a java.lang.Object) > at > org.apache.impala.analysis.StmtMetadataLoader.loadTables(StmtMetadataLoader.java:229) > at > org.apache.impala.analysis.StmtMetadataLoader.loadTables(StmtMetadataLoader.java:141) > at > org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:2001) > at > org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1913) > at > org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1737) > at > org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:164) > {code} > Initially, Iceberg tables are created as IncompleteTables and when there is a > query on the table, they will be loaded as IcebergTable. For me it seems, > that when we run the first query after creating the table, with some timing > of dropping the files we can get into a state where the table appears as a > "missingTable" in StmtMetadataLoader.loadTable(), however, when a prioritized > table load is requested, the Catalog says that the table is already loaded. > This results the table always appearing as "missingTable" and we never get > out of the [while > loop|https://github.com/apache/impala/blob/62e20d1ba842a3f27395251c57dea9850f462fc9/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java#L196] > in loadTables(). > I managed to repro this using HiveCatalog, but I didn't have luck to repro > with non-Iceberg, traditional Hive tables. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org