[ https://issues.apache.org/jira/browse/IMPALA-9120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sahil Takiar updated IMPALA-9120: --------------------------------- Priority: Critical (was: Major) > Refreshing an ABFS table with a deleted directory fails > ------------------------------------------------------- > > Key: IMPALA-9120 > URL: https://issues.apache.org/jira/browse/IMPALA-9120 > Project: IMPALA > Issue Type: Bug > Components: Catalog > Reporter: Sahil Takiar > Assignee: Sahil Takiar > Priority: Critical > > The following fails on ABFS (but succeeds on HDFS): > {code:java} > hdfs dfs -mkdir /test-external-table > ./bin/impala-shell.sh > [localhost:21000] default> create external table (col int) location > '/test-external-table'; > [localhost:21000] default> select * from test; > hdfs dfs -rm -r -skipTrash /test-external-table > ./bin/impala-shell.sh > [localhost:21000] default> refresh test; > ERROR: TableLoadingException: Refreshing file and block metadata for 1 paths > for table default.test: failed to load 1 paths. Check the catalog server log > for more details.{code} > This causes the test > tests/query_test/test_hdfs_file_mods.py::TestHdfsFileMods::test_file_modifications[modification_type: > delete_directory | ...] to fail on ABFS as well. > The error from catalogd is: > {code:java} > E1104 22:38:53.748571 87486 ParallelFileMetadataLoader.java:102] Loading file > and block metadata for 1 paths for table test_file_modifications_d0471c2c.t1 > encountered an error loading data for path > abfss://[]@[].dfs.core.windows.net/test-warehouse/test_file_modifications_d0471c2c > Java exception follows: > java.util.concurrent.ExecutionException: java.io.FileNotFoundException: GET > https://[].dfs.core.windows.net/[]?resource=filesystem&maxResults=5000&directory=test-warehouse/test_file_modifications_d0471c2c&timeout=90&recursive=false > StatusCode=404 > StatusDescription=The specified path does not exist. > ErrorCode=PathNotFound > ErrorMessage=The specified path does not exist. > RequestId:[] > Time:2019-11-04T22:38:53.7469083Z > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.impala.catalog.ParallelFileMetadataLoader.load(ParallelFileMetadataLoader.java:99) > at > org.apache.impala.catalog.HdfsTable.loadFileMetadataForPartitions(HdfsTable.java:606) > at > org.apache.impala.catalog.HdfsTable.loadAllPartitions(HdfsTable.java:547) > at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:973) > at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:896) > at org.apache.impala.catalog.TableLoader.load(TableLoader.java:83) > at > org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:244) > at > org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:241) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.io.FileNotFoundException: GET > https://[].dfs.core.windows.net/[]?resource=filesystem&maxResults=5000&directory=test-warehouse/test_file_modifications_d0471c2c&timeout=90&recursive=false > StatusCode=404 > StatusDescription=The specified path does not exist. > ErrorCode=PathNotFound > ErrorMessage=The specified path does not exist. > RequestId:[] > Time:2019-11-04T22:38:53.7469083Z > at > org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.checkException(AzureBlobFileSystem.java:957) > at > org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.listStatus(AzureBlobFileSystem.java:351) > at > org.apache.hadoop.fs.FileSystem.listStatusBatch(FileSystem.java:1790) > at > org.apache.hadoop.fs.FileSystem$DirListingIterator.fetchMore(FileSystem.java:2058) > at > org.apache.hadoop.fs.FileSystem$DirListingIterator.hasNext(FileSystem.java:2047) > at > org.apache.impala.common.FileSystemUtil$RecursingIterator.hasNext(FileSystemUtil.java:722) > at > org.apache.impala.common.FileSystemUtil$FilterIterator.hasNext(FileSystemUtil.java:679) > at > org.apache.impala.catalog.FileMetadataLoader.load(FileMetadataLoader.java:166) > at > org.apache.impala.catalog.ParallelFileMetadataLoader.lambda$load$0(ParallelFileMetadataLoader.java:93) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:293) > at > com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:61) > at > com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:45) > at > org.apache.impala.catalog.ParallelFileMetadataLoader.load(ParallelFileMetadataLoader.java:93) > ... 11 more > Caused by: GET > https://[].dfs.core.windows.net/[]?resource=filesystem&maxResults=5000&directory=test-warehouse/test_file_modifications_d0471c2c&timeout=90&recursive=false > StatusCode=404 > StatusDescription=The specified path does not exist. > ErrorCode=PathNotFound > ErrorMessage=The specified path does not exist. > RequestId:[] > Time:2019-11-04T22:38:53.7469083Z > at > org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.execute(AbfsRestOperation.java:134) > at > org.apache.hadoop.fs.azurebfs.services.AbfsClient.listPath(AbfsClient.java:180) > at > org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listStatus(AzureBlobFileSystemStore.java:526) > at > org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.listStatus(AzureBlobFileSystem.java:348) > ... 23 more {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org