[ 
https://issues.apache.org/jira/browse/IMPALA-10579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang reassigned IMPALA-10579:
---------------------------------------

    Assignee: Quanlong Huang

> Deadloop in table metadata loading when using an invalid RemoteIterator
> -----------------------------------------------------------------------
>
>                 Key: IMPALA-10579
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10579
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Critical
>
> The file listing thread in catalogd will go into a dead loop if it gets a 
> RemoteIterator on a non-existing path. The first call of the 
> RemoteIterator.hasNext() will throw a FileNotFoundException. However, this 
> exception will be catched and the loop will continue, which results in a dead 
> loop. Related codes: 
> [https://github.com/apache/impala/blob/d89c04bf806682d3449c566ce979632bd2ac5b29/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java#L789-L814]
> {code:java}
>   static class FilterIterator implements RemoteIterator<FileStatus> {
>     ...
>     public boolean hasNext() throws IOException {
>       ...
>       while (curFile_ == null) {
>         FileStatus next;
>         try {
>           if (!baseIterator_.hasNext()) return false; // <---- throws 
> FileNotFoundException
>           ...
>           next = baseIterator_.next();
>         } catch (FileNotFoundException ex) {
>           ...
>           LOG.warn(ex.getMessage());
>           continue;  // <--------- catch the exception and continue into a 
> dead loop
>         }
>         if (!isInIgnoredDirectory(startPath_, next)) {
>           curFile_ = next;
>           return true;
>         }
>       }
>       return true;
>     }
> {code}
> *When will the path to be loading not exist?*
>  It happens when metadata (table/partition location) in HMS still have the 
> path. But it's actually removed from the storage.
> *When will impala get such an invalid RemoteIterator?*
>  For FileSystem implementations that don't override the 
> FileSystem#listStatusIterator() interface, e.g. S3AFileSystem before 
> HADOOP-17281, AzureBlobFileSystem, and GoogleHadoopFileSystem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to