[jira] [Commented] (HADOOP-16458) LocatedFileStatusFetcher scans failing intermittently against S3 store

2021-07-14 Thread Matteo Martignon (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380756#comment-17380756
 ] 

Matteo Martignon commented on HADOOP-16458:
---

We are facing the same issue of intermittent failures on GCP with DataFusion 
service. org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input 
Pattern gs:///*/*/*/*.parquet matches 0 files.

> LocatedFileStatusFetcher scans failing intermittently against S3 store
> --
>
> Key: HADOOP-16458
> URL: https://issues.apache.org/jira/browse/HADOOP-16458
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
> Environment: S3 + S3Guard
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Fix For: 3.3.0
>
>
> Intermittent failure of LocatedFileStatusFetcher.getFileStatuses(), which is 
> using globStatus to find files.
> I'd say "turn s3guard on" except this appears to be the case, and the dataset 
> being read is
> over 1h old.
> Which means it is harder than I'd like to blame S3 for what would sound like 
> an inconsistency
> We're hampered by the number of debug level statements in the globber code 
> being approximately none; there's no debugging to turn on. All we know is 
> that globFiles returns null without any explanation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16458) LocatedFileStatusFetcher scans failing intermittently against S3 store

2021-06-23 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17368077#comment-17368077
 ] 

Steve Loughran commented on HADOOP-16458:
-

hmm. Given the purpose of that change was to make the underlying cause of a 
failure visible, it's hard to feel *too* bad that the visibility of the root 
cause is now a problem. 

The exception being raised is still the same, so the change David has added is 
backwards compatible. The old code contained the assumption that the inner 
cause on InvalidInputException was always null. With the Hive fix the error log 
will now actually contain whatever the underlying cause of that 
InvalidInputException, which should benefit all.

> LocatedFileStatusFetcher scans failing intermittently against S3 store
> --
>
> Key: HADOOP-16458
> URL: https://issues.apache.org/jira/browse/HADOOP-16458
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
> Environment: S3 + S3Guard
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Fix For: 3.3.0
>
>
> Intermittent failure of LocatedFileStatusFetcher.getFileStatuses(), which is 
> using globStatus to find files.
> I'd say "turn s3guard on" except this appears to be the case, and the dataset 
> being read is
> over 1h old.
> Which means it is harder than I'd like to blame S3 for what would sound like 
> an inconsistency
> We're hampered by the number of debug level statements in the globber code 
> being approximately none; there's no debugging to turn on. All we know is 
> that globFiles returns null without any explanation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16458) LocatedFileStatusFetcher scans failing intermittently against S3 store

2021-06-22 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17367798#comment-17367798
 ] 

Wei-Chiu Chuang commented on HADOOP-16458:
--

Looks like it caused a small regression in Hive. See this comment in HIVE-24484.
https://issues.apache.org/jira/browse/HIVE-24484?focusedCommentId=17367606&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17367606

> LocatedFileStatusFetcher scans failing intermittently against S3 store
> --
>
> Key: HADOOP-16458
> URL: https://issues.apache.org/jira/browse/HADOOP-16458
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
> Environment: S3 + S3Guard
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Fix For: 3.3.0
>
>
> Intermittent failure of LocatedFileStatusFetcher.getFileStatuses(), which is 
> using globStatus to find files.
> I'd say "turn s3guard on" except this appears to be the case, and the dataset 
> being read is
> over 1h old.
> Which means it is harder than I'd like to blame S3 for what would sound like 
> an inconsistency
> We're hampered by the number of debug level statements in the globber code 
> being approximately none; there's no debugging to turn on. All we know is 
> that globFiles returns null without any explanation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16458) LocatedFileStatusFetcher scans failing intermittently against S3 store

2019-10-01 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16942172#comment-16942172
 ] 

Hudson commented on HADOOP-16458:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17429 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17429/])
HADOOP-16458. LocatedFileStatusFetcher.getFileStatuses failing (stevel: rev 
1921e94292f0820985a0cfbf8922a2a1a67fe921)
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.java
* (add) 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/auth/ITestRestrictedReadAccess.java
* (edit) 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/InvalidInputException.java
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/Globber.java
* (add) 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestLocatedFileStatusFetcher.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/LocatedFileStatusFetcher.java
* (add) 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AFSMainOperations.java
* (edit) 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Invoker.java
* (edit) 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/test/LambdaTestUtils.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/InvalidInputException.java


> LocatedFileStatusFetcher scans failing intermittently against S3 store
> --
>
> Key: HADOOP-16458
> URL: https://issues.apache.org/jira/browse/HADOOP-16458
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
> Environment: S3 + S3Guard
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Fix For: 3.3.0
>
>
> Intermittent failure of LocatedFileStatusFetcher.getFileStatuses(), which is 
> using globStatus to find files.
> I'd say "turn s3guard on" except this appears to be the case, and the dataset 
> being read is
> over 1h old.
> Which means it is harder than I'd like to blame S3 for what would sound like 
> an inconsistency
> We're hampered by the number of debug level statements in the globber code 
> being approximately none; there's no debugging to turn on. All we know is 
> that globFiles returns null without any explanation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org