[ https://issues.apache.org/jira/browse/HADOOP-16202?focusedWorklogId=536032&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-536032 ]
ASF GitHub Bot logged work on HADOOP-16202: ------------------------------------------- Author: ASF GitHub Bot Created on: 14/Jan/21 14:58 Start Date: 14/Jan/21 14:58 Worklog Time Spent: 10m Work Description: steveloughran commented on a change in pull request #2168: URL: https://github.com/apache/hadoop/pull/2168#discussion_r557456802 ########## File path: hadoop-common-project/hadoop-common/src/site/markdown/filesystem/openfile.md ########## @@ -0,0 +1,290 @@ +<!--- + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. See accompanying LICENSE file. +--> + +# `openFile()` + +Create a builder to open a file, supporting options +both standard and filesystem specific. The return +value of the `build()` call is a `Future<FSDataInputStream>`, +which must be waited on. The file opening may be +asynchronous, and it may actually be postponed (including +permission/existence checks) until reads are actually +performed. + +This API call was added to `FileSystem` and `FileContext` in +Hadoop 3.3.0; it was tuned in Hadoop 3.3.1 as follows. + +* Support `opt(key, long)` and `must(key, long)`. +* Declare that `withFileStatus(null)` is allowed. +* Declare that `withFileStatus(status)` only checks + the filename of the path, not the full path. + This is needed to support passthrough/mounted filesystems. + + +### <a name="openfile(path)"></a> `FSDataInputStreamBuilder openFile(Path path)` Review comment: oops. Will do. I won't rename the fsdatainputstreambuilder.md file though -it's already shipped and I don't want to break links across versions. Will change all text references ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 536032) Time Spent: 7.5h (was: 7h 20m) > Stabilize openFile() and adopt internally > ----------------------------------------- > > Key: HADOOP-16202 > URL: https://issues.apache.org/jira/browse/HADOOP-16202 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs, fs/s3, tools/distcp > Affects Versions: 3.3.0 > Reporter: Steve Loughran > Assignee: Steve Loughran > Priority: Major > Labels: pull-request-available > Time Spent: 7.5h > Remaining Estimate: 0h > > The {{openFile()}} builder API lets us add new options when reading a file > Add an option {{"fs.s3a.open.option.length"}} which takes a long and allows > the length of the file to be declared. If set, *no check for the existence of > the file is issued when opening the file* > Also: withFileStatus() to take any FileStatus implementation, rather than > only S3AFileStatus -and not check that the path matches the path being > opened. Needed to support viewFS-style wrapping and mounting. > and Adopt where appropriate to stop clusters with S3A reads switched to > random IO from killing download/localization > * fs shell copyToLocal > * distcp > * IOUtils.copy -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org