[jira] [Commented] (HADOOP-16101) Use lighter-weight alternatives to innerGetFileStatus where possible
[ https://issues.apache.org/jira/browse/HADOOP-16101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17488919#comment-17488919 ] Monthon Klongklaew commented on HADOOP-16101: - After some investigation I think it is light-weight enough at this point. The innerGetFileStatus became a lot simpler since S3Guard was removed. We have https://issues.apache.org/jira/browse/HADOOP-17415 which cover the file reading without initial HEAD request. Should we consider closing this one and create a new task for rename builder with FileStatus param? > Use lighter-weight alternatives to innerGetFileStatus where possible > > > Key: HADOOP-16101 > URL: https://issues.apache.org/jira/browse/HADOOP-16101 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Sean Mackrory >Priority: Major > > Discussion in HADOOP-15999 highlighted the heaviness of a full > innerGetFileStatus call, where many usages of it may need a lighter weight > fileExists, etc. Let's investigate usage of innerGetFileStatus and slim it > down where possible. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16101) Use lighter-weight alternatives to innerGetFileStatus where possible
[ https://issues.apache.org/jira/browse/HADOOP-16101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16825160#comment-16825160 ] Steve Loughran commented on HADOOP-16101: - Update, I think the new openFile() builder should take a withSource(FileStatus) param. If you already have the file status: no need to repeat yourself. We can do the same for rename. > Use lighter-weight alternatives to innerGetFileStatus where possible > > > Key: HADOOP-16101 > URL: https://issues.apache.org/jira/browse/HADOOP-16101 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Sean Mackrory >Priority: Major > > Discussion in HADOOP-15999 highlighted the heaviness of a full > innerGetFileStatus call, where many usages of it may need a lighter weight > fileExists, etc. Let's investigate usage of innerGetFileStatus and slim it > down where possible. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16101) Use lighter-weight alternatives to innerGetFileStatus where possible
[ https://issues.apache.org/jira/browse/HADOOP-16101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16764855#comment-16764855 ] Steve Loughran commented on HADOOP-16101: - I've thought about not doing the HEAD first -see HADOOP-13712 We've been constrained by the expectation that "if the file doesn't exist, open() must fail". With the new openFile() and its future<> response, we have a bit more leeway. h3. now may be the time to change the spec there and say "if you open a file with openFile(), failures may not surface until the stream is read()". FWIW, even though getFileStatus is doing three checks, in the successful path "the file is present", only that initial HEAD is used. The failure case does do three calls, with the last two essentially choosing between FNFE and some path is directory exception (which may be FNFE anyway, as some filesystems do). Because its the failure path, optimising that is probably less beneficial than saving 200ms on every file open, which could be done if we purge that initial HEAD and go straight for the GET on read. > Use lighter-weight alternatives to innerGetFileStatus where possible > > > Key: HADOOP-16101 > URL: https://issues.apache.org/jira/browse/HADOOP-16101 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Sean Mackrory >Priority: Major > > Discussion in HADOOP-15999 highlighted the heaviness of a full > innerGetFileStatus call, where many usages of it may need a lighter weight > fileExists, etc. Let's investigate usage of innerGetFileStatus and slim it > down where possible. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16101) Use lighter-weight alternatives to innerGetFileStatus where possible
[ https://issues.apache.org/jira/browse/HADOOP-16101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16763826#comment-16763826 ] Sean Mackrory commented on HADOOP-16101: The other detail here is that we don't think we need to do a HEAD at all before doing the actual GET work when reading. > Use lighter-weight alternatives to innerGetFileStatus where possible > > > Key: HADOOP-16101 > URL: https://issues.apache.org/jira/browse/HADOOP-16101 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Sean Mackrory >Priority: Major > > Discussion in HADOOP-15999 highlighted the heaviness of a full > innerGetFileStatus call, where many usages of it may need a lighter weight > fileExists, etc. Let's investigate usage of innerGetFileStatus and slim it > down where possible. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org