[jira] Commented: (HADOOP-6223) New improved FileSystem interface for those implementing new files systems.

Sanjay Radia (JIRA) Mon, 14 Sep 2009 17:07:27 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12755259#action_12755259
 ]


Sanjay Radia commented on HADOOP-6223:
--------------------------------------

@Doug >I do not yet see a case for rename2.
I understand that Owen presented to you the case for rename2 and option 1 in 
private discussions last week and that you were not convinced.

Let me summarize the case for option 1 and rename 2 for the benefit of the rest 
of the community.
Please refer to the options 1 and options 2 above in this Jira. Further note 
that *both* options get us to the *same end state*: a new parallel stack where 
applications call FileContext which in turn calls AFS-impls. Option 1 uses the 
existing FileSystem APi and implementation in the early phase as we migrate 
from old stack to new stack. As I have mentioned above I have been going with 
option1 instead of 2; my patches in HADOOP-4952 have been based on option 1. I 
did strongly consider option 2 but felt that it raised the risk in this project 
(details below).

*Does the FileSystem API have to be enhanced to support  FileContext?*
Yes. If you look at the patches for FileContext (HADOOP-4952) they have added 3 
protected methods: getInitialWorkingDir(), createAbsPerm() mkdirsAbsPerm() (btw 
in the latest patch the last two methods were renamed to primitveCreate() and 
primitiveMkdirs().
These 3 methods were all declared protected and hence not visible to the 
applications. Once we have the full new stack, these methods can be deleted.

*What has this got to do with rename2()?*
Turns out that our rename implementation is broken. Fixing the 
FileSystem#rename spec would potentially break applications. Given that we are 
introducing a new fs api (FileContext) it has been proposed that we leave the 
old FileSystem#rename & its spec as is and simply add a new *protected* method 
FileSystem#rename2() - its sole purpose is to support FileContext#rename like 
the other 3 protected methods mentioned above.

*Why did you choose to go with Option 1 and not option 2.*
Option 1 was easier to get started because it leveraged existing FileSystem to 
the fullest.  AFS on the other hand was debated as as soon as it started and 
further the option 2 was questioned. I felt that the community needed some time 
to digest this Jira. Comments from 3 folks is very little in contrast to the 
large number of comments in FileContext Jira. Further, my intuition told me 
that there were a number details to be resolved. The FileSystem design and 
implementation are very messy and I didn't want to simply carry forward its 
design without debate. 

Over the weekend, as I explored option 2 , my intuition was correct: here is a 
list of issues to be resolved for AFS. While none of them are impossible to 
solve, they are not trivial either. 
* where should the cache go? In FC or AFS.? Is the cache keyed off the config 
or not (the cache is FS seems to be somewhat tied to the config. - I think we 
need to look at that closely). The cache has leaked through the FileSytem API - 
I would like to avoid that for AFS.
* Delete-on-exit - should we raise it to FC or leave it in AFS. There are 
certain assumptions made by the current delete-on-exit that seem incorrect and 
should be revisited.
• What do we do about the public close method?
*  Statistics features in FS. - where does it go in the new world.
Given the above,  I had felt it was wiser to go with option 1 since its only 
cost is a few protected methods. Further, even in option 2 these protected 
methods would have helped us would have simplified delegation from AFS to 
FileSystem.

It had always been my goal that as soon as the FileContext was committed I 
would complete this AFS  jira and perhaps even switch from option 1 to option 2 
midway if there was sufficient time. 

So far I don't understand the objections to option1 (and to rename2) ; 
protected methods seems reasonable in this situation. Is this a style issue? If 
the objections are minor I feel it is better to give this AFS jira sufficient 
time for community discussion and go with option 1. If there are serious 
objections to Option 1 then by all means lets put all the wood behind the 
option 2 arrow.

BTW Option 1 would have been completed by this Friday according to our original 
plan. Option 2 will not be completed by the freeze date on Friday but we have 
started work on it.


> New improved FileSystem interface for those implementing new files systems.
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-6223
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6223
>             Project: Hadoop Common
>          Issue Type: Sub-task
>            Reporter: Sanjay Radia
>            Assignee: Sanjay Radia
>
> The FileContext API (HADOOP-4952) provides an improved interface for the 
> application writer.
> This lets us simplify the FileSystem API since it will no longer need to deal 
> with notions of default filesystem [ / ],  wd, and config
> defaults for blocksize, replication factor etc. Further it will not need the 
> many overloaded methods for create() and open() since
> the FileContext API provides that convenience.
> The FileSystem API can be simplified and can now be restricted to those 
> implementing new file systems.
> This jira proposes that we create new file system API,  and deprecate 
> FileSystem API after a few releases.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-6223) New improved FileSystem interface for those implementing new files systems.

Reply via email to