[ 
https://issues.apache.org/jira/browse/HADOOP-7973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13187777#comment-13187777
 ] 

Robert Joseph Evans commented on HADOOP-7973:
---------------------------------------------

{quote}Is FsShell a publicly supported API now?{quote}
FsShell is marked as @InterfaceAudiance.Private on trunk, so no it is not a 
publicly supported API.  However it is used directly by Pig, and possibly 
others.  The use that we are referring to is an oozie action like the following.

{code}
<action name="copy">
    <java>
       <job-tracker>${jobTracker}</job-tracker>
       <name-node>${nameNode}</name-node>
       <configuration>
            <property>
                <name>mapred.job.queue.name</name>
                <value>${queueName}</value>
            </property>
        </configuration>
        <main-class>org.apache.hadoop.fs.FsShell</main-class>
        <arg>-cp</arg>
        <arg>${from}</arg>
        <arg>${to}</arg>
   </java>
</action>
{code}

This is more or less the same as calling {code}hadoop fs -cp $from $to{code} it 
is done this way because oozie does not support copy from the fs action, 
because oozie does not want significant amounts of data flowing to or from the 
node oozie is running on.  Yes this technically is a violation of our interface 
visibility guidelines, but only very slightly, because it is trying to act very 
much like {code}hadoop fs{code} which is a public interface.  I am OK with 
telling the customer to fix their usage of this long term, because this is not 
what they are supposed to do.  We have already told them this, but the practice 
is quite pervasive. 

It worked before, it no longer works, and this is simply because our internal 
code, FsShell, is ignoring the guidelines that we tell everyone else to follow. 
 Don't call FileSystem.close.  Which kind of reminds me of that scene from "The 
Emperor's new Groove" ["Why do we even have that 
lever"|http://www.youtube.com/watch?v=AGdFiA0A_c0]  If this API is not supposed 
to be called, then why has it not been deprecated, and replaced with something 
that has cleaner semantics that users actually understand.

{quote}I've seen this bite users as well but its more so cause they do not 
understand how to use the FS objects than anything else:{quote}
That seems to point to me that there is something wrong with the API if people 
who use our main interface have to have a deep understanding of how FileSystem 
caching works, and what is more that it can be disabled.  I believe that we may 
want to leave FileSystem.close in place but deprecate it, and provide a method 
that does the expected behavior of closing the FileSystem if it is not part of 
the cache, or nothing if it is part of the cache.  At the same time, we update 
FsShell to use this new API.

I want to reiterate that I am not condoning the behavior that has exposed this 
issue.  But we have customers that are doing this, and I would really like to 
unblock them.  Especially if I can unblock them with a tiny change on our part 
instead of a massive change on their part.  Especially if doing so seems to fix 
an API that is causing problems. 
                
> DistributedFileSystem close has severe consequences
> ---------------------------------------------------
>
>                 Key: HADOOP-7973
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7973
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 1.0.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-7973.patch
>
>
> The way {{FileSystem#close}} works is very problematic.  Since the 
> {{FileSystems}} are cached, any {{close}} by any caller will cause problems 
> for every other reference to it.  Will add more detail in the comments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to