Re: [VOTE] Release Apache Spark 1.0.0 (RC10)

Colin McCabe Thu, 22 May 2014 15:05:39 -0700

On Thu, May 22, 2014 at 12:48 PM, Aaron Davidson <ilike...@gmail.com> wrote:


> In Spark 0.9.0 and 0.9.1, we stopped using the FileSystem cache correctly,
> and we just recently resumed using it in 1.0 (and in 0.9.2) when this issue
> was fixed: https://issues.apache.org/jira/browse/SPARK-1676
>

Interesting...


> Prior to this fix, each Spark task created and cached its own FileSystems
> due to a bug in how the FS cache handles UGIs. The big problem that arose
> was that these FileSystems were never closed, so they just kept piling up.
> There were two solutions we considered, with the following effects: (1)
> Share the FS cache among all tasks and (2) Each task effectively gets its
> own FS cache, and closes all of its FSes after the task completes.
>

Since the FS cache is in hadoop-common-project, it's not so much a bug in
HDFS as a bug in Hadoop.  So even if you're using, say, Lustre, you'll
still get the same issues with org.apache.hadoop.fs.FileSystem and its
global cache.

We chose solution (1) for 3 reasons:
>  - It does not rely on the behavior of a bug in HDFS.

 - It is the most performant option.
>  - It is most consistent with the semantics of the (albeit broken) FS
> cache.
>
> Since this behavior was changed in 1.0, it could be considered a
> regression. We should consider the exact behavior we want out of the FS
> cache. For Spark's purposes, it seems fine to cache FileSystems across
> tasks, as Spark does not close FileSystems. The issue that comes up is that
> user code which uses FileSystem.get() but then closes the FileSystem can
> screw up Spark processes which were using that FileSystem. The workaround
> for users would be to use FileSystem.newInstance() if they want full
> control over the lifecycle of their FileSystems.
>

The current solution seems reasonable, as long as Spark processes:
1. don't change the current working directory (doing so isn't thread-safe
and will affect all other users of that FS object)
2. don't close the FileSystem object

Another solution would be to use newInstance and build your own FS cache,
essentially.  I don't think it would be that much code.  This might be
nicer because you could implement things like closing FileSystem objects
that haven't been used in a while.

cheers,
Colin



> On Thu, May 22, 2014 at 12:06 PM, Colin McCabe <cmcc...@alumni.cmu.edu
> >wrote:
>
> > The FileSystem cache is something that has caused a lot of pain over the
> > years.  Unfortunately we (in Hadoop core) can't change the way it works
> now
> > because there are too many users depending on the current behavior.
> >
> > Basically, the idea is that when you request a FileSystem with certain
> > options with FileSystem#get, you might get a reference to an FS object
> that
> > already exists, from our FS cache cache singleton.  Unfortunately, this
> > also means that someone else can change the working directory on you or
> > close the FS underneath you.  The FS is basically shared mutable state,
> and
> > you don't know whom you're sharing with.
> >
> > It might be better for Spark to call FileSystem#newInstance, which
> bypasses
> > the FileSystem cache and always creates a new object.  If Spark can hang
> on
> > to the FS for a while, it can get the benefits of caching without the
> > downsides.  In HDFS, multiple FS instances can also share things like the
> > socket cache between them.
> >
> > best,
> > Colin
> >
> >
> > On Thu, May 22, 2014 at 10:06 AM, Marcelo Vanzin <van...@cloudera.com
> > >wrote:
> >
> > > Hi Kevin,
> > >
> > > On Thu, May 22, 2014 at 9:49 AM, Kevin Markey <kevin.mar...@oracle.com
> >
> > > wrote:
> > > > The FS closed exception only effects the cleanup of the staging
> > > directory,
> > > > not the final success or failure.  I've not yet tested the effect of
> > > > changing my application's initialization, use, or closing of
> > FileSystem.
> > >
> > > Without going and reading more of the Spark code, if your app is
> > > explicitly close()'ing the FileSystem instance, it may be causing the
> > > exception. If Spark is caching the FileSystem instance, your app is
> > > probably closing that same instance (which it got from the HDFS
> > > library's internal cache).
> > >
> > > It would be nice if you could test that theory; it might be worth
> > > knowing that's the case so that we can tell people not to do that.
> > >
> > > --
> > > Marcelo
> > >
> >
>

Re: [VOTE] Release Apache Spark 1.0.0 (RC10)

Reply via email to