Re: [VOTE] Release Apache Spark 1.0.0 (RC10)

Colin McCabe Thu, 22 May 2014 12:06:57 -0700

The FileSystem cache is something that has caused a lot of pain over the
years.  Unfortunately we (in Hadoop core) can't change the way it works now
because there are too many users depending on the current behavior.

Basically, the idea is that when you request a FileSystem with certain
options with FileSystem#get, you might get a reference to an FS object that
already exists, from our FS cache cache singleton.  Unfortunately, this
also means that someone else can change the working directory on you or
close the FS underneath you.  The FS is basically shared mutable state, and
you don't know whom you're sharing with.

It might be better for Spark to call FileSystem#newInstance, which bypasses
the FileSystem cache and always creates a new object.  If Spark can hang on
to the FS for a while, it can get the benefits of caching without the
downsides.  In HDFS, multiple FS instances can also share things like the
socket cache between them.

best,
Colin

On Thu, May 22, 2014 at 10:06 AM, Marcelo Vanzin <van...@cloudera.com>wrote:

> Hi Kevin,
>
> On Thu, May 22, 2014 at 9:49 AM, Kevin Markey <kevin.mar...@oracle.com>
> wrote:
> > The FS closed exception only effects the cleanup of the staging
> directory,
> > not the final success or failure.  I've not yet tested the effect of
> > changing my application's initialization, use, or closing of FileSystem.
>
> Without going and reading more of the Spark code, if your app is
> explicitly close()'ing the FileSystem instance, it may be causing the
> exception. If Spark is caching the FileSystem instance, your app is
> probably closing that same instance (which it got from the HDFS
> library's internal cache).
>
> It would be nice if you could test that theory; it might be worth
> knowing that's the case so that we can tell people not to do that.
>
> --
> Marcelo
>

Re: [VOTE] Release Apache Spark 1.0.0 (RC10)

Reply via email to