Re: [DISCUSS] - Committing client code to 3rd Party FileSystems within Hadoop Common

Harsh J Thu, 23 May 2013 11:38:15 -0700

I think we do a fairly good work maintaining a stable and public FileSystem
and FileContext API for third-party plugins to exist outside of Apache
Hadoop but still be able to work well across versions.

The question of test pops up though, specifically that of testing against
trunk to catch regressions across various implementations, but it'd be much
work for us to also maintain glusterfs dependencies and mechanisms as part
of trunk.

We do provide trunk build snapshot artifacts publicly for downstream
projects to test against, which I think may help cover the continuous
testing concerns, if there are those.

Right now, I don't think the S3 FS we maintain really works all that well.
I also recall, per recent conversations on the lists, that AMZN has started
shipping their own library for a better implementation rather than
perfecting the implementation we have here (correct me if am wrong but I
think the changes were not all contributed back). I see some work going on
for OpenStack's Swift, for which I think Steve also raised a similar
discussion here: http://search-hadoop.com/m/W1S5h2SrxlG, but I don't recall
if the conversation proceeded at the time.

What's your perspective as the releaser though? Would you not find
maintaining this outside easier, especially in terms of maintaining your
code for quicker releases, for both bug fixes and features - also given
that you can CI it against Apache Hadoop trunk at the same time?

On Thu, May 23, 2013 at 11:47 PM, Stephen Watt <sw...@redhat.com> wrote:

> (Resending - I think the first time I sent this out it got lost within all
> the ByLaws voting)
>
> Hi Folks
>
> My name is Steve Watt and I am presently working on enabling glusterfs to
> be used as a Hadoop FileSystem. Most of the work thus far has involved
> developing a Hadoop FileSystem plugin for glusterfs. I'm getting to the
> point where the plugin is becoming stable and I've been trying to
> understand where the right place is to host/manage/version it.
>
> Steve Loughran was kind enough to point out a few past threads in the
> community (such as
> http://lucene.472066.n3.nabble.com/Need-to-add-fs-shim-to-use-QFS-td4012118.html)
> that show a project disposition to move away from Hadoop Common containing
> client code (plugins) for 3rd party FileSystems. This makes sense and
> allows the filesystem plugin developer more autonomy as well as reduces
> Hadoop Common's dependence on 3rd Party libraries.
>
> Before I embark down that path, can the PMC/Committers verify that the
> preference is still to have client code for 3rd Party FileSystems hosted
> and managed outside of Hadoop Common?
>
> Regards
> Steve Watt
>

-- 
Harsh J

Re: [DISCUSS] - Committing client code to 3rd Party FileSystems within Hadoop Common

Reply via email to