RE: Re: Regarding HDFS and YARN support for S3

Naganarasimha G R (Naga) Sun, 28 Sep 2014 01:26:51 -0700

Hi Jay,
Thanks a lot for replying and it clarifies most of it, but still some parts are 
not so clear .
Some clarifications from my side :
| When you say "HDFS does not support fs.AbstractFileSystem.s3.impl".... That 
is true.  If your file system is configured using HDFS, then s3 urls will not 
be used, ever.
:) i think i am not doing this basic mistake . What we have done is we have 
configured "viewfs://nsX" for "fs.defaultFS" and one of the mount is S3 i.e. 
"fs.viewfs.mounttable.nsX.link./uds" to "s3://hadoop/test1/".
So it fails to even create YARNRunner instance as there is no mapping for 
"fs.AbstractFileSystem.s3.impl" if run "./yarn jar". But as per the code even 
if set "fs.defaultFS" to s3 it will not work as there is no mapping for S3's 
impl of AbstractFileSystem interface.


These are my further queries

  1.  Whats the purpose of AbstractFileSystem and FileSystem interfaces?
  2.  Does HDFS default package(code) support configuration of S3 ? I see S3 
implementation of FileSystem interface(org.apache.hadoop.fs.s3.S3FileSystem) 
but not for AbstractFileSystem !. So i presume it doesn't support S3 
completely. Whats the reason for not supporting both ?
  3.  Suppose if i need to support Amazon S3 do i need to extend and implement 
AbstractFileSystem and configure  "fs.AbstractFileSystem.s3.impl" or some thing 
more than this i need to take care?

Regards,

Naga



Huawei Technologies Co., Ltd.
Phone:
Fax:
Mobile:  +91 9980040283
Email: naganarasimh...@huawei.com<mailto:naganarasimh...@huawei.com>
Huawei Technologies Co., Ltd.
Bantian, Longgang District,Shenzhen 518129, P.R.China
http://www.huawei.com


________________________________
From: jay vyas [jayunit100.apa...@gmail.com]
Sent: Saturday, September 27, 2014 02:41
To: common-u...@hadoop.apache.org
Subject: Re:

See https://wiki.apache.org/hadoop/HCFS/

YES Yarn is written to the FileSystem interface.  It works on S3FileSystem and 
GlusterFileSystem and any other HCFS.

We have run , and continue to run, the many tests in apache bigtop's test suite 
against our hadoop clusters running on alternative file system implementations,
and it works.

When you say "HDFS does not support fs.AbstractFileSystem.s3.impl".... That is 
true.  If your file system is configured using HDFS, then s3 urls will not be 
used, ever.

When you create a FileSystem object in hadoop, it reads the uri (i.e. 
"glusterfs:///") and then finds the file system binding in your core-site.xml 
(i.e. fs.AbstractFileSystem.glusterfs.impl).

So the URI must have a corresponding entry in the core-site.xml.

As a reference implementation, you can see 
https://github.com/gluster/glusterfs-hadoop/blob/master/conf/core-site.xml




On Fri, Sep 26, 2014 at 10:10 AM, Naganarasimha G R (Naga) 
<garlanaganarasi...@huawei.com<mailto:garlanaganarasi...@huawei.com>> wrote:
Hi All,

I have following doubts on pluggable FileSystem and YARN
1. If all the implementations should extend FileSystem then why there is a 
parallel class AbstractFileSystem. which ViewFS extends ?
2. Is YARN supposed to run on any of the pluggable 
org.apache.hadoop.fs.FileSystem like s3 ?
if its suppose to run then when submitting a job in the client side  YARNRunner 
is calling FileContext.getFileContext(this.conf);
which is further calling FileContext.getAbstractFileSystem() which throws 
exception for S3.
So i am not able to run YARN job with ViewFS with S3 as mount. And based on the 
code even if i configure only S3 then also its going to fail.
3. HDFS does not support "fs.AbstractFileSystem.s3.impl" with some default 
class similar to org.apache.hadoop.fs.s3.S3FileSystem ?


Regards,

Naga



Huawei Technologies Co., Ltd.
Phone:
Fax:
Mobile:  +91 9980040283<tel:%2B91%209980040283>
Email: naganarasimh...@huawei.com<mailto:naganarasimh...@huawei.com>
Huawei Technologies Co., Ltd.
http://www.huawei.com




--
jay vyas

RE: Re: Regarding HDFS and YARN support for S3

Reply via email to