[ 
https://issues.apache.org/jira/browse/HADOOP-12756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131602#comment-15131602
 ] 

shimingfei commented on HADOOP-12756:
-------------------------------------

Thanks for your detailed comments Steve.
OSS is very like S3, so the testing will be similar. we already have an 
implementation, and it works fine with our use cases and micro-benchmarks(sort 
and terasort) on both Hadoop and spark.

You are right that the work can be packaged as an independent jar, and users' 
app can load it as external library. But we think it is better to integrate it 
into Hadoop, as an module under hadoop tools for maintenance and ease of use 
purpose.

> Incorporate Aliyun OSS file system implementation
> -------------------------------------------------
>
>                 Key: HADOOP-12756
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12756
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs
>            Reporter: shimingfei
>            Assignee: shimingfei
>         Attachments: OSS integration.pdf
>
>
> Aliyun OSS is widely used among China’s cloud users, but currently it is not 
> easy to access data laid on OSS storage from user’s Hadoop/Spark application, 
> because of no original support for OSS in Hadoop.
> This work aims to integrate Aliyun OSS with Hadoop. By simple configuration, 
> Spark/Hadoop applications can read/write data from OSS without any code 
> change. Narrowing the gap between user’s APP and data storage, like what have 
> been done for S3 in Hadoop 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to