[ 
https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350369#comment-14350369
 ] 

Sean Busbey commented on HDFS-6200:
-----------------------------------

The dependencies you bring with you are an integral part of the interface you 
define for downstream clients. While I agree that it can be a separate subtask, 
it has to be considered as part of how you structure the overall approach.

{quote}
Unfortunately the dependency is a real one – the webhdfs server on DN uses 
DFSClient to read data from HDFS.
{quote}

Our own internal use of client interfaces isn't the same thing as downstream 
application uses. For one, we don't have to worry about what dependencies we 
bring with us in the internal case because by definition we're in control of 
both the client interface and the place it's being used.

In the approach I'm suggesting the original code for the client would still 
live in hadoop-hdfs, so the webhdfs server would be free to use on DFSClient. 
If that is unappealing for some reason, perhaps we should structure things with 
an internal client artifact. e.g.
{noformat}
    hadoop-hdfs -- depends on --> hadoop-hdfs-client-internal
    hadoop-hdfs-client -- depends on --> hadoop-hdfs-client-internal
{noformat}

> Create a separate jar for hdfs-client
> -------------------------------------
>
>                 Key: HDFS-6200
>                 URL: https://issues.apache.org/jira/browse/HDFS-6200
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Haohui Mai
>            Assignee: Haohui Mai
>         Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, 
> HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, 
> HDFS-6200.005.patch, HDFS-6200.006.patch, HDFS-6200.007.patch
>
>
> Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs 
> client. As discussed in the hdfs-dev mailing list 
> (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser),
>  downstream projects are forced to bring in additional dependency in order to 
> access hdfs. The additional dependency sometimes can be difficult to manage 
> for projects like Apache Falcon and Apache Oozie.
> This jira proposes to create a new project, hadoop-hdfs-cliient, which 
> contains the client side of the hdfs code. Downstream projects can use this 
> jar instead of the hadoop-hdfs to avoid unnecessary dependency.
> Note that it does not break the compatibility of downstream projects. This is 
> because old downstream projects implicitly depend on hadoop-hdfs-client 
> through the hadoop-hdfs jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to