[ 
https://issues.apache.org/jira/browse/HADOOP-13016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235234#comment-15235234
 ] 

Allen Wittenauer commented on HADOOP-13016:
-------------------------------------------


a) We should fix hadoop-hdfs-client to actually be the lean client rather than 
building Yet Another Client jar.  "Do I use hdfs-client or hdfs-lean-client?"

b) Adding aws is a huge can of worms and I'm very much against it:

  1. It's a slippery slope of file system support; everyone and their dog is 
going to want their custom fs in it.  Either they all get it or none do.
  2. It means moving aws back into the main classpath again, which means also 
getting extra dependencies that not all clusters want or need.
  3. You can't have a lean client with "minimal" components AND have aws 
support.  It's completely contradictory as to purpose.

AWS shouldn't have been moved like it was in branch-2, but the damage is done.  
It wasn't the first massively incompatible change in branch-2 and won't be the 
last given the track record.

> reinstate hadoop-hdfs as dependency of hadoop-client, create 
> hadoop-lean-client for minimal deployments
> -------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-13016
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13016
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: build
>    Affects Versions: 2.8.0
>            Reporter: Steve Loughran
>
> the split of hadoop-hdfs and hadoop-hdfs-client is breaking code of mine 
> whose builds declared a dependency on hadoop-client and expected all of HDFS 
> to make it in.
> I'm finding this first, because I'm building and testing downstream code 
> against branch-2; I find myself having to explicitly declare a dependency on 
> hadoop-hdfs to make things work again.
> We've also seen problems downstream (e.g. spark) where the move of s3n 
> classes to hadoop-aws has broken code which expects it to be there.
> At the same time, I see the merits in a lean, low-dependency client, which 
> hadoop-client and its dependencies is not today.
> I propose
> # reinstate hadoop-hdfs as dependency of hadoop-client
> # add hadoop-aws as a dependency of hadoop-client —but excluding adding any 
> amazon-aws JARs.
> # create hadoop-lean-client for minimal deployments, stripping out all 
> extraneous dependencies,
> # for hadoop-lean-client, have a compatibility statement of "we will strip 
> out anything we can from this, even over point releases". That is, anything 
> that can be dropped in future, will.
> This will give downstream projects a choice: the old POM with everything, the 
> lean POM for new apps.
> And, by reinstating hadoop-hdfs, things will build again



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to