[ 
https://issues.apache.org/jira/browse/HADOOP-15739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16612151#comment-16612151
 ] 

Steve Loughran commented on HADOOP-15739:
-----------------------------------------

mvn dependency analysis isn't 100% reliable as it doesn't notice transitive use 
or reflection

I prefer to dump the tree to a file and review by hand, which is what I see you 
doing too. That one diffs nicely.
{code}
mvn dependency:tree -Dverbose > target/dependencies.txt
{code}

# given junit is pulling in hamcrest-core, don't worry about it. Hamcrest 
library gives you more assertions (like on the state of collections)...its 
worth knowing what it offers and picking it up if you ever need them.
# jax-b & jsr311: implies something things its doing XML & REST calls. If 
nothing is doing that, fine.
# jackson-databind is part of the jackson set, which does need careful 
management, so if it is being used, yes, make it explicit.

regarding hadoop-cloud-storage/pom.xml , it's goal is to be the "lean" edition 
of all the cloud store artifacts, one which cuts out hadoop-common and expects 
apps to have imported it elsewhere (and excluded stuff they don't want).
See: [https://github.com/apache/spark/blob/master/hadoop-cloud/pom.xml#L219] 
for an example, noting that we still have to purge stuff. 

The goal I have is that
* if you depend on hadoop-azure direct, you get everything needed for wasb and 
abfs to work, including hadoop-common, hadoop-auth and other dependencies.
* if you depend on hadoop-cloud-storage, you get everything wasb and abfs to 
work, *excluding hadoop-common/hadoop-auth*, and excluding any other cruft 
which has got in
* That someone, some time will produce a hadoop-cloud-storage-shaded which also 
shades the store stuff, depends on the other shaded hadoop artifacts, and 
selectively uses shaded external JARs. (not the AWS-SDK, even though its 
brittle, as that's already a 50MB shaded artifact. For the azure SDKs it'd 
depend on how big it was and how traumatic updates proved to be)

For now, regarding those maven warnings, I'd recommend working out where those 
dependencies were being used, and make decisions about provided/compile/test 
based on those. 
 

> ABFS: remove unused maven dependencies and add used undeclared dependencies
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-15739
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15739
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: build, fs/azure
>    Affects Versions: HADOOP-15407
>            Reporter: Da Zhou
>            Assignee: Da Zhou
>            Priority: Major
>         Attachments: HADOOP-15739-HADOOP-15407-001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to