Josh Rosen created SPARK-20573:
----------------------------------

             Summary: --packages fails when transitive dependency can only be 
resolved from repository specified in POM's <repositories> tag
                 Key: SPARK-20573
                 URL: https://issues.apache.org/jira/browse/SPARK-20573
             Project: Spark
          Issue Type: Bug
          Components: Spark Submit
    Affects Versions: 2.1.0, 2.0.0
            Reporter: Josh Rosen


With a clean Ivy cache, run the following command:

{code}
./bin/spark-shell --packages com.twitter.elephantbird:elephant-bird-core:4.4
{code}

This will fail with {{unresolved dependency: 
com.hadoop.gplcompression#hadoop-lzo;0.4.16: not found}}.

 If you look at the elephant-bird-core POM (at 
http://central.maven.org/maven2/com/twitter/elephantbird/elephant-bird-core/4.4/elephant-bird-core-4.4.pom)
 you'll see a direct dependency on hadoop-lzo. This library is only present in 
Twitter's public Maven repository, hosted at http://maven.twttr.com.The 
elephant-bird-core POM does not directly declare Twitter's external repository. 
Instead, that external repository is inherited from elephant-bird-core's parent 
POM (at 
http://central.maven.org/maven2/com/twitter/elephantbird/elephant-bird/4.4/elephant-bird-4.4.pom).

>From the Ivy output it looks like it it didn't even attempt to resolve from 
>the Twitter repo:

{code}
:: problems summary ::
:::: WARNINGS
                module not found: com.hadoop.gplcompression#hadoop-lzo;0.4.16

        ==== local-m2-cache: tried

          
file:/Users/joshrosen/.m2/repository/com/hadoop/gplcompression/hadoop-lzo/0.4.16/hadoop-lzo-0.4.16.pom

          -- artifact 
com.hadoop.gplcompression#hadoop-lzo;0.4.16!hadoop-lzo.jar:

          
file:/Users/joshrosen/.m2/repository/com/hadoop/gplcompression/hadoop-lzo/0.4.16/hadoop-lzo-0.4.16.jar

        ==== local-ivy-cache: tried

          
/Users/joshrosen/.ivy2/local/com.hadoop.gplcompression/hadoop-lzo/0.4.16/ivys/ivy.xml

          -- artifact 
com.hadoop.gplcompression#hadoop-lzo;0.4.16!hadoop-lzo.jar:

          
/Users/joshrosen/.ivy2/local/com.hadoop.gplcompression/hadoop-lzo/0.4.16/jars/hadoop-lzo.jar

        ==== central: tried

          
https://repo1.maven.org/maven2/com/hadoop/gplcompression/hadoop-lzo/0.4.16/hadoop-lzo-0.4.16.pom

          -- artifact 
com.hadoop.gplcompression#hadoop-lzo;0.4.16!hadoop-lzo.jar:

          
https://repo1.maven.org/maven2/com/hadoop/gplcompression/hadoop-lzo/0.4.16/hadoop-lzo-0.4.16.jar

        ==== spark-packages: tried

          
http://dl.bintray.com/spark-packages/maven/com/hadoop/gplcompression/hadoop-lzo/0.4.16/hadoop-lzo-0.4.16.pom

          -- artifact 
com.hadoop.gplcompression#hadoop-lzo;0.4.16!hadoop-lzo.jar:

          
http://dl.bintray.com/spark-packages/maven/com/hadoop/gplcompression/hadoop-lzo/0.4.16/hadoop-lzo-0.4.16.jar

                ::::::::::::::::::::::::::::::::::::::::::::::

                ::          UNRESOLVED DEPENDENCIES         ::

                ::::::::::::::::::::::::::::::::::::::::::::::

                :: com.hadoop.gplcompression#hadoop-lzo;0.4.16: not found

                ::::::::::::::::::::::::::::::::::::::::::::::
{code}

If you manually specify the Twitter repository as an additional external 
repository then everything works fine.

This is a somewhat frustrating behavior from an end-user's point of view 
because unless they dig through the POMs themselves it is not obvious why 
things are broken or how to fix them. When Maven resolves this coordinate it 
properly fetches the transitive dependencies from the additional repositories 
specified in the referencing POMs. My hunch is that this behavior is caused by 
either a bug in Ivy itself or a bug in Spark's usage / configuration of the 
embedded Ivy resolver.

It would be great to see if we can find other test-cases to narrow down the 
scope of the bug. I'm wondering whether POM-specified repositories will work if 
they're specified in the POM of the top-level dependency being resolved. It 
would also be useful to determine whether Ivy handles additional repositories 
in the top-level of transitive dependencies' POMs: maybe the problem is the 
specific combination of transitive dep + repository inherited from that dep's 
parent POM.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to