Hey Nathan,

I like the first idea better. Let's see what others think. I'd be happy to
review your PR afterwards!

Best,
Burak

On Thu, Jun 18, 2015 at 9:53 PM, Nathan McCarthy <
nathan.mccar...@quantium.com.au> wrote:

>  Hey,
>
>  Spark Submit adds maven central & spark bintray to the ChainResolver
> before it adds any external resolvers.
> https://github.com/apache/spark/blob/branch-1.4/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L821
>
>
>  When running on a cluster without internet access, this means the spark
> shell takes forever to launch as it tries these two remote repos before the
> ones specified in the --repositories list. In our case we have a proxy
> which the cluster can access it and supply it via —repositories. This is
> also a problem for users who maintain a proxy for maven/ivy repos with
> something like Nexus/Artifactory.
>
>  I see two options for a fix;
>
>    - Change the order repos are added to the ChainResolver, making the
>    --repositories supplied repos come before anything else.
>    
> https://github.com/apache/spark/blob/branch-1.4/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L843
>
>    - Have a config option (like spark.jars.ivy.useDefaultRemoteRepos,
>    default true) which when false wont add the maven central & bintry to the
>    ChainResolver.
>
> Happy to do a PR now for this if someone can give me a recommendation on
> which option would be better.
>
>  JIRA here; https://issues.apache.org/jira/browse/SPARK-8475
>
>  Cheers,
> Nathan
>
>

Reply via email to