[jira] [Commented] (SPARK-11085) Add support for HTTP proxy

2016-06-02 Thread Ion Alberdi (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-11085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15311906#comment-15311906
 ] 

Ion Alberdi commented on SPARK-11085:
-

Hello to all, 
I reproduce the error using the docker in 
https://github.com/Yannael/kafka-sparkstreaming-cassandra. 
What I tried to communicate the -Dhttp.proxyHost, -Dhttp.proxyPort parameters: 

- the javaopts worarkound mentionned above

- setting, spark.executor.extraJavaOptions. The launched command becomes
% java org.apache.spark.deploy.SparkSubmit ... --conf 
spark.driver.extraJavaOptions=-Dhttp.proxyHost= 
-Dhttp.proxyPort=
I wonder whether the shell is able to parse that line and thus transfer the two 
(-Dhttp.proxyHost= and -Dhttp.proxyPort=) parameters to 
org.apache.spark.deploy.SparkSubmit

- setting "--driver-java-options" parameters that ends with
% java org.apache.spark.deploy.SparkSubmit ... -Dhttp.proxyHost= 
-Dhttp.proxyPort=
even if the shell seems more likely to parse the two informations, the packages 
are not downloaded as the http request do not go through the proxy





> Add support for HTTP proxy 
> ---
>
> Key: SPARK-11085
> URL: https://issues.apache.org/jira/browse/SPARK-11085
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Shell, Spark Submit
>Reporter: Dustin Cote
>Priority: Minor
>
> Add a way to update ivysettings.xml for the spark-shell and spark-submit to 
> support proxy settings for clusters that need to access a remote repository 
> through an http proxy.  Typically this would be done like:
> JAVA_OPTS="$JAVA_OPTS -Dhttp.proxyHost=proxy.host -Dhttp.proxyPort=8080 
> -Dhttps.proxyHost=proxy.host.secure -Dhttps.proxyPort=8080"
> Directly in the ivysettings.xml would look like:
>  
>  proxyport="8080" 
> nonproxyhosts="nonproxy.host"/> 
>  
> Even better would be a way to customize the ivysettings.xml with command 
> options.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-11085) Add support for HTTP proxy

2016-06-02 Thread Ion Alberdi (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-11085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15312244#comment-15312244
 ] 

Ion Alberdi commented on SPARK-11085:
-

More precisely, it seems that when a url like
http://dl.bintray.com/spark-packages/maven/com/datastax/spark/spark-cassandra-connector_2.11/1.6.0-M2/spark-cassandra-connector_2.11-1.6.0-M2.pom

is tried it goes through the proxy. 

However,
when going to a maven compatible link, like

https://repo1.maven.org/maven2/com/datastax/spark/spark-cassandra-connector_2.11/1.6.0-M2/spark-cassandra-connector_2.11-1.6.0-M2.jar

Then the proxy is not taken into account.



> Add support for HTTP proxy 
> ---
>
> Key: SPARK-11085
> URL: https://issues.apache.org/jira/browse/SPARK-11085
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Shell, Spark Submit
>Reporter: Dustin Cote
>Priority: Minor
>
> Add a way to update ivysettings.xml for the spark-shell and spark-submit to 
> support proxy settings for clusters that need to access a remote repository 
> through an http proxy.  Typically this would be done like:
> JAVA_OPTS="$JAVA_OPTS -Dhttp.proxyHost=proxy.host -Dhttp.proxyPort=8080 
> -Dhttps.proxyHost=proxy.host.secure -Dhttps.proxyPort=8080"
> Directly in the ivysettings.xml would look like:
>  
>  proxyport="8080" 
> nonproxyhosts="nonproxy.host"/> 
>  
> Even better would be a way to customize the ivysettings.xml with command 
> options.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-11085) Add support for HTTP proxy

2016-06-02 Thread Ion Alberdi (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-11085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15312367#comment-15312367
 ] 

Ion Alberdi commented on SPARK-11085:
-

To reproduce, on a network that needs an http_proxy to get to 
http://dl.bintray.com and https://repo1.maven.org.

% spark-shell --packages 
org.apache.spark:spark-streaming-kafka_2.11:1.6.1,com.datastax.spark:spark-cassandra-connector_2.11:1.6.1-M2
 --driver-java-options "-Dhttp.proxyHost= 
-Dhttp.proxyPort="
...

 spark-packages: tried

  
http://dl.bintray.com/spark-packages/maven/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.pom

  -- artifact 
org.apache.spark#spark-streaming-kafka_2.11;1.6.1!spark-streaming-kafka_2.11.jar:

  
http://dl.bintray.com/spark-packages/maven/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.jar

module not found: 
com.datastax.spark#spark-cassandra-connector_2.11;1.6.1-M2
Indeed,
http://dl.bintray.com/spark-packages/maven/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.pom
does not exist.

However, 
ERRORS
Server access error at url 
https://repo1.maven.org/maven2/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.pom
 (java.net.ConnectException: Connection timed out)

this is due to the proxy configuration not being taken into account, as
https://repo1.maven.org/maven2/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.pom
exists.
The difference between the two are
https://github.com/apache/spark/blob/0a3026990bd0cbad53f0001da793349201104958/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L904
one has the root set and not the other, which gets the url from apparently
https://github.com/apache/ant-ivy/blob/master/src/java/org/apache/ivy/plugins/resolver/IBiblioResolver.java#L71

I'm currently trying to figure out why the proxy is not taken using an 
IBiblioResolver that does not have its root set.






> Add support for HTTP proxy 
> ---
>
> Key: SPARK-11085
> URL: https://issues.apache.org/jira/browse/SPARK-11085
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Shell, Spark Submit
>Reporter: Dustin Cote
>Priority: Minor
>
> Add a way to update ivysettings.xml for the spark-shell and spark-submit to 
> support proxy settings for clusters that need to access a remote repository 
> through an http proxy.  Typically this would be done like:
> JAVA_OPTS="$JAVA_OPTS -Dhttp.proxyHost=proxy.host -Dhttp.proxyPort=8080 
> -Dhttps.proxyHost=proxy.host.secure -Dhttps.proxyPort=8080"
> Directly in the ivysettings.xml would look like:
>  
>  proxyport="8080" 
> nonproxyhosts="nonproxy.host"/> 
>  
> Even better would be a way to customize the ivysettings.xml with command 
> options.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-11085) Add support for HTTP proxy

2016-06-02 Thread Ion Alberdi (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-11085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15312367#comment-15312367
 ] 

Ion Alberdi edited comment on SPARK-11085 at 6/2/16 2:16 PM:
-

To reproduce, on a network that needs an http_proxy to get to 
http://dl.bintray.com and https://repo1.maven.org.

% spark-shell --packages 
org.apache.spark:spark-streaming-kafka_2.11:1.6.1,com.datastax.spark:spark-cassandra-connector_2.11:1.6.1-M2
 --driver-java-options "-Dhttp.proxyHost= 
-Dhttp.proxyPort="
...

 spark-packages: tried

  
http://dl.bintray.com/spark-packages/maven/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.pom

  -- artifact 
org.apache.spark#spark-streaming-kafka_2.11;1.6.1!spark-streaming-kafka_2.11.jar:

  
http://dl.bintray.com/spark-packages/maven/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.jar

module not found: 
com.datastax.spark#spark-cassandra-connector_2.11;1.6.1-M2

Indeed,
http://dl.bintray.com/spark-packages/maven/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.pom
does not exist.

However, 

ERRORS
Server access error at url 
https://repo1.maven.org/maven2/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.pom
 (java.net.ConnectException: Connection timed out)

this is due to the proxy configuration not being taken into account, as
https://repo1.maven.org/maven2/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.pom
exists.
The difference between the two are
https://github.com/apache/spark/blob/0a3026990bd0cbad53f0001da793349201104958/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L904
one has the root set and not the other, which gets the url from apparently
https://github.com/apache/ant-ivy/blob/master/src/java/org/apache/ivy/plugins/resolver/IBiblioResolver.java#L71

I'm currently trying to figure out why the proxy is not taken using an 
IBiblioResolver that does not have its root set.







was (Author: yetanotherion):
To reproduce, on a network that needs an http_proxy to get to 
http://dl.bintray.com and https://repo1.maven.org.

% spark-shell --packages 
org.apache.spark:spark-streaming-kafka_2.11:1.6.1,com.datastax.spark:spark-cassandra-connector_2.11:1.6.1-M2
 --driver-java-options "-Dhttp.proxyHost= 
-Dhttp.proxyPort="
...

 spark-packages: tried

  
http://dl.bintray.com/spark-packages/maven/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.pom

  -- artifact 
org.apache.spark#spark-streaming-kafka_2.11;1.6.1!spark-streaming-kafka_2.11.jar:

  
http://dl.bintray.com/spark-packages/maven/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.jar

module not found: 
com.datastax.spark#spark-cassandra-connector_2.11;1.6.1-M2
Indeed,
http://dl.bintray.com/spark-packages/maven/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.pom
does not exist.

However, 
ERRORS
Server access error at url 
https://repo1.maven.org/maven2/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.pom
 (java.net.ConnectException: Connection timed out)

this is due to the proxy configuration not being taken into account, as
https://repo1.maven.org/maven2/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.pom
exists.
The difference between the two are
https://github.com/apache/spark/blob/0a3026990bd0cbad53f0001da793349201104958/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L904
one has the root set and not the other, which gets the url from apparently
https://github.com/apache/ant-ivy/blob/master/src/java/org/apache/ivy/plugins/resolver/IBiblioResolver.java#L71

I'm currently trying to figure out why the proxy is not taken using an 
IBiblioResolver that does not have its root set.






> Add support for HTTP proxy 
> ---
>
> Key: SPARK-11085
> URL: https://issues.apache.org/jira/browse/SPARK-11085
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Shell, Spark Submit
>Reporter: Dustin Cote
>Priority: Minor
>
> Add a way to update ivysettings.xml for the spark-shell and spark-submit to 
> support proxy settings for clusters that need to access a remote repository 
> through an http proxy.  Typically this would be done like:
> JAVA_OPTS="$JAVA_OPTS -Dhttp.proxyHost=proxy.host -Dhttp.proxyPort=8080 
> -Dhttps.proxyHost=proxy.host.secure -Dhttps.proxyPort=8080"
> Directly in the ivysettings.xml would look like:
>  
>  proxyport="8080" 
> nonproxyhosts="nonproxy.host"/> 
>  
> Even better would be a way to