[jira] [Commented] (SPARK-11085) Add support for HTTP proxy
[ https://issues.apache.org/jira/browse/SPARK-11085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15311906#comment-15311906 ] Ion Alberdi commented on SPARK-11085: - Hello to all, I reproduce the error using the docker in https://github.com/Yannael/kafka-sparkstreaming-cassandra. What I tried to communicate the -Dhttp.proxyHost, -Dhttp.proxyPort parameters: - the javaopts worarkound mentionned above - setting, spark.executor.extraJavaOptions. The launched command becomes % java org.apache.spark.deploy.SparkSubmit ... --conf spark.driver.extraJavaOptions=-Dhttp.proxyHost= -Dhttp.proxyPort= I wonder whether the shell is able to parse that line and thus transfer the two (-Dhttp.proxyHost= and -Dhttp.proxyPort=) parameters to org.apache.spark.deploy.SparkSubmit - setting "--driver-java-options" parameters that ends with % java org.apache.spark.deploy.SparkSubmit ... -Dhttp.proxyHost= -Dhttp.proxyPort= even if the shell seems more likely to parse the two informations, the packages are not downloaded as the http request do not go through the proxy > Add support for HTTP proxy > --- > > Key: SPARK-11085 > URL: https://issues.apache.org/jira/browse/SPARK-11085 > Project: Spark > Issue Type: Improvement > Components: Spark Shell, Spark Submit >Reporter: Dustin Cote >Priority: Minor > > Add a way to update ivysettings.xml for the spark-shell and spark-submit to > support proxy settings for clusters that need to access a remote repository > through an http proxy. Typically this would be done like: > JAVA_OPTS="$JAVA_OPTS -Dhttp.proxyHost=proxy.host -Dhttp.proxyPort=8080 > -Dhttps.proxyHost=proxy.host.secure -Dhttps.proxyPort=8080" > Directly in the ivysettings.xml would look like: > > proxyport="8080" > nonproxyhosts="nonproxy.host"/> > > Even better would be a way to customize the ivysettings.xml with command > options. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11085) Add support for HTTP proxy
[ https://issues.apache.org/jira/browse/SPARK-11085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15312244#comment-15312244 ] Ion Alberdi commented on SPARK-11085: - More precisely, it seems that when a url like http://dl.bintray.com/spark-packages/maven/com/datastax/spark/spark-cassandra-connector_2.11/1.6.0-M2/spark-cassandra-connector_2.11-1.6.0-M2.pom is tried it goes through the proxy. However, when going to a maven compatible link, like https://repo1.maven.org/maven2/com/datastax/spark/spark-cassandra-connector_2.11/1.6.0-M2/spark-cassandra-connector_2.11-1.6.0-M2.jar Then the proxy is not taken into account. > Add support for HTTP proxy > --- > > Key: SPARK-11085 > URL: https://issues.apache.org/jira/browse/SPARK-11085 > Project: Spark > Issue Type: Improvement > Components: Spark Shell, Spark Submit >Reporter: Dustin Cote >Priority: Minor > > Add a way to update ivysettings.xml for the spark-shell and spark-submit to > support proxy settings for clusters that need to access a remote repository > through an http proxy. Typically this would be done like: > JAVA_OPTS="$JAVA_OPTS -Dhttp.proxyHost=proxy.host -Dhttp.proxyPort=8080 > -Dhttps.proxyHost=proxy.host.secure -Dhttps.proxyPort=8080" > Directly in the ivysettings.xml would look like: > > proxyport="8080" > nonproxyhosts="nonproxy.host"/> > > Even better would be a way to customize the ivysettings.xml with command > options. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11085) Add support for HTTP proxy
[ https://issues.apache.org/jira/browse/SPARK-11085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15312367#comment-15312367 ] Ion Alberdi commented on SPARK-11085: - To reproduce, on a network that needs an http_proxy to get to http://dl.bintray.com and https://repo1.maven.org. % spark-shell --packages org.apache.spark:spark-streaming-kafka_2.11:1.6.1,com.datastax.spark:spark-cassandra-connector_2.11:1.6.1-M2 --driver-java-options "-Dhttp.proxyHost= -Dhttp.proxyPort=" ... spark-packages: tried http://dl.bintray.com/spark-packages/maven/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.pom -- artifact org.apache.spark#spark-streaming-kafka_2.11;1.6.1!spark-streaming-kafka_2.11.jar: http://dl.bintray.com/spark-packages/maven/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.jar module not found: com.datastax.spark#spark-cassandra-connector_2.11;1.6.1-M2 Indeed, http://dl.bintray.com/spark-packages/maven/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.pom does not exist. However, ERRORS Server access error at url https://repo1.maven.org/maven2/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.pom (java.net.ConnectException: Connection timed out) this is due to the proxy configuration not being taken into account, as https://repo1.maven.org/maven2/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.pom exists. The difference between the two are https://github.com/apache/spark/blob/0a3026990bd0cbad53f0001da793349201104958/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L904 one has the root set and not the other, which gets the url from apparently https://github.com/apache/ant-ivy/blob/master/src/java/org/apache/ivy/plugins/resolver/IBiblioResolver.java#L71 I'm currently trying to figure out why the proxy is not taken using an IBiblioResolver that does not have its root set. > Add support for HTTP proxy > --- > > Key: SPARK-11085 > URL: https://issues.apache.org/jira/browse/SPARK-11085 > Project: Spark > Issue Type: Improvement > Components: Spark Shell, Spark Submit >Reporter: Dustin Cote >Priority: Minor > > Add a way to update ivysettings.xml for the spark-shell and spark-submit to > support proxy settings for clusters that need to access a remote repository > through an http proxy. Typically this would be done like: > JAVA_OPTS="$JAVA_OPTS -Dhttp.proxyHost=proxy.host -Dhttp.proxyPort=8080 > -Dhttps.proxyHost=proxy.host.secure -Dhttps.proxyPort=8080" > Directly in the ivysettings.xml would look like: > > proxyport="8080" > nonproxyhosts="nonproxy.host"/> > > Even better would be a way to customize the ivysettings.xml with command > options. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-11085) Add support for HTTP proxy
[ https://issues.apache.org/jira/browse/SPARK-11085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15312367#comment-15312367 ] Ion Alberdi edited comment on SPARK-11085 at 6/2/16 2:16 PM: - To reproduce, on a network that needs an http_proxy to get to http://dl.bintray.com and https://repo1.maven.org. % spark-shell --packages org.apache.spark:spark-streaming-kafka_2.11:1.6.1,com.datastax.spark:spark-cassandra-connector_2.11:1.6.1-M2 --driver-java-options "-Dhttp.proxyHost= -Dhttp.proxyPort=" ... spark-packages: tried http://dl.bintray.com/spark-packages/maven/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.pom -- artifact org.apache.spark#spark-streaming-kafka_2.11;1.6.1!spark-streaming-kafka_2.11.jar: http://dl.bintray.com/spark-packages/maven/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.jar module not found: com.datastax.spark#spark-cassandra-connector_2.11;1.6.1-M2 Indeed, http://dl.bintray.com/spark-packages/maven/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.pom does not exist. However, ERRORS Server access error at url https://repo1.maven.org/maven2/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.pom (java.net.ConnectException: Connection timed out) this is due to the proxy configuration not being taken into account, as https://repo1.maven.org/maven2/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.pom exists. The difference between the two are https://github.com/apache/spark/blob/0a3026990bd0cbad53f0001da793349201104958/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L904 one has the root set and not the other, which gets the url from apparently https://github.com/apache/ant-ivy/blob/master/src/java/org/apache/ivy/plugins/resolver/IBiblioResolver.java#L71 I'm currently trying to figure out why the proxy is not taken using an IBiblioResolver that does not have its root set. was (Author: yetanotherion): To reproduce, on a network that needs an http_proxy to get to http://dl.bintray.com and https://repo1.maven.org. % spark-shell --packages org.apache.spark:spark-streaming-kafka_2.11:1.6.1,com.datastax.spark:spark-cassandra-connector_2.11:1.6.1-M2 --driver-java-options "-Dhttp.proxyHost= -Dhttp.proxyPort=" ... spark-packages: tried http://dl.bintray.com/spark-packages/maven/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.pom -- artifact org.apache.spark#spark-streaming-kafka_2.11;1.6.1!spark-streaming-kafka_2.11.jar: http://dl.bintray.com/spark-packages/maven/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.jar module not found: com.datastax.spark#spark-cassandra-connector_2.11;1.6.1-M2 Indeed, http://dl.bintray.com/spark-packages/maven/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.pom does not exist. However, ERRORS Server access error at url https://repo1.maven.org/maven2/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.pom (java.net.ConnectException: Connection timed out) this is due to the proxy configuration not being taken into account, as https://repo1.maven.org/maven2/org/apache/spark/spark-streaming-kafka_2.11/1.6.1/spark-streaming-kafka_2.11-1.6.1.pom exists. The difference between the two are https://github.com/apache/spark/blob/0a3026990bd0cbad53f0001da793349201104958/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L904 one has the root set and not the other, which gets the url from apparently https://github.com/apache/ant-ivy/blob/master/src/java/org/apache/ivy/plugins/resolver/IBiblioResolver.java#L71 I'm currently trying to figure out why the proxy is not taken using an IBiblioResolver that does not have its root set. > Add support for HTTP proxy > --- > > Key: SPARK-11085 > URL: https://issues.apache.org/jira/browse/SPARK-11085 > Project: Spark > Issue Type: Improvement > Components: Spark Shell, Spark Submit >Reporter: Dustin Cote >Priority: Minor > > Add a way to update ivysettings.xml for the spark-shell and spark-submit to > support proxy settings for clusters that need to access a remote repository > through an http proxy. Typically this would be done like: > JAVA_OPTS="$JAVA_OPTS -Dhttp.proxyHost=proxy.host -Dhttp.proxyPort=8080 > -Dhttps.proxyHost=proxy.host.secure -Dhttps.proxyPort=8080" > Directly in the ivysettings.xml would look like: > > proxyport="8080" > nonproxyhosts="nonproxy.host"/> > > Even better would be a way to