Jaehyeon Kim created SPARK-23632: ------------------------------------ Summary: sparkR.session() error with spark packages - JVM is not ready after 10 seconds Key: SPARK-23632 URL: https://issues.apache.org/jira/browse/SPARK-23632 Project: Spark Issue Type: Bug Components: SparkR Affects Versions: 2.3.0, 2.2.1, 2.2.0 Reporter: Jaehyeon Kim
Hi When I execute _sparkR.session()_ with _org.apache.hadoop:hadoop-aws:2.8.2_ as following, {code:java} library(SparkR, lib.loc=file.path(Sys.getenv('SPARK_HOME'),'R', 'lib')) ext_opts <- '-Dhttp.proxyHost=10.74.1.25 -Dhttp.proxyPort=8080 -Dhttps.proxyHost=10.74.1.25 -Dhttps.proxyPort=8080' sparkR.session(master = "spark://master:7077", appName = 'ml demo', sparkConfig = list(spark.driver.memory = '2g'), sparkPackages = 'org.apache.hadoop:hadoop-aws:2.8.2', spark.driver.extraJavaOptions = ext_opts) {code} I see *JVM is not ready after 10 seconds* error. Below shows some of the log messages. {code:java} Ivy Default Cache set to: /home/rstudio/.ivy2/cache The jars for the packages stored in: /home/rstudio/.ivy2/jars :: loading settings :: url = jar:file:/usr/local/spark-2.2.1/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml org.apache.hadoop#hadoop-aws added as a dependency :: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0 confs: [default] found org.apache.hadoop#hadoop-aws;2.8.2 in central ... ... found javax.servlet.jsp#jsp-api;2.1 in central Error in sparkR.sparkContext(master, appName, sparkHome, sparkConfigMap, : JVM is not ready after 10 seconds ... ... found joda-time#joda-time;2.9.4 in central downloading https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/2.8.2/hadoop-aws-2.8.2.jar ... ... ... xmlenc#xmlenc;0.52 from central in [default] --------------------------------------------------------------------- | | modules || artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| --------------------------------------------------------------------- | default | 76 | 76 | 76 | 0 || 76 | 76 | --------------------------------------------------------------------- :: retrieving :: org.apache.spark#spark-submit-parent confs: [default] 76 artifacts copied, 0 already retrieved (27334kB/56ms) {code} It's fine if I re-execute it after the package and its dependencies are downloaded. I consider it's because of this part - https://github.com/apache/spark/blob/master/R/pkg/R/sparkR.R#L181 {code:java} if (!file.exists(path)) { stop("JVM is not ready after 10 seconds") } {code} Just wonder if it may be possible to update so that a user can determine how much to wait? Thanks. Regards Jaehyeon -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org