Good for the file :) No it goes on... Like if it was waiting for something
jg > On Jul 10, 2016, at 22:55, ayan guha <guha.a...@gmail.com> wrote: > > Is this terminating the execution or spark application still runs after this > error? > > One thing for sure, it is looking for local file on driver (ie your mac) @ > location: file:/Users/jgp/Documents/Data/restaurants-data.json > >> On Mon, Jul 11, 2016 at 12:33 PM, Jean Georges Perrin <j...@jgp.net> wrote: >> >> I have my dev environment on my Mac. I have a dev Spark server on a freshly >> installed physical Ubuntu box. >> >> I had some connection issues, but it is now all fine. >> >> In my code, running on the Mac, I have: >> >> 1 SparkConf conf = new >> SparkConf().setAppName("myapp").setMaster("spark://10.0.100.120:7077"); >> 2 JavaSparkContext javaSparkContext = new JavaSparkContext(conf); >> 3 javaSparkContext.setLogLevel("WARN"); >> 4 SQLContext sqlContext = new SQLContext(javaSparkContext); >> 5 >> 6 // Restaurant Data >> 7 df = sqlContext.read().option("dateFormat", >> "yyyy-mm-dd").json(source.getLocalStorage()); >> >> >> 1) Clarification question: This code runs on my mac, connects to the server, >> but line #7 assumes the file is on my mac, not on the server, right? >> >> 2) On line 7, I get an exception: >> >> 16-07-10 22:20:04:143 DEBUG - address: jgp-MacBook-Air.local/10.0.100.100 >> isLoopbackAddress: false, with host 10.0.100.100 jgp-MacBook-Air.local >> 16-07-10 22:20:04:240 INFO >> org.apache.spark.sql.execution.datasources.json.JSONRelation - Listing >> file:/Users/jgp/Documents/Data/restaurants-data.json on driver >> 16-07-10 22:20:04:288 DEBUG org.apache.hadoop.util.Shell - Failed to detect >> a valid hadoop home directory >> java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set. >> at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:225) >> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:250) >> at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76) >> at >> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.setInputPaths(FileInputFormat.java:447) >> at >> org.apache.spark.sql.execution.datasources.json.JSONRelation.org$apache$spark$sql$execution$datasources$json$JSONRelation$$createBaseRdd(JSONRelation.scala:98) >> at >> org.apache.spark.sql.execution.datasources.json.JSONRelation$$anonfun$4$$anonfun$apply$1.apply(JSONRelation.scala:115) >> at >> org.apache.spark.sql.execution.datasources.json.JSONRelation$$anonfun$4$$anonfun$apply$1.apply(JSONRelation.scala:115) >> at scala.Option.getOrElse(Option.scala:120) >> at >> org.apache.spark.sql.execution.datasources.json.JSONRelation$$anonfun$4.apply(JSONRelation.scala:115) >> at >> org.apache.spark.sql.execution.datasources.json.JSONRelation$$anonfun$4.apply(JSONRelation.scala:109) >> at scala.Option.getOrElse(Option.scala:120) >> >> Do I have to install HADOOP on the server? - I imagine that from: >> java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set. >> >> TIA, >> >> jg > > > > -- > Best Regards, > Ayan Guha