I'm trying to unit test a function that reads in a JSON file, manipulates
the DF and then returns a Scala Map.

The function has signature:
def ingest(dataLocation: String, sc: SparkContext, sqlContext: SQLContext)

I've created a bootstrap spec for spark jobs that instantiates the Spark
Context and SQLContext like so:

@transient var sc: SparkContext = _
@transient var sqlContext: SQLContext = _

override def beforeAll = {
  System.clearProperty("spark.driver.port")
  System.clearProperty("spark.hostPort")

  val conf = new SparkConf()
    .setMaster(master)
    .setAppName(appName)

  sc = new SparkContext(conf)
  sqlContext = new SQLContext(sc)
}

When I do not include sqlContext, my tests run. Once I add the sqlContext I
get the following errors:

16/02/04 17:31:58 WARN SparkContext: Another SparkContext is being
constructed (or threw an exception in its constructor).  This may indicate
an error, since only one SparkContext may be running in this JVM (see
SPARK-2243). The other SparkContext was created at:
org.apache.spark.SparkContext.<init>(SparkContext.scala:81)

16/02/04 17:31:59 ERROR SparkContext: Error initializing SparkContext.
akka.actor.InvalidActorNameException: actor name [ExecutorEndpoint] is not
unique!

and finally:

[info] IngestSpec:
[info] Exception encountered when attempting to run a suite with class
name: com.company.package.IngestSpec *** ABORTED ***
[info]   akka.actor.InvalidActorNameException: actor name
[ExecutorEndpoint] is not unique!


What do I need to do to get a sqlContext through my tests?

Thanks,

-- Steve

Reply via email to