Hello all,

I am trying to unit test my classes involved my Spark job. I am trying to
mock out the Spark classes (like SparkContext and Broadcast) so that I can
unit test my classes in isolation. However I have realised that these are
classes instead of traits. My first question is why?

It is quite hard to mock out classes using ScalaTest+ScalaMock as the
classes which need to be mocked out need to be annotated with
org.scalamock.annotation.mock as per
http://www.scalatest.org/user_guide/testing_with_mock_objects#generatedMocks.
I cannot do that in my case as I am trying to mock out the spark classes.

Am I missing something? Is there a better way to do this?

    val sparkContext = mock[SparkInteraction]
    val trainingDatasetLoader = mock[DatasetLoader]
    val broadcastTrainingDatasetLoader = mock[Broadcast[DatasetLoader]]
    def transformerFunction(source: Iterator[(HubClassificationData,
String)]): Iterator[String] = {
      source.map(_._2)
    }
    val classificationResultsRDD = mock[RDD[String]]
    val classificationResults = Array("","","")
    val inputRDD = mock[RDD[(HubClassificationData, String)]]

    inSequence{
      inAnyOrder{
        (sparkContext.broadcast[DatasetLoader]
_).expects(trainingDatasetLoader).returns(broadcastTrainingDatasetLoader)
      }
    }

    val sparkInvoker = new SparkJobInvoker(sparkContext,
trainingDatasetLoader)

when(inputRDD.mapPartitions(transformerFunction)).thenReturn(classificationResultsRDD)
    sparkInvoker.invoke(inputRDD)

Thanks,
Saket

Reply via email to