Hello all, I am trying to unit test my classes involved my Spark job. I am trying to mock out the Spark classes (like SparkContext and Broadcast) so that I can unit test my classes in isolation. However I have realised that these are classes instead of traits. My first question is why?
It is quite hard to mock out classes using ScalaTest+ScalaMock as the classes which need to be mocked out need to be annotated with org.scalamock.annotation.mock as per http://www.scalatest.org/user_guide/testing_with_mock_objects#generatedMocks. I cannot do that in my case as I am trying to mock out the spark classes. Am I missing something? Is there a better way to do this? val sparkContext = mock[SparkInteraction] val trainingDatasetLoader = mock[DatasetLoader] val broadcastTrainingDatasetLoader = mock[Broadcast[DatasetLoader]] def transformerFunction(source: Iterator[(HubClassificationData, String)]): Iterator[String] = { source.map(_._2) } val classificationResultsRDD = mock[RDD[String]] val classificationResults = Array("","","") val inputRDD = mock[RDD[(HubClassificationData, String)]] inSequence{ inAnyOrder{ (sparkContext.broadcast[DatasetLoader] _).expects(trainingDatasetLoader).returns(broadcastTrainingDatasetLoader) } } val sparkInvoker = new SparkJobInvoker(sparkContext, trainingDatasetLoader) when(inputRDD.mapPartitions(transformerFunction)).thenReturn(classificationResultsRDD) sparkInvoker.invoke(inputRDD) Thanks, Saket