This depends on your target setup! I run for example for my open source 
libraries for spark integration tests (a dedicated folder a side the unit 
tests) a local spark master, but also use a minidfs cluster (to simulate HDFS 
on a node) and sometimes also a miniyarn cluster (see 
https://wiki.apache.org/hadoop/HowToDevelopUnitTests).

 An example can be found here:  
https://github.com/ZuInnoTe/hadoopcryptoledger/tree/master/examples/spark-bitcoinblock
 

or - if you need Scala - 
https://github.com/ZuInnoTe/hadoopcryptoledger/tree/master/examples/scala-spark-bitcoinblock
 

In both cases it is in the integration-tests (Java) or it (Scala) folder.

Spark Streaming - I have no open source example at hand, but basically you need 
to simulate the source and the rest is as above.

 I will eventually write a blog post about this with more details.

> On 7 Mar 2017, at 13:04, kant kodali <kanth...@gmail.com> wrote:
> 
> Hi All,
> 
> How to unit test spark streaming or spark in general? How do I test the 
> results of my transformations? Also, more importantly don't we need to spawn 
> master and worker JVM's either in one or multiple nodes?
> 
> Thanks!
> kant

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to