I suggest RandomRDDs API. It provides nice tools. If you write wrappers around that might be good.
https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.mllib.random.RandomRDDs$ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org