No, all of the RDDs (including those returned from randomSplit()) are read-only.
On Mon, Apr 27, 2015 at 11:28 AM, Pagliari, Roberto <rpagli...@appcomsci.com> wrote: > Suppose I have something like the code below > > > for idx in xrange(0, 10): > train_test_split = training.randomSplit(weights=[0.75, 0.25]) > train_cv = train_test_split[0] > test_cv = train_test_split[1] > # scale train_cv and test_cv > > > by scaling train_cv and test_cv, will the original data be affected? > > Thanks, > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org