1. No.
2. The seed per partition is fixed. So it should generate
non-overlapping subsets.
3. There was a bug in 1.0, which was fixed in 1.0.1 and 1.1.
Best,
Xiangrui
On Thu, Oct 9, 2014 at 11:05 AM, Nan Zhu zhunanmcg...@gmail.com wrote:
Hi, all
When we use MLUtils.kfold to generate training
Thanks, Xiangrui,
I found the reason of overlapped training set and test set
….
Another counter-intuitive issue related to
https://github.com/apache/spark/pull/2508
Best,
--
Nan Zhu
On Friday, October 10, 2014 at 2:19 AM, Xiangrui Meng wrote:
1. No.
2. The seed per partition