Hey,
I have a RDD[(String,Boolean)]. I want to keep all Boolean: True rows and randomly keep some Boolean:false rows. And hope in the final result, the negative ones could be 10 times more than positive ones. What would be most efficient way to do this? Thanks,