I need to filter out outliers from a dataframe on all columns. I can
manually list all columns like:

df.filter(x=>math.abs(x.get(0).toString().toDouble-means(0))<=3*stddevs(0))

    .filter(x=>math.abs(x.get(1).toString().toDouble-means(1))<=3*stddevs(1
))

    ...

But I want to turn it into a general function which can handle variable
number of columns. How could I do that? Thanks in advance!


Regards,

Shawn

Reply via email to