I need to filter out outliers from a dataframe on all columns. I can manually list all columns like:
df.filter(x=>math.abs(x.get(0).toString().toDouble-means(0))<=3*stddevs(0)) .filter(x=>math.abs(x.get(1).toString().toDouble-means(1))<=3*stddevs(1 )) ... But I want to turn it into a general function which can handle variable number of columns. How could I do that? Thanks in advance! Regards, Shawn