Really appreciate the discussion on outliers. I come from an engineering signal processing background, and my thinking has generally been that an outlier is outside a threshold of
- distance from the mean - rarity that we don't need/want to capture in whatever model we're building. In my recent work (bioinformatics), I've seen that it's common to Winsorize the data. I am a bit uncomfortable with this, though it seems to be standard practice. Do people have thoughts here? Cheers, -Gus [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.