Dear All:  the attached file in the  .txt format


*Re:* detect and replace outliers by the average



The dataset, please see attached, contains a group factoring column “
*factor*” and two columns of data “x1” and “x2” with some NA values. I need
some help to detect the outliers and replace it and the NAs with the
average  within each level (0,1,2) for each variable “x1” and “x2”.



I tried the below code, but it did not accomplish what I want to do.



The average within each level should be computed after discard the outliers.



data<-read.csv("G:/20-Spring_2023/Outliers/data.csv", header=TRUE)

data

replace_outlier_with_mean <- function(x) {

  replace(x, x %in% boxplot.stats(x)$out, mean(x, na.rm=TRUE))  #### ,
na.rm=TRUE NOT working

}

data[] <- lapply(data, replace_outlier_with_mean)





Thank you all very much for your help in advance.





with many thanks

abou
______________________


*AbouEl-Makarim Aboueissa, PhD*

*Professor, Mathematics and Statistics*
*Graduate Coordinator*

*Department of Mathematics and Statistics*
*University of Southern Maine*
factor  x1      x2
0       700     700
0       700     500
0       470     470
0       710     560
0       5555    520
0       610     720
0       710     670
0       610     9999
1       690     620
1       580     540
1       690     690
1       NA      401
1       450     580
1       700     700
1       400     8888
1       6666    600
1       500     400
1       680     650
2       117     63
2       120     68
2       130     73
2       120     69
2       125     54
2       999     70
2       165     62
2       130     987
2       123     70
2               78
2               98
2               5
2       321     NA
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to