I am working on a function that will remove outliers for regression analysis. I am stating that a data point is an outlier if its studentized residual is above or below 3 and -3, respectively. The code below is what i have thus far for the function
x = c(1:20) y = c(1,3,4,2,5,6,18,8,10,8,11,13,14,14,15,85,17,19,19,20) data1 = data.frame(x,y) rm.outliers = function(dataset,dependent,independent){ dataset$predicted = predict(lm(dependent~independent)) dataset$stdres = rstudent(lm(dependent~independent)) m = 1 for(i in 1:length(dataset$stdres)){ dataset$outlier_counter[i] = if(dataset$stdres[i] >= 3 | dataset$stdres[i] <= -3) {m} else{0} } j = length(which(dataset$outlier_counter >= 1)) while(j>=1){ print(dataset[which(dataset$outlier_counter >= 1),]) dataset = dataset[which(dataset$outlier_counter == 0),] dataset$predicted = predict(lm(dependent~independent)) dataset$stdres = rstudent(lm(dependent~independent)) m = m+1 for(k in 1:length(dataset$stdres)){ dataset$outlier_counter[k] = if(dataset$stdres[k] >= 3 | dataset$stdres[k] <= -3) {m} else{0} } j = length(which(dataset$outlier_counter >= 1)) } return(dataset) } The problem that I run into is that i receive this error when i type rm.outliers(data1,data1$y,data1$x) " x y predicted stdres outlier_counter 16 16 85 22.98647 24.04862 1 Error in `$<-.data.frame`(`*tmp*`, "predicted", value = c(0.114285714285714, : replacement has 20 rows, data has 19" Note: the outlier_counter variable is used to state which "round" of the loop the datapoint was marked as an outlier. This would be a HUGE help to me and a few buddies who run a lot of different regression tests. Thanks, and if the question is still confusing please ask ----- - AK -- View this message in context: http://r.789695.n4.nabble.com/Removing-Outliers-Function-tp3293395p3293395.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.