For the last point (cluttered text), look at spread.labels in the plotrix 
package and spread.labs in the TeachingDemos package (I favor the later, but 
could be slightly biased as well).  Doing more than what those 2 functions do 
becomes really complicated really fast.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -----Original Message-----
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of Tal Galili
> Sent: Wednesday, January 26, 2011 4:05 PM
> To: r-help@r-project.org
> Subject: [R] boxplot - code for labeling outliers - any suggestions for
> improvements?
> 
> Hello all,
> I wrote a small function to add labels for outliers in a boxplot.
> This function will only work on a simple boxplot/formula command (e.g:
> something like boxplot(y~x)).
> 
> Code + example follows in this e-mail.
> 
> I'd be happy for any suggestions on how to improve this code, for
> example:
> 
>    - Handle boxplot.matrix (which shouldn't be too hard to do)
>    - Handle cases of complex functions (e.g: boxplot(y~a*b))
>    - Handle cases where there are many outliers leading to a clutter of
> text
>    (to this I have no idea how to systematically solve)
> 
> 
> Best,
> Tal
> ------------------------------
> 
> 
> # the function
> boxplot.add.outlier.text <- function(DATA, x_name, y_name, label_name)
> {
> 
> 
> boxplot.outlier.data <- function(xx, y_name)
> {
>  y <- xx[,y_name]
> boxplot_range <- range(boxplot.stats(y)$stats)
> ss <- (y < boxplot_range[1]) | (y > boxplot_range[2])
>  return(xx[ss,])
> }
> 
> require(plyr)
> txt_to_run <- paste("ddply(DATA, .(",x_name,"), boxplot.outlier.data,
> y_name
> = y_name)", sep = "")
>  ourlier_df <- eval(parse(text = txt_to_run))
> # head(ourlier_df)
>  txt_to_run <- paste("formula(",y_name,"~", x_name,")")
>  formu <- eval(parse(text = txt_to_run))
> boxdata <- boxplot(formu , data = DATA, plot = F)
>  boxdata_group_name <- boxdata$names[boxdata$group]
> boxdata_outlier_df <- data.frame(group = boxdata_group_name, y =
> boxdata$out, x = boxdata$group)
>  for(i in seq_len(dim(boxdata_outlier_df)[1]))
> {
>  ss <- (ourlier_df[,x_name]  %in% boxdata_outlier_df[i,]$group) &
> (ourlier_df[,y_name] %in% boxdata_outlier_df[i,]$y)
> current_label <- ourlier_df[ss,label_name]
>  temp_x <- boxdata_outlier_df[i,"x"]
> temp_y <- boxdata_outlier_df[i,"y"]
>  text(temp_x, temp_y, current_label,pos=4)
> }
> 
> list(boxdata_outlier_df = boxdata_outlier_df, ourlier_df=ourlier_df)
> }
> 
> # example:
> boxplot(decrease ~ treatment, data = OrchardSprays, log = "y", col =
> "bisque")
> boxplot.add.outlier.text(OrchardSprays, "treatment", "decrease",
> "colpos")
> 
> 
> 
> 
> ----------------Contact
> Details:-------------------------------------------------------
> Contact me: tal.gal...@gmail.com |  972-52-7275845
> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew)
> |
> www.r-statistics.com (English)
> -----------------------------------------------------------------------
> -----------------------
> 
>       [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to