Dear list,

I'm trying to do a set of generic functions do make contingency tables from data.frames. It is just running "nice" (I'm learning R), but I think it can be better.

I would like to filter the data.frame, i.e, eliminate all not numeric variables.
And I don't know how to make it: please, help me.

Below one of the my functions ('er' is a mention to EasieR, because I'm trying to do a package for myself and the my students):

#2. Tables from data.frames
#2.1---er.table.df.br (User define breaks and right)------------
er.table.df.br <- function(df,
                           breaks = c('Sturges', 'Scott', 'FD'),
                           right = FALSE) {

  if (is.data.frame(df) != 'TRUE')
    stop('need "data.frame" data')

  dim_df <- dim(df)

  tmpList <- list()

  for (i in 1:dim_df[2]) {

    x <- as.matrix(df[ ,i])
    x <- na.omit(x)

    k <- switch(breaks[1],
                'Sturges' = nclass.Sturges(x),
                'Scott'   = nclass.scott(x),
                'FD'      = nclass.FD(x),
                stop("'breaks' must be 'Sturges', 'Scott' or 'FD'"))

    tmp      <- range(x)
    classIni <- tmp[1] - tmp[2]/100
    classEnd <- tmp[2] + tmp[2]/100
    R        <- classEnd-classIni
    h        <- R/k

    # Absolut frequency
    f <- table(cut(x, br = seq(classIni, classEnd, h), right = right))

    # Relative frequency
    fr <- f/length(x)

    # Relative frequency, %
    frP <- 100*(f/length(x))

    # Cumulative frequency
    fac <- cumsum(f)

    # Cumulative frequency, %
    facP <- 100*(cumsum(f/length(x)))

    fi   <- round(f, 2)
    fr   <- round(as.numeric(fr), 2)
    frP  <- round(as.numeric(frP), 2)
    fac  <- round(as.numeric(fac), 2)
    facP <- round(as.numeric(facP),2)

    # Table
    res <- data.frame(fi, fr, frP, fac, facP)
    names(res) <- c('Class limits', 'fi', 'fr', 'fr(%)', 'fac', 'fac(%)')
    tmpList <- c(tmpList, list(res))
  }
  names(tmpList) <- names(df)
  return(tmpList)
}

To try the function:

#a) runing nice
y1=rnorm(100, 10, 1)
y2=rnorm(100, 58, 4)
y3=rnorm(100, 500, 10)
mydf=data.frame(y1, y2, y3)
#tbdf=er.table.df.br (mydf, breaks = 'Sturges', right=F)
#tbdf=er.table.df.br (mydf, breaks = 'Scott', right=F)
tbdf=er.table.df.br (mydf, breaks = 'FD', right=F)
print(tbdf)


#b) One of the problems
y1=rnorm(100, 10, 1)
y2=rnorm(100, 58, 4)
y3=rnorm(100, 500, 10)
y4=rep(letters[1:10], 10)
mydf=data.frame(y1, y2, y3, y4)
tbdf=er.table.df.br (mydf, breaks = 'Scott', right=F)
print(tbdf)

Could anyone give me a hint how to work around this?

PS: Excuse my bad English ;-)
--
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to