Hi,

I am relatively new to R; when creating functions, I run into problems with
missing values. I would like my functions to ignore rows with missing values
for arguments of my function) in the analysis (as for example is the case in
STATA). Note that I don't want my function to drop rows if there are missing
arguments elsewhere in a row, ie for variables that are not arguments of my
function.

As an example: here is a clustering function I wrote:


cl <- function(dat, na.rm = TRUE, fm, cluster){

attach( dat , warn.conflicts = F)

library(sandwich)

library(lmtest)

M <- length(unique(cluster))

N <- length(cluster)

K <- fm$rank

dfc <- (M/(M-1))*((N-1)/(N-K))

uj <- data.frame(apply(estfun(fm),2, function(x) data.frame(tapply(x,
cluster, sum)) ) );

vcovCL <- dfc*sandwich(fm, meat=crossprod(uj)/N)

coeftest(fm, vcovCL)

}


When I run my function, I get the message:


Error in tapply(x, cluster, sum) : arguments must have same length


If I specify instead attach(na.omit(dat), warn.conflicts = F)  and don't
have the "na.rm=TRUE" argument, then my function runs; but only for the rows
where there are no missing values AT ALL; however, I don't care if there are
missing values for variables on which I am not applying my function.


For example, I have information on children's size; if I want regress scores
on age and parents' education, clustering on class, I would like missing
values in size not to interfere (ie if I have scores, age, parents'
education, and class, but not size, I don't want to drop this observation).


I tried to look at the code of "lm" to see how the na.action part works, but
I couldn't figure it out... This is exactly how I would like to deal with
missing values.


I tried to write

cl <- function(dat, fm, cluster, na.action){

attach( dat , warn.conflicts = F)

library(sandwich)

library(lmtest)

  M <- length(unique(cluster))

  N <- length(cluster)

  K <- fm$rank

  dfc <- (M/(M-1))*((N-1)/(N-K))

uj <- data.frame(apply(estfun(fm),2, function(x) data.frame(tapply(x,
cluster, sum)) ) );

vcovCL <- dfc*sandwich(fm, meat=crossprod(uj)/N)

  coeftest(fm, vcovCL)

}

 attr(cl,"na.action") <- na.exclude


but it still didn't work...


Any ideas of how to deal with this issue?

Thank you for your answers!

Edmund

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to