What I would do: # read in your sample data mbr <- read.table( "clipboard", header = TRUE, stringsAsFactors = FALSE )
# create a vector with the codes you want to consider code.list <- c("A","B","C","D","E") # reduce the data accordingly mbr <- mbr[ mbr$code %in% code.list, ] # get your model matrix using reshape library( reshape ) model.matrix <- as.data.frame( cast( melt( mbr ), value ~ code ) ) # Cosmetics colnames( model.matrix )[1] <- "Member" model.matrix[ 2 : ( length( model.matrix[1,] ) ) ] <- ifelse( model.matrix[ 2 : ( length( model.matrix[1,] ) ) ] > 0, 1, 0 ) On Thursday 06 March 2014 19:23:03 Mckinstry, Craig wrote: > > I have a medical insurance claims datafile divided into blocks by member, > with multiple lines per member. I am process these into a one line per member > model matrix. Member block sizes vary from 1 to 50+. I am match attributes in > claims data to columns in the model matrix and > > have been getting by with a for loop, but for large file size it takes much > too long. Is there vectorized/apply based method to do this more efficiently? > > input data: > > member code > 1 A > 1 C > 1 F > 2 B > 2 E > 3 D > 3 A > 3 B > 3 D > 4 G > 4 A > > code.list <- c(A,B,C,D,E) > for(i in 1:n.mbr){ > mbr.i <- dat[dat$Rmbr==mbr.list[i],] #EXTRACT BLOCK OF MEMBER CLAIMS > matrix.mat[i,unique(match(mbr.i$code,code.list))] <- 1 > } > > > output model.matrix > Member A B C D E > 1 1 0 1 0 0 > 2 0 1 0 0 1 > 3 1 1 0 1 0 > 4 1 0 0 0 0 > > Craig McKinstry > 100 Market, 6th floor > Office: 503-225-6878 | Cell: 509-778-2438 > > > IMPORTANT NOTICE: This communication, including any attachment, contains > information that may be confidential or privileged, and is intended solely > for the entity or individual to whom it is addressed. If you are not the > intended recipient, you should delete this message and are hereby notified > that any disclosure, copying, or distribution of this message is strictly > prohibited. Nothing in this email, including any attachment, is intended to > be a legally binding signature. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.