Haven't quite learned to 'cast' yet, but I have always used the 'apply' functions for this type of processing:
> x <- "id patient_id date code class eala + ID1564262 1562 6.4.200612:00 5555 1 NA + ID1564262 1562 6.4.200612:00 5555 1 NA + ID1564264 1365 14.2.200614:35 5555 1 50 + ID1564265 1342 7.4.200614:30 2222 2 50 + ID1564266 1648 7.4.200614:30 2222 2 50 + ID1564267 1263 10.2.200615:45 2222 2 10 + ID1564267 1263 10.2.200615:45 3333 3 10 + ID1564269 5646 13.5.200617:02 3333 3 10 + ID1564270 7561 13.5.200617:02 6666 1 10 + ID1564271 1676 15.5.200620:41 2222 2 20" > > x.in <- read.table(textConnection(x), header=TRUE) > # 'by' seems to drop NAs so convert to a character string for processing > x.in$eala <- ifelse(is.na(x.in$eala), "NA", as.character(x.in$eala)) > # convert date to POSIXlt so we can access the year and month > myDate <- strptime(x.in$date, "%d.%m.%Y%H:%M") > x.in$year <- myDate$year + 1900 > x.in$month <- myDate$mon+1 > # split the data by eala, year, month and summarize > x.by <- by(x.in, list(x.in$eala, x.in$year, x.in$month), function(x){ + data.frame(eala=x$eala[1], month=x$month[1], year=x$year[1], + icount=length(unique(x$id)), pcount=length(unique(x$patient_id)), + count1=sum(x$class == 1), count2=sum(x$class == 2), count3=sum(x$class == 3)) + }) > # convert back to a data frame > do.call(rbind, x.by) eala month year icount pcount count1 count2 count3 1 10 2 2006 1 1 0 1 1 2 50 2 2006 1 1 1 0 0 3 50 4 2006 2 2 0 2 0 4 NA 4 2006 1 1 2 0 0 5 10 5 2006 2 2 1 0 1 6 20 5 2006 1 1 0 1 0 > > On 2/20/07, Lauri Nikkinen <[EMAIL PROTECTED]> wrote: > > Hi R-users, > > I have a data set like this (first ten rows): > > id patient_id date code class eala ID1564262 1562 6.4.2006 12:00 5555 1 > NA ID1564262 1562 6.4.2006 12:00 5555 1 NA ID1564264 1365 14.2.2006 14:35 > 5555 1 50 ID1564265 1342 7.4.2006 14:30 2222 2 50 ID1564266 1648 > 7.4.200614:30 > 2222 2 50 ID1564267 1263 10.2.2006 15:45 2222 2 10 ID1564267 1263 > 10.2.200615:45 > 3333 3 10 ID1564269 5646 13.5.2006 17:02 3333 3 10 ID1564270 7561 > 13.5.200617:02 > 6666 1 10 ID1564271 1676 15.5.2006 20:41 2222 2 20 > > How can I do a new (pivot?) data.frame in R which I can achieve by MS SQL: > > select eala, > datepart(month, date) as month, > datepart(year, date) as year, > count(distinct id) as id_count, > count(distinct patient_id) as patient_count, > count(distinct(case when class = 1 then code else null end)) as count_1, > count(distinct(case when class = 2 then code else null end)) as count_2, > count(distinct(case when class = 3 then code else null end)) as count_3, > into temp2 > from temp1 > group by datepart(month, date), datepart(year, date), eala > order by datepart(month, date), datepart(year, date), eala > > I tried something like this but could not go further: > > stats <- function(x) { > count <- function(x) length(na.omit(x)) > c( > n = count(x), > uniikit = length(unique(x)) > ) > } > library(reshape) > attach(dframe) > dfm <- melt(dframe, measure.var=c("id","patient_id"), id.var=c > ("code",""this > should be month"",""this should be year), variable_name="variable") > > cast(dfm, code + month + year ~ variable, stats) > > Regards, > > Lauri > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.