Matthew Wilson writes: > (...) > Step 3: > > I print a list of number of sales per customer ID, ranking the customer > IDs from most to least. I use a SAS proc freq step for this: > > proc freq data=d2 order=freq; > tables custid; > run; > > and the output would look like this: > > custid freq > 111 2 > 112 1 > > > Again, I have no idea how to do step 3 in R.
Provided that you are going to work with this data stored in a SQL database, it'll probably be more efficient to do this sort of manipulation directly in SQL. Anyways, the following lines of code in R will do what you described: cust <- c(111,111,112) cc <- data.frame(t(sapply(unique(cust),function(level,vec) { c(custid=level,freq=sum(vec==level)) },cust))) cc[order(cc$freq,decreasing=T),] Cheers, -- Fernando Henrique Ferraz P. da Rosa http://www.ime.usp.br/~feferraz ______________________________________________ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html