[R] Data transformation & cleaning

pip56789 Tue, 27 Sep 2011 21:00:36 -0700

Hi,

I have a few methodological and implementation questions for ya'll. Thank
you in advance for your help. I have a dataset that reflects people's
preference choices. I want to see if there's any kind of clustering effect
among certain preference choices (e.g. do people who pick choice A also pick
choice D).


I have a data set that has one record per user ID, per preference choice.
It's a "long" form of a data set that looks like this: 

ID | Page
123 | Choice A
123 | Choice B
456 | Choice A
456 | Choice B
...

I thought that I should do the following

1. Make the data set "wide", counting the observations so the data looks
like this:
ID | Count of Preference A | Count of Preference B
123 | 1 | 1
...

Using 
table1 <- dcast(data,ID ~ Page,fun.aggregate=length,value_var='Page' )

2. Create a correlation matrix of preferences
cor(table2[,-1])

How would I restrict my correlation to show preferences that met a minimum
sample threshold? Can you confirm if the two following commands do the same
thing? What would I do from here (or am I taking the wrong approach)
table1 <- dcast(data,Page ~ Page,fun.aggregate=length,value_var='Page' )
table2 <- with(data, table(Page,Page))


many thanks,
Peter

--
View this message in context: 
http://r.789695.n4.nabble.com/Data-transformation-cleaning-tp3849889p3849889.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Data transformation & cleaning

Reply via email to