Re: [R] help with statistics in R - how to measure the effect of users in groups
Hi Petr, It's not an equation. It's my mistake; the * are meant to be field separators for the example data. I should have just use blank spaces as follows: users Group1 Group2 Group3 u110 5N/A u2 6 N/A 4 u3 5 23 Regards Gawesh On Mon, Oct 10, 2011 at 9:32 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi I do not understand much about your equations. I think you shall look to Practical Regression and Anova Using R from J.Faraway. Having data frame DF with columns - users, groups, results you could do fit - lm(results~groups, data = DF) Regards Petr Hi, I'm a newbie to R. My knowledge of statistics is mostly self-taught. My problem is how to measure the effect of users in groups. I can calculate a particular attribute for a user in a group. But my hypothesis is that the user's attribute is not independent of each other and that the user's attribute depends on the group ie that user's behaviour change based on the group. Let me give an example: users*Group 1*Group 2*Group 3 u1*10*5*n/a u2*6*n/a*4 u3*5*2*3 For example, I want to be able to prove that u1 behaviour is different in group 1 than other groups and the particular thing about Group 1 is that users in Group 1 tend to have a higher value of the attribute under measurement. Hence, can use R to test my hypothesis. I'm willing to learn; so if this is very simple, just point me in the direction of any online resources about it. At the moment, I don't even how to define these class of problems? That will be a start. Regards Gawesh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with statistics in R - how to measure the effect of users in groups
Hi Petr, It's not an equation. It's my mistake; the * are meant to be field separators for the example data. I should have just use blank spaces as follows: users Group1 Group2 Group3 u110 5N/A u2 6 N/A 4 u3 5 23 Regards Gawesh OK. You shall transform your data to long format to use lm test - read.table(clipboard, header=T, na.strings=N/A) test.m-melt(test) Using users as id variables fit-lm(value~variable, data=test.m) summary(fit) Call: lm(formula = value ~ variable, data = test.m) Residuals: 1234689 3.0 -1.0 -2.0 1.5 -1.5 0.5 -0.5 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 7.000 1.258 5.563 0.00511 ** variableGroup2 -3.500 1.990 -1.759 0.15336 variableGroup3 -3.500 1.990 -1.759 0.15336 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 2.179 on 4 degrees of freedom (2 observations deleted due to missingness) Multiple R-squared: 0.525, Adjusted R-squared: 0.2875 F-statistic: 2.211 on 2 and 4 DF, p-value: 0.2256 No difference among groups, but I am not sure if this is the correct way to evaluate. library(ggplot2) p-ggplot(test.m, aes(x=variable, y=value, colour=users)) p+geom_point() There is some sign that user3 has lowest value in each group. However for including users to fit there is not enough data. Regards Petr On Mon, Oct 10, 2011 at 9:32 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi I do not understand much about your equations. I think you shall look to Practical Regression and Anova Using R from J.Faraway. Having data frame DF with columns - users, groups, results you could do fit - lm(results~groups, data = DF) Regards Petr Hi, I'm a newbie to R. My knowledge of statistics is mostly self-taught. My problem is how to measure the effect of users in groups. I can calculate a particular attribute for a user in a group. But my hypothesis is that the user's attribute is not independent of each other and that the user's attribute depends on the group ie that user's behaviour change based on the group. Let me give an example: users*Group 1*Group 2*Group 3 u1*10*5*n/a u2*6*n/a*4 u3*5*2*3 For example, I want to be able to prove that u1 behaviour is different in group 1 than other groups and the particular thing about Group 1 is that users in Group 1 tend to have a higher value of the attribute under measurement. Hence, can use R to test my hypothesis. I'm willing to learn; so if this is very simple, just point me in the direction of any online resources about it. At the moment, I don't even how to define these class of problems? That will be a start. Regards Gawesh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with statistics in R - how to measure the effect of users in groups
Thanks Petr. I will try it on the real data. But that will only show that the groups are different or not. Is there any way I can test if the users are different when they are in different groups? Regards Gawesh On Mon, Oct 10, 2011 at 11:17 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi Petr, It's not an equation. It's my mistake; the * are meant to be field separators for the example data. I should have just use blank spaces as follows: users Group1 Group2 Group3 u110 5N/A u2 6 N/A 4 u3 5 23 Regards Gawesh OK. You shall transform your data to long format to use lm test - read.table(clipboard, header=T, na.strings=N/A) test.m-melt(test) Using users as id variables fit-lm(value~variable, data=test.m) summary(fit) Call: lm(formula = value ~ variable, data = test.m) Residuals: 1234689 3.0 -1.0 -2.0 1.5 -1.5 0.5 -0.5 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 7.000 1.258 5.563 0.00511 ** variableGroup2 -3.500 1.990 -1.759 0.15336 variableGroup3 -3.500 1.990 -1.759 0.15336 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 2.179 on 4 degrees of freedom (2 observations deleted due to missingness) Multiple R-squared: 0.525, Adjusted R-squared: 0.2875 F-statistic: 2.211 on 2 and 4 DF, p-value: 0.2256 No difference among groups, but I am not sure if this is the correct way to evaluate. library(ggplot2) p-ggplot(test.m, aes(x=variable, y=value, colour=users)) p+geom_point() There is some sign that user3 has lowest value in each group. However for including users to fit there is not enough data. Regards Petr On Mon, Oct 10, 2011 at 9:32 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi I do not understand much about your equations. I think you shall look to Practical Regression and Anova Using R from J.Faraway. Having data frame DF with columns - users, groups, results you could do fit - lm(results~groups, data = DF) Regards Petr Hi, I'm a newbie to R. My knowledge of statistics is mostly self-taught. My problem is how to measure the effect of users in groups. I can calculate a particular attribute for a user in a group. But my hypothesis is that the user's attribute is not independent of each other and that the user's attribute depends on the group ie that user's behaviour change based on the group. Let me give an example: users*Group 1*Group 2*Group 3 u1*10*5*n/a u2*6*n/a*4 u3*5*2*3 For example, I want to be able to prove that u1 behaviour is different in group 1 than other groups and the particular thing about Group 1 is that users in Group 1 tend to have a higher value of the attribute under measurement. Hence, can use R to test my hypothesis. I'm willing to learn; so if this is very simple, just point me in the direction of any online resources about it. At the moment, I don't even how to define these class of problems? That will be a start. Regards Gawesh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with statistics in R - how to measure the effect of users in groups
Hello, In package qualityTools you can find one way to perform this analysis through the gageRR() function. The effect of an operator on the mesasurement system (Reproductibility) is to me equivalent to the effect you try to study of your users when they are in different groups. Regards, Carlos Ortega www.qualityexcellence.es On Mon, Oct 10, 2011 at 12:48 PM, gj gaw...@gmail.com wrote: Thanks Petr. I will try it on the real data. But that will only show that the groups are different or not. Is there any way I can test if the users are different when they are in different groups? Regards Gawesh On Mon, Oct 10, 2011 at 11:17 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi Petr, It's not an equation. It's my mistake; the * are meant to be field separators for the example data. I should have just use blank spaces as follows: users Group1 Group2 Group3 u110 5N/A u2 6 N/A 4 u3 5 23 Regards Gawesh OK. You shall transform your data to long format to use lm test - read.table(clipboard, header=T, na.strings=N/A) test.m-melt(test) Using users as id variables fit-lm(value~variable, data=test.m) summary(fit) Call: lm(formula = value ~ variable, data = test.m) Residuals: 1234689 3.0 -1.0 -2.0 1.5 -1.5 0.5 -0.5 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 7.000 1.258 5.563 0.00511 ** variableGroup2 -3.500 1.990 -1.759 0.15336 variableGroup3 -3.500 1.990 -1.759 0.15336 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 2.179 on 4 degrees of freedom (2 observations deleted due to missingness) Multiple R-squared: 0.525, Adjusted R-squared: 0.2875 F-statistic: 2.211 on 2 and 4 DF, p-value: 0.2256 No difference among groups, but I am not sure if this is the correct way to evaluate. library(ggplot2) p-ggplot(test.m, aes(x=variable, y=value, colour=users)) p+geom_point() There is some sign that user3 has lowest value in each group. However for including users to fit there is not enough data. Regards Petr On Mon, Oct 10, 2011 at 9:32 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi I do not understand much about your equations. I think you shall look to Practical Regression and Anova Using R from J.Faraway. Having data frame DF with columns - users, groups, results you could do fit - lm(results~groups, data = DF) Regards Petr Hi, I'm a newbie to R. My knowledge of statistics is mostly self-taught. My problem is how to measure the effect of users in groups. I can calculate a particular attribute for a user in a group. But my hypothesis is that the user's attribute is not independent of each other and that the user's attribute depends on the group ie that user's behaviour change based on the group. Let me give an example: users*Group 1*Group 2*Group 3 u1*10*5*n/a u2*6*n/a*4 u3*5*2*3 For example, I want to be able to prove that u1 behaviour is different in group 1 than other groups and the particular thing about Group 1 is that users in Group 1 tend to have a higher value of the attribute under measurement. Hence, can use R to test my hypothesis. I'm willing to learn; so if this is very simple, just point me in the direction of any online resources about it. At the moment, I don't even how to define these class of problems? That will be a start. Regards Gawesh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing
Re: [R] help with statistics in R - how to measure the effect of users in groups
Assuming your data are in a data frame, yourdat, as: User Group Value u1 1 !0 u2 2 5 u3 3 NA ...(etc) where Group is **explicitly coerced to be a factor,** then you want the User x Group interaction, obtained from lm( Value ~ Group*User,data = yourdat) However, you'll get some kind of warning message if a) Not all Group x User combinations are present in the data b) Moreover, no statistics can be calculated if there are no replicates of UserxGroup combinations. If you do not know why either of these are the case, get local help or study any linear models (regression) text or online tutorial, as these last issues have nothing to do with R. -- Bert On Mon, Oct 10, 2011 at 3:48 AM, gj gaw...@gmail.com wrote: Thanks Petr. I will try it on the real data. But that will only show that the groups are different or not. Is there any way I can test if the users are different when they are in different groups? Regards Gawesh On Mon, Oct 10, 2011 at 11:17 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi Petr, It's not an equation. It's my mistake; the * are meant to be field separators for the example data. I should have just use blank spaces as follows: users Group1 Group2 Group3 u110 5N/A u2 6 N/A 4 u3 5 23 Regards Gawesh OK. You shall transform your data to long format to use lm test - read.table(clipboard, header=T, na.strings=N/A) test.m-melt(test) Using users as id variables fit-lm(value~variable, data=test.m) summary(fit) Call: lm(formula = value ~ variable, data = test.m) Residuals: 1234689 3.0 -1.0 -2.0 1.5 -1.5 0.5 -0.5 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 7.000 1.258 5.563 0.00511 ** variableGroup2 -3.500 1.990 -1.759 0.15336 variableGroup3 -3.500 1.990 -1.759 0.15336 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 2.179 on 4 degrees of freedom (2 observations deleted due to missingness) Multiple R-squared: 0.525, Adjusted R-squared: 0.2875 F-statistic: 2.211 on 2 and 4 DF, p-value: 0.2256 No difference among groups, but I am not sure if this is the correct way to evaluate. library(ggplot2) p-ggplot(test.m, aes(x=variable, y=value, colour=users)) p+geom_point() There is some sign that user3 has lowest value in each group. However for including users to fit there is not enough data. Regards Petr On Mon, Oct 10, 2011 at 9:32 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi I do not understand much about your equations. I think you shall look to Practical Regression and Anova Using R from J.Faraway. Having data frame DF with columns - users, groups, results you could do fit - lm(results~groups, data = DF) Regards Petr Hi, I'm a newbie to R. My knowledge of statistics is mostly self-taught. My problem is how to measure the effect of users in groups. I can calculate a particular attribute for a user in a group. But my hypothesis is that the user's attribute is not independent of each other and that the user's attribute depends on the group ie that user's behaviour change based on the group. Let me give an example: users*Group 1*Group 2*Group 3 u1*10*5*n/a u2*6*n/a*4 u3*5*2*3 For example, I want to be able to prove that u1 behaviour is different in group 1 than other groups and the particular thing about Group 1 is that users in Group 1 tend to have a higher value of the attribute under measurement. Hence, can use R to test my hypothesis. I'm willing to learn; so if this is very simple, just point me in the direction of any online resources about it. At the moment, I don't even how to define these class of problems? That will be a start. Regards Gawesh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
Re: [R] help with statistics in R - how to measure the effect of users in groups
I should have added... If your design is not nearly balanced, main effects and interactions will not have any natural interpretation because they will be (partially) confounded. (I realize nearly is not a very useful characterization, but I do not know a better one, as it probably depends on the scientific context of your data). Again, if you do not know what this means, get statistical help as I previously suggested. Or you might want to try the stats.stackexchange.com website. -- Bert On Mon, Oct 10, 2011 at 7:06 AM, Bert Gunter bgun...@gene.com wrote: Assuming your data are in a data frame, yourdat, as: User Group Value u1 1 !0 u2 2 5 u3 3 NA ...(etc) where Group is **explicitly coerced to be a factor,** then you want the User x Group interaction, obtained from lm( Value ~ Group*User,data = yourdat) However, you'll get some kind of warning message if a) Not all Group x User combinations are present in the data b) Moreover, no statistics can be calculated if there are no replicates of UserxGroup combinations. If you do not know why either of these are the case, get local help or study any linear models (regression) text or online tutorial, as these last issues have nothing to do with R. -- Bert On Mon, Oct 10, 2011 at 3:48 AM, gj gaw...@gmail.com wrote: Thanks Petr. I will try it on the real data. But that will only show that the groups are different or not. Is there any way I can test if the users are different when they are in different groups? Regards Gawesh On Mon, Oct 10, 2011 at 11:17 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi Petr, It's not an equation. It's my mistake; the * are meant to be field separators for the example data. I should have just use blank spaces as follows: users Group1 Group2 Group3 u110 5N/A u2 6 N/A 4 u3 5 23 Regards Gawesh OK. You shall transform your data to long format to use lm test - read.table(clipboard, header=T, na.strings=N/A) test.m-melt(test) Using users as id variables fit-lm(value~variable, data=test.m) summary(fit) Call: lm(formula = value ~ variable, data = test.m) Residuals: 1234689 3.0 -1.0 -2.0 1.5 -1.5 0.5 -0.5 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 7.000 1.258 5.563 0.00511 ** variableGroup2 -3.500 1.990 -1.759 0.15336 variableGroup3 -3.500 1.990 -1.759 0.15336 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 2.179 on 4 degrees of freedom (2 observations deleted due to missingness) Multiple R-squared: 0.525, Adjusted R-squared: 0.2875 F-statistic: 2.211 on 2 and 4 DF, p-value: 0.2256 No difference among groups, but I am not sure if this is the correct way to evaluate. library(ggplot2) p-ggplot(test.m, aes(x=variable, y=value, colour=users)) p+geom_point() There is some sign that user3 has lowest value in each group. However for including users to fit there is not enough data. Regards Petr On Mon, Oct 10, 2011 at 9:32 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi I do not understand much about your equations. I think you shall look to Practical Regression and Anova Using R from J.Faraway. Having data frame DF with columns - users, groups, results you could do fit - lm(results~groups, data = DF) Regards Petr Hi, I'm a newbie to R. My knowledge of statistics is mostly self-taught. My problem is how to measure the effect of users in groups. I can calculate a particular attribute for a user in a group. But my hypothesis is that the user's attribute is not independent of each other and that the user's attribute depends on the group ie that user's behaviour change based on the group. Let me give an example: users*Group 1*Group 2*Group 3 u1*10*5*n/a u2*6*n/a*4 u3*5*2*3 For example, I want to be able to prove that u1 behaviour is different in group 1 than other groups and the particular thing about Group 1 is that users in Group 1 tend to have a higher value of the attribute under measurement. Hence, can use R to test my hypothesis. I'm willing to learn; so if this is very simple, just point me in the direction of any online resources about it. At the moment, I don't even how to define these class of problems? That will be a start. Regards Gawesh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list
Re: [R] help with statistics in R - how to measure the effect of users in groups
Groups are different treatments given to Users for your Outcome (measurement) of interest. Take this idea forward and you will have an answer. Anupam. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter Sent: Monday, October 10, 2011 7:36 PM To: gj Cc: r-help@r-project.org Subject: Re: [R] help with statistics in R - how to measure the effect of users in groups Assuming your data are in a data frame, yourdat, as: User Group Value u1 1 !0 u2 2 5 u3 3 NA ...(etc) where Group is **explicitly coerced to be a factor,** then you want the User x Group interaction, obtained from lm( Value ~ Group*User,data = yourdat) However, you'll get some kind of warning message if a) Not all Group x User combinations are present in the data b) Moreover, no statistics can be calculated if there are no replicates of UserxGroup combinations. If you do not know why either of these are the case, get local help or study any linear models (regression) text or online tutorial, as these last issues have nothing to do with R. -- Bert On Mon, Oct 10, 2011 at 3:48 AM, gj gaw...@gmail.com wrote: Thanks Petr. I will try it on the real data. But that will only show that the groups are different or not. Is there any way I can test if the users are different when they are in different groups? Regards Gawesh On Mon, Oct 10, 2011 at 11:17 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi Petr, It's not an equation. It's my mistake; the * are meant to be field separators for the example data. I should have just use blank spaces as follows: users Group1 Group2 Group3 u110 5N/A u2 6 N/A 4 u3 5 23 Regards Gawesh OK. You shall transform your data to long format to use lm test - read.table(clipboard, header=T, na.strings=N/A) test.m-melt(test) Using users as id variables fit-lm(value~variable, data=test.m) summary(fit) Call: lm(formula = value ~ variable, data = test.m) Residuals: 1234689 3.0 -1.0 -2.0 1.5 -1.5 0.5 -0.5 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 7.000 1.258 5.563 0.00511 ** variableGroup2 -3.500 1.990 -1.759 0.15336 variableGroup3 -3.500 1.990 -1.759 0.15336 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 2.179 on 4 degrees of freedom (2 observations deleted due to missingness) Multiple R-squared: 0.525, Adjusted R-squared: 0.2875 F-statistic: 2.211 on 2 and 4 DF, p-value: 0.2256 No difference among groups, but I am not sure if this is the correct way to evaluate. library(ggplot2) p-ggplot(test.m, aes(x=variable, y=value, colour=users)) p+geom_point() There is some sign that user3 has lowest value in each group. However for including users to fit there is not enough data. Regards Petr On Mon, Oct 10, 2011 at 9:32 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi I do not understand much about your equations. I think you shall look to Practical Regression and Anova Using R from J.Faraway. Having data frame DF with columns - users, groups, results you could do fit - lm(results~groups, data = DF) Regards Petr Hi, I'm a newbie to R. My knowledge of statistics is mostly self-taught. My problem is how to measure the effect of users in groups. I can calculate a particular attribute for a user in a group. But my hypothesis is that the user's attribute is not independent of each other and that the user's attribute depends on the group ie that user's behaviour change based on the group. Let me give an example: users*Group 1*Group 2*Group 3 u1*10*5*n/a u2*6*n/a*4 u3*5*2*3 For example, I want to be able to prove that u1 behaviour is different in group 1 than other groups and the particular thing about Group 1 is that users in Group 1 tend to have a higher value of the attribute under measurement. Hence, can use R to test my hypothesis. I'm willing to learn; so if this is very simple, just point me in the direction of any online resources about it. At the moment, I don't even how to define these class of problems? That will be a start. Regards Gawesh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and
Re: [R] help with statistics in R - how to measure the effect of users in groups
Hi Bert, The real situation is like what you suggested, user x group interactions. The users can be in more than one group. In fact, the data that I am trying to analyse constitute of users, online forums as groups and the attribute under measure is the number of posts made by each user in a particular forum. My hypothesis is that the number of posts a user makes to a forum is dependent on the forum. For example if the user is in a forum that is active he contributes more compared to when he is in a forum that is less active. I guess there will be some users who contribute the same irrespective of the forum. I hope this makes sense. Regards Gawesh On Mon, Oct 10, 2011 at 4:50 PM, Bert Gunter gunter.ber...@gene.com wrote: Yes, of course. But then one gets into additional problems with carryover effects,etc. Also, one then has a repeated measures problem (User is the experimental unit) and my previous advice is nonsense, Like you, I have no idea what his real situation is. -- Bert On Mon, Oct 10, 2011 at 8:39 AM, Anupam anupa...@gmail.com wrote: It is possible to give multiple treatments, one at a time, to same pool of patients. You are correct that interactions may be important in this problem. I am only trying to help him frame the problem using an analogy. ** ** Anupam. *From:* Bert Gunter [mailto:gunter.ber...@gene.com] *Sent:* Monday, October 10, 2011 8:21 PM *To:* Anupam *Cc:* gj *Subject:* Re: [R] help with statistics in R - how to measure the effect of users in groups ** ** If that is the case, and each user can appear in only one group, there is no group x user interaction, the poster's question was nonsense, and one analyzes the group effect only, as originally shown -- Bert On Mon, Oct 10, 2011 at 7:43 AM, Anupam anupa...@gmail.com wrote: Groups are different treatments given to Users for your Outcome (measurement) of interest. Take this idea forward and you will have an answer. Anupam. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter Sent: Monday, October 10, 2011 7:36 PM To: gj Cc: r-help@r-project.org Subject: Re: [R] help with statistics in R - how to measure the effect of users in groups Assuming your data are in a data frame, yourdat, as: User Group Value u1 1 !0 u2 2 5 u3 3 NA ...(etc) where Group is **explicitly coerced to be a factor,** then you want the User x Group interaction, obtained from lm( Value ~ Group*User,data = yourdat) However, you'll get some kind of warning message if a) Not all Group x User combinations are present in the data b) Moreover, no statistics can be calculated if there are no replicates of UserxGroup combinations. If you do not know why either of these are the case, get local help or study any linear models (regression) text or online tutorial, as these last issues have nothing to do with R. -- Bert On Mon, Oct 10, 2011 at 3:48 AM, gj gaw...@gmail.com wrote: Thanks Petr. I will try it on the real data. But that will only show that the groups are different or not. Is there any way I can test if the users are different when they are in different groups? Regards Gawesh On Mon, Oct 10, 2011 at 11:17 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi Petr, It's not an equation. It's my mistake; the * are meant to be field separators for the example data. I should have just use blank spaces as follows: users Group1 Group2 Group3 u110 5N/A u2 6 N/A 4 u3 5 23 Regards Gawesh OK. You shall transform your data to long format to use lm test - read.table(clipboard, header=T, na.strings=N/A) test.m-melt(test) Using users as id variables fit-lm(value~variable, data=test.m) summary(fit) Call: lm(formula = value ~ variable, data = test.m) Residuals: 1234689 3.0 -1.0 -2.0 1.5 -1.5 0.5 -0.5 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 7.000 1.258 5.563 0.00511 ** variableGroup2 -3.500 1.990 -1.759 0.15336 variableGroup3 -3.500 1.990 -1.759 0.15336 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 2.179 on 4 degrees of freedom (2 observations deleted due to missingness) Multiple R-squared: 0.525, Adjusted R-squared: 0.2875 F-statistic: 2.211 on 2 and 4 DF, p-value: 0.2256 No difference among groups, but I am not sure if this is the correct way to evaluate. library(ggplot2) p-ggplot(test.m, aes(x=variable, y=value, colour=users)) p+geom_point() There is some sign that user3 has lowest value in each group. However for including users to
Re: [R] help with statistics in R - how to measure the effect of users in groups
OK. So my original advice and warnings are correct. However, now there is an additional wrinkle because your response is a count, which is not a continuous measurement. For this, you'll need glm(..., family = poisson) instead of lm(...), where the ... is the stuff I gave you before. A backup approach is there aren't too many small counts (below about 10, say) is to take the square root of the counts and analyze that via lm(). In either approach, your interpretation becomes more difficult -- e.g. have you any experience with glm's = generalized linear models? Moreover, if there are large numbers of users -- e.g. dozens (and you may have hundreds or thousands -- of course the interaction will be significant, but so what? For this you'll need to re-frame the question. So given all this and what appears to be your relative ignorance of statistics, I strongly recommend that you get local statistical help. Or just forget about formal statistical analysis altogether and do some sensible plotting. Finally, that's it for me on this. I will offer you no more advice. -- Bert On Mon, Oct 10, 2011 at 9:40 AM, gj gaw...@gmail.com wrote: Hi Bert, The real situation is like what you suggested, user x group interactions. The users can be in more than one group. In fact, the data that I am trying to analyse constitute of users, online forums as groups and the attribute under measure is the number of posts made by each user in a particular forum. My hypothesis is that the number of posts a user makes to a forum is dependent on the forum. For example if the user is in a forum that is active he contributes more compared to when he is in a forum that is less active. I guess there will be some users who contribute the same irrespective of the forum. I hope this makes sense. Regards Gawesh On Mon, Oct 10, 2011 at 4:50 PM, Bert Gunter gunter.ber...@gene.comwrote: Yes, of course. But then one gets into additional problems with carryover effects,etc. Also, one then has a repeated measures problem (User is the experimental unit) and my previous advice is nonsense, Like you, I have no idea what his real situation is. -- Bert On Mon, Oct 10, 2011 at 8:39 AM, Anupam anupa...@gmail.com wrote: It is possible to give multiple treatments, one at a time, to same pool of patients. You are correct that interactions may be important in this problem. I am only trying to help him frame the problem using an analogy. ** ** Anupam. *From:* Bert Gunter [mailto:gunter.ber...@gene.com] *Sent:* Monday, October 10, 2011 8:21 PM *To:* Anupam *Cc:* gj *Subject:* Re: [R] help with statistics in R - how to measure the effect of users in groups ** ** If that is the case, and each user can appear in only one group, there is no group x user interaction, the poster's question was nonsense, and one analyzes the group effect only, as originally shown -- Bert On Mon, Oct 10, 2011 at 7:43 AM, Anupam anupa...@gmail.com wrote: Groups are different treatments given to Users for your Outcome (measurement) of interest. Take this idea forward and you will have an answer. Anupam. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter Sent: Monday, October 10, 2011 7:36 PM To: gj Cc: r-help@r-project.org Subject: Re: [R] help with statistics in R - how to measure the effect of users in groups Assuming your data are in a data frame, yourdat, as: User Group Value u1 1 !0 u2 2 5 u3 3 NA ...(etc) where Group is **explicitly coerced to be a factor,** then you want the User x Group interaction, obtained from lm( Value ~ Group*User,data = yourdat) However, you'll get some kind of warning message if a) Not all Group x User combinations are present in the data b) Moreover, no statistics can be calculated if there are no replicates of UserxGroup combinations. If you do not know why either of these are the case, get local help or study any linear models (regression) text or online tutorial, as these last issues have nothing to do with R. -- Bert On Mon, Oct 10, 2011 at 3:48 AM, gj gaw...@gmail.com wrote: Thanks Petr. I will try it on the real data. But that will only show that the groups are different or not. Is there any way I can test if the users are different when they are in different groups? Regards Gawesh On Mon, Oct 10, 2011 at 11:17 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi Petr, It's not an equation. It's my mistake; the * are meant to be field separators for the example data. I should have just use blank spaces as follows: users Group1 Group2 Group3 u110 5N/A u2 6 N/A 4 u3 5 23 Regards Gawesh OK. You shall transform your data to long format to
Re: [R] help with statistics in R - how to measure the effect of users in groups
Hi I would try either some tree method (mvpart) or you can expand lm model also with users. fit-lm(value~variable+users, data=test.m) Anyway I am not an ultimate expert in statistics. so you shall also consult some appropriate literature which can be found in CRAN web. Did you try to look into the book I recommended? Petr Thanks Petr. I will try it on the real data. But that will only show that the groups are different or not. Is there any way I can test if the users are different when they are in different groups? Regards Gawesh On Mon, Oct 10, 2011 at 11:17 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi Petr, It's not an equation. It's my mistake; the * are meant to be field separators for the example data. I should have just use blank spaces as follows: users Group1 Group2 Group3 u110 5N/A u2 6 N/A 4 u3 5 23 Regards Gawesh OK. You shall transform your data to long format to use lm test - read.table(clipboard, header=T, na.strings=N/A) test.m-melt(test) Using users as id variables fit-lm(value~variable, data=test.m) summary(fit) Call: lm(formula = value ~ variable, data = test.m) Residuals: 1234689 3.0 -1.0 -2.0 1.5 -1.5 0.5 -0.5 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 7.000 1.258 5.563 0.00511 ** variableGroup2 -3.500 1.990 -1.759 0.15336 variableGroup3 -3.500 1.990 -1.759 0.15336 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 2.179 on 4 degrees of freedom (2 observations deleted due to missingness) Multiple R-squared: 0.525, Adjusted R-squared: 0.2875 F-statistic: 2.211 on 2 and 4 DF, p-value: 0.2256 No difference among groups, but I am not sure if this is the correct way to evaluate. library(ggplot2) p-ggplot(test.m, aes(x=variable, y=value, colour=users)) p+geom_point() There is some sign that user3 has lowest value in each group. However for including users to fit there is not enough data. Regards Petr On Mon, Oct 10, 2011 at 9:32 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi I do not understand much about your equations. I think you shall look to Practical Regression and Anova Using R from J.Faraway. Having data frame DF with columns - users, groups, results you could do fit - lm(results~groups, data = DF) Regards Petr Hi, I'm a newbie to R. My knowledge of statistics is mostly self-taught. My problem is how to measure the effect of users in groups. I can calculate a particular attribute for a user in a group. But my hypothesis is that the user's attribute is not independent of each other and that the user's attribute depends on the group ie that user's behaviour change based on the group. Let me give an example: users*Group 1*Group 2*Group 3 u1*10*5*n/a u2*6*n/a*4 u3*5*2*3 For example, I want to be able to prove that u1 behaviour is different in group 1 than other groups and the particular thing about Group 1 is that users in Group 1 tend to have a higher value of the attribute under measurement. Hence, can use R to test my hypothesis. I'm willing to learn; so if this is very simple, just point me in the direction of any online resources about it. At the moment, I don't even how to define these class of problems? That will be a start. Regards Gawesh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with statistics in R - how to measure the effect of users in groups
Hi OK. So my original advice and warnings are correct. However, now there is an additional wrinkle because your response is a count, which is not a continuous measurement. For this, you'll need glm(..., family = poisson) instead of lm(...), where the ... is the stuff I gave you before. A backup approach is there aren't too many small counts (below about 10, say) is to take the square root of the counts and analyze that via lm(). In either approach, your interpretation becomes more difficult -- e.g. have you any experience with glm's = generalized linear models? Moreover, if there are large numbers of users -- e.g. dozens (and you may have hundreds or thousands -- of course the interaction will be significant, but so what? For this you'll need to re-frame the question. So given all this and what appears to be your relative ignorance of statistics, I strongly recommend that you get local statistical help. Or just forget about formal statistical analysis altogether and do some sensible plotting. what was actually my advice too library(ggplot2) p-ggplot(test.m, aes(x=variable, y=value, colour=users)) p+geom_point() Regards Petr Finally, that's it for me on this. I will offer you no more advice. -- Bert On Mon, Oct 10, 2011 at 9:40 AM, gj gaw...@gmail.com wrote: Hi Bert, The real situation is like what you suggested, user x group interactions. The users can be in more than one group. In fact, the data that I am trying to analyse constitute of users, online forums as groups and the attribute under measure is the number of posts made by each user in a particular forum. My hypothesis is that the number of posts a user makes to a forum is dependent on the forum. For example if the user is in a forum that is active he contributes more compared to when he is in a forum that is less active. I guess there will be some users who contribute the same irrespective of the forum. I hope this makes sense. Regards Gawesh On Mon, Oct 10, 2011 at 4:50 PM, Bert Gunter gunter.ber...@gene.comwrote: Yes, of course. But then one gets into additional problems with carryover effects,etc. Also, one then has a repeated measures problem (User is the experimental unit) and my previous advice is nonsense, Like you, I have no idea what his real situation is. -- Bert On Mon, Oct 10, 2011 at 8:39 AM, Anupam anupa...@gmail.com wrote: It is possible to give multiple treatments, one at a time, to same pool of patients. You are correct that interactions may be important in this problem. I am only trying to help him frame the problem using an analogy. ** ** Anupam. *From:* Bert Gunter [mailto:gunter.ber...@gene.com] *Sent:* Monday, October 10, 2011 8:21 PM *To:* Anupam *Cc:* gj *Subject:* Re: [R] help with statistics in R - how to measure the effect of users in groups ** ** If that is the case, and each user can appear in only one group, there is no group x user interaction, the poster's question was nonsense, and one analyzes the group effect only, as originally shown -- Bert On Mon, Oct 10, 2011 at 7:43 AM, Anupam anupa...@gmail.com wrote: Groups are different treatments given to Users for your Outcome (measurement) of interest. Take this idea forward and you will have an answer. Anupam. -Original Message- From: r-help-boun...@r-project.org [ mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter Sent: Monday, October 10, 2011 7:36 PM To: gj Cc: r-help@r-project.org Subject: Re: [R] help with statistics in R - how to measure the effect of users in groups Assuming your data are in a data frame, yourdat, as: User Group Value u1 1 !0 u2 2 5 u3 3 NA ...(etc) where Group is **explicitly coerced to be a factor,** then you want the User x Group interaction, obtained from lm( Value ~ Group*User,data = yourdat) However, you'll get some kind of warning message if a) Not all Group x User combinations are present in the data b) Moreover, no statistics can be calculated if there are no replicates of UserxGroup combinations. If you do not know why either of these are the case, get local help or study any linear models (regression) text or online tutorial, as these last issues have nothing to do with R. -- Bert On Mon, Oct 10, 2011 at 3:48 AM, gj gaw...@gmail.com wrote: Thanks Petr. I will try it on the real data. But that will only show that the groups are different or not. Is there any way I can test if the users are different when they are in different groups? Regards Gawesh On Mon, Oct 10, 2011 at 11:17 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi Petr, It's not an equation. It's my mistake; the * are meant to be field
[R] help with statistics in R - how to measure the effect of users in groups
Hi, I'm a newbie to R. My knowledge of statistics is mostly self-taught. My problem is how to measure the effect of users in groups. I can calculate a particular attribute for a user in a group. But my hypothesis is that the user's attribute is not independent of each other and that the user's attribute depends on the group ie that user's behaviour change based on the group. Let me give an example: users*Group 1*Group 2*Group 3 u1*10*5*n/a u2*6*n/a*4 u3*5*2*3 For example, I want to be able to prove that u1 behaviour is different in group 1 than other groups and the particular thing about Group 1 is that users in Group 1 tend to have a higher value of the attribute under measurement. Hence, can use R to test my hypothesis. I'm willing to learn; so if this is very simple, just point me in the direction of any online resources about it. At the moment, I don't even how to define these class of problems? That will be a start. Regards Gawesh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.