Re: [R] Using functions/loops for repetitive commands
Hello, Derek, first of all, be very aware of what David Winsemius said; you are about to enter the area of "unprincipled data-mining" (as he called it) with its trap -- one of many -- of multiple testing. So, *if* you know what the consequences and possible remedies are, a purely R-syntactic "solution" to your problem might be the (again not fully tested) hack below. If so how can I change my code to automate the chisq.test in the same way I did for the wilcox.test? Try lapply( [], function( y) chisq.test( y, $) ) or even shorter: lapply( [], chisq.test, $ ) However, in the resulting output you will not be seeing the names of the variables that went into the first argument of chisq.test(). This is a little bit more complicated to resolve: lapply( names( []), function( y) eval( substitute( chisq.test( $y0, $tension), list( y0 = y) ) ) ) Still another possibility is to use xtabs() (with its summary-method) which has a formula argument. Hoping that you know what to do with the results -- Gerrit - Dr. Gerrit Eichner Mathematical Institute, Room 212 gerrit.eich...@math.uni-giessen.de Justus-Liebig-University Giessen Tel: +49-(0)641-99-32104 Arndtstr. 2, 35392 Giessen, Germany Fax: +49-(0)641-99-32109http://www.uni-giessen.de/cms/eichner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using functions/loops for repetitive commands
On May 5, 2011, at 1:45 PM, dereksloan wrote: Thanks a lot, I understand what you say but I'm having problems - maybe with the syntax or the specific command. You are right - I have a dataframe to store the data and want to automate the analysis. i.e. I want do a chisq.test with to know if alcohol intake (Y/N) differs between sexes, then if smoking (Y/N) differs between sexes, then if alcohol intake or smoking differ by hiv status. The command within my data frame for each individual comparison is e.g. chisq.test(alcohol,sex)... then repeat it for all combination of variables. I don't generally answer questions that support shotgun approaches to manufacturing p-values for fear of encouraging unprincipled data- ming ... unless it is clear that the questioner understands what he are doing from a statistical point of view. So my apologies. I probably shouldn't have even posted in this case. I misunderstood the question and thought it was just a quick syntactic fix. I now understand it to be more involved and really demands more care and respect than I was giving it. but using lapply I'm still unsure how to design the loop. I'll keep trying - let me know if you have more ideas. Derek -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using functions/loops for repetitive commands
Thanks a lot, I understand what you say but I'm having problems - maybe with the syntax or the specific command. You are right - I have a dataframe to store the data and want to automate the analysis. i.e. I want do a chisq.test with to know if alcohol intake (Y/N) differs between sexes, then if smoking (Y/N) differs between sexes, then if alcohol intake or smoking differ by hiv status. The command within my data frame for each individual comparison is e.g. chisq.test(alcohol,sex)... then repeat it for all combination of variables. but using lapply I'm still unsure how to design the loop. I'll keep trying - let me know if you have more ideas. Derek -- View this message in context: http://r.789695.n4.nabble.com/Using-functions-loops-for-repetitive-commands-tp3498006p3499001.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using functions/loops for repetitive commands
On May 5, 2011, at 1:08 PM, dereksloan wrote: Thanks David, I did notice that and I got his code to work using wilcox.test for the continuous variables. The problem is that when I tried to alter the code to do chisq.test on my categorical variables there is something wrong with the syntax and I don't know what. Right > ?chisq.test # No mention of a formula argument seen > ?chisq.test.formula No documentation for 'chisq.test.formula' in specified packages and libraries: you could try '??chisq.test.formula' `chisq.test` doesn't have a formula method, so sending it a formula will fail. Why aren't you sending it the arguments instead of turning them into strings? Derek -- View this message in context: http://r.789695.n4.nabble.com/Using-functions-loops-for-repetitive-commands-tp3498006p3498896.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using functions/loops for repetitive commands
Thanks David, I did notice that and I got his code to work using wilcox.test for the continuous variables. The problem is that when I tried to alter the code to do chisq.test on my categorical variables there is something wrong with the syntax and I don't know what. Derek -- View this message in context: http://r.789695.n4.nabble.com/Using-functions-loops-for-repetitive-commands-tp3498006p3498896.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using functions/loops for repetitive commands
On May 5, 2011, at 10:01 AM, dereksloan wrote: Your code may be untested but it works - also helping me slowly to start understanding how to write functions. Thank you. However I still have difficulty. I also have some categorical variables to analyse by age & hiv status - i.e. my dataset expands to (for example); id sex hiv age famsize bmi resprate smoker alcohol 1 M Pos 23 2 16 15 Y Y 2 F Neg 24 5 18 14 Y Y 3 F Pos 56 14 23 24 Y N 4 F Pos 67 3 33 31 N N 5 M Neg 34 2 21 23 N N Using the template for the code you sent me I thought I could analyse the categorical variables by sex & hiv status using a chiq-squared test; Long-hand this would be; chisq.test(smoker,sex) chisq.test(alcohol,sex) chisq.test(smoker,hiv) chisq.test(alcohol,hiv) Again I wanted to use a function to loop automate it and thought I could write; categ<-c(smoker,alcohol) group.name<-c(sex,hiv) bl.chisq<-function(categ,group.name,){ lapply(categ, function(y){ form2<-as.formula(paste(y,group.name)) I haven't tested it but I suspect you failed to note that Eichner used sep="~" in his paste argument to as.formula(). chisq.test(form2,) }) } bl.chisq(categ,group.name,) but I get an error message: Error in parse(text = x) : unexpected symbol in "smoker sex" What is wrong with the code? Is is because the wilcox.test is a formula (with a ~ symbol for modelling) whilst the chisq.test simply requires me to list raw data? If so how can I change my code to automate the chisq.test in the same way I did for the wilcox.test? Many thanks for any help! Derek David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using functions/loops for repetitive commands
Hi Derek, You can accomplish your loop jobs by following means: (a) use for loop (b) use while loop (c) use lapply, tapply, or sapply. (i feel "lapply is the elegant way ) ---For Loop- "for" loops are pretty simple to use and is almost similar to any other scripting languages you know.( I am referring to Matlab) (Example 1) lets say you know that you have to run 10 iterations then you can run it as for(i in 1:10) print(i) //it will print the number from 1 to 10 (Example 2) You don't know how many iterations you need to run. Only thing you have is some vector and you want to do some operation on that vector. You can do something like this: myVector<-c(20,45,23,45,89) for(i in seq_along(myVector)) print(myVector[i] -Using lapply- In "lapply" you need to provide mainly two things: (1)First parameter: vectors or some sequence of numbers (2)Second parameter: A function which could be user defined function or some other inbuilt function. lapply will call the function for every number given in the "First parameter of the function) For example: x<-c(10,20,20) lapply(seq_along(x),function(i) {//your logic}) if you see the first parameter i have sent seq_along(x). The outcome of seq_along(x) will be 1, 2,3. Now lapply will take each of these numbers and call the function. That means lapply is calling the function thrice for the current data set something like this function(1) { //your logic} function(2) { } function(3) { //) That means your logic inside the function will be executed for each and every value specified in the first parameter of the lapply function. I hope it helps you in some way. For your problem, i am making a guess that you are using data frame or matrix to store the data and then you want to automate the data right? You can try using "lapply", i think that would be efficient..Let me also try .. Regards, Som Shekhar __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using functions/loops for repetitive commands
Your code may be untested but it works - also helping me slowly to start understanding how to write functions. Thank you. However I still have difficulty. I also have some categorical variables to analyse by age & hiv status - i.e. my dataset expands to (for example); id sex hiv age famsize bmi resprate smoker alcohol 1 M Pos 23 2 16 15 Y Y 2 F Neg 24 5 18 14 Y Y 3 F Pos 56 14 23 24 Y N 4 F Pos 67 3 33 31 N N 5 M Neg 34 2 21 23 N N Using the template for the code you sent me I thought I could analyse the categorical variables by sex & hiv status using a chiq-squared test; Long-hand this would be; chisq.test(smoker,sex) chisq.test(alcohol,sex) chisq.test(smoker,hiv) chisq.test(alcohol,hiv) Again I wanted to use a function to loop automate it and thought I could write; categ<-c(smoker,alcohol) group.name<-c(sex,hiv) bl.chisq<-function(categ,group.name,){ lapply(categ, function(y){ form2<-as.formula(paste(y,group.name)) chisq.test(form2,) }) } bl.chisq(categ,group.name,) but I get an error message: Error in parse(text = x) : unexpected symbol in "smoker sex" What is wrong with the code? Is is because the wilcox.test is a formula (with a ~ symbol for modelling) whilst the chisq.test simply requires me to list raw data? If so how can I change my code to automate the chisq.test in the same way I did for the wilcox.test? Many thanks for any help! Derek -- View this message in context: http://r.789695.n4.nabble.com/Using-functions-loops-for-repetitive-commands-tp3498006p3498427.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using functions/loops for repetitive commands
Hello, Derek, see below. On Thu, 5 May 2011, dereksloan wrote: I still need to do some repetitive statistical analysis on some outcomes from a dataset. Take the following as an example; id sex hiv age famsize bmi resprate 1 M Pos 23 2 16 15 2 F Neg 24 5 18 14 3 F Pos 56 14 23 24 4 F Pos 67 3 33 31 5 M Neg 34 2 21 23 I want to know if there are statistically detectable differences in all of the continuous variables in my data set when subdivided by sex or hiv status (ie are age, family size, bmi and resprate different in my male and female patients or in hiv pos/neg patients) Of course I can use wilcoxon or t-tests e.g: wilcox.test( age~sex) wilcox.test(famsize~sex) wilcox.test(bmi~sex) wilcox.test(resprate~sex) wilcox.test( age~hiv) wilcox.test(famsize~hiv) wilcox.test(bmi~hiv) wilcox.test(resprate~hiv) [snip] Define, e. g., my.wilcox.tests <- function( var.names, groupvar.name, data) { lapply( var.names, function( v) { form <- as.formula( paste( v, "~", groupvar.name)) wilcox.test( form, data = data) } ) } and call something like my.wilcox.test( , , data = ) Caveat: untested! Hth -- Gerrit - Dr. Gerrit Eichner Mathematical Institute, Room 212 gerrit.eich...@math.uni-giessen.de Justus-Liebig-University Giessen Tel: +49-(0)641-99-32104 Arndtstr. 2, 35392 Giessen, Germany Fax: +49-(0)641-99-32109http://www.uni-giessen.de/cms/eichner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.