Re: [R] General help - online statistics courses?
You should try http://math.tutorvista.com/statistics.html Tutorvista.com the site offers http://math.tutorvista.com/statistics.html online statistics help to anyone that is in need. The site is very interactive and helps you get the ace in your desired subject. -- View this message in context: http://r.789695.n4.nabble.com/General-help-online-statistics-courses-tp3799327p3889240.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ANOVA from imported data has only 1 degree of freedom
Hi David, Apologies again and thankyou for your help, I've edited my original post to clarify what I was asking. What I meant was that the factor had only 1 degrees of freedom when it should have had 2 (14 in total), so you're right there were 14 but not in the right place. In SPSS you select one column as a factor and another as a dependent variable so this wouldn't happen, it's easy to use but not that versatile and very expensive. I've been told good things about R so I'm trying to teach myself. I followed your suggestion and I now have the results I need, I'll take more care with posting in future, Yours, Sam -- View this message in context: http://r.789695.n4.nabble.com/ANOVA-from-imported-data-has-only-1-degree-of-freedom-tp3887528p3888322.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] HOW TO PASS MY JAVA ARGUMENT INTO RSCRIPT FILE
Hi , I am working in Eclipse IDE , I want to use rscript to produce statistical analysis , I tested a sample rcode in the script its working fine in my Eclipse IDE , but I don't know how to pass my java values into rscript . I need some guidance ,Please help me . Thanks , Janarthanan .M [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] convert apply to lappy
Thanks alot for the tip. Worked :) From: Joshua Wiley jwiley.ps...@gmail.com Cc: R-help@r-project.org R-help@r-project.org Sent: Sunday, October 9, 2011 6:32 PM Subject: Re: [R] convert apply to lappy Hi Alex, If data is a matrix, probably the easiest option would be: tips - as.data.frame(data) mclapply(tips, foo) By the way, I would recommend not using 'data' (which is also a function) as the name of the object storing your data. If your data set has many columns and performance is an issue I might convert it to a list instead of a data frame. Note that if you wanted the equivalent of apply(tips, 1, foo), you could transpose your matrix first: as.data.frame(t(data)). lapply works on columns of a data frame because each column is basically an element of a list (list apply). Cheers, Josh Dear all I want to convert a apply to lapply. The reason for that is that there is a function mclappy that uses exact the same format as the lapply function. My code looks like that mean_power_per_tip - function(data) { return((apply(data[,],2,MeanTip))); } where data is a [m,n] matrix. I would like to thank you in advance for your help B.R Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vector-subsetting with ZERO - Is behavior changeable?
Thank you very much. Learned something again! Joh William Dunlap wrote: You can use [1] on the output of FUN to ensure that exactly one value (perhaps NA from numeric(0)[1]) is returned. E.g. index - 1 sapply(list(c(1,2,3),c(1,2),c(1)),function(x){x[max(length(x)- index,0)][1]}) [1] 2 1 NA I'll also put in a plug for vapply, which throws an error if FUN does not return what you expect it to: vapply(list(c(1,2,3),c(1,2),c(1)),function(x){x[max(length(x)- index,0)]}, FUN.VALUE=numeric(1)) Error in vapply(list(c(1, 2, 3), c(1, 2), c(1)), function(x) { : values must be length 1, but FUN(X[[3]]) result is length 0 vapply(list(c(1,2,3),c(1,2),c(1)),function(x){x[max(length(x)- index,0)][1]}, FUN.VALUE=numeric(1)) [1] 2 1 NA For long input vectors vapply can save a fair bit of memory and time over sapply. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Johannes Graumann Sent: Wednesday, October 05, 2011 4:29 AM To: r-h...@stat.math.ethz.ch Subject: [R] Vector-subsetting with ZERO - Is behavior changeable? Dear All, I have trouble generizising some code. index - 0 sapply(list(c(1,2,3),c(1,2),c(1)),function(x){x[max(length(x)- index,0)]}) Will yield a wished for vector like so: [1] 3 2 1 But in this case (trying to select te second to last element in each vector of the list) index - 1 sapply(list(c(1,2,3),c(1,2),c(1)),function(x){x[max(length(x)- index,0)]}) I end up with [[1]] [1] 2 [[2]] [1] 1 [[3]] numeric(0) I would (massively) prefer something like [1] 2 1 NA My current implementation looks like index - 1 unlist( sapply( list(c(1,2,3),c(1,2),c(1)), function(x){ value - x[max(length(x)-index,0)] if(identical(value,numeric(0))){return(NA)} else {return(value)} } ) ) [1] 2 1 NA Quite the inelegant eyesore. Any hints on how to do this better? Thanks, Joh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Handling Time in R
Thanks a lot. That helped. One thing now is to have the difftime(y,x) to always report seconds. There are times that there is a change in the day and thus the diff will report few days difference. How can it always report only seconds? I would like to thank you in advance for your help B.R Alex From: jim holtman jholt...@gmail.com Cc: R-help@r-project.org R-help@r-project.org Sent: Friday, October 7, 2011 5:34 PM Subject: Re: [R] Handling Time in R ?ISOdatetime x - ISOdatetime(2011,10,6,16,23,30.539) str(x) POSIXct[1:1], format: 2011-10-06 16:23:30 y - ISOdatetime(2011,10,6,16,23,30.939) difftime(y,x) Time difference of 0.399 secs Dear all, I would like to ask your help regarding handling time stamps in R. I think first I need a reference to read about their logic and how I should handle them. For example, this is a struct I have str(MyStruct$TimeStamps) num [1:100, 1:6] 2011 2011 2011 2011 2011 ... MyStruct$TimeStamps[1,] [1] 2011.000 10.000 6.000 16.000 23.000 30.539 the last field contains seconds.milliseconds. How I can for example make calculations with time stamps like see if the MyStruct$TimeStamps[1,]-MyStruct$TimeStamps[2,] differ more than 300millisecond, or 3 days have passed? I would like to thank you in advance for your suggestions B.R Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Handling Time in R
check the help page; there is a parameter ('units' I thnk) that will let you specify that. Sent from my iPad On Oct 10, 2011, at 2:57, Alaios ala...@yahoo.com wrote: Thanks a lot. That helped. One thing now is to have the difftime(y,x) to always report seconds. There are times that there is a change in the day and thus the diff will report few days difference. How can it always report only seconds? I would like to thank you in advance for your help B.R Alex From: jim holtman jholt...@gmail.com To: Alaios ala...@yahoo.com Cc: R-help@r-project.org R-help@r-project.org Sent: Friday, October 7, 2011 5:34 PM Subject: Re: [R] Handling Time in R ?ISOdatetime x - ISOdatetime(2011,10,6,16,23,30.539) str(x) POSIXct[1:1], format: 2011-10-06 16:23:30 y - ISOdatetime(2011,10,6,16,23,30.939) difftime(y,x) Time difference of 0.399 secs On Fri, Oct 7, 2011 at 11:04 AM, Alaios ala...@yahoo.com wrote: Dear all, I would like to ask your help regarding handling time stamps in R. I think first I need a reference to read about their logic and how I should handle them. For example, this is a struct I have str(MyStruct$TimeStamps) num [1:100, 1:6] 2011 2011 2011 2011 2011 ... MyStruct$TimeStamps[1,] [1] 2011.000 10.0006.000 16.000 23.000 30.539 the last field contains seconds.milliseconds. How I can for example make calculations with time stamps like see if the MyStruct$TimeStamps[1,]-MyStruct$TimeStamps[2,] differ more than 300millisecond, or 3 days have passed? I would like to thank you in advance for your suggestions B.R Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] HOW TO PASS MY JAVA ARGUMENT INTO RSCRIPT FILE
Hello, use the rJava library to execute R code from Java or transfer values from Java to R: http://www.rforge.net/rJava/ http://www.rforge.net/rJava/ -- View this message in context: http://r.789695.n4.nabble.com/HOW-TO-PASS-MY-JAVA-ARGUMENT-INTO-RSCRIPT-FILE-tp3889327p3889457.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cspade error
That's a very late answer, but I just ran into the same problem and thought maybe someone else (someone browsing the archives, for instance) would appreciate a tip. There are maybe empty lines in your data file file_name.txt. If you remove them, or remove the corresponding transactions in data_ex, it should work (at least, it works for me). -- View this message in context: http://r.789695.n4.nabble.com/cspade-error-tp3774834p3889448.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Handling Time in R
This did the trick as.numeric(diff(c(ISOdatetime(2011,6,1,11,59,1.09),ISOdatetime(2011,6,5,11,59,1.09 [1] 345600 From: Jim Holtman jholt...@gmail.com Cc: R-help@r-project.org R-help@r-project.org Sent: Monday, October 10, 2011 9:26 AM Subject: Re: [R] Handling Time in R check the help page; there is a parameter ('units' I thnk) that will let you specify that. Sent from my iPad Thanks a lot. That helped. One thing now is to have the difftime(y,x) to always report seconds. There are times that there is a change in the day and thus the diff will report few days difference. How can it always report only seconds? I would like to thank you in advance for your help B.R Alex From: jim holtman jholt...@gmail.com Cc: R-help@r-project.org R-help@r-project.org Sent: Friday, October 7, 2011 5:34 PM Subject: Re: [R] Handling Time in R ?ISOdatetime x - ISOdatetime(2011,10,6,16,23,30.539) str(x) POSIXct[1:1], format: 2011-10-06 16:23:30 y - ISOdatetime(2011,10,6,16,23,30.939) difftime(y,x) Time difference of 0.399 secs Dear all, I would like to ask your help regarding handling time stamps in R. I think first I need a reference to read about their logic and how I should handle them. For example, this is a struct I have str(MyStruct$TimeStamps) num [1:100, 1:6] 2011 2011 2011 2011 2011 ... MyStruct$TimeStamps[1,] [1] 2011.000 10.000 6.000 16.000 23.000 30.539 the last field contains seconds.milliseconds. How I can for example make calculations with time stamps like see if the MyStruct$TimeStamps[1,]-MyStruct$TimeStamps[2,] differ more than 300millisecond, or 3 days have passed? I would like to thank you in advance for your suggestions B.R Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MGARCH BEKK estimation
Dear All, I want to estimate Bivariate Garch Model using MGARCHBEKK package. I am not able to understand some part of the command this function. mvBEKK.est(eps=lrdata, order = c(1,1), params = NULL, fixed = NULL, method = BFGS, verbose = F) Here what exactly the eps refers to ? It would be really useful if somebody can suggest me the meaning. With regards, Upananda -- You may delay, but time will not. Research Scholar alternative mail id: up...@iitkgp.ac.in Department of HSS, IIT KGP KGP [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: help with statistics in R - how to measure the effect of users in groups
Hi I do not understand much about your equations. I think you shall look to Practical Regression and Anova Using R from J.Faraway. Having data frame DF with columns - users, groups, results you could do fit - lm(results~groups, data = DF) Regards Petr Hi, I'm a newbie to R. My knowledge of statistics is mostly self-taught. My problem is how to measure the effect of users in groups. I can calculate a particular attribute for a user in a group. But my hypothesis is that the user's attribute is not independent of each other and that the user's attribute depends on the group ie that user's behaviour change based on the group. Let me give an example: users*Group 1*Group 2*Group 3 u1*10*5*n/a u2*6*n/a*4 u3*5*2*3 For example, I want to be able to prove that u1 behaviour is different in group 1 than other groups and the particular thing about Group 1 is that users in Group 1 tend to have a higher value of the attribute under measurement. Hence, can use R to test my hypothesis. I'm willing to learn; so if this is very simple, just point me in the direction of any online resources about it. At the moment, I don't even how to define these class of problems? That will be a start. Regards Gawesh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Handling Time in R
Difftime doesn't report things. When you print it, it automatically selects an appropriate human-readable unit to display in, but that does not change its internal representation. If you must convert to seconds, you can do so using the as.double generic (as.double.difftime) with a units parameter. --- Jeff Newmiller The . . Go Live... DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Alaios ala...@yahoo.com wrote: Thanks a lot. That helped. One thing now is to have the difftime(y,x) to always report seconds. There are times that there is a change in the day and thus the diff will report few days difference. How can it always report only seconds? I would like to thank you in advance for your help B.R Alex _ From: jim holtman jholt...@gmail.com Cc: R-help@r-project.org R-help@r-project.org Sent: Friday, October 7, 2011 5:34 PM Subject: Re: [R] Handling Time in R ?ISOdatetime x - ISOdatetime(2011,10,6,16,23,30.539) str(x) POSIXct[1:1], format: 2011-10-06 16:23:30 y - ISOdatetime(2011,10,6,16,23,30.939) difftime(y,x) Time difference of 0.399 secs Dear all, I would like to ask your help regarding handling time stamps in R. I think first I need a reference to read about their logic and how I should handle them. For example, this is a struct I have str(MyStruct$TimeStamps) �num [1:100, 1:6] 2011 2011 2011 2011 2011 ... MyStruct$TimeStamps[1,] [1] 2011.000�� 10.000��� 6.000�� 16.000�� 23.000�� 30.539 the last field contains seconds.milliseconds. How I can for example make calculations with time stamps like see if the MyStruct$TimeStamps[1,]-MyStruct$TimeStamps[2,] differ more than 300millisecond, or 3 days have passed? I would like to thank you in advance for your suggestions B.R Alex � � � �[[alternative HTML version deleted]] _ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? [[alternative HTML version deleted]] _ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Handling Time in R
Do you mean something like that? as.double(diff(c(ISOdatetime(2011,6,1,11,59,1.09),ISOdatetime(2011,6,5,11,59,1.09))),length=20) [1] 345600 From: Jeff Newmiller jdnew...@dcn.davis.ca.us Cc: R-help@r-project.org R-help@r-project.org Sent: Monday, October 10, 2011 10:42 AM Subject: Re: [R] Handling Time in R Difftime doesn't report things. When you print it, it automatically selects an appropriate human-readable unit to display in, but that does not change its internal representation. If you must convert to seconds, you can do so using the as.double generic (as.double.difftime) with a units parameter. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Thanks a lot. That helped. One thing now is to have the difftime(y,x) to always report seconds. There are times that there is a change in the day and thus the diff will report few days difference. How can it always report only seconds? I would like to thank you in advance for your help B.R Alex From: jim holtman jholt...@gmail.com Cc: R-help@r-project.org R-help@r-project.org Sent: Friday, October 7, 2011 5:34 PM Subject: Re: [R] Handling Time in R ?ISOdatetime x - ISOdatetime(2011,10,6,16,23,30.539) str(x) POSIXct[1:1], format: 2011-10-06 16:23:30 y - ISOdatetime(2011,10,6,16,23,30.939) difftime(y,x) Time difference of 0.399 secs Dear all, I would like to ask your help regarding handling time stamps in R. I think first I need a reference to read about their logic and how I should handle them. For example, this is a struct I have str(MyStruct$TimeStamps) �num [1:100, 1:6] 2011 2011 2011 2011 2011 ... MyStruct$TimeStamps[1,] [1] 2011.000�� 10.000��� 6.000�� 16.000�� 23.000�� 30.539 the last field contains seconds.milliseconds. How I can for example make calculations with time stamps like see if the MyStruct$TimeStamps[1,]-MyStruct$TimeStamps[2,] differ more than 300millisecond, or 3 days have passed? I would like to thank you in advance for your suggestions B.R Alex � � � �[[alternative HTML version deleted]] R-help@r-project.org mailing list br / https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? [[alternative HTML version deleted]] R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with statistics in R - how to measure the effect of users in groups
Hi Petr, It's not an equation. It's my mistake; the * are meant to be field separators for the example data. I should have just use blank spaces as follows: users Group1 Group2 Group3 u110 5N/A u2 6 N/A 4 u3 5 23 Regards Gawesh On Mon, Oct 10, 2011 at 9:32 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi I do not understand much about your equations. I think you shall look to Practical Regression and Anova Using R from J.Faraway. Having data frame DF with columns - users, groups, results you could do fit - lm(results~groups, data = DF) Regards Petr Hi, I'm a newbie to R. My knowledge of statistics is mostly self-taught. My problem is how to measure the effect of users in groups. I can calculate a particular attribute for a user in a group. But my hypothesis is that the user's attribute is not independent of each other and that the user's attribute depends on the group ie that user's behaviour change based on the group. Let me give an example: users*Group 1*Group 2*Group 3 u1*10*5*n/a u2*6*n/a*4 u3*5*2*3 For example, I want to be able to prove that u1 behaviour is different in group 1 than other groups and the particular thing about Group 1 is that users in Group 1 tend to have a higher value of the attribute under measurement. Hence, can use R to test my hypothesis. I'm willing to learn; so if this is very simple, just point me in the direction of any online resources about it. At the moment, I don't even how to define these class of problems? That will be a start. Regards Gawesh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with statistics in R - how to measure the effect of users in groups
Hi Petr, It's not an equation. It's my mistake; the * are meant to be field separators for the example data. I should have just use blank spaces as follows: users Group1 Group2 Group3 u110 5N/A u2 6 N/A 4 u3 5 23 Regards Gawesh OK. You shall transform your data to long format to use lm test - read.table(clipboard, header=T, na.strings=N/A) test.m-melt(test) Using users as id variables fit-lm(value~variable, data=test.m) summary(fit) Call: lm(formula = value ~ variable, data = test.m) Residuals: 1234689 3.0 -1.0 -2.0 1.5 -1.5 0.5 -0.5 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 7.000 1.258 5.563 0.00511 ** variableGroup2 -3.500 1.990 -1.759 0.15336 variableGroup3 -3.500 1.990 -1.759 0.15336 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 2.179 on 4 degrees of freedom (2 observations deleted due to missingness) Multiple R-squared: 0.525, Adjusted R-squared: 0.2875 F-statistic: 2.211 on 2 and 4 DF, p-value: 0.2256 No difference among groups, but I am not sure if this is the correct way to evaluate. library(ggplot2) p-ggplot(test.m, aes(x=variable, y=value, colour=users)) p+geom_point() There is some sign that user3 has lowest value in each group. However for including users to fit there is not enough data. Regards Petr On Mon, Oct 10, 2011 at 9:32 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi I do not understand much about your equations. I think you shall look to Practical Regression and Anova Using R from J.Faraway. Having data frame DF with columns - users, groups, results you could do fit - lm(results~groups, data = DF) Regards Petr Hi, I'm a newbie to R. My knowledge of statistics is mostly self-taught. My problem is how to measure the effect of users in groups. I can calculate a particular attribute for a user in a group. But my hypothesis is that the user's attribute is not independent of each other and that the user's attribute depends on the group ie that user's behaviour change based on the group. Let me give an example: users*Group 1*Group 2*Group 3 u1*10*5*n/a u2*6*n/a*4 u3*5*2*3 For example, I want to be able to prove that u1 behaviour is different in group 1 than other groups and the particular thing about Group 1 is that users in Group 1 tend to have a higher value of the attribute under measurement. Hence, can use R to test my hypothesis. I'm willing to learn; so if this is very simple, just point me in the direction of any online resources about it. At the moment, I don't even how to define these class of problems? That will be a start. Regards Gawesh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Converting factor into date
Dear R users, I have an elementary query. I have a dataset which is taken from text file with the help of read.csv command but when I generate the data in R file it converts the Dates into factor.So for the above problem, I use as.Date to convert the Dates from factor form to date format using the following: z has Date as a column. *z- read.csv(data, header = TRUE, sep = \t) z$Date- as.Date(z$Date, format = %d/%m/%y/) *But during this operation I loose all my dates and I get NA's instead of it. It would be helpful to have your inputs. Regards Vikram [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Handling Time in R
No, read ?difftime and look at as.double. There is a units parameter that you must set if you want predictable results. --- Jeff Newmiller The . . Go Live... DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Alaios ala...@yahoo.com wrote: Do you mean something like that? as.double(diff(c(ISOdatetime(2011,6,1,11,59,1.09),ISOdatetime(2011,6,5,11,59,1.09))),length=20) [1] 345600 _ From: Jeff Newmiller jdnew...@dcn.davis.ca.us To: Alaios ala...@yahoo.com; jim holtman jholt...@gmail.com Cc: R-help@r-project.org R-help@r-project.org Sent: Monday, October 10, 2011 10:42 AM Subject: Re: [R] Handling Time in R Difftime doesn't report things. When you print it, it automatically selects an appropriate human-readable unit to display in, but that does not change its internal representation. If you must convert to seconds, you can do so using the as.double generic (as.double.difftime) with a units parameter. --- Jeff Newmiller The . . Go Live... DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Alaios ala...@yahoo.com wrote: Thanks a lot. That helped. One thing now is to have the difftime(y,x) to always report seconds. There are times that there is a change in the day and thus the diff will report few days difference. How can it always report only seconds? I would like to thank you in advance for your help B.R Alex _ From: jim holtman jholt...@gmail.com Cc: R-help@r-project.org R-help@r-project.org Sent: Friday, October 7, 2011 5:34 PM Subject: Re: [R] Handling Time in R ?ISOdatetime x - ISOdatetime(2011,10,6,16,23,30.539) str(x) POSIXct[1:1], format: 2011-10-06 16:23:30 y - ISOdatetime(2011,10,6,16,23,30.939) difftime(y,x) Time difference of 0.399 secs Dear all, I would like to ask your help regarding handling time stamps in R. I think first I need a reference to read about their logic and how I should handle them. For example, this is a struct I have str(MyStruct$TimeStamps) �num [1:100, 1:6] 2011 2011 2011 2011 2011 ... MyStruct$TimeStamps[1,] [1] 2011.000�� 10.000��� 6.000�� 16.000�� 23.000�� 30.539 the last field contains seconds.milliseconds. How I can for example make calculations with time stamps like see if the MyStruct$TimeStamps[1,]-MyStruct$TimeStamps[2,] differ more than 300millisecond, or 3 days have passed? I would like to thank you in advance for your suggestions B.R Alex � � � �[[alternative HTML version deleted]] _ R-help@r-project.org mailing list br / https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? [[alternative HTML version deleted]] _ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Converting factor into date
Convert to character first or use the as.is option to read.csv. The default is to try to convert the underlying integer form of factors to date, which is not what you intend. --- Jeff Newmiller The . . Go Live... DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Vikram Bahure economics.vik...@gmail.com wrote: Dear R users, I have an elementary query. I have a dataset which is taken from text file with the help of read.csv command but when I generate the data in R file it converts the Dates into factor.So for the above problem, I use as.Date to convert the Dates from factor form to date format using the following: z has Date as a column. *z- read.csv(data, header = TRUE, sep = \t) z$Date- as.Date(z$Date, format = %d/%m/%y/) *But during this operation I loose all my dates and I get NA's instead of it. It would be helpful to have your inputs. Regards Vikram [[alternative HTML version deleted]] _ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with statistics in R - how to measure the effect of users in groups
Thanks Petr. I will try it on the real data. But that will only show that the groups are different or not. Is there any way I can test if the users are different when they are in different groups? Regards Gawesh On Mon, Oct 10, 2011 at 11:17 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi Petr, It's not an equation. It's my mistake; the * are meant to be field separators for the example data. I should have just use blank spaces as follows: users Group1 Group2 Group3 u110 5N/A u2 6 N/A 4 u3 5 23 Regards Gawesh OK. You shall transform your data to long format to use lm test - read.table(clipboard, header=T, na.strings=N/A) test.m-melt(test) Using users as id variables fit-lm(value~variable, data=test.m) summary(fit) Call: lm(formula = value ~ variable, data = test.m) Residuals: 1234689 3.0 -1.0 -2.0 1.5 -1.5 0.5 -0.5 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 7.000 1.258 5.563 0.00511 ** variableGroup2 -3.500 1.990 -1.759 0.15336 variableGroup3 -3.500 1.990 -1.759 0.15336 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 2.179 on 4 degrees of freedom (2 observations deleted due to missingness) Multiple R-squared: 0.525, Adjusted R-squared: 0.2875 F-statistic: 2.211 on 2 and 4 DF, p-value: 0.2256 No difference among groups, but I am not sure if this is the correct way to evaluate. library(ggplot2) p-ggplot(test.m, aes(x=variable, y=value, colour=users)) p+geom_point() There is some sign that user3 has lowest value in each group. However for including users to fit there is not enough data. Regards Petr On Mon, Oct 10, 2011 at 9:32 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi I do not understand much about your equations. I think you shall look to Practical Regression and Anova Using R from J.Faraway. Having data frame DF with columns - users, groups, results you could do fit - lm(results~groups, data = DF) Regards Petr Hi, I'm a newbie to R. My knowledge of statistics is mostly self-taught. My problem is how to measure the effect of users in groups. I can calculate a particular attribute for a user in a group. But my hypothesis is that the user's attribute is not independent of each other and that the user's attribute depends on the group ie that user's behaviour change based on the group. Let me give an example: users*Group 1*Group 2*Group 3 u1*10*5*n/a u2*6*n/a*4 u3*5*2*3 For example, I want to be able to prove that u1 behaviour is different in group 1 than other groups and the particular thing about Group 1 is that users in Group 1 tend to have a higher value of the attribute under measurement. Hence, can use R to test my hypothesis. I'm willing to learn; so if this is very simple, just point me in the direction of any online resources about it. At the moment, I don't even how to define these class of problems? That will be a start. Regards Gawesh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Type of Graph to use
Hi, Please advice on what type of graph can be used to display the following data set. I have the following: NameClass a Class 1 a Class4 b Class2 b Class1 d Class3 d Class5 e Class4 e Class2 So each entry in name can belong to more than one class. I want to represent the data as to see where overlaps occur that is which names are in the same Class Name and also which names are unique to a Class. I tough a Venn Diagram would work but this can only present numerical values for each Class, I would like each name to be presented by a dot or *. Any suggestions and how to would be appreciated. -- Regards/Groete/Mit freundlichen GrüÃen/recuerdos/meilleures salutations/ distinti saluti/siong/duì yú/пÑÐ¸Ð²ÐµÑ Jurgens de Bruin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pos in panel.text
Hi, I need to vary the placements of data labels but I cannot assign a vector to pos option. Any vectors work fine with cex, for example. What could be the problem here? xyplot(Npop~Narea, data=size, scales=list(x=list(log=TRUE), y=list(log=TRUE), xlab=expression(N[A]), ylab=expression(N[P]), panel=function( ...) { panel.lines(..., type=l, col.line=black, lwd=.25) panel.xyplot(..., type=p, col=black, cex=.5, pch=20) panel.text(..., lab=t, cex=.5, pos=c(4,2)) }) Many thanks, Allan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Converting factor into date
Hi, I am getting my results from the following: *z$Date-as.Date(as.character(z$Date),format=%d/%m/%y)* instead of: z$Date-as.Date(as.character(z$Date,format=%d/%m/%y)) Thanks again. Regards Vikram On Mon, Oct 10, 2011 at 4:08 PM, Jeff Newmiller jdnew...@dcn.davis.ca.uswrote: Convert to character first or use the as.is option to read.csv. The default is to try to convert the underlying integer form of factors to date, which is not what you intend. --- Jeff Newmiller The . . Go Live... DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Vikram Bahure economics.vik...@gmail.com wrote: Dear R users, I have an elementary query. I have a dataset which is taken from text file with the help of read.csv command but when I generate the data in R file it converts the Dates into factor.So for the above problem, I use as.Date to convert the Dates from factor form to date format using the following: z has Date as a column. *z- read.csv(data, header = TRUE, sep = \t) z$Date- as.Date(z$Date, format = %d/%m/%y/) *But during this operation I loose all my dates and I get NA's instead of it. It would be helpful to have your inputs. Regards Vikram [[alternative HTML version deleted]] -- R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the post ing guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Type of Graph to use
See if this gives you the presentation you want: x - read.table(textConnection(NameClass a Class1 a Class4 b Class2 b Class1 d Class3 d Class5 e Class4 e Class2), header = TRUE) closeAllConnections() # add columns of numeric values of factors x$name - as.integer(x$Name) x$class - as.integer(x$Class) # create plot area plot(0 , type = 'n' , xaxt = 'n' , yaxt = 'n' , xlab = '' , ylab = '' , xlim = c(0, max(x$class)) , ylim = c(0, max(x$name)) ) # now plot the rectangles rect( xleft = x$class - 1 , ybottom = x$name - 1 , xright = x$class , ytop = x$name , col = x$name ) # add the labels axis(1 , at = seq(0.5, by = 1, length = length(levels(x$Class))) , labels = levels(x$Class) ) axis(2 , at = seq(0.5, by = 1, length = length(levels(x$Name))) , labels = levels(x$Name) ) On Mon, Oct 10, 2011 at 6:49 AM, Jurgens de Bruin debrui...@gmail.com wrote: Hi, Please advice on what type of graph can be used to display the following data set. I have the following: Name Class a Class 1 a Class4 b Class2 b Class1 d Class3 d Class5 e Class4 e Class2 So each entry in name can belong to more than one class. I want to represent the data as to see where overlaps occur that is which names are in the same Class Name and also which names are unique to a Class. I tough a Venn Diagram would work but this can only present numerical values for each Class, I would like each name to be presented by a dot or *. Any suggestions and how to would be appreciated. -- Regards/Groete/Mit freundlichen Grüßen/recuerdos/meilleures salutations/ distinti saluti/siong/duì yú/привет Jurgens de Bruin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Type of Graph to use
Jim, This should work, would it be possible to plot * and not larg rec. On 10 October 2011 14:12, jim holtman jholt...@gmail.com wrote: See if this gives you the presentation you want: x - read.table(textConnection(NameClass a Class1 a Class4 b Class2 b Class1 d Class3 d Class5 e Class4 e Class2), header = TRUE) closeAllConnections() # add columns of numeric values of factors x$name - as.integer(x$Name) x$class - as.integer(x$Class) # create plot area plot(0 , type = 'n' , xaxt = 'n' , yaxt = 'n' , xlab = '' , ylab = '' , xlim = c(0, max(x$class)) , ylim = c(0, max(x$name)) ) # now plot the rectangles rect( xleft = x$class - 1 , ybottom = x$name - 1 , xright = x$class , ytop = x$name , col = x$name ) # add the labels axis(1 , at = seq(0.5, by = 1, length = length(levels(x$Class))) , labels = levels(x$Class) ) axis(2 , at = seq(0.5, by = 1, length = length(levels(x$Name))) , labels = levels(x$Name) ) On Mon, Oct 10, 2011 at 6:49 AM, Jurgens de Bruin debrui...@gmail.com wrote: Hi, Please advice on what type of graph can be used to display the following data set. I have the following: NameClass a Class 1 a Class4 b Class2 b Class1 d Class3 d Class5 e Class4 e Class2 So each entry in name can belong to more than one class. I want to represent the data as to see where overlaps occur that is which names are in the same Class Name and also which names are unique to a Class. I tough a Venn Diagram would work but this can only present numerical values for each Class, I would like each name to be presented by a dot or *. Any suggestions and how to would be appreciated. -- Regards/Groete/Mit freundlichen GrüÃen/recuerdos/meilleures salutations/ distinti saluti/siong/duì yú/пÑÐ¸Ð²ÐµÑ Jurgens de Bruin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? -- Regards/Groete/Mit freundlichen GrüÃen/recuerdos/meilleures salutations/ distinti saluti/siong/duì yú/пÑÐ¸Ð²ÐµÑ Jurgens de Bruin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pos in panel.text
Thanks, Carlos, Tried that, but no success, still getting this error message: Warning messages: 1: In if (pos == 1) { : the condition has length 1 and only the first element will be used 2: In if (pos == 2) { : the condition has length 1 and only the first element will be used Thanks, Allan On 10/10/2011 12:10, Carlos Ortega wrote: Hello, To check the possible values of pos parameter you need to review text() as it is indicated in the lattice help of panel.text(). In text() it says: |pos| a position specifier for the text. If specified this overrides any |adj| value given. Values of |1|, |2|, |3| and |4|, respectively indicate positions below, to the left of, above and to the right of the specified coordinates. So, the coordinates should be x=4, y=2 for your case. Additionally you can use ltext() function which is explained in the same panel.text() help. Regards, Carlos Ortega www.qualityexcellence.es http://www.qualityexcellence.es 2011/10/10 Allan Sikk a.s...@ucl.ac.uk mailto:a.s...@ucl.ac.uk Hi, I need to vary the placements of data labels but I cannot assign a vector to pos option. Any vectors work fine with cex, for example. What could be the problem here? xyplot(Npop~Narea, data=size, scales=list(x=list(log=TRUE), y=list(log=TRUE), xlab=expression(N[A]), ylab=expression(N[P]), panel=function( ...) { panel.lines(..., type=l, col.line=black, lwd=.25) panel.xyplot(..., type=p, col=black, cex=.5, pch=20) panel.text(..., lab=t, cex=.5, pos=c(4,2)) }) Many thanks, Allan __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- DrAllan Sikk Lecturer in Baltic Politics University College London, School of Slavonic and East European Studies 16 Taviton St, London WC1H 0BW, United Kingdom tel: +44 (0)20 7679 4872 http://www.homepages.ucl.ac.uk/~tjmsasi/ Latest research: - 'Newness as a Winning Formula for New Political Parties', /Party Politics/, forthcoming. - 'Parties and Populism', Centre for European Politics, Security and Integration (CEPSI) Working Paper (2010), http://bit.ly/partiespopulism. - (with Rein Taagepera) 'Parsimonius Model for Predicting Mean Cabinet Duration on the Basis of Electoral System', /Party Politics/, 16(2), 2010, 261-81. - 'Force Mineure?The Effects of the EU on Party Politics in a Small Country: The Case of Estonia,' /Journal of Communist Studies and Transition Politics/, 25(4), 2009, 468-90. - (with Rune Andersen) 'Without a Tinge of Red: The Fall and Rise of Estonian Greens, 1987-2007', /Journal of Baltic Studies/, 40(3), 2009, 349-73. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Type of Graph to use
try this version: x - read.table(textConnection(NameClass a Class1 a Class4 b Class2 b Class1 d Class3 d Class5 e Class4 e Class2), header = TRUE) closeAllConnections() # add columns of numeric values of factors x$name - as.integer(x$Name) x$class - as.integer(x$Class) # create plot area plot(0 , type = 'n' , xaxt = 'n' , yaxt = 'n' , xlab = '' , ylab = '' , xlim = c(0, max(x$class)) , ylim = c(0, max(x$name)) ) # now plot the rectangles # rect( # xleft = x$class - 1 # , ybottom = x$name - 1 # , xright = x$class # , ytop = x$name # , col = x$name # ) # plot * instead points(x$class - .5, x$name - .5, pch = *, cex = 3) # add the labels axis(1 , at = seq(0.5, by = 1, length = length(levels(x$Class))) , labels = levels(x$Class) ) axis(2 , at = seq(0.5, by = 1, length = length(levels(x$Name))) , labels = levels(x$Name) ) On Mon, Oct 10, 2011 at 8:27 AM, Jurgens de Bruin debrui...@gmail.com wrote: Jim, This should work, would it be possible to plot * and not larg rec. On 10 October 2011 14:12, jim holtman jholt...@gmail.com wrote: See if this gives you the presentation you want: x - read.table(textConnection(Name Class a Class1 a Class4 b Class2 b Class1 d Class3 d Class5 e Class4 e Class2), header = TRUE) closeAllConnections() # add columns of numeric values of factors x$name - as.integer(x$Name) x$class - as.integer(x$Class) # create plot area plot(0 , type = 'n' , xaxt = 'n' , yaxt = 'n' , xlab = '' , ylab = '' , xlim = c(0, max(x$class)) , ylim = c(0, max(x$name)) ) # now plot the rectangles rect( xleft = x$class - 1 , ybottom = x$name - 1 , xright = x$class , ytop = x$name , col = x$name ) # add the labels axis(1 , at = seq(0.5, by = 1, length = length(levels(x$Class))) , labels = levels(x$Class) ) axis(2 , at = seq(0.5, by = 1, length = length(levels(x$Name))) , labels = levels(x$Name) ) On Mon, Oct 10, 2011 at 6:49 AM, Jurgens de Bruin debrui...@gmail.com wrote: Hi, Please advice on what type of graph can be used to display the following data set. I have the following: Name Class a Class 1 a Class4 b Class2 b Class1 d Class3 d Class5 e Class4 e Class2 So each entry in name can belong to more than one class. I want to represent the data as to see where overlaps occur that is which names are in the same Class Name and also which names are unique to a Class. I tough a Venn Diagram would work but this can only present numerical values for each Class, I would like each name to be presented by a dot or *. Any suggestions and how to would be appreciated. -- Regards/Groete/Mit freundlichen Grüßen/recuerdos/meilleures salutations/ distinti saluti/siong/duì yú/привет Jurgens de Bruin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? -- Regards/Groete/Mit freundlichen Grüßen/recuerdos/meilleures salutations/ distinti saluti/siong/duì yú/привет Jurgens de Bruin -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Type of Graph to use
On Mon, Oct 10, 2011 at 6:49 AM, Jurgens de Bruin debrui...@gmail.com wrote: Hi, Please advice on what type of graph can be used to display the following data set. I have the following: Name Class a Class 1 a Class4 b Class2 b Class1 d Class3 d Class5 e Class4 e Class2 So each entry in name can belong to more than one class. I want to represent the data as to see where overlaps occur that is which names are in the same Class Name and also which names are unique to a Class. I tough a Venn Diagram would work but this can only present numerical values for each Class, I would like each name to be presented by a dot or *. Assuming DF is the indicated data.frame: library(gplots) with(DF, balloonplot(Name, Class, rep(1, nrow(DF)), label = FALSE)) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pos in panel.text
Here's the code. The problem seems to be specific for lattice as I can easily use a vector with pos in plot. trellis.device(,width=600, height = 400) xyplot(Npop~Narea, scales=list(x=list(log=TRUE, at=my.at,labels = formatC(my.at, big.mark = ,, format=d)), y=list(log=TRUE, at=c(1,10,100,1000,1,10,100))), panel=function(...) { panel.xyplot(..., type=p, col=black, cex=.5, pch=20) panel.text(x=log10(Narea), y=log10(Npop), lab=t, cex=.5, pos=c(4,2)) } ) On 10/10/2011 13:58, Carlos Ortega wrote: Hi Allan, Please could you send the modified code where now it should appear x and y coordinates?. I do not fully understand the error message you get. Regards, Carlos Ortega www.qualityexcellence.es http://www.qualityexcellence.es 2011/10/10 Allan Sikk a.s...@ucl.ac.uk mailto:a.s...@ucl.ac.uk Thanks, Carlos, Tried that, but no success, still getting this error message: Warning messages: 1: In if (pos == 1) { : the condition has length 1 and only the first element will be used 2: In if (pos == 2) { : the condition has length 1 and only the first element will be used Thanks, Allan On 10/10/2011 12:10, Carlos Ortega wrote: Hello, To check the possible values of pos parameter you need to review text() as it is indicated in the lattice help of panel.text(). In text() it says: |pos| a position specifier for the text. If specified this overrides any |adj| value given. Values of |1|, |2|, |3| and |4|, respectively indicate positions below, to the left of, above and to the right of the specified coordinates. So, the coordinates should be x=4, y=2 for your case. Additionally you can use ltext() function which is explained in the same panel.text() help. Regards, Carlos Ortega www.qualityexcellence.es http://www.qualityexcellence.es http://www.qualityexcellence.es 2011/10/10 Allan Sikk a.s...@ucl.ac.uk mailto:a.s...@ucl.ac.uk mailto:a.s...@ucl.ac.uk mailto:a.s...@ucl.ac.uk Hi, I need to vary the placements of data labels but I cannot assign a vector to pos option. Any vectors work fine with cex, for example. What could be the problem here? xyplot(Npop~Narea, data=size, scales=list(x=list(log=TRUE), y=list(log=TRUE), xlab=expression(N[A]), ylab=expression(N[P]), panel=function( ...) { panel.lines(..., type=l, col.line=black, lwd=.25) panel.xyplot(..., type=p, col=black, cex=.5, pch=20) panel.text(..., lab=t, cex=.5, pos=c(4,2)) }) Many thanks, Allan __ R-help@r-project.org mailto:R-help@r-project.org mailto:R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- DrAllan Sikk Lecturer in Baltic Politics University College London, School of Slavonic and East European Studies 16 Taviton St, London WC1H 0BW, United Kingdom tel: +44 (0)20 7679 4872 tel:%2B44%20%280%2920%207679%204872 http://www.homepages.ucl.ac.uk/~tjmsasi/ http://www.homepages.ucl.ac.uk/%7Etjmsasi/ Latest research: - 'Newness as a Winning Formula for New Political Parties', /Party Politics/, forthcoming. - 'Parties and Populism', Centre for European Politics, Security and Integration (CEPSI) Working Paper (2010), http://bit.ly/partiespopulism. - (with Rein Taagepera) 'Parsimonius Model for Predicting Mean Cabinet Duration on the Basis of Electoral System', /Party Politics/, 16(2), 2010, 261-81. - 'Force Mineure?The Effects of the EU on Party Politics in a Small Country: The Case of Estonia,' /Journal of Communist Studies and Transition Politics/, 25(4), 2009, 468-90. - (with Rune Andersen) 'Without a Tinge of Red: The Fall and Rise of Estonian Greens, 1987-2007', /Journal of Baltic Studies/, 40(3), 2009, 349-73. [[alternative HTML version deleted]] __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- DrAllan Sikk Lecturer in Baltic Politics University College London, School of Slavonic and East European Studies 16 Taviton St, London WC1H 0BW, United Kingdom tel: +44 (0)20 7679 4872
Re: [R] help with statistics in R - how to measure the effect of users in groups
Hello, In package qualityTools you can find one way to perform this analysis through the gageRR() function. The effect of an operator on the mesasurement system (Reproductibility) is to me equivalent to the effect you try to study of your users when they are in different groups. Regards, Carlos Ortega www.qualityexcellence.es On Mon, Oct 10, 2011 at 12:48 PM, gj gaw...@gmail.com wrote: Thanks Petr. I will try it on the real data. But that will only show that the groups are different or not. Is there any way I can test if the users are different when they are in different groups? Regards Gawesh On Mon, Oct 10, 2011 at 11:17 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi Petr, It's not an equation. It's my mistake; the * are meant to be field separators for the example data. I should have just use blank spaces as follows: users Group1 Group2 Group3 u110 5N/A u2 6 N/A 4 u3 5 23 Regards Gawesh OK. You shall transform your data to long format to use lm test - read.table(clipboard, header=T, na.strings=N/A) test.m-melt(test) Using users as id variables fit-lm(value~variable, data=test.m) summary(fit) Call: lm(formula = value ~ variable, data = test.m) Residuals: 1234689 3.0 -1.0 -2.0 1.5 -1.5 0.5 -0.5 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 7.000 1.258 5.563 0.00511 ** variableGroup2 -3.500 1.990 -1.759 0.15336 variableGroup3 -3.500 1.990 -1.759 0.15336 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 2.179 on 4 degrees of freedom (2 observations deleted due to missingness) Multiple R-squared: 0.525, Adjusted R-squared: 0.2875 F-statistic: 2.211 on 2 and 4 DF, p-value: 0.2256 No difference among groups, but I am not sure if this is the correct way to evaluate. library(ggplot2) p-ggplot(test.m, aes(x=variable, y=value, colour=users)) p+geom_point() There is some sign that user3 has lowest value in each group. However for including users to fit there is not enough data. Regards Petr On Mon, Oct 10, 2011 at 9:32 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi I do not understand much about your equations. I think you shall look to Practical Regression and Anova Using R from J.Faraway. Having data frame DF with columns - users, groups, results you could do fit - lm(results~groups, data = DF) Regards Petr Hi, I'm a newbie to R. My knowledge of statistics is mostly self-taught. My problem is how to measure the effect of users in groups. I can calculate a particular attribute for a user in a group. But my hypothesis is that the user's attribute is not independent of each other and that the user's attribute depends on the group ie that user's behaviour change based on the group. Let me give an example: users*Group 1*Group 2*Group 3 u1*10*5*n/a u2*6*n/a*4 u3*5*2*3 For example, I want to be able to prove that u1 behaviour is different in group 1 than other groups and the particular thing about Group 1 is that users in Group 1 tend to have a higher value of the attribute under measurement. Hence, can use R to test my hypothesis. I'm willing to learn; so if this is very simple, just point me in the direction of any online resources about it. At the moment, I don't even how to define these class of problems? That will be a start. Regards Gawesh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing
Re: [R] Type of Graph to use
Thanks for all the help, Their would it be possible to use a Venn Diagram for this application? On 10 October 2011 14:49, Gabor Grothendieck ggrothendi...@gmail.comwrote: On Mon, Oct 10, 2011 at 6:49 AM, Jurgens de Bruin debrui...@gmail.com wrote: Hi, Please advice on what type of graph can be used to display the following data set. I have the following: NameClass a Class 1 a Class4 b Class2 b Class1 d Class3 d Class5 e Class4 e Class2 So each entry in name can belong to more than one class. I want to represent the data as to see where overlaps occur that is which names are in the same Class Name and also which names are unique to a Class. I tough a Venn Diagram would work but this can only present numerical values for each Class, I would like each name to be presented by a dot or *. Assuming DF is the indicated data.frame: library(gplots) with(DF, balloonplot(Name, Class, rep(1, nrow(DF)), label = FALSE)) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com -- Regards/Groete/Mit freundlichen GrüÃen/recuerdos/meilleures salutations/ distinti saluti/siong/duì yú/пÑÐ¸Ð²ÐµÑ Jurgens de Bruin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Type of Graph to use
Please ignore the venn digram as this will be to complex to read when more than 3 categories are present On 10 October 2011 15:36, Jurgens de Bruin debrui...@gmail.com wrote: Thanks for all the help, Their would it be possible to use a Venn Diagram for this application? On 10 October 2011 14:49, Gabor Grothendieck ggrothendi...@gmail.comwrote: On Mon, Oct 10, 2011 at 6:49 AM, Jurgens de Bruin debrui...@gmail.com wrote: Hi, Please advice on what type of graph can be used to display the following data set. I have the following: NameClass a Class 1 a Class4 b Class2 b Class1 d Class3 d Class5 e Class4 e Class2 So each entry in name can belong to more than one class. I want to represent the data as to see where overlaps occur that is which names are in the same Class Name and also which names are unique to a Class. I tough a Venn Diagram would work but this can only present numerical values for each Class, I would like each name to be presented by a dot or *. Assuming DF is the indicated data.frame: library(gplots) with(DF, balloonplot(Name, Class, rep(1, nrow(DF)), label = FALSE)) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com -- Regards/Groete/Mit freundlichen GrüÃen/recuerdos/meilleures salutations/ distinti saluti/siong/duì yú/пÑÐ¸Ð²ÐµÑ Jurgens de Bruin -- Regards/Groete/Mit freundlichen GrüÃen/recuerdos/meilleures salutations/ distinti saluti/siong/duì yú/пÑÐ¸Ð²ÐµÑ Jurgens de Bruin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Type of Graph to use
On Mon, Oct 10, 2011 at 8:49 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Mon, Oct 10, 2011 at 6:49 AM, Jurgens de Bruin debrui...@gmail.com wrote: Hi, Please advice on what type of graph can be used to display the following data set. I have the following: Name Class a Class 1 a Class4 b Class2 b Class1 d Class3 d Class5 e Class4 e Class2 So each entry in name can belong to more than one class. I want to represent the data as to see where overlaps occur that is which names are in the same Class Name and also which names are unique to a Class. I tough a Venn Diagram would work but this can only present numerical values for each Class, I would like each name to be presented by a dot or *. Assuming DF is the indicated data.frame: library(gplots) with(DF, balloonplot(Name, Class, rep(1, nrow(DF)), label = FALSE)) Here is one additional idea: xt - xtabs(~ Class + Name, DF) symnum(xt, cutpoints = 0:2/2, symbols = c(., +)) Name Classa b d e Class1 + + . . Class2 . + . + Class3 . . + . Class4 + . . + Class5 . . + . -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Request for moderated posting
This message has been passed onto the list moderators for approval. This is because you are not a subscriber to this list or the related squid-users list. If the message is relevant to the squid-dev mailinglist one of the moderators will accept the message and it gets automatically forwarded to the list. The squid-devel list is restricted to discussions about the development of Squid only. Configuration and usage questions are not accepted except on features not yet avaiable in the current STABLE version of Squid. If you wish to participate in the squid-dev mailinglist, please subscribe to the squid-dev list by first sending presentation of yourself and which areas of Squid you are interested to help with the development to squid-...@squid-cache.org, and then a subscribe request as described below. Or alternatively if you are looking for general help in how to use or configure Squid, subscribe to the squid-users list instead. When when you have introduced yourself and your intentions to the developers, you may send a request to subscribe on the list to by sending an email to squid-dev-subscr...@squid-cache.org with no subject or body. If you would like to subscribe an alternate email address from the one you are posting from or like to see what other subscription options there are, send an email to squid-dev-h...@squid-cache.org to get help on doing this. Please remember that squid-dev is aimed at squid developers. If you want to contribute ideas and code, this list is for you. If you only want to track development please use the public archives. Thanks! The Squid Developers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multiple imputation on subgroups
Dear R-users, I want to multiple impute missing scores, but only for a few subgroups in my data (variable 'subgroups': only impute for subgroups 2 and 3). Does anyone knows how to do this in MICE? This is my script for the multiple imputation: imp - mice(data, m=20, predictorMatrix=pred, post=post, method=c(, , , , ,norm, norm,norm,norm,norm,norm), maxit=20) . The final analysis should be on the dataset as a whole, so with subgroups 2 and 3 with observed and imputed values, and for subgroup 1 with observed values only (and missing scores). Thanks. -- View this message in context: http://r.789695.n4.nabble.com/Multiple-imputation-on-subgroups-tp3889664p3889664.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pos in panel.text
Hello, To check the possible values of pos parameter you need to review text() as it is indicated in the lattice help of panel.text(). In text() it says: pos a position specifier for the text. If specified this overrides any adj value given. Values of 1, 2, 3 and 4, respectively indicate positions below, to the left of, above and to the right of the specified coordinates. So, the coordinates should be x=4, y=2 for your case. Additionally you can use ltext() function which is explained in the same panel.text() help. Regards, Carlos Ortega www.qualityexcellence.es 2011/10/10 Allan Sikk a.s...@ucl.ac.uk Hi, I need to vary the placements of data labels but I cannot assign a vector to pos option. Any vectors work fine with cex, for example. What could be the problem here? xyplot(Npop~Narea, data=size, scales=list(x=list(log=TRUE), y=list(log=TRUE), xlab=expression(N[A]), ylab=expression(N[P]), panel=function( ...) { panel.lines(..., type=l, col.line=black, lwd=.25) panel.xyplot(..., type=p, col=black, cex=.5, pch=20) panel.text(..., lab=t, cex=.5, pos=c(4,2)) }) Many thanks, Allan __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] variable scope for deltavar function from emdbook
Dear all, I want to use the deltavar() function from emdbook. I can use it directly from the command terminal but within a function it behaves weird. Working example: -- library(emdbook) fn - function() { browser() y - 2 print(deltavar(y*b2, meanval=c(b2=3), Sigma=1) ) } x - 2 print(deltavar(x*b1, meanval=c(b1=3), Sigma=1) ) y-3 fn() running this returns 4 for the first function call, which is fine. For the call of deltavar in fn(), I get 9, i.e. the function uses y-3 instead of the local y-2. If y- is commented, deltavar returns an error. So why is the function not using the local variable and how do I make it use it? Many thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pos in panel.text
Hi, OK. Have you tried to run your code without the pos parameter? Based on the help, pos should be just *one* parameter. pos offers a finer adjustment of the text. But in your case, the first thing to get is that the text label is represented at the specified coordinates. Besides pos you can try adj which is a parameter that allows you to use two parameters (between 0 and 1). Regards, Carlos Ortega www.qualityexcellence.es 2011/10/10 Allan Sikk a.s...@ucl.ac.uk Here's the code. The problem seems to be specific for lattice as I can easily use a vector with pos in plot. trellis.device(,width=600, height = 400) xyplot(Npop~Narea, scales=list(x=list(log=TRUE, at=my.at,labels = formatC(my.at, big.mark = ,, format=d)), y=list(log=TRUE, at=c(1,10,100,1000,1,10,100))), panel=function(...) { panel.xyplot(..., type=p, col=black, cex=.5, pch=20) panel.text(x=log10(Narea), y=log10(Npop), lab=t, cex=.5, pos=c(4,2)) } ) On 10/10/2011 13:58, Carlos Ortega wrote: Hi Allan, Please could you send the modified code where now it should appear x and y coordinates?. I do not fully understand the error message you get. Regards, Carlos Ortega www.qualityexcellence.es http://www.qualityexcellence.es 2011/10/10 Allan Sikk a.s...@ucl.ac.uk mailto:a.s...@ucl.ac.uk Thanks, Carlos, Tried that, but no success, still getting this error message: Warning messages: 1: In if (pos == 1) { : the condition has length 1 and only the first element will be used 2: In if (pos == 2) { : the condition has length 1 and only the first element will be used Thanks, Allan On 10/10/2011 12:10, Carlos Ortega wrote: Hello, To check the possible values of pos parameter you need to review text() as it is indicated in the lattice help of panel.text(). In text() it says: |pos| a position specifier for the text. If specified this overrides any |adj| value given. Values of |1|, |2|, |3| and |4|, respectively indicate positions below, to the left of, above and to the right of the specified coordinates. So, the coordinates should be x=4, y=2 for your case. Additionally you can use ltext() function which is explained in the same panel.text() help. Regards, Carlos Ortega www.qualityexcellence.es http://www.qualityexcellence.es http://www.qualityexcellence.es 2011/10/10 Allan Sikk a.s...@ucl.ac.uk mailto:a.s...@ucl.ac.uk mailto:a.s...@ucl.ac.uk mailto:a.s...@ucl.ac.uk Hi, I need to vary the placements of data labels but I cannot assign a vector to pos option. Any vectors work fine with cex, for example. What could be the problem here? xyplot(Npop~Narea, data=size, scales=list(x=list(log=TRUE), y=list(log=TRUE), xlab=expression(N[A]), ylab=expression(N[P]), panel=function( ...) { panel.lines(..., type=l, col.line=black, lwd=.25) panel.xyplot(..., type=p, col=black, cex=.5, pch=20) panel.text(..., lab=t, cex=.5, pos=c(4,2)) }) Many thanks, Allan __ R-help@r-project.org mailto:R-help@r-project.org mailto:R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- DrAllan Sikk Lecturer in Baltic Politics University College London, School of Slavonic and East European Studies 16 Taviton St, London WC1H 0BW, United Kingdom tel: +44 (0)20 7679 4872 tel:%2B44%20%280%2920%207679%204872 http://www.homepages.ucl.ac.uk/~tjmsasi/ http://www.homepages.ucl.ac.uk/%7Etjmsasi/ Latest research: - 'Newness as a Winning Formula for New Political Parties', /Party Politics/, forthcoming. - 'Parties and Populism', Centre for European Politics, Security and Integration (CEPSI) Working Paper (2010), http://bit.ly/partiespopulism. - (with Rein Taagepera) 'Parsimonius Model for Predicting Mean Cabinet Duration on the Basis of Electoral System', /Party Politics/, 16(2), 2010, 261-81. - 'Force Mineure?The Effects of the EU on Party Politics in a Small Country: The Case of Estonia,' /Journal of Communist Studies and Transition Politics/, 25(4), 2009, 468-90. - (with Rune Andersen) 'Without a Tinge of Red: The Fall and Rise of Estonian Greens, 1987-2007', /Journal of Baltic Studies/,
[R] Importing from Fortan
Hello all, how do I import a Fortran file (f3.1) into one column in R? I've tried this (I'm a total beginner as you can see): FortranData-read.fwf(C:\\Users\\format3_1.txt,rep(3,20)) Warning message: In readLines(file, n = thisblock) : incomplete final line found on 'C:\Users\format3_1.txt' FortranData V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 1 2.2 3.3 4.2 2.1 3.4 2.3 2.3 4.2 2.1 3.4 2.3 2.3 4.2 2.1 3.4 2.3 2.3 4.2 2.1 3.4 As you can see, each datum gets imported into a separate column, whereas I'd like to have everything stored under V1. I'm also unsure as to what the error message means. Thanks in advance for your help! Léa -- View this message in context: http://r.789695.n4.nabble.com/Importing-from-Fortan-tp3889947p3889947.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linear programming problem, RGPLK - no feasible solution.
In my post at https://stat.ethz.ch/pipermail/r-help/2011-October/292019.html I included an undefined term ej. The problem code should be as follows. It seems like a simple linear programming problem, but for some reason my code is not finding the solution. obj - c(rep(0,3),1) col1 -c(1,0,0,1,0,0,1,-2.330078923,0) col2 -c(0,1,0,0,1,0,1,-2.057855981,0) col3 -c(0,0,1,0,0,1,1,-1.885177032,0) col4 -c(-1,-1,-1,1,1,1,0,0,1) mat - cbind(col1, col2, col3, col4) dir - c(rep(=, 3), rep(=, 3), rep(==, 2), =) rhs - c(rep(0, 7), 1, 0) sol - Rglpk_solve_LP(obj, mat, dir, rhs, types = NULL, max = FALSE, bounds = c(-100,100), verbose = TRUE) The R output says there is no feasible solution, but e.g. (-2.3756786, 0.3297676, 2.0459110, 2.3756786) is feasible. The output is GLPK Simplex Optimizer, v4.42 9 rows, 4 columns, 19 non-zeros 0: obj = 0.0e+000 infeas = 1.000e+000 (2) PROBLEM HAS NO FEASIBLE SOLUTION One other thing, a possible bug - if I run this code with dir shorter than it should be, R crashes. My version of R is 2.131.56322.0, and I'm running it on Windows 7. Regards, Gareth __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pos in panel.text
Hi Allan, Please could you send the modified code where now it should appear x and y coordinates?. I do not fully understand the error message you get. Regards, Carlos Ortega www.qualityexcellence.es 2011/10/10 Allan Sikk a.s...@ucl.ac.uk Thanks, Carlos, Tried that, but no success, still getting this error message: Warning messages: 1: In if (pos == 1) { : the condition has length 1 and only the first element will be used 2: In if (pos == 2) { : the condition has length 1 and only the first element will be used Thanks, Allan On 10/10/2011 12:10, Carlos Ortega wrote: Hello, To check the possible values of pos parameter you need to review text() as it is indicated in the lattice help of panel.text(). In text() it says: |pos| a position specifier for the text. If specified this overrides any |adj| value given. Values of |1|, |2|, |3| and |4|, respectively indicate positions below, to the left of, above and to the right of the specified coordinates. So, the coordinates should be x=4, y=2 for your case. Additionally you can use ltext() function which is explained in the same panel.text() help. Regards, Carlos Ortega www.qualityexcellence.es http://www.qualityexcellence.es 2011/10/10 Allan Sikk a.s...@ucl.ac.uk mailto:a.s...@ucl.ac.uk Hi, I need to vary the placements of data labels but I cannot assign a vector to pos option. Any vectors work fine with cex, for example. What could be the problem here? xyplot(Npop~Narea, data=size, scales=list(x=list(log=TRUE), y=list(log=TRUE), xlab=expression(N[A]), ylab=expression(N[P]), panel=function( ...) { panel.lines(..., type=l, col.line=black, lwd=.25) panel.xyplot(..., type=p, col=black, cex=.5, pch=20) panel.text(..., lab=t, cex=.5, pos=c(4,2)) }) Many thanks, Allan __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- DrAllan Sikk Lecturer in Baltic Politics University College London, School of Slavonic and East European Studies 16 Taviton St, London WC1H 0BW, United Kingdom tel: +44 (0)20 7679 4872 http://www.homepages.ucl.ac.uk/~tjmsasi/ Latest research: - 'Newness as a Winning Formula for New Political Parties', /Party Politics/, forthcoming. - 'Parties and Populism', Centre for European Politics, Security and Integration (CEPSI) Working Paper (2010), http://bit.ly/partiespopulism. - (with Rein Taagepera) 'Parsimonius Model for Predicting Mean Cabinet Duration on the Basis of Electoral System', /Party Politics/, 16(2), 2010, 261-81. - 'Force Mineure?The Effects of the EU on Party Politics in a Small Country: The Case of Estonia,' /Journal of Communist Studies and Transition Politics/, 25(4), 2009, 468-90. - (with Rune Andersen) 'Without a Tinge of Red: The Fall and Rise of Estonian Greens, 1987-2007', /Journal of Baltic Studies/, 40(3), 2009, 349-73. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Handling Time in R
Thanks a lot for your answer. It is something that I do not understand with R how the wrapper functions work really. As an example, check the code below  TimeDiffInSeconds-diff((ISOdate(timeMatrix[,1],timeMatrix[,2],timeMatrix[,3],timeMatrix[,4],timeMatrix[,5],timeMatrix[,6])),units=secs);       returns an error even though I make it   TimeDiffInSeconds-difftime((ISOdate(timeMatrix[,1],timeMatrix[,2],timeMatrix[,3],timeMatrix[,4],timeMatrix[,5],timeMatrix[,6])),units=secs);       I would like to thank you in advance for your help B.R Alex From: Jeff Newmiller jdnew...@dcn.davis.ca.us Cc: R-help@r-project.org R-help@r-project.org Sent: Monday, October 10, 2011 12:23 PM Subject: Re: [R] Handling Time in R No, read ?difftime and look at as.double. There is a units parameter that you must set if you want predictable results. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Do you mean something like that? as.double(diff(c(ISOdatetime(2011,6,1,11,59,1.09),ISOdatetime(2011,6,5,11,59,1.09))),length=20) [1] 345600 From: Jeff Newmiller jdnew...@dcn.davis.ca.us Cc: R-help@r-project.org R-help@r-project.org Sent: Monday, October 10, 2011 10:42 AM Subject: Re: [R] Handling Time in R Difftime doesn't report things. When you print it, it automatically selects an appropriate human-readable unit to display in, but that does not change its internal representation. If you must convert to seconds, you can do so using the as.double generic (as.double.difftime) with a units parameter. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Thanks a lot. That helped. One thing now is to have the difftime(y,x) to always report seconds. There are times that there is a change in the day and thus the diff will report few days difference. How can it always report only seconds? I would like to thank you in advance for your help B.R Alex From: jim holtman jholt...@gmail.com Cc: R-help@r-project.org R-help@r-project.org Sent: Friday, October 7, 2011 5:34 PM Subject: Re: [R] Handling Time in R ?ISOdatetime x - ISOdatetime(2011,10,6,16,23,30.539) str(x) POSIXct[1:1], format: 2011-10-06 16:23:30 y - ISOdatetime(2011,10,6,16,23,30.939) difftime(y,x) Time difference of 0.399 secs Dear all, I would like to ask your help regarding handling time stamps in R. I think first I need a reference to read about their logic and how I should handle them. For example, this is a struct I have str(MyStruct$TimeStamps) �num [1:100, 1:6] 2011 2011 2011 2011 2011 ... MyStruct$TimeStamps[1,] [1] 2011.000�� 10.000��� 6.000�� 16.000�� 23.000�� 30.539 the last field contains seconds.milliseconds. How I can for example make calculations with time stamps like see if the MyStruct$TimeStamps[1,]-MyStruct$TimeStamps[2,] differ more than 300millisecond, or 3 days have passed? I would like to thank you in advance for your suggestions B.R Alex � � � �[[alternative HTML version deleted]] R-help@r-project.org mailing list br / https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? [[alternative HTML version deleted]] R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide
Re: [R] help with statistics in R - how to measure the effect of users in groups
Assuming your data are in a data frame, yourdat, as: User Group Value u1 1 !0 u2 2 5 u3 3 NA ...(etc) where Group is **explicitly coerced to be a factor,** then you want the User x Group interaction, obtained from lm( Value ~ Group*User,data = yourdat) However, you'll get some kind of warning message if a) Not all Group x User combinations are present in the data b) Moreover, no statistics can be calculated if there are no replicates of UserxGroup combinations. If you do not know why either of these are the case, get local help or study any linear models (regression) text or online tutorial, as these last issues have nothing to do with R. -- Bert On Mon, Oct 10, 2011 at 3:48 AM, gj gaw...@gmail.com wrote: Thanks Petr. I will try it on the real data. But that will only show that the groups are different or not. Is there any way I can test if the users are different when they are in different groups? Regards Gawesh On Mon, Oct 10, 2011 at 11:17 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi Petr, It's not an equation. It's my mistake; the * are meant to be field separators for the example data. I should have just use blank spaces as follows: users Group1 Group2 Group3 u110 5N/A u2 6 N/A 4 u3 5 23 Regards Gawesh OK. You shall transform your data to long format to use lm test - read.table(clipboard, header=T, na.strings=N/A) test.m-melt(test) Using users as id variables fit-lm(value~variable, data=test.m) summary(fit) Call: lm(formula = value ~ variable, data = test.m) Residuals: 1234689 3.0 -1.0 -2.0 1.5 -1.5 0.5 -0.5 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 7.000 1.258 5.563 0.00511 ** variableGroup2 -3.500 1.990 -1.759 0.15336 variableGroup3 -3.500 1.990 -1.759 0.15336 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 2.179 on 4 degrees of freedom (2 observations deleted due to missingness) Multiple R-squared: 0.525, Adjusted R-squared: 0.2875 F-statistic: 2.211 on 2 and 4 DF, p-value: 0.2256 No difference among groups, but I am not sure if this is the correct way to evaluate. library(ggplot2) p-ggplot(test.m, aes(x=variable, y=value, colour=users)) p+geom_point() There is some sign that user3 has lowest value in each group. However for including users to fit there is not enough data. Regards Petr On Mon, Oct 10, 2011 at 9:32 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi I do not understand much about your equations. I think you shall look to Practical Regression and Anova Using R from J.Faraway. Having data frame DF with columns - users, groups, results you could do fit - lm(results~groups, data = DF) Regards Petr Hi, I'm a newbie to R. My knowledge of statistics is mostly self-taught. My problem is how to measure the effect of users in groups. I can calculate a particular attribute for a user in a group. But my hypothesis is that the user's attribute is not independent of each other and that the user's attribute depends on the group ie that user's behaviour change based on the group. Let me give an example: users*Group 1*Group 2*Group 3 u1*10*5*n/a u2*6*n/a*4 u3*5*2*3 For example, I want to be able to prove that u1 behaviour is different in group 1 than other groups and the particular thing about Group 1 is that users in Group 1 tend to have a higher value of the attribute under measurement. Hence, can use R to test my hypothesis. I'm willing to learn; so if this is very simple, just point me in the direction of any online resources about it. At the moment, I don't even how to define these class of problems? That will be a start. Regards Gawesh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
Re: [R] help with statistics in R - how to measure the effect of users in groups
I should have added... If your design is not nearly balanced, main effects and interactions will not have any natural interpretation because they will be (partially) confounded. (I realize nearly is not a very useful characterization, but I do not know a better one, as it probably depends on the scientific context of your data). Again, if you do not know what this means, get statistical help as I previously suggested. Or you might want to try the stats.stackexchange.com website. -- Bert On Mon, Oct 10, 2011 at 7:06 AM, Bert Gunter bgun...@gene.com wrote: Assuming your data are in a data frame, yourdat, as: User Group Value u1 1 !0 u2 2 5 u3 3 NA ...(etc) where Group is **explicitly coerced to be a factor,** then you want the User x Group interaction, obtained from lm( Value ~ Group*User,data = yourdat) However, you'll get some kind of warning message if a) Not all Group x User combinations are present in the data b) Moreover, no statistics can be calculated if there are no replicates of UserxGroup combinations. If you do not know why either of these are the case, get local help or study any linear models (regression) text or online tutorial, as these last issues have nothing to do with R. -- Bert On Mon, Oct 10, 2011 at 3:48 AM, gj gaw...@gmail.com wrote: Thanks Petr. I will try it on the real data. But that will only show that the groups are different or not. Is there any way I can test if the users are different when they are in different groups? Regards Gawesh On Mon, Oct 10, 2011 at 11:17 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi Petr, It's not an equation. It's my mistake; the * are meant to be field separators for the example data. I should have just use blank spaces as follows: users Group1 Group2 Group3 u110 5N/A u2 6 N/A 4 u3 5 23 Regards Gawesh OK. You shall transform your data to long format to use lm test - read.table(clipboard, header=T, na.strings=N/A) test.m-melt(test) Using users as id variables fit-lm(value~variable, data=test.m) summary(fit) Call: lm(formula = value ~ variable, data = test.m) Residuals: 1234689 3.0 -1.0 -2.0 1.5 -1.5 0.5 -0.5 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 7.000 1.258 5.563 0.00511 ** variableGroup2 -3.500 1.990 -1.759 0.15336 variableGroup3 -3.500 1.990 -1.759 0.15336 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 2.179 on 4 degrees of freedom (2 observations deleted due to missingness) Multiple R-squared: 0.525, Adjusted R-squared: 0.2875 F-statistic: 2.211 on 2 and 4 DF, p-value: 0.2256 No difference among groups, but I am not sure if this is the correct way to evaluate. library(ggplot2) p-ggplot(test.m, aes(x=variable, y=value, colour=users)) p+geom_point() There is some sign that user3 has lowest value in each group. However for including users to fit there is not enough data. Regards Petr On Mon, Oct 10, 2011 at 9:32 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi I do not understand much about your equations. I think you shall look to Practical Regression and Anova Using R from J.Faraway. Having data frame DF with columns - users, groups, results you could do fit - lm(results~groups, data = DF) Regards Petr Hi, I'm a newbie to R. My knowledge of statistics is mostly self-taught. My problem is how to measure the effect of users in groups. I can calculate a particular attribute for a user in a group. But my hypothesis is that the user's attribute is not independent of each other and that the user's attribute depends on the group ie that user's behaviour change based on the group. Let me give an example: users*Group 1*Group 2*Group 3 u1*10*5*n/a u2*6*n/a*4 u3*5*2*3 For example, I want to be able to prove that u1 behaviour is different in group 1 than other groups and the particular thing about Group 1 is that users in Group 1 tend to have a higher value of the attribute under measurement. Hence, can use R to test my hypothesis. I'm willing to learn; so if this is very simple, just point me in the direction of any online resources about it. At the moment, I don't even how to define these class of problems? That will be a start. Regards Gawesh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list
Re: [R] help with statistics in R - how to measure the effect of users in groups
Groups are different treatments given to Users for your Outcome (measurement) of interest. Take this idea forward and you will have an answer. Anupam. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter Sent: Monday, October 10, 2011 7:36 PM To: gj Cc: r-help@r-project.org Subject: Re: [R] help with statistics in R - how to measure the effect of users in groups Assuming your data are in a data frame, yourdat, as: User Group Value u1 1 !0 u2 2 5 u3 3 NA ...(etc) where Group is **explicitly coerced to be a factor,** then you want the User x Group interaction, obtained from lm( Value ~ Group*User,data = yourdat) However, you'll get some kind of warning message if a) Not all Group x User combinations are present in the data b) Moreover, no statistics can be calculated if there are no replicates of UserxGroup combinations. If you do not know why either of these are the case, get local help or study any linear models (regression) text or online tutorial, as these last issues have nothing to do with R. -- Bert On Mon, Oct 10, 2011 at 3:48 AM, gj gaw...@gmail.com wrote: Thanks Petr. I will try it on the real data. But that will only show that the groups are different or not. Is there any way I can test if the users are different when they are in different groups? Regards Gawesh On Mon, Oct 10, 2011 at 11:17 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi Petr, It's not an equation. It's my mistake; the * are meant to be field separators for the example data. I should have just use blank spaces as follows: users Group1 Group2 Group3 u110 5N/A u2 6 N/A 4 u3 5 23 Regards Gawesh OK. You shall transform your data to long format to use lm test - read.table(clipboard, header=T, na.strings=N/A) test.m-melt(test) Using users as id variables fit-lm(value~variable, data=test.m) summary(fit) Call: lm(formula = value ~ variable, data = test.m) Residuals: 1234689 3.0 -1.0 -2.0 1.5 -1.5 0.5 -0.5 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 7.000 1.258 5.563 0.00511 ** variableGroup2 -3.500 1.990 -1.759 0.15336 variableGroup3 -3.500 1.990 -1.759 0.15336 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 2.179 on 4 degrees of freedom (2 observations deleted due to missingness) Multiple R-squared: 0.525, Adjusted R-squared: 0.2875 F-statistic: 2.211 on 2 and 4 DF, p-value: 0.2256 No difference among groups, but I am not sure if this is the correct way to evaluate. library(ggplot2) p-ggplot(test.m, aes(x=variable, y=value, colour=users)) p+geom_point() There is some sign that user3 has lowest value in each group. However for including users to fit there is not enough data. Regards Petr On Mon, Oct 10, 2011 at 9:32 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi I do not understand much about your equations. I think you shall look to Practical Regression and Anova Using R from J.Faraway. Having data frame DF with columns - users, groups, results you could do fit - lm(results~groups, data = DF) Regards Petr Hi, I'm a newbie to R. My knowledge of statistics is mostly self-taught. My problem is how to measure the effect of users in groups. I can calculate a particular attribute for a user in a group. But my hypothesis is that the user's attribute is not independent of each other and that the user's attribute depends on the group ie that user's behaviour change based on the group. Let me give an example: users*Group 1*Group 2*Group 3 u1*10*5*n/a u2*6*n/a*4 u3*5*2*3 For example, I want to be able to prove that u1 behaviour is different in group 1 than other groups and the particular thing about Group 1 is that users in Group 1 tend to have a higher value of the attribute under measurement. Hence, can use R to test my hypothesis. I'm willing to learn; so if this is very simple, just point me in the direction of any online resources about it. At the moment, I don't even how to define these class of problems? That will be a start. Regards Gawesh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and
Re: [R] Importing from Fortan
On Oct 10, 2011, at 7:45 AM, Spartina wrote: Hello all, how do I import a Fortran file (f3.1) into one column in R? I've tried this (I'm a total beginner as you can see): FortranData-read.fwf(C:\\Users\\format3_1.txt,rep(3,20)) Warning message: In readLines(file, n = thisblock) : incomplete final line found on 'C:\Users\format3_1.txt' FortranData V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 1 2.2 3.3 4.2 2.1 3.4 2.3 2.3 4.2 2.1 3.4 2.3 2.3 4.2 2.1 3.4 2.3 2.3 4.2 2.1 3.4 As you can see, each datum gets imported into a separate column, whereas I'd like to have everything stored under V1. Have you considered transposing the result? If this is a multiple line file that you wnat to straighten out completely you could first convert to matrix, transpose, and then wrap in c() to get a vector of values. I'm also unsure as to what the error message means. It's not an error. It's only a warning. It just menans there is not end of line marker on the last line. Thanks in advance for your help! Léa -- View this message in context: http://r.789695.n4.nabble.com/Importing-from-Fortan-tp3889947p3889947.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple imputation on subgroups
An ad-hoc method is to impute missing scores of the whole data set including subgroup1, then change imputed scores in subgroup1 into NA. Weidong Gu On Mon, Oct 10, 2011 at 5:35 AM, Sarah s1327...@student.rug.nl wrote: Dear R-users, I want to multiple impute missing scores, but only for a few subgroups in my data (variable 'subgroups': only impute for subgroups 2 and 3). Does anyone knows how to do this in MICE? This is my script for the multiple imputation: imp - mice(data, m=20, predictorMatrix=pred, post=post, method=c(, , , , ,norm, norm,norm,norm,norm,norm), maxit=20) . The final analysis should be on the dataset as a whole, so with subgroups 2 and 3 with observed and imputed values, and for subgroup 1 with observed values only (and missing scores). Thanks. -- View this message in context: http://r.789695.n4.nabble.com/Multiple-imputation-on-subgroups-tp3889664p3889664.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] calculate multiple means of one vector
Dear R-Users, I have the following two vectors: data - rnorm(40, 0, 2) positions - c(3, 4, 5, 8, 9, 10, 20, 21, 22, 30, 31, 32) now I would like to calculate the mean of every chunk of data-points (of the data-vector) as defined by the positions-vector. So I would like to get a vector with the mean of element 3 to 5 of the data-vector, 8 to 10, 20 to 22 and so on. The gaps between the chunks are arbitrary. There is no pattern (meaning the gap from 5 to 8, 10 to 20, 22 to 30 etc.) But the chunks are always of length n (in this case 3). Is there a convenient way to do this without using a for-loop? thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fast or space-efficient lookup?
Hi Ivo, On Mon, Oct 10, 2011 at 10:58 AM, ivo welch ivo.we...@gmail.com wrote: hi steve---agreed...but is there any other computer language in which an expression in a [ . ] is anything except a tensor index selector? Sure, it's a type specifier in scala generics: http://www.scala-lang.org/node/113 Something similar to scale-eez in haskell. Aslo, MATLAB (ugh) it's not even a tensor selector (they use normal parens there). But I'm not sure what that has to do w/ the price of tea in china. With data.table, [ still is tensor-selector like, though. You can just pass in another data.table to use as the keys to do your selection through the `i` argument (like selecting rows), which I guess will likely be your most common use case if you're moving to data.table (presumably you are trying to take advantage of its quickness over big-table-like objects. You can use the `j` param to further manipulate columns. If you pass in a data.table as `i`, it will add its columns to `j`. I'll grant you that it is different than your standard rectangular object selection in R, but the motivation isn't so strange as both i,j params in normal calls to 'xxx[i,j]' are for selecting (ok not manipulating) rows and columns on other rectangular like objects, too. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to extent the improveProb for survival data
Dear R users Function improveProb in the rms library calculate NRI and IDI for predictions of binary outcome. Do anyone extent its use in survival data? Many thanks. *Yao Zhu* *Department of Urology Fudan University Shanghai Cancer Center Shanghai, China* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to draw 4 random weights that sum up to 1?
Hey list, This might be a more general question and not that R-specific. Sorry for that. I'm trying to draw a random vector of 4 numbers that sum up to 1. My first approach was something like a - runif(1) b - runif(1, max=1-a) c - runif(1, max=1-a-b) d - 1-a-b-c but this kind of distorts the results, right? Would the following be a good approach? w - sample(1:100, 4, replace=TRUE) w - w/sum(w) I'd prefer a general algorithm-kind of answer to a specific R function (if there is any). Although a function name would help too, if I can sourcedive. Thanks in advance, Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Superposing mean line to xyplot
Dear R-users, I'm using lattice package and function xyplot for the first time so you will excuse me for my inexperience. I'm facing quite a simple problem but I'm having troubles on how to solve it, I've read tons of old mails in the archives and looked at some slides from Deepayan Sarkar but still can not get the point. This is the context. I've got data on 9 microRNAs, each miRNA has been measured on three different arrays and on each array I have 4 replicates for each miRNA, which sums up to a total of 108 measurements. I've the suspect that measurement on the first array are systematically lower than the others so I wanted to draw some line plot where each panel correspond to a miRNA, and each line correspond to one of the four replicates (that is: first replicate of miRNA A on array 1 must be connected to first replicate of miRNA A on array 2 and so on), so that for each panel there are 4 series of three points connected by a line/segment. I've done this easily with lattice doing this: array = rep(c(A,B,C),each = 36) # array replicate spot = rep(1:4,27) # miRNA replicate on each array miRNA = rep(rep(paste(miRNA,1:9,sep=.),each=4),3) # miRNA label exprs = rnorm(mean=2.8,n = 108) # intensity data = data.frame(miRNA,array,spot,exprs) xyplot(exprs ~ array|miRNA,data=data,type=b,groups=spot,xlab=Array,ylab = Intensity,col=black,lty=2:5,scales = list(y = list(relation = free))) Now, I want to superpose to each panel an other series of three points connected by a line, where each point represent the mean of the four replicates of the miRNA on each array, a sort of mean line. I've tried using the following, but it's not working as expected: xyplot(exprs ~ array|miRNA,data=array,type=b,groups=spot,xlab=Array,ylab = Intensity,col=black,lty=2:5,scales = list(y = list(relation = free)), panel = function(x,y,groups,subscripts){ panel.xyplot(x,y,groups=groups,subscripts=subscripts) panel.superpose(x,y,panel.groups=panel.average,groups=groups,subscripts=subscripts) }) This is maybe a silly question and possibly there's a trivial way to do it, but I can not figure it out. Thanx for any help. niccolò __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fast or space-efficient lookup?
Ivo, Also, perhaps FAQ 2.14 helps : Can you explain further why data.table is inspired by A[B] syntax in base? http://datatable.r-forge.r-project.org/datatable-faq.pdf And, 2.15 and 2.16. Matthew Steve Lianoglou mailinglist.honey...@gmail.com wrote in message news:CAHA9McPQ4P-a2imjm=szgjfxyx0faw0j79fwq2e87dqkf9j...@mail.gmail.com... Hi Ivo, On Mon, Oct 10, 2011 at 10:58 AM, ivo welch ivo.we...@gmail.com wrote: hi steve---agreed...but is there any other computer language in which an expression in a [ . ] is anything except a tensor index selector? Sure, it's a type specifier in scala generics: http://www.scala-lang.org/node/113 Something similar to scale-eez in haskell. Aslo, MATLAB (ugh) it's not even a tensor selector (they use normal parens there). But I'm not sure what that has to do w/ the price of tea in china. With data.table, [ still is tensor-selector like, though. You can just pass in another data.table to use as the keys to do your selection through the `i` argument (like selecting rows), which I guess will likely be your most common use case if you're moving to data.table (presumably you are trying to take advantage of its quickness over big-table-like objects. You can use the `j` param to further manipulate columns. If you pass in a data.table as `i`, it will add its columns to `j`. I'll grant you that it is different than your standard rectangular object selection in R, but the motivation isn't so strange as both i,j params in normal calls to 'xxx[i,j]' are for selecting (ok not manipulating) rows and columns on other rectangular like objects, too. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Superposing mean line to xyplot
Dear R-users, I'm using lattice package and function xyplot for the first time so you will excuse me for my inexperience. I'm facing quite a simple problem but I'm having troubles on how to solve it, I've read tons of old mails in the archives and looked at some slides from Deepayan Sarkar but still can not get the point. This is the context. I've got data on 9 microRNAs, each miRNA has been measured on three different arrays and on each array I have 4 replicates for each miRNA, which sums up to a total of 108 measurements. I've the suspect that measurement on the first array are systematically lower than the others so I wanted to draw some line plot where each panel correspond to a miRNA, and each line correspond to one of the four replicates (that is: first replicate of miRNA A on array 1 must be connected to first replicate of miRNA A on array 2 and so on), so that for each panel there are 4 series of three points connected by a line/segment. I've done this easily with lattice doing this: array = rep(c(A,B,C),each = 36) # array replicate spot = rep(1:4,27) # miRNA replicate on each array miRNA = rep(rep(paste(miRNA,1:9,sep=.),each=4),3) # miRNA label exprs = rnorm(mean=2.8,n = 108) # intensity data = data.frame(miRNA,array,spot,exprs) xyplot(exprs ~ array|miRNA,data=data,type=b,groups=spot,xlab=Array,ylab = Intensity,col=black,lty=2:5,scales = list(y = list(relation = free))) Now, I want to superpose to each panel an other series of three points connected by a line, where each point represent the mean of the four replicates of the miRNA on each array, a sort of mean line. I've tried using the following, but it's not working as expected: xyplot(exprs ~ array|miRNA,data=array,type=b,groups=spot,xlab=Array,ylab = Intensity,col=black,lty=2:5,scales = list(y = list(relation = free)), panel = function(x,y,groups,subscripts){ panel.xyplot(x,y,groups=groups,subscripts=subscripts) panel.superpose(x,y,panel.groups=panel.average,groups=groups,subscripts=subscripts) }) This is maybe a silly question and possibly there's a trivial way to do it, but I can not figure it out. Thanx for any help. niccolò __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculate multiple means of one vector
Hi: Here's one approach: dat - rnorm(40, 0, 2) positions - matrix(c(3, 4, 5, 8, 9, 10, 20, 21, 22, 30, 31, 32), ncol = 3, byrow = TRUE) # Subdata t(apply(positions, 1, function(x) dat[x])) [,1] [,2] [,3] [1,] 0.5679765 1.429396 2.9050931 [2,] 4.0878845 -2.569012 2.1209280 [3,] 4.0295221 -2.659358 -1.3566887 [4,] 1.3109707 -1.745255 -0.9462857 # means of subdata apply(positions, 1, function(x) mean(dat[x])) [1] 1.63415529 1.21326693 0.00449164 -0.46019018 # or colMeans(apply(positions, 1, function(x) dat[x])) HTH, Dennis On Mon, Oct 10, 2011 at 7:56 AM, Martin Batholdy batho...@googlemail.com wrote: Dear R-Users, I have the following two vectors: data - rnorm(40, 0, 2) positions - c(3, 4, 5, 8, 9, 10, 20, 21, 22, 30, 31, 32) now I would like to calculate the mean of every chunk of data-points (of the data-vector) as defined by the positions-vector. So I would like to get a vector with the mean of element 3 to 5 of the data-vector, 8 to 10, 20 to 22 and so on. The gaps between the chunks are arbitrary. There is no pattern (meaning the gap from 5 to 8, 10 to 20, 22 to 30 etc.) But the chunks are always of length n (in this case 3). Is there a convenient way to do this without using a for-loop? thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with statistics in R - how to measure the effect of users in groups
Hi Bert, The real situation is like what you suggested, user x group interactions. The users can be in more than one group. In fact, the data that I am trying to analyse constitute of users, online forums as groups and the attribute under measure is the number of posts made by each user in a particular forum. My hypothesis is that the number of posts a user makes to a forum is dependent on the forum. For example if the user is in a forum that is active he contributes more compared to when he is in a forum that is less active. I guess there will be some users who contribute the same irrespective of the forum. I hope this makes sense. Regards Gawesh On Mon, Oct 10, 2011 at 4:50 PM, Bert Gunter gunter.ber...@gene.com wrote: Yes, of course. But then one gets into additional problems with carryover effects,etc. Also, one then has a repeated measures problem (User is the experimental unit) and my previous advice is nonsense, Like you, I have no idea what his real situation is. -- Bert On Mon, Oct 10, 2011 at 8:39 AM, Anupam anupa...@gmail.com wrote: It is possible to give multiple treatments, one at a time, to same pool of patients. You are correct that interactions may be important in this problem. I am only trying to help him frame the problem using an analogy. ** ** Anupam. *From:* Bert Gunter [mailto:gunter.ber...@gene.com] *Sent:* Monday, October 10, 2011 8:21 PM *To:* Anupam *Cc:* gj *Subject:* Re: [R] help with statistics in R - how to measure the effect of users in groups ** ** If that is the case, and each user can appear in only one group, there is no group x user interaction, the poster's question was nonsense, and one analyzes the group effect only, as originally shown -- Bert On Mon, Oct 10, 2011 at 7:43 AM, Anupam anupa...@gmail.com wrote: Groups are different treatments given to Users for your Outcome (measurement) of interest. Take this idea forward and you will have an answer. Anupam. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter Sent: Monday, October 10, 2011 7:36 PM To: gj Cc: r-help@r-project.org Subject: Re: [R] help with statistics in R - how to measure the effect of users in groups Assuming your data are in a data frame, yourdat, as: User Group Value u1 1 !0 u2 2 5 u3 3 NA ...(etc) where Group is **explicitly coerced to be a factor,** then you want the User x Group interaction, obtained from lm( Value ~ Group*User,data = yourdat) However, you'll get some kind of warning message if a) Not all Group x User combinations are present in the data b) Moreover, no statistics can be calculated if there are no replicates of UserxGroup combinations. If you do not know why either of these are the case, get local help or study any linear models (regression) text or online tutorial, as these last issues have nothing to do with R. -- Bert On Mon, Oct 10, 2011 at 3:48 AM, gj gaw...@gmail.com wrote: Thanks Petr. I will try it on the real data. But that will only show that the groups are different or not. Is there any way I can test if the users are different when they are in different groups? Regards Gawesh On Mon, Oct 10, 2011 at 11:17 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi Petr, It's not an equation. It's my mistake; the * are meant to be field separators for the example data. I should have just use blank spaces as follows: users Group1 Group2 Group3 u110 5N/A u2 6 N/A 4 u3 5 23 Regards Gawesh OK. You shall transform your data to long format to use lm test - read.table(clipboard, header=T, na.strings=N/A) test.m-melt(test) Using users as id variables fit-lm(value~variable, data=test.m) summary(fit) Call: lm(formula = value ~ variable, data = test.m) Residuals: 1234689 3.0 -1.0 -2.0 1.5 -1.5 0.5 -0.5 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 7.000 1.258 5.563 0.00511 ** variableGroup2 -3.500 1.990 -1.759 0.15336 variableGroup3 -3.500 1.990 -1.759 0.15336 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 2.179 on 4 degrees of freedom (2 observations deleted due to missingness) Multiple R-squared: 0.525, Adjusted R-squared: 0.2875 F-statistic: 2.211 on 2 and 4 DF, p-value: 0.2256 No difference among groups, but I am not sure if this is the correct way to evaluate. library(ggplot2) p-ggplot(test.m, aes(x=variable, y=value, colour=users)) p+geom_point() There is some sign that user3 has lowest value in each group. However for including users to
Re: [R] How to draw 4 random weights that sum up to 1?
You probably want to generate data from a Dirichlet distribution. There are some functions in packages that will do this and give you more background, or you can just generate 4 numbers from an exponential (or gamma) distribution and divide them by their sum. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Alexander Engelhardt Sent: Monday, October 10, 2011 10:11 AM To: r-help Subject: [R] How to draw 4 random weights that sum up to 1? Hey list, This might be a more general question and not that R-specific. Sorry for that. I'm trying to draw a random vector of 4 numbers that sum up to 1. My first approach was something like a - runif(1) b - runif(1, max=1-a) c - runif(1, max=1-a-b) d - 1-a-b-c but this kind of distorts the results, right? Would the following be a good approach? w - sample(1:100, 4, replace=TRUE) w - w/sum(w) I'd prefer a general algorithm-kind of answer to a specific R function (if there is any). Although a function name would help too, if I can sourcedive. Thanks in advance, Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to draw 4 random weights that sum up to 1?
On 10.10.2011 18:10, Alexander Engelhardt wrote: Hey list, This might be a more general question and not that R-specific. Sorry for that. I'm trying to draw a random vector of 4 numbers that sum up to 1. My first approach was something like a - runif(1) b - runif(1, max=1-a) c - runif(1, max=1-a-b) d - 1-a-b-c but this kind of distorts the results, right? Would the following be a good approach? w - sample(1:100, 4, replace=TRUE) w - w/sum(w) Yes, although better combine both ways to w - runif(4) w - w / sum(w) Uwe Ligges I'd prefer a general algorithm-kind of answer to a specific R function (if there is any). Although a function name would help too, if I can sourcedive. Thanks in advance, Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with statistics in R - how to measure the effect of users in groups
OK. So my original advice and warnings are correct. However, now there is an additional wrinkle because your response is a count, which is not a continuous measurement. For this, you'll need glm(..., family = poisson) instead of lm(...), where the ... is the stuff I gave you before. A backup approach is there aren't too many small counts (below about 10, say) is to take the square root of the counts and analyze that via lm(). In either approach, your interpretation becomes more difficult -- e.g. have you any experience with glm's = generalized linear models? Moreover, if there are large numbers of users -- e.g. dozens (and you may have hundreds or thousands -- of course the interaction will be significant, but so what? For this you'll need to re-frame the question. So given all this and what appears to be your relative ignorance of statistics, I strongly recommend that you get local statistical help. Or just forget about formal statistical analysis altogether and do some sensible plotting. Finally, that's it for me on this. I will offer you no more advice. -- Bert On Mon, Oct 10, 2011 at 9:40 AM, gj gaw...@gmail.com wrote: Hi Bert, The real situation is like what you suggested, user x group interactions. The users can be in more than one group. In fact, the data that I am trying to analyse constitute of users, online forums as groups and the attribute under measure is the number of posts made by each user in a particular forum. My hypothesis is that the number of posts a user makes to a forum is dependent on the forum. For example if the user is in a forum that is active he contributes more compared to when he is in a forum that is less active. I guess there will be some users who contribute the same irrespective of the forum. I hope this makes sense. Regards Gawesh On Mon, Oct 10, 2011 at 4:50 PM, Bert Gunter gunter.ber...@gene.comwrote: Yes, of course. But then one gets into additional problems with carryover effects,etc. Also, one then has a repeated measures problem (User is the experimental unit) and my previous advice is nonsense, Like you, I have no idea what his real situation is. -- Bert On Mon, Oct 10, 2011 at 8:39 AM, Anupam anupa...@gmail.com wrote: It is possible to give multiple treatments, one at a time, to same pool of patients. You are correct that interactions may be important in this problem. I am only trying to help him frame the problem using an analogy. ** ** Anupam. *From:* Bert Gunter [mailto:gunter.ber...@gene.com] *Sent:* Monday, October 10, 2011 8:21 PM *To:* Anupam *Cc:* gj *Subject:* Re: [R] help with statistics in R - how to measure the effect of users in groups ** ** If that is the case, and each user can appear in only one group, there is no group x user interaction, the poster's question was nonsense, and one analyzes the group effect only, as originally shown -- Bert On Mon, Oct 10, 2011 at 7:43 AM, Anupam anupa...@gmail.com wrote: Groups are different treatments given to Users for your Outcome (measurement) of interest. Take this idea forward and you will have an answer. Anupam. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter Sent: Monday, October 10, 2011 7:36 PM To: gj Cc: r-help@r-project.org Subject: Re: [R] help with statistics in R - how to measure the effect of users in groups Assuming your data are in a data frame, yourdat, as: User Group Value u1 1 !0 u2 2 5 u3 3 NA ...(etc) where Group is **explicitly coerced to be a factor,** then you want the User x Group interaction, obtained from lm( Value ~ Group*User,data = yourdat) However, you'll get some kind of warning message if a) Not all Group x User combinations are present in the data b) Moreover, no statistics can be calculated if there are no replicates of UserxGroup combinations. If you do not know why either of these are the case, get local help or study any linear models (regression) text or online tutorial, as these last issues have nothing to do with R. -- Bert On Mon, Oct 10, 2011 at 3:48 AM, gj gaw...@gmail.com wrote: Thanks Petr. I will try it on the real data. But that will only show that the groups are different or not. Is there any way I can test if the users are different when they are in different groups? Regards Gawesh On Mon, Oct 10, 2011 at 11:17 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi Petr, It's not an equation. It's my mistake; the * are meant to be field separators for the example data. I should have just use blank spaces as follows: users Group1 Group2 Group3 u110 5N/A u2 6 N/A 4 u3 5 23 Regards Gawesh OK. You shall transform your data to long format to
Re: [R] Superposing mean line to xyplot
Hi: Here's one way to do it, adding the latticeExtra package: array = rep(c(A,B,C),each = 36) # array replicate spot = rep(1:4,27) # miRNA replicate on each array miRNA = rep(rep(paste(miRNA,1:9,sep=.),each=4),3) # miRNA label exprs = rnorm(mean=2.8,n = 108) # intensity dat = data.frame(miRNA,array,spot,exprs) library(latticeExtra) p0 - xyplot(exprs ~ array|miRNA, data=dat, type=b, groups = spot, xlab=Array, ylab = Intensity, col=black, lty = 2:5, scales = list(y = list(relation = free)) ) p1 - xyplot(exprs ~ array|miRNA, data=dat, type=a, xlab=Array, ylab = Intensity, col=red, lty = 1, lwd = 2, scales = list(y = list(relation = free)) ) p0 + p1 You can also write a panel function to do this if you wish. HTH, Dennis 2011/10/10 Niccolò Bassani biostatist...@gmail.com: Dear R-users, I'm using lattice package and function xyplot for the first time so you will excuse me for my inexperience. I'm facing quite a simple problem but I'm having troubles on how to solve it, I've read tons of old mails in the archives and looked at some slides from Deepayan Sarkar but still can not get the point. This is the context. I've got data on 9 microRNAs, each miRNA has been measured on three different arrays and on each array I have 4 replicates for each miRNA, which sums up to a total of 108 measurements. I've the suspect that measurement on the first array are systematically lower than the others so I wanted to draw some line plot where each panel correspond to a miRNA, and each line correspond to one of the four replicates (that is: first replicate of miRNA A on array 1 must be connected to first replicate of miRNA A on array 2 and so on), so that for each panel there are 4 series of three points connected by a line/segment. I've done this easily with lattice doing this: array = rep(c(A,B,C),each = 36) # array replicate spot = rep(1:4,27) # miRNA replicate on each array miRNA = rep(rep(paste(miRNA,1:9,sep=.),each=4),3) # miRNA label exprs = rnorm(mean=2.8,n = 108) # intensity data = data.frame(miRNA,array,spot,exprs) xyplot(exprs ~ array|miRNA,data=data,type=b,groups=spot,xlab=Array,ylab = Intensity,col=black,lty=2:5,scales = list(y = list(relation = free))) Now, I want to superpose to each panel an other series of three points connected by a line, where each point represent the mean of the four replicates of the miRNA on each array, a sort of mean line. I've tried using the following, but it's not working as expected: xyplot(exprs ~ array|miRNA,data=array,type=b,groups=spot,xlab=Array,ylab = Intensity,col=black,lty=2:5,scales = list(y = list(relation = free)), panel = function(x,y,groups,subscripts){ panel.xyplot(x,y,groups=groups,subscripts=subscripts) panel.superpose(x,y,panel.groups=panel.average,groups=groups,subscripts=subscripts) }) This is maybe a silly question and possibly there's a trivial way to do it, but I can not figure it out. Thanx for any help. niccolò __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Merge Data by time stamps
Dear all, I have some device measurements and the time stamps I get from it have the below format: MyStruct$TimeStamps[1,] [1] 2011.000 10.000 6.000 16.000 23.000 30.539 I can convert them easily with ISOdate() to a number and do the calculations I need. One of my problems is that I want to gather my measurements to piles of duration (let's say) 5 minutes. Afterwards I will apply a function to these piles. As the device is not super-precise please find below the time needed for one operation to complete (in seconds) . 1.10 1.90 1.34 1.23 1.56 1.22 1.34 as you understand I can not say that 5 minutes measurements are specific to X consecutive measurements but differ. How I can ask from R to do the summation and whenever there is a 5 minute data set to split it so to apply it into a function? I would like to thank you in advance for your help B.R Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multicore by(), like mclapply?
dear r experts---Is there a multicore equivalent of by(), just like mclapply() is the multicore equivalent of lapply()? if not, is there a fast way to convert a data.table into a list based on a column that lapply and mclapply can consume? advice appreciated...as always. regards, /iaw Ivo Welch (ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] package dlm: dlmForecast()
I haven't tried this, but I am pretty confident that using dlmFilter() with fictitious future values of the observations set to NA should do the job. Hope this helps, Giovanni Petris On Sat, 2011-10-08 at 13:21 +, YuHong wrote: May I have a question about dlmForecast() function in the package 'dlm'? This function 'dlmForecast()' currently only deals with constant models. May anyone suggest on how to predict using non-constant model? Thanks a lot! Regards, Hong Yu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to draw 4 random weights that sum up to 1?
On Oct 10, 2011, at 12:44 PM, Uwe Ligges wrote: On 10.10.2011 18:10, Alexander Engelhardt wrote: Hey list, This might be a more general question and not that R-specific. Sorry for that. I'm trying to draw a random vector of 4 numbers that sum up to 1. My first approach was something like a - runif(1) b - runif(1, max=1-a) c - runif(1, max=1-a-b) d - 1-a-b-c but this kind of distorts the results, right? Would the following be a good approach? w - sample(1:100, 4, replace=TRUE) w - w/sum(w) Yes, although better combine both ways to w - runif(4) w - w / sum(w) For the non-statisticians in the audience like myself who didn't know what that distribution might look like (it being difficult to visualize densities on your 3-dimensional manifold in 4-space), here is my effort to get an appreciation: M4 - matrix(runif(4), ncol=4) M4 - M4/rowSums(M4) # just a larger realization of Ligges' advice colMeans(M4) [1] 0.2503946 0.2499594 0.2492118 0.2504342 plot(density(M4[,1])) lines(density(M4[,2]),col=red) lines(density(M4[,3]),col=blue) lines(density(M4[,4]),col=green) plot(density(rowSums(M4[,1:2]))) plot(density(rowSums(M4[,1:3]))) plot(density(rowSums(M4[,2:4]))) # rather kewl results, noting that these are a reflecion around 0.5 of the single vector densities. Uwe Ligges I'd prefer a general algorithm-kind of answer to a specific R function (if there is any). Although a function name would help too, if I can sourcedive. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multicore by(), like mclapply?
Hi Ivo, My suggestion would be to only pass lapply (or mclapply) the indices. That should be fast, subsetting with data table should also be fast, and then you do whatever computations you will. For example: require(data.table) DT - data.table(x=rep(c(a,b,c),each=3), y=c(1,3,6), v=1:9) setkey(DT, x) lapply(as.character(unique(DT[,x])), function(i) DT[i]) the DT[i] object is the subset of the data table you want. You can pass this to whatever function for computations you need. Hope this helps, Josh On Mon, Oct 10, 2011 at 10:41 AM, ivo welch ivo.we...@gmail.com wrote: dear r experts---Is there a multicore equivalent of by(), just like mclapply() is the multicore equivalent of lapply()? if not, is there a fast way to convert a data.table into a list based on a column that lapply and mclapply can consume? advice appreciated...as always. regards, /iaw Ivo Welch (ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multicore by(), like mclapply?
Package plyr has .parallel. Searching datatable-help for multicore, say on Nabble here, http://r.789695.n4.nabble.com/datatable-help-f2315188.html yields three relevant posts and examples. Please check wiki do's and don'ts to make sure you didn't fall into one of those traps, though (we don't know data or task so just guessing) : http://rwiki.sciviews.org/doku.php?id=packages:cran:data.table HTH Matthew ivo welch ivo.we...@gmail.com wrote in message news:CAPr7RtUroPQtQvoh5uBuT60OYkwGR+ufGr_Z=g5g+vljeoj...@mail.gmail.com... dear r experts---Is there a multicore equivalent of by(), just like mclapply() is the multicore equivalent of lapply()? if not, is there a fast way to convert a data.table into a list based on a column that lapply and mclapply can consume? advice appreciated...as always. regards, /iaw Ivo Welch (ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merge Data by time stamps
On Oct 10, 2011, at 1:28 PM, Alaios wrote: Dear all, I have some device measurements and the time stamps I get from it have the below format: MyStruct$TimeStamps[1,] [1] 2011.000 10.0006.000 16.000 23.000 30.539 I can convert them easily with ISOdate() to a number and do the calculations I need. One of my problems is that I want to gather my measurements to piles of duration (let's say) 5 minutes. Afterwards I will apply a function to these piles. As the device is not super-precise please find below the time needed for one operation to complete (in seconds) . 1.10 1.90 1.34 1.23 1.56 1.22 1.34 Assuming I understand your presentation and lacking R-coded examples and desired output on which to test: ?cumsum ?cut as you understand I can not say that 5 minutes measurements are specific to X consecutive measurements but differ. How I can ask from R to do the summation and whenever there is a 5 minute data set to split it so to apply it into a function? I would like to thank you in advance for your help -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to draw 4 random weights that sum up to 1?
As an interesting extension to David's post, try: M4.e - matrix(rexp(4,1), ncol=4) Instead of the uniform and rerun the rest of the code (note the limits on the x-axis). With 3 dimensions and the restriction we can plot in 2 dimensions to compare: library(TeachingDemos) m3.unif - matrix(runif(3000), ncol=3) m3.unif - m3.unif/rowSums(m3.unif) m3.exp - matrix(rexp(3000,1), ncol=3) m3.exp - m3.exp/rowSums(m3.exp) dev.new() triplot(m3.unif) dev.new() triplot(m3.exp) now compare the 2 plots on the density of the points near the corners. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of David Winsemius Sent: Monday, October 10, 2011 12:05 PM To: Uwe Ligges Cc: r-help; Alexander Engelhardt Subject: Re: [R] How to draw 4 random weights that sum up to 1? On Oct 10, 2011, at 12:44 PM, Uwe Ligges wrote: On 10.10.2011 18:10, Alexander Engelhardt wrote: Hey list, This might be a more general question and not that R-specific. Sorry for that. I'm trying to draw a random vector of 4 numbers that sum up to 1. My first approach was something like a - runif(1) b - runif(1, max=1-a) c - runif(1, max=1-a-b) d - 1-a-b-c but this kind of distorts the results, right? Would the following be a good approach? w - sample(1:100, 4, replace=TRUE) w - w/sum(w) Yes, although better combine both ways to w - runif(4) w - w / sum(w) For the non-statisticians in the audience like myself who didn't know what that distribution might look like (it being difficult to visualize densities on your 3-dimensional manifold in 4-space), here is my effort to get an appreciation: M4 - matrix(runif(4), ncol=4) M4 - M4/rowSums(M4) # just a larger realization of Ligges' advice colMeans(M4) [1] 0.2503946 0.2499594 0.2492118 0.2504342 plot(density(M4[,1])) lines(density(M4[,2]),col=red) lines(density(M4[,3]),col=blue) lines(density(M4[,4]),col=green) plot(density(rowSums(M4[,1:2]))) plot(density(rowSums(M4[,1:3]))) plot(density(rowSums(M4[,2:4]))) # rather kewl results, noting that these are a reflecion around 0.5 of the single vector densities. Uwe Ligges I'd prefer a general algorithm-kind of answer to a specific R function (if there is any). Although a function name would help too, if I can sourcedive. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multicore by(), like mclapply?
hi josh---thx. I had a different version of this, and discarded it because I think it was very slow. the reason is that on each application, your version has to scan my (very long) data vector. (I have many thousand different cases, too.) I presume that by() has one scan through the vector that makes all splits. regards, /iaw Ivo Welch (ivo.we...@gmail.com) On Mon, Oct 10, 2011 at 11:07 AM, Joshua Wiley jwiley.ps...@gmail.com wrote: Hi Ivo, My suggestion would be to only pass lapply (or mclapply) the indices. That should be fast, subsetting with data table should also be fast, and then you do whatever computations you will. For example: require(data.table) DT - data.table(x=rep(c(a,b,c),each=3), y=c(1,3,6), v=1:9) setkey(DT, x) lapply(as.character(unique(DT[,x])), function(i) DT[i]) the DT[i] object is the subset of the data table you want. You can pass this to whatever function for computations you need. Hope this helps, Josh On Mon, Oct 10, 2011 at 10:41 AM, ivo welch ivo.we...@gmail.com wrote: dear r experts---Is there a multicore equivalent of by(), just like mclapply() is the multicore equivalent of lapply()? if not, is there a fast way to convert a data.table into a list based on a column that lapply and mclapply can consume? advice appreciated...as always. regards, /iaw Ivo Welch (ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pmml for random forest rules
Hi, I am having some trouble using R 2.13.1 for generating a pmml object of of class c('randomForest.formula', 'randomForest') I see that these methods are available: methods(pmml) [1] pmml.coxph*pmml.hclust* pmml.itemsets* pmml.kmeans* pmml.ksvm* pmml.lm* pmml.multinom* pmml.nnet* pmml.rpart* [10] pmml.rsf* pmml.rules*pmml.survreg* However, the R journal 1/1 pg 64 says there should be a method available ( http://journal.r-project.org/2009-1/RJournal_2009-1_Guazzelli+et+al.pdf ): Random Forest (and randomSurvivalForest) — randomForest (Breiman and Cutler. R port by A. Liaw and M. Wiener, 2009) and randomSurvivalForest (Ishwaran and Kogalur , 2009): PMML export of a randomSurvivalForest rsf object. This function gives the user the ability to export PMML containing the geometry of a forest. However, if I run these lines of code: library(randomForest) (iris.rf- randomForest(Species ~ ., data=iris)) pmml(iris.rf) I get this error: Error in UseMethod(pmml) : no applicable method for 'pmml' applied to an object of class c('randomForest.formula', 'randomForest') Also, if I run these lines of code data(Adult) ## Mine association rules. rules - apriori(Adult, parameter = list(supp = 0.5, conf = 0.9, target = rules)) pmml(rules) I get this error: pmml(rules) Error in function (classes, fdef, mtable) : unable to find an inherited method for function size, for signature itemMatrix With this traceback: traceback() 5: stop(unable to find an inherited method for function \, fdef@generic, \, for signature , cnames) 4: function (classes, fdef, mtable) { methods - .findInheritedMethods(classes, fdef, mtable) if (length(methods) == 1L) return(methods[[1L]]) else if (length(methods) == 0L) { cnames - paste(\, sapply(classes, as.character), \, sep = , collapse = , ) stop(unable to find an inherited method for function \, fdef@generic, \, for signature , cnames) } else stop(Internal error in finding inherited methods; didn't return a unique method) }(list(itemMatrix), function (object) standardGeneric(size), environment) 3: size(is.unique) 2: pmml.rules(rules) 1: pmml(rules) Thanks, Patrick McCann __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ANOVA from imported data has only 1 degree of freedom
Thanks bbolker, that's really helpful. I'll look out for the book too, it could be helpful! Yours, Sam -- View this message in context: http://r.789695.n4.nabble.com/ANOVA-from-imported-data-has-only-1-degree-of-freedom-tp3887528p3891246.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about ggplot2 and stat_smooth
Hi Tom, Just wanted to chime-in and let you know that the linked figures are really cool! Keep up the good work. On an un-related note, any talk of future GRASS training sessions? Cheers, Dylan On Tuesday, October 04, 2011, thomas.ad...@noaa.gov wrote: Hadley, Thanks for responding. No, not smoothed quantile regression. If you go here: http://www.erh.noaa.gov/mmefs/index.php and click on one of the colored squares, you can see we have 'boxplots'. What I want to express is the uncertainty as depicted in the example from my previous email where I can specify the limits calculated for the 'boxplots' using 5%, 25%,75%, 95% limits as we have with the 'boxplots'. Tom - Original Message - From: Hadley Wickham had...@rice.edu Date: Tuesday, October 4, 2011 10:23 am Subject: Re: [R] Question about ggplot2 and stat_smooth To: Thomas Adams thomas.ad...@noaa.gov Cc: R-help forum r-help@r-project.org On Mon, Oct 3, 2011 at 12:24 PM, Thomas Adams thomas.ad...@noaa.gov wrote: I'm interested in creating a graphic -like- this: c - ggplot(mtcars, aes(qsec, wt)) c + geom_point() + stat_smooth(fill=blue, colour=darkblue, size=2, alpha = 0.2) but I need to show 2 sets of bands (with different shading) using 5%, 25%, 75%, 95% limits that I specify and where the heavy blue line is the median. I don't understand how to do this with ggplot2. Exactly what sort of limits do you want? It sounds like maybe you are looking for smoothed quantile regression. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dylan E. Beaudette USDA-NRCS Soil Scientist California Soil Resource Lab http://casoilresource.lawr.ucdavis.edu/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] question about string to boor?
Hello! So I am handling this problem with some arrays grp1-grp7, I want to write a loop to avoid tedious work, but I don't know how to transform string to boor? For example I used i=1 paste(grp,i, sep=) I only got grp1 instead of grp1, which can't be manipulate using mean() or other function. I am not sure if I make myself clear... THANKS Nellie -- View this message in context: http://r.789695.n4.nabble.com/question-about-string-to-boor-tp3890983p3890983.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Dealing with missing data in ave() functions
Dear all, I would be grateful if you could help me in some way! Well, I have one dataset already uploaded into R with some missing data. My dataset is made of 19 columns, one of these is the sector (naf code). There are many rows relating to the single enterprises. What I want to do is the mean of each variable (column) by sector (naf code), so I type for example: col_5 = data.frame(ave(x[,5], naf, na.rm = TRUE)) Col_5 is referred to the first variable of interest. But since there are some missing data, it results in a series of NA all the column ahead (all the rows are NA). How can I do to discard and not taking into account these missing values??? How could I do to replace missing values with a simple zero just for this analysis??? Thanks so much friends, bye! -- View this message in context: http://r.789695.n4.nabble.com/Dealing-with-missing-data-in-ave-functions-tp3891001p3891001.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] correlation matrix
Hello Gurus I have two correlation matrices 'xa' and 'xb' set.seed(100) d=cbind(x=rnorm(20)+1, x1=rnorm(20)+1, x2=rnorm(20)+1) d1=cbind(x=rnorm(20)+2, x1=rnorm(20)+2, x2=rnorm(20)+2) xa=cor(d,use='complete') xb=cor(d1,use='complete') I want to combine these two to get a third matrix which should have half values from 'xa' and half values from 'xb' x x1 x2 x 1.000 -0.15157123 -0.23085308 x1 0.3466155 1. -0.01061675 x2 0.1234507 0.01775527 1. I would like to generate a heatmap for correlation values in disease and non disease phenotype I would appreciate if someone can point me in correct direction. Thanks sharad -- View this message in context: http://r.789695.n4.nabble.com/correlation-matrix-tp3891085p3891085.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to calculate the statistics of a yearly window with a rolling step as 1 day?
Hope someone can help me here. I have a daily time series, say 2003-02-01 2003-02-03 2003-02-07 2003-02-09 2003-02-14 .. 2004-02-01 2004-02-04 0.4914798 -1.1857653 -1.6982844 -0.3559572 -0.2333087 ... 0.44553-0.45222 I need to calculate the statistics for the overlapping rolling yearly window with rolling step as 1 day so for each of the intervals: (2003-02-01 ~ 2004-02-01), (2003-02-03 ~ 2004-02-04), i need to calculate some statistics. Could you please help me out how to extract these intervals? Right now I am using index. But since the dates doesn't match exactly, I have to do it like: a(index(a)=index(b) index(a)=index(b)+365), which is very time-consuming since it's a long time series. Could someone help me? Really appreicate!!! -- View this message in context: http://r.789695.n4.nabble.com/how-to-calculate-the-statistics-of-a-yearly-window-with-a-rolling-step-as-1-day-tp3891404p3891404.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question about string to boor?
On Oct 10, 2011, at 12:52 PM, song_gpqg wrote: Hello! So I am handling this problem with some arrays grp1-grp7, I want to write a loop to avoid tedious work, but I don't know how to transform string to boor? For example I used i=1 paste(grp,i, sep=) ?get e.g. get( paste(grp,i, sep=) ) I only got grp1 instead of grp1, which can't be manipulate using mean() or other function. I am not sure if I make myself clear... THANKS Nellie -- View this message in context: http://r.789695.n4.nabble.com/question-about-string-to-boor-tp3890983p3890983.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] round() and negative digits
On 10/9/2011 6:18 AM, Prof Brian Ripley wrote: Sometimes it is better not to document things than try to give precise details which may get changed *and* there will be useRs who misread (and maybe even file bug reports on their misreadings). The source is the ultimate documentation. I can't agree with this less. The source does the computation. The documentation says how to use it and what it should do. Corner cases can be trapped in code or mentioned in Notes. But the source is only useful if you can easily find it and then can understand what it is doing, particularly for a .Primitive like round(). The source is only the documentation of last resort. -Michael __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multicore by(), like mclapply?
On Tue, Oct 11, 2011 at 7:54 AM, ivo welch ivo.we...@gmail.com wrote: hi josh---thx. I had a different version of this, and discarded it because I think it was very slow. the reason is that on each application, your version has to scan my (very long) data vector. (I have many thousand different cases, too.) I presume that by() has one scan through the vector that makes all splits. by.data.frame() is basically a wrapper for tapply(), and the key line in tapply() is ans - lapply(split(X, group), FUN, ...) which should be easy to adapt for mclapply. -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to calculate the statistics of a yearly window with a rolling step as 1 day?
On Mon, Oct 10, 2011 at 2:55 PM, ecoc liting...@gmail.com wrote: Hope someone can help me here. I have a daily time series, say 2003-02-01 2003-02-03 2003-02-07 2003-02-09 2003-02-14 .. 2004-02-01 2004-02-04 0.4914798 -1.1857653 -1.6982844 -0.3559572 -0.2333087 ... 0.44553 -0.45222 I need to calculate the statistics for the overlapping rolling yearly window with rolling step as 1 day so for each of the intervals: (2003-02-01 ~ 2004-02-01), (2003-02-03 ~ 2004-02-04), i need to calculate some statistics. Could you please help me out how to extract these intervals? Right now I am using index. But since the dates doesn't match exactly, I have to do it like: a(index(a)=index(b) index(a)=index(b)+365), which is very time-consuming since it's a long time series. Could someone help me? Really appreicate!!! Fill in the missing days with NAs using zoo FAQ 15 or otherwise and then use rollapply(z, 365, f, ...whatever...) such that your function f first removes any NAs. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multicore by(), like mclapply?
I could be waay off base here, but my concern about presplitting the data is that you will have your data, and a second copy of our data that is something like a list where each element contains the portion of the data for that split. Good speed wise, bad memory wise. My hope with the technique I showed (again I may not have accomplished it) was to only have at anyone time, the original data and a copy of the particular elements being worked with. Of course this is not an issue if you have plenty of memory. On Oct 10, 2011, at 12:19, Thomas Lumley tlum...@uw.edu wrote: On Tue, Oct 11, 2011 at 7:54 AM, ivo welch ivo.we...@gmail.com wrote: hi josh---thx. I had a different version of this, and discarded it because I think it was very slow. the reason is that on each application, your version has to scan my (very long) data vector. (I have many thousand different cases, too.) I presume that by() has one scan through the vector that makes all splits. by.data.frame() is basically a wrapper for tapply(), and the key line in tapply() is ans - lapply(split(X, group), FUN, ...) which should be easy to adapt for mclapply. -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] correlation matrix
What a pleasant post to respond to - with self-contained code. :) heat-matrix(0,nrow=dim(xa)[1],ncol=dim(xa)[2]) heat[lower.tri(heat)]-xa[lower.tri(xa)] heat[upper.tri(heat)]-xb[upper.tri(xb)] diag(heat)-1 heat HTH, Daniel 1Rnwb wrote: Hello Gurus I have two correlation matrices 'xa' and 'xb' set.seed(100) d=cbind(x=rnorm(20)+1, x1=rnorm(20)+1, x2=rnorm(20)+1) d1=cbind(x=rnorm(20)+2, x1=rnorm(20)+2, x2=rnorm(20)+2) xa=cor(d,use='complete') xb=cor(d1,use='complete') I want to combine these two to get a third matrix which should have half values from 'xa' and half values from 'xb' x x1 x2 x 1.000 -0.15157123 -0.23085308 x1 0.3466155 1. -0.01061675 x2 0.1234507 0.01775527 1. I would like to generate a heatmap for correlation values in disease and non disease phenotype I would appreciate if someone can point me in correct direction. Thanks sharad -- View this message in context: http://r.789695.n4.nabble.com/correlation-matrix-tp3891085p3891685.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] correlation matrix
Tena koe Sharad If I understand you correctly, you want the lower triangle of your combined matrix to be the lower triangle of one of the correlation matrices, and the upper triangle to be the upper triangle from the other. If so, check lower.tri() and upper.tri(). HTH Peter Alspach -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of 1Rnwb Sent: Tuesday, 11 October 2011 6:20 a.m. To: r-help@r-project.org Subject: [R] correlation matrix Hello Gurus I have two correlation matrices 'xa' and 'xb' set.seed(100) d=cbind(x=rnorm(20)+1, x1=rnorm(20)+1, x2=rnorm(20)+1) d1=cbind(x=rnorm(20)+2, x1=rnorm(20)+2, x2=rnorm(20)+2) xa=cor(d,use='complete') xb=cor(d1,use='complete') I want to combine these two to get a third matrix which should have half values from 'xa' and half values from 'xb' x x1 x2 x 1.000 -0.15157123 -0.23085308 x1 0.3466155 1. -0.01061675 x2 0.1234507 0.01775527 1. I would like to generate a heatmap for correlation values in disease and non disease phenotype I would appreciate if someone can point me in correct direction. Thanks sharad -- View this message in context: http://r.789695.n4.nabble.com/correlation-matrix-tp3891085p3891085.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. The contents of this e-mail are confidential and may be subject to legal privilege. If you are not the intended recipient you must not use, disseminate, distribute or reproduce all or any part of this e-mail or attachments. If you have received this e-mail in error, please notify the sender and delete all material pertaining to this e-mail. Any opinion or views expressed in this e-mail are those of the individual sender and may not represent those of The New Zealand Institute for Plant and Food Research Limited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multicore by(), like mclapply?
This is the sort of thing that should be measured, rather than speculated about, but if you're using multicore all those subsets can be made at the same time, not sequentially, so they add up to a copy of the whole data. Using data.table rather than a data.frame would help, of course. I would guess that splitting, garbage collecting, and then forking would be most efficient -- reducing the chance that all the separate processes end up separately garbage collecting the results of the split. It's a pity that forking messes up the profilers; makes it harder to measure these things. -thomas On Tue, Oct 11, 2011 at 9:14 AM, Joshua Wiley jwiley.ps...@gmail.com wrote: I could be waay off base here, but my concern about presplitting the data is that you will have your data, and a second copy of our data that is something like a list where each element contains the portion of the data for that split. Good speed wise, bad memory wise. My hope with the technique I showed (again I may not have accomplished it) was to only have at anyone time, the original data and a copy of the particular elements being worked with. Of course this is not an issue if you have plenty of memory. On Oct 10, 2011, at 12:19, Thomas Lumley tlum...@uw.edu wrote: On Tue, Oct 11, 2011 at 7:54 AM, ivo welch ivo.we...@gmail.com wrote: hi josh---thx. I had a different version of this, and discarded it because I think it was very slow. the reason is that on each application, your version has to scan my (very long) data vector. (I have many thousand different cases, too.) I presume that by() has one scan through the vector that makes all splits. by.data.frame() is basically a wrapper for tapply(), and the key line in tapply() is ans - lapply(split(X, group), FUN, ...) which should be easy to adapt for mclapply. -- Thomas Lumley Professor of Biostatistics University of Auckland -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] correlation matrix
And you might also consider packages like corrplot, corrgram etc. for other plotting options of a correlation matrix. They can be more informative than simply invoking image(heat) What a pleasant post to respond to - with self-contained code. :) heat-matrix(0,nrow=dim(xa)[1],ncol=dim(xa)[2]) heat[lower.tri(heat)]-xa[lower.tri(xa)] heat[upper.tri(heat)]-xb[upper.tri(xb)] diag(heat)-1 heat HTH, Daniel 1Rnwb wrote: Hello Gurus I have two correlation matrices 'xa' and 'xb' set.seed(100) d=cbind(x=rnorm(20)+1, x1=rnorm(20)+1, x2=rnorm(20)+1) d1=cbind(x=rnorm(20)+2, x1=rnorm(20)+2, x2=rnorm(20)+2) xa=cor(d,use='complete') xb=cor(d1,use='complete') I want to combine these two to get a third matrix which should have half values from 'xa' and half values from 'xb' x x1 x2 x 1.000 -0.15157123 -0.23085308 x1 0.3466155 1. -0.01061675 x2 0.1234507 0.01775527 1. I would like to generate a heatmap for correlation values in disease and non disease phenotype I would appreciate if someone can point me in correct direction. Thanks sharad -- View this message in context: http://r.789695.n4.nabble.com/correlation-matrix-tp3891085p3891685.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to test if two C statistics are significantly different?
Hey all, In order to test if a marker is a risk factor, I built two models (using cox proportional hazard model). One model included this marker, and the other is not. Then, I use R package risksetROC to test how much predictive value did the marker add to this model. I get two C statistics by analyzing the linear predictors of the two models into this package. The qustion is How to test if two C statistics are significantly different? Your help will be greatly appreciated! Yujie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] correlation matrix
okay so fixed what i need to do this way finit=0 for(ri in 1:dim(xa)[1]) { finit=finit+1 xc[ri,1:finit]-xa[ri,1:finit] xc[1:finit,ri]-xb[1:finit,ri] } but getting error in heatmap.2 mycol - colorpanel(n=40,low=red,mid=white,high=blue) heatmap.2(xc, breaks=pairs.breaks, col=mycol, Rowv=FALSE, symm=TRUE, key=TRUE, symkey=FALSE, density.info=none, trace=none, cexRow=0.5, + scale = none, dendrogram=none) Error in heatmap.2(xc, breaks = pairs.breaks, col = mycol, Rowv = FALSE, : `x' must be a numeric matrix any pointers are appreciated -- View this message in context: http://r.789695.n4.nabble.com/correlation-matrix-tp3891085p3891584.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question about string to boor?
Hi Nellie, hope I got you right. I guess you want something like that. for (i in 1:7) { oneOfNelliesArray - eval(parse(text=paste(grp,i, sep=))) anyFunction(oneOfNelliesArray) } Paste() just returns you a string. But you want R to evaluate the expression. So you have to parse it and tell R to evaluate it. Christoph 2011/10/10 song_gpqg song_g...@126.com Hello! So I am handling this problem with some arrays grp1-grp7, I want to write a loop to avoid tedious work, but I don't know how to transform string to boor? For example I used i=1 paste(grp,i, sep=) I only got grp1 instead of grp1, which can't be manipulate using mean() or other function. I am not sure if I make myself clear... THANKS Nellie -- View this message in context: http://r.789695.n4.nabble.com/question-about-string-to-boor-tp3890983p3890983.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] invalid or not-yet-implemented 'Matrix' subsetting
I have this error and I can't figure out whats wrong: invalid or not-yet-implemented 'Matrix' subsetting it pops up when I try to run this line of code: S - B[indices.mod,union(mir.e.nc,mir.negatives.nc)] -- View this message in context: http://r.789695.n4.nabble.com/invalid-or-not-yet-implemented-Matrix-subsetting-tp3891550p3891550.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linear programming problem, RGPLK - no feasible solution.
Liu Evans, Gareth Gareth.Liu-Evans at liverpool.ac.uk writes: In my post at https://stat.ethz.ch/pipermail/r-help/2011-October/292019.html I included an undefined term ej. The problem code should be as follows. It seems like a simple linear programming problem, but for some reason my code is not finding the solution. obj - c(rep(0,3),1) col1 -c(1,0,0,1,0,0,1,-2.330078923,0) col2 -c(0,1,0,0,1,0,1,-2.057855981,0) col3 -c(0,0,1,0,0,1,1,-1.885177032,0) col4 -c(-1,-1,-1,1,1,1,0,0,1) mat - cbind(col1, col2, col3, col4) dir - c(rep(=, 3), rep(=, 3), rep(==, 2), =) rhs - c(rep(0, 7), 1, 0) sol - Rglpk_solve_LP(obj, mat, dir, rhs, types = NULL, max = FALSE, bounds = c(-100,100), verbose = TRUE) The R output says there is no feasible solution, but e.g. (-2.3756786, 0.3297676, 2.0459110, 2.3756786) is feasible. The output is GLPK Simplex Optimizer, v4.42 9 rows, 4 columns, 19 non-zeros 0: obj = 0.0e+000 infeas = 1.000e+000 (2) PROBLEM HAS NO FEASIBLE SOLUTION Please have a closer look at the help page ?Rglpk_solve_LP. The way to define the bounds is a bit clumsy, but then it works: sol - Rglpk_solve_LP(obj, mat, dir, rhs, types = NULL, max = FALSE, bounds = list(lower=list(ind=1:4, val=rep(-100,4)), upper=list(ind=1:4, val=rep(100,4))), verbose=TRUE) GLPK Simplex Optimizer, v4.42 9 rows, 4 columns, 19 non-zeros 0: obj = -1.0e+02 infeas = 1.626e+03 (2) *10: obj = 1.0e+02 infeas = 0.000e+00 (0) *13: obj = 2.247686558e+00 infeas = 0.000e+00 (0) OPTIMAL SOLUTION FOUND sol $optimum [1] 2.247687 $solution [1] -2.247687e+00 -6.446292e-31 2.247687e+00 2.247687e+00 One other thing, a possible bug - if I run this code with dir shorter than it should be, R crashes. My version of R is 2.131.56322.0, and I'm running it on Windows 7. If you can reproduce that R crashes -- which it shall never do -- inform the maintainer of this package. On Mac it doesn't crash, it goes into an infinite loop with Execution aborted.Error detected in file glplib03.c at line 83. Regards, Hans Werner Regards, Gareth __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multicore by(), like mclapply?
On Mon, Oct 10, 2011 at 4:14 PM, Joshua Wiley jwiley.ps...@gmail.com wrote: I could be waay off base here, but my concern about presplitting the data is that you will have your data, and a second copy of our data that is something like a list where each element contains the portion of the data for that split. Good speed wise, bad memory wise. My hope with the technique I showed (again I may not have accomplished it) was to only have at anyone time, the original data and a copy of the particular elements being worked with. Of course this is not an issue if you have plenty of memory. That's exactly what plyr does behind the scenes. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question about string to boor?
On 11/10/11 08:21, Christoph Molnar wrote: Hi Nellie, hope I got you right. I guess you want something like that. for (i in 1:7) { oneOfNelliesArray- eval(parse(text=paste(grp,i, sep=))) anyFunction(oneOfNelliesArray) } Paste() just returns you a string. But you want R to evaluate the expression. So you have to parse it and tell R to evaluate it. But using get() is so much simpler and safer. cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: WHO Anthro growth curve macros and R
Hi all, some years ago, I sent a question to the mailing list regarding the WHO anthro macros. Since I've now received three mails asking how I solved it, I thought I'd cc R-help in for future reference. Attaching a zip file with the relevant code parts that I used that I'm not sure gets through (if anyone has recommendations on how to manage such files for the list, I'd be grateful. What I ended up doing was importing the data in SPSS format, and adapting the Splus function igrowup.standard slightly. igrowup.standard2.R is the adapted function, while the ssc files are original splus functions. Let me know if anyone gets problems in figuring out how to use the files. best regards, Gustaf -- Gustaf Rydevik, M.Sci. tel: +44(0)704 253 760 42 address:St John's hill 18/5 EH8 9UQ Edinburgh, UK skype:gustaf_rydevik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scheirer-Ray-Hare
I have been trying to use this test recently, following the text from this link: http://books.google.com/books?id=1eTyuMDND94Cpg=PA145lpg=PA145dq=nonparam#v=onepageqf=false I ordered my data based on ranks, and ran a type III ANOVA from the car package - something like Anova(lm(var1~var2*var3,contrasts=list(var2='contr.sum',var3='contr.sum')),type='III'). Then I calculate SStotal (sum of the SS for the factors, interaction, and residual). I calculate MStotal, which is the SStotal divided by the degrees of freedom (add up the DF for factors, interaction, and residual). Then, calculate SS/MStotal for each factor and combination of factors. The p value for each is calculated as follows: 1-pchisq(the SS/MStotal, the degrees of freedom). I'm pretty sure there is an error in the text, as the first example they give calculates the SS as 1496, which includes the intercept-SS, and their math doesn't work out then (1496/16 is not = to 22). The second example makes more sense, and they don't include the intercept-SS. Anyhow, this seems like a useful test, but I think it should be used with caution. Hopefully, this helps, and if I'm doing something wrong here, that would be great to know (: -- View this message in context: http://r.789695.n4.nabble.com/Scheirer-Ray-Hare-tp3818476p3891860.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Text Mining with Facebook Reviews (XML and FQL)
Hello, I am trying to use XML package to download Facebook reviews in the following way: require(XML) mydata.vectors - character(0) Qword - URLencode('#IBM') QUERY - paste('SELECT review_id, message, rating from review where message LIKE %',Qword,'%',sep='') Facebook_url = paste('https://api.facebook.com/method/fql.query?query= ',QUERY,sep='') mydata.xml - xmlParseDoc(Facebook_url, asText=F) mydata.vector - xpathSApply(mydata.xml, '//s:entry/s:title', xmlValue, namespaces =c('s'='http://www.w3.org/2005/Atom')) The mydata.xml is NULL therefore no further step can be execute. I am not so familiar with XML or FQL. Any suggestion will be appreciated. Thank you! Best regards, Kenneth [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: WHO Anthro growth curve macros and R
On Oct 10, 2011, at 4:48 PM, Gustaf Rydevik wrote: Hi all, some years ago, I sent a question to the mailing list regarding the WHO anthro macros. Since I've now received three mails asking how I solved it, I thought I'd cc R-help in for future reference. Attaching a zip file with the relevant code parts that I used that I'm not sure gets through (if anyone has recommendations on how to manage such files for the list, I'd be grateful. What I ended up doing was importing the data in SPSS format, and adapting the Splus function igrowup.standard slightly. igrowup.standard2.R is the adapted function, while the ssc files are original splus functions. Let me know if anyone gets problems in figuring out how to use the files. The only files that reach the readership are .pdf and .txt files. I do not know how carefully these get inspected, so it is possible that a zip file named something.txt might make it through. best regards, Gustaf \ David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SLOW split() function
dear R experts: apologies for all my speed and memory questions. I have a bet with my coauthors that I can make R reasonably efficient through R-appropriate programming techniques. this is not just for kicks, but for work. for benchmarking, my [3 year old] Mac Pro has 2.8GHz Xeons, 16GB of RAM, and R 2.13.1. right now, it seems that 'split()' is why I am losing my bet. (split is an integral component of *apply() and by(), so I need split() to be fast. its resulting list can then be fed, e.g., to mclapply().) I made up an example to illustrate my ills: library(data.table) N - 1000 T - N*10 d - data.table(data.frame( key= rep(1:T, rep(N,T)), val=rnorm(N*T) )) setkey(d, key); gc() ## force a garbage collection cat(N=, N, . Size of d=, object.size(d)/1024/1024, MB\n) print(system.time( s-split(d, d$key) )) My ordered input data table (or data frame; doesn't make a difference) is 114MB in size. it takes about a second to create. split() only needs to reshape it. this simple operation takes almost 5 minutes on my computer. with a data set that is larger, this explodes further. am I doing something wrong? is there an alternative to split()? sincerely, /iaw Ivo Welch (ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rolling optimization
Hello everyone, I would like assistance with a snippet I have written to do a recursive portfolio optimization given time-varying return forecasts. In my case, I have forecast the monthly returns for nearly 55 years out on 8 asset classes. I need to calculate the weights for the optimal (tangency) portfolio based on my monthly forecasts and an arbitrary covariance matrix Getting these weights have proven difficult. # these are forecast (out of sample) returns; each is a 648x1 matrix cash_forecast2=as.ts(cash_forecast) larg_forecast2=as.ts(larg_forecast) valu_forecast2=as.ts(valu_forecast) grow_forecast2=as.ts(grow_forecast) smal_forecast2=as.ts(smal_forecast) tres_forecast2=as.ts(tres_forecast) cred_forecast2=as.ts(cred_forecast) comm_forecast2=as.ts(comm_forecast) # make a matrix of all expected returns # each line corresponds to forecast monthly returns for each asset class; this is a 648x8 matrix asset_forecast=ts.intersect(cash_forecast2, larg_forecast2,valu_forecast2, grow_forecast2, smal_forecast2, tres_forecast2, cred_forecast2, comm_forecast2) # make a covariance matrix based on the entire data actual_ret=cbind(cash_ret, larg_ret,valu_ret,grow_ret,smal_ret,tres_ret,cred_ret,comm_ret) cov_matrix=cov(actual_ret) opt_port = ts(matrix(,nrow=648,ncol=8)) for (i in 1:648) opt_port[i,]= portfolio.optim(asset_forecast[i,], riskless=FALSE, shorts=TRUE, covmat = cov_matrix, by.column = FALSE, by=1, align=right) I get the following error message; Error in portfolio.optim.default(asset_forecast[i, ], shorts = TRUE, covmat = cov_matrix, : x is not a matrix So clearly, asset_forecast[i,] is not a matrix. So I need another method to do this. Can anyone suggest a solution that would allow my to set sail in the right direction? Many thanks, Bond, Jamessssorry that's my screen name... Darius :) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rolling optimization
Not having played with portfolio.opim() much, I can't guarantee this will fix it, but if it requires a matrix rather than a vector and you are sure about the rest of the syntax, this might do the trick: asset_forecast[i, , drop = FALSE] This is because: R x = matrix(1:9, 3) R is.matrix(x[,1]) FALSE R is.matrix(x[,1,drop=FALSE]) TRUE Michael On Mon, Oct 10, 2011 at 9:33 PM, Darius H xeno...@hotmail.com wrote: Hello everyone, I would like assistance with a snippet I have written to do a recursive portfolio optimization given time-varying return forecasts. In my case, I have forecast the monthly returns for nearly 55 years out on 8 asset classes. I need to calculate the weights for the optimal (tangency) portfolio based on my monthly forecasts and an arbitrary covariance matrix Getting these weights have proven difficult. # these are forecast (out of sample) returns; each is a 648x1 matrix cash_forecast2=as.ts(cash_forecast) larg_forecast2=as.ts(larg_forecast) valu_forecast2=as.ts(valu_forecast) grow_forecast2=as.ts(grow_forecast) smal_forecast2=as.ts(smal_forecast) tres_forecast2=as.ts(tres_forecast) cred_forecast2=as.ts(cred_forecast) comm_forecast2=as.ts(comm_forecast) # make a matrix of all expected returns # each line corresponds to forecast monthly returns for each asset class; this is a 648x8 matrix asset_forecast=ts.intersect(cash_forecast2, larg_forecast2,valu_forecast2, grow_forecast2, smal_forecast2, tres_forecast2, cred_forecast2, comm_forecast2) # make a covariance matrix based on the entire data actual_ret=cbind(cash_ret, larg_ret,valu_ret,grow_ret,smal_ret,tres_ret,cred_ret,comm_ret) cov_matrix=cov(actual_ret) opt_port = ts(matrix(,nrow=648,ncol=8)) for (i in 1:648) opt_port[i,]= portfolio.optim(asset_forecast[i,], riskless=FALSE, shorts=TRUE, covmat = cov_matrix, by.column = FALSE, by=1, align=right) I get the following error message; Error in portfolio.optim.default(asset_forecast[i, ], shorts = TRUE, covmat = cov_matrix, : x is not a matrix So clearly, asset_forecast[i,] is not a matrix. So I need another method to do this. Can anyone suggest a solution that would allow my to set sail in the right direction? Many thanks, Bond, Jamessssorry that's my screen name... Darius :) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SLOW split() function
instead of spliting the entire dataframe, split the indices and then use these to access your data: try system.time(s - split(seq(nrow(d)), d$key)) this should be faster and less memory intensive. you can then use the indices to access the subset: result - lapply(s, function(.indx){ doSomething - sum(d$someCol[.indx]) }) Sent from my iPad On Oct 10, 2011, at 21:01, ivo welch ivo.we...@gmail.com wrote: dear R experts: apologies for all my speed and memory questions. I have a bet with my coauthors that I can make R reasonably efficient through R-appropriate programming techniques. this is not just for kicks, but for work. for benchmarking, my [3 year old] Mac Pro has 2.8GHz Xeons, 16GB of RAM, and R 2.13.1. right now, it seems that 'split()' is why I am losing my bet. (split is an integral component of *apply() and by(), so I need split() to be fast. its resulting list can then be fed, e.g., to mclapply().) I made up an example to illustrate my ills: library(data.table) N - 1000 T - N*10 d - data.table(data.frame( key= rep(1:T, rep(N,T)), val=rnorm(N*T) )) setkey(d, key); gc() ## force a garbage collection cat(N=, N, . Size of d=, object.size(d)/1024/1024, MB\n) print(system.time( s-split(d, d$key) )) My ordered input data table (or data frame; doesn't make a difference) is 114MB in size. it takes about a second to create. split() only needs to reshape it. this simple operation takes almost 5 minutes on my computer. with a data set that is larger, this explodes further. am I doing something wrong? is there an alternative to split()? sincerely, /iaw Ivo Welch (ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.