Re: [R] R on netbooks et al?
Johannes Huesing wrote: chaogai chao...@xs4all.nl [Thu, Mar 05, 2009 at 07:04:19PM CET]: I'm having similar experiences on my Acer Aspire One. Everything will work good. Only thing that takes a lot of time is compiling R if you are in the habit of doing so. On the Fedora version that came with my Acer Aspire One, I am even thinking of compiling R itself as the current R version is 2.6.0 ... Otherwise, everything seems fine and the keyboard is indeed the greatest letdown so far (the tiny left mouse button a close second). I did do that. Most practical is to get the R-devel from the repositories. It is the wrong version, will bring what you need to build regarding other dependencies. Then remove R-devel and you can get your 2.8.1 sources from CRAN. Not sure about the exact names of the things. Now happy on Suse 11.1 after a brief fling with the Fedora 10. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interaction term not significant when using glm???
I think the interaction is not so strong anymore if you do what glm does: use a logit transformation. testdata - matrix(c(rep(0:1,times=4),rep(c(FLC,FLC,free,free),times=2), rep(c(no,yes),each =4),3,42,1,44,27,20,3,42),ncol=4) colnames(testdata) -c(spot,constr,vernalized,Freq) testdata - as.data.frame(testdata) testdata$Freq - as.numeric(as.character(testdata$Freq)) testdata$spot - as.numeric(as.character(testdata$spot)) T2 - reshape(testdata,v.names='Freq',timevar='spot',idvar=names(testdata)[c(2,3)],direction='wide') T2$Prop - T2$Freq.0/(T2$Freq.0+T2$Freq.1) plot(log(T2$Prop/(1-T2$Prop)),x=interaction(T2$constr,T2$vernalized)) Kees joris meys wrote: Dear all, I have a dataset where the interaction is more than obvious, but I was asked to give a p-value, so I ran a logistic regression using glm. Very funny, in the outcome the interaction term is NOT significant, although that's completely counterintuitive. There are 3 variables : spot (binary response), constr (gene construct) and vernalized (growth conditions). Only for the FLC construct after vernalization, the chance on spots should be lower. So in the model one would suspect the interaction term to be significant. Yet, only the two main terms are significant here. Can it be my data is too sparse to use these models? Am I using the wrong method? # data generation testdata - matrix(c(rep(0:1,times=4),rep(c(FLC,FLC,free,free),times=2), rep(c(no,yes),each =4),3,42,1,44,27,20,3,42),ncol=4) colnames(testdata) -c(spot,constr,vernalized,Freq) testdata - as.data.frame(testdata) # model T0fit - glm(spot~constr*vernalized, weights=Freq, data=testdata, family=binomial) anova(T0fit) Kind regards Joris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merge data frames with same column names of differe nt lengths and missing values
Steven Lubitz slubitz1 at yahoo.com writes: x - data.frame(item1=c(NA,NA,3,4,5), item2=c(1,NA,NA,4,5), id=1:5) y - data.frame(item1=c(NA,2,NA,4,5,6), item2=c(NA,NA,3,4,5,NA), id=1:6) merge(x,y,by=c(id,item1,item2),all.x=T,all.y=T) #my rows are duplicated and the NA values are retained - I instead want one row per ID id item1 item2 1 1NA 1 2 1NANA 3 2 2NA 4 2NANA 5 3 3NA 6 3NA 3 7 4 4 4 8 5 5 5 9 6 6NA I think you only got the wrong (too complex) function. Try rbind(x,y) Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NonLinear Programming in R - QUERY
Lars Bishop lars52r at gmail.com writes: I'll appreciate your help on this. Do you know of any package that can be used to solve optimization problems subject to general *non-linear* equality constraints. Package DEoptim Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] c() coverts real numbers to integers?
John Poulsen jpoulsen at ufl.edu writes: I know I am forgetting to do something silly. I typed coordinates in vectors (as below) but when I call them in R they come out as integers, and I want them to be real numbers. I have tried using as.numeric, as.real, etc... but they are still read by R as integers. STX-c(16.0962, 16.1227, 16.0921, 16.1498) STY-c(2.0387, 2.0214, 1.9877, 1.9846) Your should tell us what come out means. If str(STX) gives integer, then I am lost and you probably have a pre-0.01 version of R. If you see STX [1] 16 16 16 16 then somewhere in your code you have set options(digits=1). Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and vim (gvim) on ubuntu
I'd recommend to use this script instead. It uses screen to communicate R and vim, it works well. http://www.vim.org/scripts/script.php?script_id=2551 Best, - -Jose - -- Jose Quesada, PhD. Max Planck Institute, Center for Adaptive Behavior and cognition, Berlin http://www.josequesada.name/ Hi Jose, Thanks for that script. It works very well as a substitute/replacement for r-plugin. Only two things: 1) I can't send selection of text? When I try with :VicleSend I get: E481: No range allowed. 2) When I send the whole file to my rSession screen all the text disapears from vim... I can of cause get i back with u- but still. Hope you have a solution, Sincerely [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Download and Import xls files in R
Dear List, I am trying to solve a problem: I have approximately 100 Excel spreadsheets each with approximately 4 sheet each that I would like to download and import in R for analysis. Unfortunately i realized (i also sent an email to the author or xlsReadWrite() ) that the read.xls() doesn't allow to import the file in R from internet. Here it is the the code: ciao-read.xls(http://www.giustizia.it/statistiche/statistiche_dap/det/seriestoriche/corsi_proff.xls;) This doesn't work.. How would you solve the problem in an automated way? I would not like to manually download each one, open it with excel and saving in in csv? Thanks, Francesco __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Download and Import xls files in R
Dear List, I am trying to solve a problem: I have approximately 100 Excel spreadsheets each with approximately 4 sheet each that I would like to download and import in R for analysis. Unfortunately i realized (i also sent an email to the author or xlsReadWrite() ) that the read.xls() doesn't allow to import the file in R from internet. Here it is the the code: ciao-read.xls(http://www.giustizia.it/statistiche/statistiche_dap/det/seriestoriche/corsi_proff.xls;) This doesn't work.. How would you solve the problem in an automated way? I would not like to manually download each one, open it with excel and saving in in csv? Thanks, Francesco __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Download and Import xls files in R
Dear List, I am trying to solve a problem: I have approximately 100 Excel spreadsheets each with approximately 4 sheet each that I would like to download and import in R for analysis. Unfortunately i realized (i also sent an email to the author or xlsReadWrite() ) that the read.xls() doesn't allow to import the file in R from internet. Here it is the the code: ciao-read.xls(http://www.giustizia.it/statistiche/statistiche_dap/det/seriestoriche/corsi_proff.xls;) This doesn't work.. How would you solve the problem in an automated way? I would not like to manually download each one, open it with excel and saving in in csv? Thanks, Francesco __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multivariate integration and partial differentiation
Could somebody share some tips on implementing multivariate integration and partial differentiation in R? For example, for a trivariate joint distribution (cumulative density function) of F(x,y,z), how to differentiate with respect to x and get the bivariate distribution (probability density function) of f(y,z). Or integrate f(x,y,z) with respect to x to get bivariate distribution of (y,z). Your sharing is appreciated. Wei-han Liu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Download and Import xls files in R
Jacopo Anselmi wrote: Dear List, I am trying to solve a problem: I have approximately 100 Excel spreadsheets each with approximately 4 sheet each that I would like to download and import in R for analysis. Unfortunately i realized (i also sent an email to the author or xlsReadWrite() ) that the read.xls() doesn't allow to import the file in R from internet. Here it is the the code: ciao-read.xls(http://www.giustizia.it/statistiche/statistiche_dap/det/seriestoriche/corsi_proff.xls;) Two candidate solutions: 1. use the read.xls function in the gdata package (where the file can be a url) 2. use the download.file function to download the file and then use the read.xls function in the xlsReadWrite function Ciao, domenico PS: please, check the setting of your mail program (I see three times the same mail) This doesn't work.. How would you solve the problem in an automated way? I would not like to manually download each one, open it with excel and saving in in csv? Thanks, Francesco __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merge data frames with same column names of different lengths and missing values
Steven Lubitz wrote: Hello, I'm switching over from SAS to R and am having trouble merging data frames. The data frames have several columns with the same name, and each has a different number of rows. Some of the values are missing from cells with the same column names in each data frame. I had hoped that when I merged the dataframes, every column with the same name would be merged, with the value in a complete cell overwriting the value in an empty cell from the other data frame. I cannot seem to achieve this result, though I've tried several merge adaptations: x - data.frame(item1=c(NA,NA,3,4,5), item2=c(1,NA,NA,4,5), id=1:5) y - data.frame(item1=c(NA,2,NA,4,5,6), item2=c(NA,NA,3,4,5,NA), id=1:6) merge(x,y,by=id) #I lose observations here (n=1 in this example), and my items are duplicated - I do not want this result id item1.x item2.x item1.y item2.y 1 1 NA 1 NA NA 2 2 NA NA 2 NA 3 3 3 NA NA 3 4 4 4 4 4 4 5 5 5 5 5 5 merge(x,y,by=c(id,item1,item2)) #again I lose observations (n=4 here) and do not want this result id item1 item2 1 4 4 4 2 5 5 5 merge(x,y,by=c(id,item1,item2),all.x=T,all.y=T) #my rows are duplicated and the NA values are retained - I instead want one row per ID id item1 item2 1 1NA 1 2 1NANA 3 2 2NA 4 2NANA 5 3 3NA 6 3NA 3 7 4 4 4 8 5 5 5 9 6 6NA You should obtain the desired solution using: merge(y, x, by=c(id,item1,item2), all=TRUE) In database terminology all=TRUE corresponds to the full outer join, all.x to the left outer join and all.y to the right outer join. Ciao, domenico In reality I have multiple data frames with numerous columns, all with this problem. I can do the merge seamlessly in SAS, but am trying to learn and stick with R for my analyses. Any help would be greatly appreciated. Steve Lubitz Cardiovascular Research Fellow, Brigham and Women's Hospital and Massachusetts General Hospital __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multivariate integration and partial differentiation
The adapt package has multivariate integration. However, I am not sure you need the multivariate integration for the example you describe: you only need one dimensional integration. For this, you can check out ?integrate For differentiation, depending on how well behaved the cdf is, you could use ?diff to calculate approximations for the slope. HTH Andrew On Mar 7, 5:08 pm, Wei-han Liu weihanliu2...@yahoo.com wrote: Could somebody share some tips on implementing multivariate integration and partial differentiation in R? For example, for a trivariate joint distribution (cumulative density function) of F(x,y,z), how to differentiate with respect to x and get the bivariate distribution (probability density function) of f(y,z). Or integrate f(x,y,z) with respect to x to get bivariate distribution of (y,z). Your sharing is appreciated. Wei-han Liu [[alternative HTML version deleted]] __ r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interaction term not significant when using glm???
On Fri, 6 Mar 2009, joris meys wrote: Dear all, I have a dataset where the interaction is more than obvious, but I was asked to give a p-value, so I ran a logistic regression using glm. Very funny, in the outcome the interaction term is NOT significant, although that's completely counterintuitive. There are 3 variables : spot (binary response), constr (gene construct) and vernalized (growth conditions). Only for the FLC construct after vernalization, the chance on spots should be lower. So in the model one would suspect the interaction term to be significant. Yet, only the two main terms are significant here. Can it be my data is too sparse to use these models? Am I using the wrong method? The point estimate for the interaction term is large: 1.79, or an odds ratio of nearly 6. The data are very strongly overdispersed (variance is 45 times larger than it should be), so they don't fit a binomial model well. If you used a quasibinomial model you would get no statistical significance for any of the terms. I would say the problem is partly combination of the overdispersion and the sample size. It doesn't help that the situation appears to be a difference between the FLC:yes cell and the other three cells, a difference that is spread out over the three parameters. -thomas # data generation testdata - matrix(c(rep(0:1,times=4),rep(c(FLC,FLC,free,free),times=2), rep(c(no,yes),each =4),3,42,1,44,27,20,3,42),ncol=4) colnames(testdata) -c(spot,constr,vernalized,Freq) testdata - as.data.frame(testdata) # model T0fit - glm(spot~constr*vernalized, weights=Freq, data=testdata, family=binomial) anova(T0fit) Kind regards Joris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Thomas Lumley Assoc. Professor, Biostatistics tlum...@u.washington.eduUniversity of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Download and Import xls files in R
Try this: library(gdata) ciao-read.xls(pattern = TOTALE, http://www.giustizia.it/statistiche/statistiche_dap/det/seriestoriche/corsi_proff.xls;) Downloading... trying URL 'http://www.giustizia.it/statistiche/statistiche_dap/det/seriestoriche/corsi_proff.xls' Content type 'application/vnd.ms-excel' length 33280 bytes (32 Kb) opened URL downloaded 32 Kb Done. Converting xls file to csv file... Done. Searching for lines containing pattern TOTALE ... Done. Reading csv file... Done. head(ciao) X X.1 UOMINI DONNE TOTALE X.2 UOMINI.1 DONNE.1 TOTALE.1 UOMINI.2 1 I sem. 91 185 ndnd 1,926nd nd nd nd nd 2 II sem. 91 275 ndnd 2,470 89nd nd nd nd 3 I sem.92 230 3,265 432 3,697 1331,524 2001,724 543 4 II sem.92 205 2,581 417 2,998 83 864 115 979 413 5 I sem.93 241 3,165 439 3,604 1051,171 2221,393 661 6 II sem.93 256 2,844 395 3,239 94 986 1021,088 516 DONNE.2 TOTALE.2 1 nd nd 2 nd nd 3 88 631 4 66 479 5 91 752 6 79 595 On Fri, Mar 6, 2009 at 10:04 PM, Francesco Petrarca francesco.petrarc...@gmail.com wrote: Dear List, I am trying to solve a problem: I have approximately 100 Excel spreadsheets each with approximately 4 sheet each that I would like to download and import in R for analysis. Unfortunately i realized (i also sent an email to the author or xlsReadWrite() ) that the read.xls() doesn't allow to import the file in R from internet. Here it is the the code: ciao-read.xls(http://www.giustizia.it/statistiche/statistiche_dap/det/seriestoriche/corsi_proff.xls;) This doesn't work.. How would you solve the problem in an automated way? I would not like to manually download each one, open it with excel and saving in in csv? Thanks, Francesco __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Recode factor into binary factor-level vars
How to I recode a factor into a binary data frame according to the factor levels: ### example:start set.seed(20) l - sample(rep.int(c(locA, locB, locC, locD), 100), 10, replace=T) # [1] locD locD locD locD locB locA locA locA locD locA ### example:end What I want in the end is the following: m$locA: 0, 0, 0, 0, 0, 1, 1, 1, 0, 1 m$locB: 0, 0, 0, 0, 1, 0, 0, 0, 0, 0 m$locC: 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 m$locD: 1, 1, 1, 1, 0, 0, 0, 0, 1, 0 Instead of 0, NA's would also be fine. Thanks, Sören -- Sören Vogel, PhD-Student, Eawag, Dept. SIAM http://www.eawag.ch, http://sozmod.eawag.ch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to omit NA without using ifelse
Hi Manli. Try the replace() function as below: replace(a,is.na(a),0) #where a is the name of your 50 x 50 matrix Below is an example: a-matrix(c(sqrt(-2:3)), nrow=2) # produces a 2 x 3 matrix some of whose elements are NaN (or NA) # due to square root operator on negative integers replace(a, is.na(a), 0) [,1] [,2] [,3] [1,]001.414214 [2,]011.732051 bartjoosen wrote: ?is.na Manli Yan wrote: I have a 50*50 matrix,some entry are NAs,I want to replace these NA by 0,so can I use some syntax to do so other than using ifelse? I tried to use replace(a,NA,0),it didnt work~~(a is matrix name) Thanks~ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/how-to-omit-NA-without-using-ifelse-tp22365996p22387672.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Recode factor into binary factor-level vars
one way is: set.seed(20) l - sample(rep.int(c(locA, locB, locC, locD), 100), 10, replace=T) f - factor(l, levels = paste(loc, LETTERS[1:4], sep = )) m - as.data.frame(model.matrix(~ f - 1)) names(m) - levels(f) m I hope it helps. Best, Dimitris soeren.vo...@eawag.ch wrote: How to I recode a factor into a binary data frame according to the factor levels: ### example:start set.seed(20) l - sample(rep.int(c(locA, locB, locC, locD), 100), 10, replace=T) # [1] locD locD locD locD locB locA locA locA locD locA ### example:end What I want in the end is the following: m$locA: 0, 0, 0, 0, 0, 1, 1, 1, 0, 1 m$locB: 0, 0, 0, 0, 1, 0, 0, 0, 0, 0 m$locC: 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 m$locD: 1, 1, 1, 1, 0, 0, 0, 0, 1, 0 Instead of 0, NA's would also be fine. Thanks, Sören -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multivariate integration and partial differentiation
Hi R Users: Could somebody share some tips on implementing multivariate integration and partial differentiation in R? For example, for a trivariate joint distribution (cumulative density function) of F(x,y,z), how to differentiate with respect to x and get the bivariate distribution (probability density function) of f(y,z). Or integrate f(x,y,z) with respect to x to get bivariate distribution of (y,z). Your sharing is appreciated. Wei-han Liu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Recode factor into binary factor-level vars
Sören; You need to somehow add back to the information that is in l that fact that it was sampled from a set with 4 elements. Since you didn't sample from a factor the level information was lost. Otherwise, you coud create that list with unique(l) which in this case only returns 3 elements: set.l - c(locA, locB, locC, locD) sapply(set.l, function(x) l == x) locA locB locC locD [1,] FALSE FALSE FALSE TRUE [2,] FALSE FALSE FALSE TRUE [3,] FALSE FALSE FALSE TRUE [4,] FALSE FALSE FALSE TRUE [5,] FALSE TRUE FALSE FALSE [6,] TRUE FALSE FALSE FALSE [7,] TRUE FALSE FALSE FALSE [8,] TRUE FALSE FALSE FALSE [9,] FALSE FALSE FALSE TRUE [10,] TRUE FALSE FALSE FALSE Its in the wrong orientation because l is actually a column vector, so t() fixes that and adding 0 to TRUE/FALSE returns 0/1: t(sapply(set.l, function(x) x == l))+0 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] locA000001110 1 locB000010000 0 locC000000000 0 locD111100001 0 m - as.data.frame(t(sapply(set.l, function(x) l == x))+0) m The one-liner would be: m - as.data.frame(t(sapply(c(locA, locB, locC, locD), function(x) l == x))+0) You canalso you mapply but the result does not have the desired row names and the column names are the result of the sampling whcih seems to me potentially confusing: mapply(function(x) x==set.l, l)+0 locD locD locD locD locB locA locA locA locD locA [1,]0000011101 [2,]0000100000 [3,]0000000000 [4,]1111000010 I see that Dimitris has already given you a perfectly workable solution, but these seem to tackle the problem from a different angle. -- David Winsemius, MD Heritage Laboratories West Hartford, CT On Mar 7, 2009, at 8:39 AM, soeren.vo...@eawag.ch wrote: How to I recode a factor into a binary data frame according to the factor levels: ### example:start set.seed(20) l - sample(rep.int(c(locA, locB, locC, locD), 100), 10, replace=T) # [1] locD locD locD locD locB locA locA locA locD locA ### example:end What I want in the end is the following: m$locA: 0, 0, 0, 0, 0, 1, 1, 1, 0, 1 m$locB: 0, 0, 0, 0, 1, 0, 0, 0, 0, 0 m$locC: 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 m$locD: 1, 1, 1, 1, 0, 0, 0, 0, 1, 0 Instead of 0, NA's would also be fine. Thanks, Sören -- Sören Vogel, PhD-Student, Eawag, Dept. SIAM http://www.eawag.ch, http://sozmod.eawag.ch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] piecewise linear regression
Hi - I'd like to construct and plot the percents by year in a small data set (d) that has values between 1988 and 2007. I'd like to have a breakpoint (buy no discontinuity) at 1996. Is there a better way to do this than in the code below? d year percent se 1 198830.6 0.32 2 198931.5 0.31 3 199030.9 0.30 4 199130.6 0.28 5 199229.3 0.25 6 199430.3 0.26 7 199629.9 0.24 8 199828.4 0.22 9 200027.8 0.22 10 200126.1 0.20 11 200225.1 0.19 12 200324.4 0.19 13 200423.7 0.19 14 200525.1 0.18 15 200623.9 0.20 dput(d) structure(list(year = c(1988L, 1989L, 1990L, 1991L, 1992L, 1994L, 1996L, 1998L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L, 2007L), percent = c(30.6, 31.5, 30.9, 30.6, 29.3, 30.3, 29.9, 28.4, 27.8, 26.1, 25.1, 24.4, 23.7, 25.1, 23.9, 23.9), se = c(0.32, 0.31, 0.3, 0.28, 0.25, 0.26, 0.24, 0.22, 0.22, 0.2, 0.19, 0.19, 0.19, 0.18, 0.2, 0.18)), .Names = c(year, percent, se), class = data.frame, row.names = c(NA, -16L)) with(d,plot(year,percent,pch=16,xlim=c(1988,2007))) m=lm(percent~ year + I(year-1996):I(year = 1996), weights=1/se, subset=year=1988, da=d); points(d$year,predict(m,dafr(year=d$year)),type='l',lwd=2,col='red') thanks very much David Freedman -- View this message in context: http://www.nabble.com/piecewise-linear-regression-tp22388118p22388118.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Goldbach partitions code
Hi, I have been meaning to get back to you sooner on this. I have posted goldbach5, which is a bit faster, on my blog. http://romainfrancois.blog.free.fr/ Any takers for the next step ? Cheers, Romain Folks, I put up a brief note describing my naive attempts to compute Goldbach partitions, starting with a brute-force approach and refining progressively. http://jostamon.blogspot.com/2009/02/goldbachs-comet.html I'd welcome your suggestions on improvements, alternatives, other optimisations, esp. to do with space vs time tradeoffs. Is this an example interesting enough for pedagogical purposes, do you think? Please advise. Cheers, MM -- Romain Francois Independent R Consultant +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] piecewise linear regression
It actually looked reasonably economical but the output certainly is ugly. I see a variety of approaches in the r-help archives. This thread discusses two other approaches, degree-one splines from Berry and hard coded-coefficients from Lumley: http://finzi.psych.upenn.edu/R/Rhelp08/archive/118046.html The Lumley solution has the advantage which he articulates that the slopes are more directly interpretabel and in this case you can see that yourversion's year slope agrees with Lumley's suggested parametrization: m=lm(percent~ year + pmax(year,1996) + pmin(year, 1996), weights=1/ se, + subset=year=1988, da=d); m Call: lm(formula = percent ~ year + pmax(year, 1996) + pmin(year, 1996), data = d, subset = year = 1988, weights = 1/se) Coefficients: (Intercept) year pmax(year, 1996) pmin(year, 1996) 1161.3126 -0.2177 -0.3494NA More compact output to boot. -- David Winsemius On Mar 7, 2009, at 9:54 AM, David Freedman wrote: Hi - I'd like to construct and plot the percents by year in a small data set (d) that has values between 1988 and 2007. I'd like to have a breakpoint (buy no discontinuity) at 1996. Is there a better way to do this than in the code below? d year percent se 1 198830.6 0.32 2 198931.5 0.31 3 199030.9 0.30 4 199130.6 0.28 5 199229.3 0.25 6 199430.3 0.26 7 199629.9 0.24 8 199828.4 0.22 9 200027.8 0.22 10 200126.1 0.20 11 200225.1 0.19 12 200324.4 0.19 13 200423.7 0.19 14 200525.1 0.18 15 200623.9 0.20 dput(d) structure(list(year = c(1988L, 1989L, 1990L, 1991L, 1992L, 1994L, 1996L, 1998L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L, 2007L), percent = c(30.6, 31.5, 30.9, 30.6, 29.3, 30.3, 29.9, 28.4, 27.8, 26.1, 25.1, 24.4, 23.7, 25.1, 23.9, 23.9), se = c(0.32, 0.31, 0.3, 0.28, 0.25, 0.26, 0.24, 0.22, 0.22, 0.2, 0.19, 0.19, 0.19, 0.18, 0.2, 0.18)), .Names = c(year, percent, se), class = data.frame, row.names = c(NA, -16L)) with(d,plot(year,percent,pch=16,xlim=c(1988,2007))) m=lm(percent~ year + I(year-1996):I(year = 1996), weights=1/se, subset=year=1988, da=d); points(d$year,predict(m,dafr(year=d$year)),type='l',lwd=2,col='red') thanks very much David Freedman -- View this message in context: http://www.nabble.com/piecewise-linear-regression-tp22388118p22388118.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Week value function
Hi R users, I am looking for a date function that will give the following: - The number-of-week value is in the range 01-53 - Weeks begin on a Monday and week 1 of the year is the week that includes both January 4th and the first Thursday of the year. If the first Monday of January is the 2nd, 3rd, or 4th, the preceding days are part of the last week of the preceding year. This is similar to the SAS's week function with option V. I am currently using : date - strptime(DATE, %d%B%Y) week - format(date, %W) but, I could not find an option for doing the above description automatically. Can anyone help? Thanks in advance for any help. -- View this message in context: http://www.nabble.com/Week-value-function-tp22389878p22389878.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] using a noisy variable in regression (not an R question)
Hi, This is not an R question, but I've seen opinions given on non R topics, so I wanted to give it a try. :) How would one treat a variable that was measured once, but is known to fluctuate a lot? For example, I want to include a hormone in my regression as an explanatory variable. However, this hormone varies in its levels throughout a day. Nevertheless, its levels differ substantially between individuals so that there is information there to use. One simple thing to try would be to form categories, but I assume there are better ways to handle this. Has anyone worked with such data, or could anyone suggest some keywords that may be helpful in searching for this topic. Thanks for your input. Regards, Juliet __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Week value function
I am not seeing anything but that proves nothing of course. You could write your own function and stick it in the .First of your .Rprofile files that get loads at startup. Details here: http://cran.r-project.org/doc/contrib/Lemon-kickstart/kr_first.html week.dBY - function(x) format(strptime(x, %d%B%Y) , %W) dt-07JAN2009 week.dBY(dt) [1] 01 # a character valued vector Gives 00 for 01JAN2009 but you can adjust that behavior to your specifications. You could also convert to numeric if desired: nweek.dBY - function(x) as.integer(format(strptime(x, %d%B%Y) , %W)) nweek.dBY(dt) [1] 1 -- David Winsemius On Mar 7, 2009, at 12:34 PM, Pele wrote: Hi R users, I am looking for a date function that will give the following: - The number-of-week value is in the range 01-53 - Weeks begin on a Monday and week 1 of the year is the week that includes both January 4th and the first Thursday of the year. If the first Monday of January is the 2nd, 3rd, or 4th, the preceding days are part of the last week of the preceding year. This is similar to the SAS's week function with option V. I am currently using : date - strptime(DATE, %d%B%Y) week - format(date, %W) but, I could not find an option for doing the above description automatically. Can anyone help? Thanks in advance for any help. -- View this message in context: http://www.nabble.com/Week-value-function-tp22389878p22389878.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using a noisy variable in regression (not an R question)
If you form categories, you add even more error, specifically, the variation in the distance between each number and the category boundary. What's wrong with just including it in the regression? Yes, the measure X1 will account for less variance than the underlying variable of real interest (T1, each individual's mean, perhaps), but X1 could still be useful in two ways. One, it might be a significant predictor of the dependent variable Y despite the error. Two, it might increase the sensitivity of the model to other predictors (X2, X3...) by accounting for what would otherwise be error. What you cannot conclude in this case (when you measure a predictor with error) is that the effect of (say) X2 is not accounted for by its correlation with T1. Some people try to conclude this when X2 remains a significant predictor of Y when X1 is included in the model. The trouble is that X1 is an error-prone measure of T1, so the full effect of T1 is not removed by inclusion of X1. Jon On 03/07/09 12:49, Juliet Hannah wrote: Hi, This is not an R question, but I've seen opinions given on non R topics, so I wanted to give it a try. :) How would one treat a variable that was measured once, but is known to fluctuate a lot? For example, I want to include a hormone in my regression as an explanatory variable. However, this hormone varies in its levels throughout a day. Nevertheless, its levels differ substantially between individuals so that there is information there to use. One simple thing to try would be to form categories, but I assume there are better ways to handle this. Has anyone worked with such data, or could anyone suggest some keywords that may be helpful in searching for this topic. Thanks for your input. Regards, Juliet __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jonathan Baron, Professor of Psychology, University of Pennsylvania Home page: http://www.sas.upenn.edu/~baron Editor: Judgment and Decision Making (http://journal.sjdm.org) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using a noisy variable in regression (not an R question)
Hi Juliet, Juliet Hannah schrieb: One simple thing to try would be to form categories Simple but problematic. Frank Harrell put together a wonderful page detailing all the issues with categorizing continuous data: http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/CatContinuous So: keep your data continuous. Apart from that, I would second John's recommendation to try to get samples at the same point in time (and, if it is cortisol, stay away from smokers etc.). Best wishes Stephan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using a noisy variable in regression (not an R question)
Thank you for your responses. I should have emphasized, I do not intend to categorize -- mainly because of all the discussions I have seen on R-help arguing against this. I just thought it would be problematic to include the variable by itself. Take other variables, such as a genotype or BMI. If we measure this variable the next day, it would be the same. However, a hormone's level would not be the same. I thought this error must be accounted for somehow. Thanks again! Regards, Juliet On Sat, Mar 7, 2009 at 1:21 PM, Jonathan Baron ba...@psych.upenn.edu wrote: If you form categories, you add even more error, specifically, the variation in the distance between each number and the category boundary. What's wrong with just including it in the regression? Yes, the measure X1 will account for less variance than the underlying variable of real interest (T1, each individual's mean, perhaps), but X1 could still be useful in two ways. One, it might be a significant predictor of the dependent variable Y despite the error. Two, it might increase the sensitivity of the model to other predictors (X2, X3...) by accounting for what would otherwise be error. What you cannot conclude in this case (when you measure a predictor with error) is that the effect of (say) X2 is not accounted for by its correlation with T1. Some people try to conclude this when X2 remains a significant predictor of Y when X1 is included in the model. The trouble is that X1 is an error-prone measure of T1, so the full effect of T1 is not removed by inclusion of X1. Jon On 03/07/09 12:49, Juliet Hannah wrote: Hi, This is not an R question, but I've seen opinions given on non R topics, so I wanted to give it a try. :) How would one treat a variable that was measured once, but is known to fluctuate a lot? For example, I want to include a hormone in my regression as an explanatory variable. However, this hormone varies in its levels throughout a day. Nevertheless, its levels differ substantially between individuals so that there is information there to use. One simple thing to try would be to form categories, but I assume there are better ways to handle this. Has anyone worked with such data, or could anyone suggest some keywords that may be helpful in searching for this topic. Thanks for your input. Regards, Juliet __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jonathan Baron, Professor of Psychology, University of Pennsylvania Home page: http://www.sas.upenn.edu/~baron Editor: Judgment and Decision Making (http://journal.sjdm.org) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Standardized coefficients (std.coef) in graphviz from path.diagram()
I was wondering if there was a way to add the standardized coefficients from SEM that I get from running std.coef() to my graph that I create with path.diagram() for graphviz? Right now the only way I know how is to edit the values in a text editor after creating the graph. Thanks, Chris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using a noisy variable in regression (not an R question)
Hi Juliet, Juliet Hannah schrieb: I should have emphasized, I do not intend to categorize -- mainly because of all the discussions I have seen on R-help arguing against this. Sorry that we all jumped on this ;-) I just thought it would be problematic to include the variable by itself. Take other variables, such as a genotype or BMI. If we measure this variable the next day, it would be the same. However, a hormone's level would not be the same. I thought this error must be accounted for somehow. You are quite correct that fluctuating hormone levels are a problem (although, strictly speaking, measuring BMI and even genotyping will not yield exactly the same results the next day, measurement error is always present). And there may be methods dealing with this, but I don't know of any. If you have any idea about the variability of your hormone, you could always take your data, perturb the hormone levels and run the analysis again to get a feeling for the stability of your results. This is quite ad hoc, but if I were the reviewer, a perturbation analysis like this would greatly reassure me. However, I recently worked with hormones and had exactly your problem, and we couldn't find any published data on day-to-day variability, so this was not an option - we finally went ahead and simply plugged the measurements into R. Good luck! Stephan Thanks again! Regards, Juliet On Sat, Mar 7, 2009 at 1:21 PM, Jonathan Baron ba...@psych.upenn.edu wrote: If you form categories, you add even more error, specifically, the variation in the distance between each number and the category boundary. What's wrong with just including it in the regression? Yes, the measure X1 will account for less variance than the underlying variable of real interest (T1, each individual's mean, perhaps), but X1 could still be useful in two ways. One, it might be a significant predictor of the dependent variable Y despite the error. Two, it might increase the sensitivity of the model to other predictors (X2, X3...) by accounting for what would otherwise be error. What you cannot conclude in this case (when you measure a predictor with error) is that the effect of (say) X2 is not accounted for by its correlation with T1. Some people try to conclude this when X2 remains a significant predictor of Y when X1 is included in the model. The trouble is that X1 is an error-prone measure of T1, so the full effect of T1 is not removed by inclusion of X1. Jon On 03/07/09 12:49, Juliet Hannah wrote: Hi, This is not an R question, but I've seen opinions given on non R topics, so I wanted to give it a try. :) How would one treat a variable that was measured once, but is known to fluctuate a lot? For example, I want to include a hormone in my regression as an explanatory variable. However, this hormone varies in its levels throughout a day. Nevertheless, its levels differ substantially between individuals so that there is information there to use. One simple thing to try would be to form categories, but I assume there are better ways to handle this. Has anyone worked with such data, or could anyone suggest some keywords that may be helpful in searching for this topic. Thanks for your input. Regards, Juliet __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jonathan Baron, Professor of Psychology, University of Pennsylvania Home page: http://www.sas.upenn.edu/~baron Editor: Judgment and Decision Making (http://journal.sjdm.org) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using a noisy variable in regression (not an R question)
On Sat, Mar 7, 2009 at 11:49 AM, Juliet Hannah juliet.han...@gmail.com wrote: Hi, This is not an R question, but I've seen opinions given on non R topics, so I wanted to give it a try. :) How would one treat a variable that was measured once, but is known to fluctuate a lot? For example, I want to include a hormone in my regression as an explanatory variable. However, this hormone varies in its levels throughout a day. Nevertheless, its levels differ substantially between individuals so that there is information there to use. One simple thing to try would be to form categories, but I assume there are better ways to handle this. Has anyone worked with such data, or could anyone suggest some keywords that may be helpful in searching for this topic. Thanks for your input. From teaching econometrics, I remember that if the truth is y=b0+b1x1+noise and then you do not have a correct measure of x1, but rather something else like ex1=x1+noise, then the regression estimate of b1 is biased, generally attenuated. As far as I understand it, the technical solutions are not too encouraging You can try to get better data or possibly to build an instrumental variables model, where you could have other predictors of the true value of x1 in a first stage model. I don't recall that I was able to persuade myself that approach really solves anything, but many people recommend it. I suppose a key question is whether you can persuade your audience that ex1= x1+noise and whether that noise is well behaved. As I was considering your problem, I was wondering if there might not be a mixed model approach to this problem. You hypothesize the truth is y=b0+b1x1+noise, but you don't have x1. So suppose you reconsider the truth as a random parameter, as in y=b0+c1*ex1+noise. ex1 is a fixed estimate of the hormone level for each observation. c1 is a random, varying coefficient because the effect of the hormone fluctuates in an unmeasurable way. Then you could try to estimate the distribution of c1. You have an interesting problem, I think. pj -- Paul E. Johnson Professor, Political Science 1541 Lilac Lane, Room 504 University of Kansas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] popular R packages
I would like to get some idea of which R-packages are popular, and what R is used for in general. Are there any statistics available on which R packages are downloaded often, or is there something like a package-survey? Something similar to http://popcon.debian.org/ maybe? Any tips are welcome! - Jeroen Ooms * Dept. of Methodology and Statistics * Utrecht University Visit http://www.jeroenooms.com www.jeroenooms.com to explore some of my current projects. -- View this message in context: http://www.nabble.com/popular-R-packages-tp22391260p22391260.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Standardized coefficients (std.coef) in graphviz from path.diagram()
Dear Christopher, -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Christopher David Desjardins Sent: March-07-09 1:38 PM To: r-help@r-project.org Subject: [R] Standardized coefficients (std.coef) in graphviz from path.diagram() I was wondering if there was a way to add the standardized coefficients from SEM that I get from running std.coef() to my graph that I create with path.diagram() for graphviz? The graphviz commands are created by the sem:::path.diagram.sem function, which is pretty simple. You could easily edit it so that it allows you to specify arbitrary labels for the edges of the graph. I hope this helps, John Right now the only way I know how is to edit the values in a text editor after creating the graph. Thanks, Chris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using a noisy variable in regression (not an R question)
On Sat, 7 Mar 2009, Juliet Hannah wrote: Hi, This is not an R question, but I've seen opinions given on non R topics, so I wanted to give it a try. :) How would one treat a variable that was measured once, but is known to fluctuate a lot? For example, I want to include a hormone in my regression as an explanatory variable. However, this hormone varies in its levels throughout a day. Nevertheless, its levels differ substantially between individuals so that there is information there to use. One simple thing to try would be to form categories, but I assume there are better ways to handle this. Has anyone worked with such data, or could anyone suggest some keywords that may be helpful in searching for this Try: correction for attenuation measurement error models errors-in-variables Wayne Fuller LA Stefanski and RJ Carroll William Cochran HTH, Chuck topic. Thanks for your input. Regards, Juliet __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] standard error for median
Dear all, is it possible to estimate a standard error for the median? And is there a function in R for this? I want to use it to describe a skewed distribution. Thanks in advance, Ralph [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] popular R packages
This function will show which other packages depend on a particular package: dep - function(pkg, AP = available.packages()) { +pkg - paste(\\b, pkg, \\b, sep = ) +cat(Depends:, rownames(AP)[grep(pkg, AP[, Depends])], \n) +cat(Suggests:, rownames(AP)[grep(pkg, AP[, Suggests])], \n) + } dep(zoo) Depends: AER BootPR FinTS PerformanceAnalytics RBloomberg StreamMetabolism TSfame TShistQuote VhayuR dyn dynlm fda fxregime lmtest meboot party quantmod sandwich sde strucchange tripEstimation tseries xts Suggests: TSMySQL TSPostgreSQL TSSQLite TSdbi TSodbc UsingR Zelig gsubfn playwith pscl tframePlus On Sat, Mar 7, 2009 at 2:57 PM, Jeroen Ooms j.c.l.o...@uu.nl wrote: I would like to get some idea of which R-packages are popular, and what R is used for in general. Are there any statistics available on which R packages are downloaded often, or is there something like a package-survey? Something similar to http://popcon.debian.org/ maybe? Any tips are welcome! - Jeroen Ooms * Dept. of Methodology and Statistics * Utrecht University Visit http://www.jeroenooms.com www.jeroenooms.com to explore some of my current projects. -- View this message in context: http://www.nabble.com/popular-R-packages-tp22391260p22391260.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question on Variabeles
Hi everyone, Im quite new to R an I have the following Question: I would like to define Variables which I can add and multiply etc. and that R can simplyfy the terms. The variables should stand for integers. For example I would like to have an entry in an array with variable z and if I addb+zthe field should now contain 2z+b. Or if the field contains 1and I addzthe field should contain 1+z. How can I solve this problem? I tried to set z=integer and then for example matrix[1,1]=i But this does only work if I assign a value for z first. I am very grateful for any help. Thank you very much! David -- View this message in context: http://www.nabble.com/Question-on-Variabeles-tp22388117p22388117.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ttest in R
Dear list, i am a biologist who needs to do some ttest between disease and non disease, sex, genotype and the serum levels of proteins on a large number of individuals. i have been using excel for a long time but it is very tedious and time consuming. i am posting the data below and ask your help in generating a code to get this analysis done in R. thanks gender disease genotypedata M N CC 3.206447188 F N CC 8.2 M N CC 15.78908629 M N CC 21.14311843 F N CC 21.48413647 M N CC 30.08028842 M N CC 30.11455009 F N CC 32.41258845 F N CT 6622.253065 M N CT 6763.6 M N CT 7342.023209 F N CT 7617.7 F N CT 7857.3 M N CT 8027.78692 F N CT 8755.950438 M N CT 9007.7 F N CT 9157.76987 M N CT 9398.270953 F N CT 9710.083037 F N CT 9896.887277 M N CT 10082.60082 F N CT 10137.05244 F N CT 10350.76186 M N CT 14629.34508 F N TT 4.614829254 F N TT 5.223593964 F N TT 6.7 M N TT 6.7 M N TT 7.735287229 F N TT 13.68084134 F N TT 14.5 M N TT 15.3 M N TT 16.16826703 M N TT 19.8 M N TT 24.51271254 M N TT 29.92459383 F N TT 30.3993842 M N TT 30.57161207 F N TT 30.72031553 F N TT 31.8 F N TT 34.72409961 M N TT 37 F N TT 38.94507607 M N TT 39.1 M N TT 40.9 M N TT 41.5 F N TT 42.36614019 F Y CC 338.2166757 M Y CC 345.8711007 M Y CC 347.4659528 F Y CC 356.3 F Y CC 358.4 F Y CC 360.184259 F Y CC 453.8 F Y CC 573.7342373 M Y CC 962.1232959 F Y CC 1055.9 F Y CC 1309.532621 F Y CC 2798.6 F Y CC 3568.794326 M Y CT 1.227348206 F Y CT 2.061944986 F Y CT 2.245592643 M Y CT 2.454696412 M Y CT 2.456716738 M Y CT 4.318447391 M Y CT 4.503098245 M Y CT 5.873088452 M Y CT 7.106930564 F Y CT 7.7 M Y CT 10.83537709 M Y CT 11.4 M Y CT 12.1 M Y CT 12.62002743 M Y CT 13.6 F Y CT 13.7 F Y CT 14.35562171 F Y CT 15.9 F Y TT 986.6755719 F Y TT 1206.475083 F Y TT 1237.9 M Y TT 1254.5 F Y TT 1303.6 F Y TT 1573.915019 M Y TT 1756.8 M Y TT 1895 M Y TT 2126.766565 F Y TT 2149.512866 M Y TT 3249.449945 F Y TT 6999.3 M Y TT 7172.479241 M Y TT 8268.909251 M Y TT 8544.229671 -- View this message in context: http://www.nabble.com/ttest-in-R-tp22390889p22390889.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] standard error for median
It is an example (both via asymptotic theory and the bootstrap) in chapter 5 of MASS (the book). The functions used are in the scripts of the MASS package, but you will need to understand the theory being used as described in the book. On Sat, 7 Mar 2009, Ralph Scherer wrote: Dear all, is it possible to estimate a standard error for the median? And is there a function in R for this? I want to use it to describe a skewed distribution. Thanks in advance, Ralph [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] standard error for median
Dear Prof. Ripley, thank you for your fast answer. But which book do you mean? I can't find an MASS book. Do you mean the R book? Best wishes, Ralph Am Samstag, den 07.03.2009, 21:24 + schrieb Prof Brian Ripley: It is an example (both via asymptotic theory and the bootstrap) in chapter 5 of MASS (the book). The functions used are in the scripts of the MASS package, but you will need to understand the theory being used as described in the book. On Sat, 7 Mar 2009, Ralph Scherer wrote: Dear all, is it possible to estimate a standard error for the median? And is there a function in R for this? I want to use it to describe a skewed distribution. Thanks in advance, Ralph [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] standard error for median
Ok, found it. Thanks. Am Samstag, den 07.03.2009, 22:34 +0100 schrieb Ralph Scherer: Dear Prof. Ripley, thank you for your fast answer. But which book do you mean? I can't find an MASS book. Do you mean the R book? Best wishes, Ralph Am Samstag, den 07.03.2009, 21:24 + schrieb Prof Brian Ripley: It is an example (both via asymptotic theory and the bootstrap) in chapter 5 of MASS (the book). The functions used are in the scripts of the MASS package, but you will need to understand the theory being used as described in the book. On Sat, 7 Mar 2009, Ralph Scherer wrote: Dear all, is it possible to estimate a standard error for the median? And is there a function in R for this? I want to use it to describe a skewed distribution. Thanks in advance, Ralph [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] standard error for median
Ralph Scherer scherer...@googlemail.com [Sat, Mar 07, 2009 at 10:34:28PM CET]: Dear Prof. Ripley, thank you for your fast answer. But which book do you mean? I can't find an MASS book. Try library(MASS) citation(package=MASS) -- Johannes Hüsing There is something fascinating about science. One gets such wholesale returns of conjecture mailto:johan...@huesing.name from such a trifling investment of fact. http://derwisch.wikidot.com (Mark Twain, Life on the Mississippi) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ttest in R
Hello friend. I believe anova might be a better solution for you. You might have a look here: http://www.personality-project.org/r/r.anova.html A simple R session that will work for you is: # getting the data in: data1 - read.table( enter the path of the file here, look at ?read.table for exact syntax) aov1 - aov(data ~ gender + disease + genotype ,data = data1) summary(aov1) If you insist on t.test, here is the way: t.test(data ~ gender ,data = data1) t.test(data ~ disease ,data = data1) t.test(data ~ genotype ,data = data1) Cheers, Tal On Sat, Mar 7, 2009 at 9:23 PM, 1Rnwb sbpuro...@gmail.com wrote: Dear list, i am a biologist who needs to do some ttest between disease and non disease, sex, genotype and the serum levels of proteins on a large number of individuals. i have been using excel for a long time but it is very tedious and time consuming. i am posting the data below and ask your help in generating a code to get this analysis done in R. thanks gender disease genotypedata M N CC 3.206447188 F N CC 8.2 M N CC 15.78908629 M N CC 21.14311843 F N CC 21.48413647 M N CC 30.08028842 M N CC 30.11455009 F N CC 32.41258845 F N CT 6622.253065 M N CT 6763.6 M N CT 7342.023209 F N CT 7617.7 F N CT 7857.3 M N CT 8027.78692 F N CT 8755.950438 M N CT 9007.7 F N CT 9157.76987 M N CT 9398.270953 F N CT 9710.083037 F N CT 9896.887277 M N CT 10082.60082 F N CT 10137.05244 F N CT 10350.76186 M N CT 14629.34508 F N TT 4.614829254 F N TT 5.223593964 F N TT 6.7 M N TT 6.7 M N TT 7.735287229 F N TT 13.68084134 F N TT 14.5 M N TT 15.3 M N TT 16.16826703 M N TT 19.8 M N TT 24.51271254 M N TT 29.92459383 F N TT 30.3993842 M N TT 30.57161207 F N TT 30.72031553 F N TT 31.8 F N TT 34.72409961 M N TT 37 F N TT 38.94507607 M N TT 39.1 M N TT 40.9 M N TT 41.5 F N TT 42.36614019 F Y CC 338.2166757 M Y CC 345.8711007 M Y CC 347.4659528 F Y CC 356.3 F Y CC 358.4 F Y CC 360.184259 F Y CC 453.8 F Y CC 573.7342373 M Y CC 962.1232959 F Y CC 1055.9 F Y CC 1309.532621 F Y CC 2798.6 F Y CC 3568.794326 M Y CT 1.227348206 F Y CT 2.061944986 F Y CT 2.245592643 M Y CT 2.454696412 M Y CT 2.456716738 M Y CT 4.318447391 M Y CT 4.503098245 M Y CT 5.873088452 M Y CT 7.106930564 F Y CT 7.7 M Y CT 10.83537709 M Y CT 11.4 M Y CT 12.1 M Y CT 12.62002743 M Y CT 13.6 F Y CT 13.7 F Y CT 14.35562171 F Y CT 15.9 F Y TT 986.6755719 F Y TT 1206.475083 F Y TT 1237.9 M Y TT 1254.5 F Y TT 1303.6 F Y TT 1573.915019 M Y TT 1756.8 M Y TT 1895 M Y TT 2126.766565 F Y TT 2149.512866 M Y TT 3249.449945 F Y TT 6999.3 M Y TT 7172.479241 M Y TT 8268.909251 M Y TT 8544.229671 -- View this message in context: http://www.nabble.com/ttest-in-R-tp22390889p22390889.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- -- My contact information: Tal Galili Phone number: 972-50-3373767 FaceBook: Tal Galili My Blogs: www.talgalili.com www.biostatistics.co.il [[alternative HTML version deleted]]
Re: [R] merge data frames with same column names of different lengths and missing values
Steve, I don't know if R has such a function to perform the task you were asking. I wrote one myself. Try the following to see if it works for you. The new function merge.new has one additional argument col.ID, which is the column number of ID column. To use your x, y as examples, type: merge.new(x,y,all=TRUE,col.ID=3) # merge.new-function(...,col.ID){ inter-merge(...) inter-inter[order(inter[col.ID]),] #merged data sorted by ID #total columns and rows for the target dataframe total.row-length(unique(inter[[col.ID]])) total.col-dim(inter)[2] row.ID-unique(inter[[col.ID]]) target-matrix(NA,total.row,total.col) target-as.data.frame(target) names(target)-names(inter) for (i in 1:total.row){ inter.part-inter[inter[col.ID]==row.ID[i],] #select all rows with the same ID for (j in 1:total.col){ if (is.na(inter.part[1,j])){ if(is.na(inter.part[2,j])) {target[i,j]=NA} else {target[i,j]=inter.part[2,j]} } else {target[i,j]=inter.part[1,j]} } } print(paste(total rows=,total.row)) print(paste(total columns=,total.col)) return(target) } # -- Jun Shen PhD PK/PD Scientist BioPharma Services Millipore Corporation 15 Research Park Dr. St Charles, MO 63304 Direct: 636-720-1589 On Fri, Mar 6, 2009 at 11:02 PM, Steven Lubitz slubi...@yahoo.com wrote: Hello, I'm switching over from SAS to R and am having trouble merging data frames. The data frames have several columns with the same name, and each has a different number of rows. Some of the values are missing from cells with the same column names in each data frame. I had hoped that when I merged the dataframes, every column with the same name would be merged, with the value in a complete cell overwriting the value in an empty cell from the other data frame. I cannot seem to achieve this result, though I've tried several merge adaptations: x - data.frame(item1=c(NA,NA,3,4,5), item2=c(1,NA,NA,4,5), id=1:5) y - data.frame(item1=c(NA,2,NA,4,5,6), item2=c(NA,NA,3,4,5,NA), id=1:6) merge(x,y,by=id) #I lose observations here (n=1 in this example), and my items are duplicated - I do not want this result id item1.x item2.x item1.y item2.y 1 1 NA 1 NA NA 2 2 NA NA 2 NA 3 3 3 NA NA 3 4 4 4 4 4 4 5 5 5 5 5 5 merge(x,y,by=c(id,item1,item2)) #again I lose observations (n=4 here) and do not want this result id item1 item2 1 4 4 4 2 5 5 5 merge(x,y,by=c(id,item1,item2),all.x=T,all.y=T) #my rows are duplicated and the NA values are retained - I instead want one row per ID id item1 item2 1 1NA 1 2 1NANA 3 2 2NA 4 2NANA 5 3 3NA 6 3NA 3 7 4 4 4 8 5 5 5 9 6 6NA In reality I have multiple data frames with numerous columns, all with this problem. I can do the merge seamlessly in SAS, but am trying to learn and stick with R for my analyses. Any help would be greatly appreciated. Steve Lubitz Cardiovascular Research Fellow, Brigham and Women's Hospital and Massachusetts General Hospital __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merge data frames with same column names of different lengths and missing values
Subject: Re: [R] merge data frames with same column names of different lengths and missing values To: Phil Spector spec...@stat.berkeley.edu Date: Saturday, March 7, 2009, 5:01 PM Phil, Thank you - this is very helpful. However I realized that with my real data sets (not the example I have here), I also have different numbers of columns in each data frame. rbind doesn't seem to like this. Here's a modified example: x - data.frame(item1=c(NA,NA,3,4,5), item2=c(1,NA,NA,4,5), item3=c(NA,2,NA,4,NA), id=1:5) y - data.frame(item1=c(NA,2,NA,4,5,6), item2=c(NA,NA,3,4,5,NA), id=1:6) rbind(x,y) Error in rbind(deparse.level, ...) : numbers of columns of arguments do not match Any ideas? Thanks, Steve --- On Sat, 3/7/09, Phil Spector spec...@stat.berkeley.edu wrote: From: Phil Spector spec...@stat.berkeley.edu Subject: Re: [R] merge data frames with same column names of different lengths and missing values To: Steven Lubitz slubi...@yahoo.com Date: Saturday, March 7, 2009, 1:56 AM Steven - I believe this gives the output that you desire: xy = rbind(x,y) aggregate(subset(xy,select=-id),xy['id'],function(x)rev(x[!is.na(x)])[1]) id item1 item2 1 1NA 1 2 2 2NA 3 3 3 3 4 4 4 4 5 5 5 5 6 6 6NA But I think what merge x y; by id; would give you is aggregate(subset(xy,select=-id),xy['id'],function(x)x[length(x)]) id item1 item2 1 1NANA 2 2 2NA 3 3NA 3 4 4 4 4 5 5 5 5 6 6 6NA - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Fri, 6 Mar 2009, Steven Lubitz wrote: Hello, I'm switching over from SAS to R and am having trouble merging data frames. The data frames have several columns with the same name, and each has a different number of rows. Some of the values are missing from cells with the same column names in each data frame. I had hoped that when I merged the dataframes, every column with the same name would be merged, with the value in a complete cell overwriting the value in an empty cell from the other data frame. I cannot seem to achieve this result, though I've tried several merge adaptations: x - data.frame(item1=c(NA,NA,3,4,5), item2=c(1,NA,NA,4,5), id=1:5) y - data.frame(item1=c(NA,2,NA,4,5,6), item2=c(NA,NA,3,4,5,NA), id=1:6) merge(x,y,by=id) #I lose observations here (n=1 in this example), and my items are duplicated - I do not want this result id item1.x item2.x item1.y item2.y 1 1 NA 1 NA NA 2 2 NA NA 2 NA 3 3 3 NA NA 3 4 4 4 4 4 4 5 5 5 5 5 5 merge(x,y,by=c(id,item1,item2)) #again I lose observations (n=4 here) and do not want this result id item1 item2 1 4 4 4 2 5 5 5 merge(x,y,by=c(id,item1,item2),all.x=T,all.y=T) #my rows are duplicated and the NA values are retained - I instead want one row per ID id item1 item2 1 1NA 1 2 1NANA 3 2 2NA 4 2NANA 5 3 3NA 6 3NA 3 7 4 4 4 8 5 5 5 9 6 6NA In reality I have multiple data frames with numerous columns, all with this problem. I can do the merge seamlessly in SAS, but am trying to learn and stick with R for my analyses. Any help would be greatly appreciated. Steve Lubitz Cardiovascular Research Fellow, Brigham and Women's Hospital and Massachusetts General Hospital __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ttest in R
p.s: since the data (Y variable) is very much not normal (also after a log transform)I would consider going with an a-parametric test check: ? wilcox.test (for a non parametric t.test ) OR (for a non parametric simple anova) ?kruskal.test On Sat, Mar 7, 2009 at 11:51 PM, Tal Galili tal.gal...@gmail.com wrote: Hello friend. I believe anova might be a better solution for you. You might have a look here: http://www.personality-project.org/r/r.anova.html A simple R session that will work for you is: # getting the data in: data1 - read.table( enter the path of the file here, look at ?read.table for exact syntax) aov1 - aov(data ~ gender + disease + genotype ,data = data1) summary(aov1) If you insist on t.test, here is the way: t.test(data ~ gender ,data = data1) t.test(data ~ disease ,data = data1) t.test(data ~ genotype ,data = data1) Cheers, Tal On Sat, Mar 7, 2009 at 9:23 PM, 1Rnwb sbpuro...@gmail.com wrote: Dear list, i am a biologist who needs to do some ttest between disease and non disease, sex, genotype and the serum levels of proteins on a large number of individuals. i have been using excel for a long time but it is very tedious and time consuming. i am posting the data below and ask your help in generating a code to get this analysis done in R. thanks gender disease genotypedata M N CC 3.206447188 F N CC 8.2 M N CC 15.78908629 M N CC 21.14311843 F N CC 21.48413647 M N CC 30.08028842 M N CC 30.11455009 F N CC 32.41258845 F N CT 6622.253065 M N CT 6763.6 M N CT 7342.023209 F N CT 7617.7 F N CT 7857.3 M N CT 8027.78692 F N CT 8755.950438 M N CT 9007.7 F N CT 9157.76987 M N CT 9398.270953 F N CT 9710.083037 F N CT 9896.887277 M N CT 10082.60082 F N CT 10137.05244 F N CT 10350.76186 M N CT 14629.34508 F N TT 4.614829254 F N TT 5.223593964 F N TT 6.7 M N TT 6.7 M N TT 7.735287229 F N TT 13.68084134 F N TT 14.5 M N TT 15.3 M N TT 16.16826703 M N TT 19.8 M N TT 24.51271254 M N TT 29.92459383 F N TT 30.3993842 M N TT 30.57161207 F N TT 30.72031553 F N TT 31.8 F N TT 34.72409961 M N TT 37 F N TT 38.94507607 M N TT 39.1 M N TT 40.9 M N TT 41.5 F N TT 42.36614019 F Y CC 338.2166757 M Y CC 345.8711007 M Y CC 347.4659528 F Y CC 356.3 F Y CC 358.4 F Y CC 360.184259 F Y CC 453.8 F Y CC 573.7342373 M Y CC 962.1232959 F Y CC 1055.9 F Y CC 1309.532621 F Y CC 2798.6 F Y CC 3568.794326 M Y CT 1.227348206 F Y CT 2.061944986 F Y CT 2.245592643 M Y CT 2.454696412 M Y CT 2.456716738 M Y CT 4.318447391 M Y CT 4.503098245 M Y CT 5.873088452 M Y CT 7.106930564 F Y CT 7.7 M Y CT 10.83537709 M Y CT 11.4 M Y CT 12.1 M Y CT 12.62002743 M Y CT 13.6 F Y CT 13.7 F Y CT 14.35562171 F Y CT 15.9 F Y TT 986.6755719 F Y TT 1206.475083 F Y TT 1237.9 M Y TT 1254.5 F Y TT 1303.6 F Y TT 1573.915019 M Y TT 1756.8 M Y TT 1895 M Y TT 2126.766565 F Y TT 2149.512866 M Y TT 3249.449945 F Y TT 6999.3 M Y TT 7172.479241 M Y TT 8268.909251 M Y TT 8544.229671 -- View this message in context: http://www.nabble.com/ttest-in-R-tp22390889p22390889.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide
[R] Rdonlp2 -Query
Hi, Did anyone used this package? Could you please share your thought on it? Thanks! L. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question on Variables
myArray[, 'z'] - myArray[, 'z'] + b Is this what you want? On Sat, Mar 7, 2009 at 9:52 AM, David1234 danielth...@web.de wrote: Hi everyone, Im quite new to R an I have the following Question: I would like to define Variables which I can add and multiply etc. and that R can simplyfy the terms. The variables should stand for integers. For example I would like to have an entry in an array with variable z and if I add b+z the field should now contain 2z+b. Or if the field contains 1 and I add z the field should contain 1+z. How can I solve this problem? I tried to set z=integer and then for example matrix[1,1]=i But this does only work if I assign a value for z first. I am very grateful for any help. Thank you very much! David -- View this message in context: http://www.nabble.com/Question-on-Variables-tp22388117p22388117.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question on Variabeles
I fear that you are looking for a symbolic algebra system, and R is not that sort of platform. If I am correct and you still want to access a symbolic algebra system from R, then you should look at YACAS and the interface to it, Ryacas. -- David Winsemius On Mar 7, 2009, at 9:52 AM, David1234 wrote: Hi everyone, Im quite new to R an I have the following Question: I would like to define Variables which I can add and multiply etc. and that R can simplyfy the terms. The variables should stand for integers. For example I would like to have an entry in an array with variable z and if I addb+zthe field should now contain 2z+b. Or if the field contains 1and I addzthe field should contain 1+z. How can I solve this problem? I tried to set z=integer and then for example matrix[1,1]=i But this does only work if I assign a value for z first. I am very grateful for any help. Thank you very much! David -- View this message in context: http://www.nabble.com/Question-on-Variabeles-tp22388117p22388117.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] popular R packages
When the question arises How many R-users there are?, the consensus seems to be that there is no valid method to address the question. The thread R-business case from 2004 can be found here: https://stat.ethz.ch/pipermail/r-help/2004-March/047606.html I did not see any material revision to that conclusion during the recent discussion of the New York Times article on the r-challenge to SAS. Gmane tracks the number of r-help activity (I realize not what you asked for): http://www.gmane.org/info.php?group=gmane.comp.lang.r.general The distribution of r-packages is, well ... distributed: http://cran.r-project.org/mirrors.html At least one of the participants in the 2004 thread suggested that it would be a good thing to track the numbers of downloads by package. I have not heard of any such system being installed in the mirror software and I see nothing that suggests data gathering in the CRAN Mirror How-to: http://cran.r-project.org/mirror-howto.html On the other hand I am not part of R-core, so you must await more authoritative opinion since a 5 year-old thread and amateur speculation is not much of a leg to stand on. There are lexicographic packages for R. One approach to a de novo analysis would be to do some sort of natural language analysis of the r-help archives counting up either package names with non-English names or close proximity of the words library or package to package names that overlap the 30,000 common English words. That would have the danger of inflating counts of the packages with the least adequate documentation or a paucity of good worked examples, but there are many readers of this list who suspect that new users don't look at the documentation, so who knows? -- David Winsemius On Mar 7, 2009, at 2:57 PM, Jeroen Ooms wrote: I would like to get some idea of which R-packages are popular, and what R is used for in general. Are there any statistics available on which R packages are downloaded often, or is there something like a package-survey? Something similar to http://popcon.debian.org/ maybe? Any tips are welcome! - Jeroen Ooms * Dept. of Methodology and Statistics * Utrecht University Visit http://www.jeroenooms.com www.jeroenooms.com to explore some of my current projects. -- View this message in context: http://www.nabble.com/popular-R-packages-tp22391260p22391260.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] popular R packages
I don't think At least one of the participants in the 2004 thread suggested that it would be a good thing to track the numbers of downloads by package. is reasonable because I download R packages for 2 home computers (laptop desktop) and 2 at work (1 Linux 1 Mac). There must be many such cases… Tom David Winsemius wrote: When the question arises How many R-users there are?, the consensus seems to be that there is no valid method to address the question. The thread R-business case from 2004 can be found here: https://stat.ethz.ch/pipermail/r-help/2004-March/047606.html I did not see any material revision to that conclusion during the recent discussion of the New York Times article on the r-challenge to SAS. Gmane tracks the number of r-help activity (I realize not what you asked for): http://www.gmane.org/info.php?group=gmane.comp.lang.r.general The distribution of r-packages is, well ... distributed: http://cran.r-project.org/mirrors.html At least one of the participants in the 2004 thread suggested that it would be a good thing to track the numbers of downloads by package. I have not heard of any such system being installed in the mirror software and I see nothing that suggests data gathering in the CRAN Mirror How-to: http://cran.r-project.org/mirror-howto.html On the other hand I am not part of R-core, so you must await more authoritative opinion since a 5 year-old thread and amateur speculation is not much of a leg to stand on. There are lexicographic packages for R. One approach to a de novo analysis would be to do some sort of natural language analysis of the r-help archives counting up either package names with non-English names or close proximity of the words library or package to package names that overlap the 30,000 common English words. That would have the danger of inflating counts of the packages with the least adequate documentation or a paucity of good worked examples, but there are many readers of this list who suspect that new users don't look at the documentation, so who knows? -- Thomas E Adams National Weather Service Ohio River Forecast Center 1901 South State Route 134 Wilmington, OH 45177 EMAIL: thomas.ad...@noaa.gov VOICE: 937-383-0528 FAX:937-383-0033 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] popular R packages
I agree with Thomas, over the years I have installed R on at least 5 computers. BTW: does any one knows how the website statistics of r-project are being analyzed? Since I can't see any google analytics or other tracking code in the main website, I am guessing someone might be running some log-file analyzer - but I'd rather hear that then assume. On Sun, Mar 8, 2009 at 12:45 AM, Thomas Adams thomas.ad...@noaa.gov wrote: I don't think At least one of the participants in the 2004 thread suggested that it would be a good thing to track the numbers of downloads by package. is reasonable because I download R packages for 2 home computers (laptop desktop) and 2 at work (1 Linux 1 Mac). There must be many such cases Tom David Winsemius wrote: When the question arises How many R-users there are?, the consensus seems to be that there is no valid method to address the question. The thread R-business case from 2004 can be found here: https://stat.ethz.ch/pipermail/r-help/2004-March/047606.html I did not see any material revision to that conclusion during the recent discussion of the New York Times article on the r-challenge to SAS. Gmane tracks the number of r-help activity (I realize not what you asked for): http://www.gmane.org/info.php?group=gmane.comp.lang.r.general The distribution of r-packages is, well ... distributed: http://cran.r-project.org/mirrors.html At least one of the participants in the 2004 thread suggested that it would be a good thing to track the numbers of downloads by package. I have not heard of any such system being installed in the mirror software and I see nothing that suggests data gathering in the CRAN Mirror How-to: http://cran.r-project.org/mirror-howto.html On the other hand I am not part of R-core, so you must await more authoritative opinion since a 5 year-old thread and amateur speculation is not much of a leg to stand on. There are lexicographic packages for R. One approach to a de novo analysis would be to do some sort of natural language analysis of the r-help archives counting up either package names with non-English names or close proximity of the words library or package to package names that overlap the 30,000 common English words. That would have the danger of inflating counts of the packages with the least adequate documentation or a paucity of good worked examples, but there are many readers of this list who suspect that new users don't look at the documentation, so who knows? -- Thomas E Adams National Weather Service Ohio River Forecast Center 1901 South State Route 134 Wilmington, OH 45177 EMAIL: thomas.ad...@noaa.gov VOICE: 937-383-0528 FAX:937-383-0033 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- -- My contact information: Tal Galili Phone number: 972-50-3373767 FaceBook: Tal Galili My Blogs: www.talgalili.com www.biostatistics.co.il [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] popular R packages
Quite so. It certainly is the case that Dirk Eddelbuettel suggested would be very desirable and I think Dirk's track record speaks for itself. I never said (and I am sure Dirk never intended) that one could take the raw numbers as a basis for blandly asserting that copies of ttt package are currently installed. When I update packages, the automated process takes hold and I go for a cup of coffee. I only have at the moment two computers with R installed and have not updated any binary packages on Windoze in over a year. Nonetheless, I do think the relative numbers of package downloads might be interpretable, or at the very least, the basis for discussions over beer. -- David Winsemius On Mar 7, 2009, at 5:45 PM, Thomas Adams wrote: I don't think At least one of the participants in the 2004 thread suggested that it would be a good thing to track the numbers of downloads by package. is reasonable because I download R packages for 2 home computers (laptop desktop) and 2 at work (1 Linux 1 Mac). There must be many such cases… Tom David Winsemius wrote: When the question arises How many R-users there are?, the consensus seems to be that there is no valid method to address the question. The thread R-business case from 2004 can be found here: https://stat.ethz.ch/pipermail/r-help/2004-March/047606.html I did not see any material revision to that conclusion during the recent discussion of the New York Times article on the r-challenge to SAS. Gmane tracks the number of r-help activity (I realize not what you asked for): http://www.gmane.org/info.php?group=gmane.comp.lang.r.general The distribution of r-packages is, well ... distributed: http://cran.r-project.org/mirrors.html At least one of the participants in the 2004 thread suggested that it would be a good thing to track the numbers of downloads by package. I have not heard of any such system being installed in the mirror software and I see nothing that suggests data gathering in the CRAN Mirror How-to: http://cran.r-project.org/mirror-howto.html On the other hand I am not part of R-core, so you must await more authoritative opinion since a 5 year-old thread and amateur speculation is not much of a leg to stand on. There are lexicographic packages for R. One approach to a de novo analysis would be to do some sort of natural language analysis of the r-help archives counting up either package names with non- English names or close proximity of the words library or package to package names that overlap the 30,000 common English words. That would have the danger of inflating counts of the packages with the least adequate documentation or a paucity of good worked examples, but there are many readers of this list who suspect that new users don't look at the documentation, so who knows? -- Thomas E Adams National Weather Service Ohio River Forecast Center 1901 South State Route 134 Wilmington, OH 45177 EMAIL: thomas.ad...@noaa.gov VOICE: 937-383-0528 FAX:937-383-0033 David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] popular R packages
I agree with Thomas, over the years I have installed R on at least 5 computers. I don't see why per-marchine statistics would not be useful. When you installed a package on five machines, you probably use it a lot, and it is more important to you than packages that you only installed once. Furthermore I don't think the distribution of packages has to be problematic. I guess downloads are only slightly related to the specific mirror, so download statistics from one of the popular mirror's would do for me. Of course these statistics are never perfect, but they could be informative... [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] popular R packages
i have kept r installed on more than ten computers during the past few years, some of them running win + more than one linux distro, all of them having r, most often installed from a separate download. i know of many cases where students download r for the purpose of a course in statistics -- often an introductory course for students who otherwise have little to do with stats. some of them do it more than once during the semester, and many of them never use r again. taking into account that basic statistics courses are taught to most university students and that r is surely the most popular free statistical computing environment, download-based usage estimates may be a bit optimistic, unless 'usage' is taken to include 'learn-pass-forget'. vQ Tal Galili wrote: I agree with Thomas, over the years I have installed R on at least 5 computers. BTW: does any one knows how the website statistics of r-project are being analyzed? Since I can't see any google analytics or other tracking code in the main website, I am guessing someone might be running some log-file analyzer - but I'd rather hear that then assume. On Sun, Mar 8, 2009 at 12:45 AM, Thomas Adams thomas.ad...@noaa.gov wrote: I don't think At least one of the participants in the 2004 thread suggested that it would be a good thing to track the numbers of downloads by package. is reasonable because I download R packages for 2 home computers (laptop desktop) and 2 at work (1 Linux 1 Mac). There must be many such cases… Tom __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot confidence limits of a regression line
hi, is there an easy way to plot the confidence lines or confidence area of the beta weight in a scatterplot? like in this plot; http://www.ssc.wisc.edu/sscc/pubs/screenshots/4-25/4-25_4.png thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NonLinear Programming in R - QUERY
Try the package Rdonlp2, which can handle general, nonlinear, equality and inequality constraints for smooth optimization problems. Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu - Original Message - From: Dieter Menne dieter.me...@menne-biomed.de Date: Saturday, March 7, 2009 4:05 am Subject: Re: [R] NonLinear Programming in R - QUERY To: r-h...@stat.math.ethz.ch Lars Bishop lars52r at gmail.com writes: I'll appreciate your help on this. Do you know of any package that can be used to solve optimization problems subject to general *non-linear* equality constraints. Package DEoptim Dieter __ R-help@r-project.org mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multivariate integration and partial differentiation
Hi, The adapt package might work, but note that it cannot handle infinite limits. So, integrate() is your best bet. Since you need to integrate out only one dimension, integrate() would work just fine. As for differentiation, you might try the grad() fucntion in the numDeriv package. Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu - Original Message - From: Wei-han Liu weihanliu2...@yahoo.com Date: Saturday, March 7, 2009 9:38 am Subject: [R] multivariate integration and partial differentiation To: r-help@r-project.org Hi R Users: Could somebody share some tips on implementing multivariate integration and partial differentiation in R? For example, for a trivariate joint distribution (cumulative density function) of F(x,y,z), how to differentiate with respect to x and get the bivariate distribution (probability density function) of f(y,z). Or integrate f(x,y,z) with respect to x to get bivariate distribution of (y,z). Your sharing is appreciated. Wei-han Liu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot confidence limits of a regression line
How about library(ggplot2) qplot(wt, mpg, data = mtcars, geom=c(point, smooth),method = lm). On Sat, Mar 7, 2009 at 6:19 PM, Martin Batholdy batho...@googlemail.com wrote: hi, is there an easy way to plot the confidence lines or confidence area of the beta weight in a scatterplot? like in this plot; http://www.ssc.wisc.edu/sscc/pubs/screenshots/4-25/4-25_4.png thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot confidence limits of a regression line - problem
hi, I don't know what I am doing wrong, but with that code; x1 - c(1.60, 0.27, 0.17, 1.63, 1.37, 2.00, 0.90, 1.07, 0.89, 0.43, 0.37, 0.59, 0.47, 1.83, 1.79, 0.90, 0.72, 1.83, 0.23, 1.97, 2.03, 2.19, 2.03, 0.86) x2 - c(1.30, 0.24, 0.20, 0.50, 1.33, 1.87, 1.30, 0.75, 1.07, 0.43, 0.37, 0.87, 1.40, 1.37, 1.63, 0.80, 0.57, 1.60, 0.39, 2.03, 1.90, 2.07, 1.93, 0.93) model - lm(x1 ~ x2) predict(model, newdata=data.frame(x=seq(0,4)), ,interval = c(confidence), level = 0.90,type=response) I get an error message saying newdate has four lines, but the variable found has 24 what is wrong in the code? thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] popular R packages
I just did RSiteSearch(library(xxx)) with xxx = the names of 6 packages familiar to me, with the following numbers of hits: hits package 169 lme4 165 nlme 6 fda 4 maps 2 FinTS 2 DierckxSpline Software could be written to (1) extract the names of current packages from CRAN then (2) perform queries similar to this on all such packages and summarize the results. I don't have the time now to write code for this, but I've written similar code before for step (1); it can be found in scripts/TsayFiles.R in the FinTS package on CRAN. For step (2), Sundar Dorai-Raj wrote code that is is included in the preliminary RSiteSearch package available from R-Forge via install.'packages(RSiteSearch,repos=http://r-forge.r-project.org;)'. Code to do this could probably be written (a) in a matter of seconds by many of those in the R Core team or (b) in a matter of hours by virtually any reader of this list using the examples I just cited. And it could provide numbers without a need to convince others to keep download statistics and make them available later. Hope this helps. Spencer Graves Wacek Kusnierczyk wrote: i have kept r installed on more than ten computers during the past few years, some of them running win + more than one linux distro, all of them having r, most often installed from a separate download. i know of many cases where students download r for the purpose of a course in statistics -- often an introductory course for students who otherwise have little to do with stats. some of them do it more than once during the semester, and many of them never use r again. taking into account that basic statistics courses are taught to most university students and that r is surely the most popular free statistical computing environment, download-based usage estimates may be a bit optimistic, unless 'usage' is taken to include 'learn-pass-forget'. vQ Tal Galili wrote: I agree with Thomas, over the years I have installed R on at least 5 computers. BTW: does any one knows how the website statistics of r-project are being analyzed? Since I can't see any google analytics or other tracking code in the main website, I am guessing someone might be running some log-file analyzer - but I'd rather hear that then assume. On Sun, Mar 8, 2009 at 12:45 AM, Thomas Adams thomas.ad...@noaa.gov wrote: I don't think At least one of the participants in the 2004 thread suggested that it would be a good thing to track the numbers of downloads by package. is reasonable because I download R packages for 2 home computers (laptop desktop) and 2 at work (1 Linux 1 Mac). There must be many such cases… Tom __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot confidence limits of a regression line - problem
It is not an error but rather a warning. As you should have seen, R went ahead and returned estimates for 24 predicted values for x1 for arguments to the formula of x2. In R errors and warnings are very different. You are expected to post full console messages to prevent this sort of confusion. In this case the warning puts you on notice that your offer to newdata= was not acceptable. You did not construct it with names that match those in the formula which was used to create model. Please follow the links offered and read the Posting Guide: -- David Winsemius On Mar 7, 2009, at 8:18 PM, Martin Batholdy wrote: hi, I don't know what I am doing wrong, but with that code; x1 - c(1.60, 0.27, 0.17, 1.63, 1.37, 2.00, 0.90, 1.07, 0.89, 0.43, 0.37, 0.59, 0.47, 1.83, 1.79, 0.90, 0.72, 1.83, 0.23, 1.97, 2.03, 2.19, 2.03, 0.86) x2 - c(1.30, 0.24, 0.20, 0.50, 1.33, 1.87, 1.30, 0.75, 1.07, 0.43, 0.37, 0.87, 1.40, 1.37, 1.63, 0.80, 0.57, 1.60, 0.39, 2.03, 1.90, 2.07, 1.93, 0.93) model - lm(x1 ~ x2) predict(model, newdata=data.frame(x=seq(0,4)), ,interval = c(confidence), level = 0.90,type=response) I get an error message saying newdate has four lines, but the variable found has 24 what is wrong in the code? thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] statistical question: confidence interval of regression weight - significance
hi, at first; thanks for the help on getting confidence intervals in R. now I have a pure statistical question. I hope you don't mind if I ask ... I have an expectation of how large my beta-weight in a regression should be - so I have an ideal or expected regression line. Now the real beta-weight is less then the expected and when I draw the confidence interval lines above and below the estimated line, the expected regression line is outside of the confidence intervals at the near end and the beginning of the x-scale. Can I say know that the empirical beta-weight is significantly different from the expected beta weight? Or is there a test (perhaps implemented in R) that can tell how the probability is that the estimated (empirical) beta-weight comes from a population where the beta-weight is equal the expected? thanks for any help! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Week value function
Hi David - I will try that.. Thanks for your suggestion! David Winsemius wrote: I am not seeing anything but that proves nothing of course. You could write your own function and stick it in the .First of your .Rprofile files that get loads at startup. Details here: http://cran.r-project.org/doc/contrib/Lemon-kickstart/kr_first.html week.dBY - function(x) format(strptime(x, %d%B%Y) , %W) dt-07JAN2009 week.dBY(dt) [1] 01 # a character valued vector Gives 00 for 01JAN2009 but you can adjust that behavior to your specifications. You could also convert to numeric if desired: nweek.dBY - function(x) as.integer(format(strptime(x, %d%B%Y) , %W)) nweek.dBY(dt) [1] 1 -- David Winsemius On Mar 7, 2009, at 12:34 PM, Pele wrote: Hi R users, I am looking for a date function that will give the following: - The number-of-week value is in the range 01-53 - Weeks begin on a Monday and week 1 of the year is the week that includes both January 4th and the first Thursday of the year. If the first Monday of January is the 2nd, 3rd, or 4th, the preceding days are part of the last week of the preceding year. This is similar to the SAS's week function with option V. I am currently using : date - strptime(DATE, %d%B%Y) week - format(date, %W) but, I could not find an option for doing the above description automatically. Can anyone help? Thanks in advance for any help. -- View this message in context: http://www.nabble.com/Week-value-function-tp22389878p22389878.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Week-value-function-tp22389878p22394294.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Predictive Analytics Seminar: San Jose, NYC, Toronto, more
Hi all, I wanted to let you know about our training seminar on predictive analytics - coming April, May, Oct, and Nov in San Jose, NYC, Stockholm, Toronto and other cities. This is intensive training for marketers, managers and business professionals to make actionable sense of customer data by predicting buying behavior, churn, etc. Past attendees provided rave reviews. Here's more info: -- Event: Predictive Analytics for Business, Marketing and Web Dates: April 2-3, May 3-4, May 27-28, Oct 14-15, Oct 18-19, and Nov 11-12, 2009 Locations: Toronto (April), San Jose (May), NYC (May), Stockholm (Oct), DC (Oct), San Francisco (Nov) A two-day intensive seminar brought to you by Prediction Impact, Inc. 93% rate this program Excellent or Very Good. **The official training program of Predictive Analytics World** **Offered in conjunction with eMetrics events** (Also see our Online Training: Predictive Analytics Applied - immediate access at any time www.predictionimpact.com/predictive-analytics-online-training.html) --- ABOUT THIS SEMINAR: Business metrics do a great job summarizing the past. But if you want to predict how customers will respond in the future, there is one place to turn--predictive analytics. By learning from your abundant historical data, predictive analytics provides the marketer something beyond standard business reports and sales forecasts: actionable predictions for each customer. These predictions encompass all channels, both online and off, foreseeing which customers will buy, click, respond, convert or cancel. If you predict it, you own it. The customer predictions generated by predictive analytics deliver more relevant content to each customer, improving response rates, click rates, buying behavior, retention and overall profit. For online applications such as e-marketing and customer care recommendations, predictive analytics acts in real-time, dynamically selecting the ad, web content or cross-sell product each visitor is most likely to click on or respond to, according to that visitor's profile. This is AB selection, rather than just AB testing. Predictive Analytics for Business, Marketing and Web is a concentrated training program that includes interactive breakout sessions and a brief hands-on exercise. In two days we cover: • The techniques, tips and pointers you need in order to run a successful predictive analytics and data mining initiative • How to strategically position and tactically deploy predictive analytics and data mining at your company • How to bridge the prevalent gap between technical understanding and practical use • How a predictive model works, how it's created and how much revenue it generates • Several detailed case studies that demonstrate predictive analytics in action and make the concepts concrete • NEW TOPIC: Five Ways to Lower Costs with Predictive Analytics No background in statistics or modeling is required. The only specific knowledge assumed for this training program is moderate experience with Microsoft Excel or equivalent. For more information, visit www.predictionimpact.com/predictive-analytics-training.html, or e-mail us at train...@predictionimpact.com. You may also call (415) 683-1146. Cross-Registration Special: Attendees earn $250 off the Predictive Analytics World Conference SNEAK PREVIEW VIDEO: www.predictionimpact.com/predictive-analytics-times.html $100 off early registration, 3 weeks ahead -- View this message in context: http://www.nabble.com/Predictive-Analytics-Seminar%3A-San-Jose%2C-NYC%2C-Toronto%2C-more-tp22394338p22394338.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] popular R packages
Hi Spencer, XLSolutions is currently analyzing r-help archived questions to rank packages for the upcoming R-PLUS 3.3 Professional version and we will be happy to share the outcome with interested parties. Please email d...@xlsolutions-corp.com Regards - Sue Turner Senior Account Manager XLSolutions Corporation North American Division 1700 7th Ave Suite 2100 Seattle, WA 98101 Phone: 206-686-1578 Email: s...@xlsolutions-corp.com web: www.xlsolutions-corp.com --- On Sat, 3/7/09, Spencer Graves spencer.gra...@prodsyse.com wrote: From: Spencer Graves spencer.gra...@prodsyse.com Subject: Re: [R] popular R packages To: Wacek Kusnierczyk waclaw.marcin.kusnierc...@idi.ntnu.no Cc: r-help@r-project.org, Jeroen Ooms j.c.l.o...@uu.nl, Thomas Adams thomas.ad...@noaa.gov Date: Saturday, March 7, 2009, 5:22 PM I just did RSiteSearch(library(xxx)) with xxx = the names of 6 packages familiar to me, with the following numbers of hits: hits package 169 lme4 165 nlme 6 fda 4 maps 2 FinTS 2 DierckxSpline Software could be written to (1) extract the names of current packages from CRAN then (2) perform queries similar to this on all such packages and summarize the results. I don't have the time now to write code for this, but I've written similar code before for step (1); it can be found in scripts/TsayFiles.R in the FinTS package on CRAN. For step (2), Sundar Dorai-Raj wrote code that is is included in the preliminary RSiteSearch package available from R-Forge via install.'packages(RSiteSearch,repos=http://r-forge.r-project.org;)'. Code to do this could probably be written (a) in a matter of seconds by many of those in the R Core team or (b) in a matter of hours by virtually any reader of this list using the examples I just cited. And it could provide numbers without a need to convince others to keep download statistics and make them available later. Hope this helps. Spencer Graves Wacek Kusnierczyk wrote: i have kept r installed on more than ten computers during the past few years, some of them running win + more than one linux distro, all of them having r, most often installed from a separate download. i know of many cases where students download r for the purpose of a course in statistics -- often an introductory course for students who otherwise have little to do with stats. some of them do it more than once during the semester, and many of them never use r again. taking into account that basic statistics courses are taught to most university students and that r is surely the most popular free statistical computing environment, download-based usage estimates may be a bit optimistic, unless 'usage' is taken to include 'learn-pass-forget'. vQ Tal Galili wrote: I agree with Thomas, over the years I have installed R on at least 5 computers. BTW: does any one knows how the website statistics of r-project are being analyzed? Since I can't see any google analytics or other tracking code in the main website, I am guessing someone might be running some log-file analyzer - but I'd rather hear that then assume. On Sun, Mar 8, 2009 at 12:45 AM, Thomas Adams thomas.ad...@noaa.gov wrote: I don't think At least one of the participants in the 2004 thread suggested that it would be a good thing to track the numbers of downloads by package. is reasonable because I download R packages for 2 home computers (laptop desktop) and 2 at work (1 Linux 1 Mac). There must be many such cases… Tom __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] popular R packages
Hi all, I'm kind of amazed at the answers suggested for the relatively simple question, How many times has each R package been downloaded?. Some have veered off in another direction, like working out how many packages a package depends upon, or whether someone downloads more than one copy. The response about ranking packages by the number of questions asked about them may be interesting, but may not relate very well at all to popularity in terms of downloads. If people were constantly asking questions about one of the packages I maintain, I would be working on the help pages to improve them, not basking in the inferred glory of having a popular package. There is one way that the download count would be very useful for package maintainers, if no one else. Take as an example the package concord, that has not been maintained for a year or more since the content was merged into the irr package. If I knew that no one downloaded concord any more, I would surely petition those in charge of the archive to remove it or at least transfer it to the package museum. No point in having ever more packages on CRAN if they are never downloaded. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rdonlp2 -Query
Leo Guelman leo.guelman at gmail.com writes: Hi, Did anyone used this package? Could you please share your thought on it? What do you, exactly, mean with share your thought on it? It has its pros and cons, as always. Sure Rdonlp2 has been used, and it has been requested and discussed several times here on the list. It is mentioned on the 'Optimization' task view, too. An RSite search might help. Thanks! L. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.