Re: [R] Generating contingency tables from the null
Hi all, I have a 3x4 contingency table with row totals all being 100. I want to generate 3 x 4 tables from the null distribution. Which R function can do this? -- Thanks, Jim. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: all combinations of the elements of two vectors
Hi Dear R-help readers, I'm sure this problem has been answered but I can't find the solution. I have two vectors v1 - c(a,b) v2 - c(1,2,3) I want an easy way to produce every possible combination of v1, v2 elements Ie I want to produce c(a1,a2,a3, b1,b2,b3) Another option is z-outer(x,y, paste, sep=) dim(z)-NULL z [1] a1 b1 c1 a2 b2 c2 a3 b3 c3 which gives the result in different order or z-as.vector(t(z)) z [1] a1 a2 a3 b1 b2 b3 c1 c2 c3 Which gives you desired order. Regards Petr regards Desmond Desmond Campbell Dept of Biostatistics and Computing, Institute of Psychiatry (KCL), PO Box 20, De Crespigny Park, Denmark Hill London, SE5 8AF Tel 020 7848 0309 Email d.campb...@iop.kcl.ac.ukmailto:d.campb...@iop.kcl.ac.uk [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] parallel rbind
Steven Bauer steven.bauer at gmail.com writes: As I am sitting here waiting for some R scripts to run...I was wondering... is there any way to parallelize rbind in R? I wait for this call to complete frequently as I deal with large amounts of data. do.call(rbind, LIST) Perfectly reasonable question, but please don't cross-post to StackOverflow and R-help -- pick one or the other. cheers Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R question: generating data using MASS
uf_mike michael.parent at ufl.edu writes: Hi, all! I'm new to R but need to use it to solve a little problem I'm having with a paper I'm writing. The question has a few components and I'd appreciate guidance on any of them. 1. The most essential thing is that I need to generate some multivariate normal data on a restricted integer range (1 to 7). I know I can use MASS mvrnorm command to do this but have a couple questions about that: -I can make the simulated data but I don't know how to issue a command that restricts the generated data to be between a specific range (1 to 7), and integer-only. This problem isn't uniquely defined. Are you willing to generate more samples than you need and then throw away extreme values? Or do you want to 'censor' extreme values (i.e. set values = 1 to 1 and values =7 to 7)? x - MASS::mvrnorm(1,...) x2 - x[x=1 x=7] x3 - x2[1:1000] ## or however many you need x4 - round(x3) -Is there a way to specify a single desired correlation between all the variables (i.e., I want, say, five variables to all be correlated about .30 with each other), rather than input the entire covariance matrix as sigma? What's wrong with m - matrix(0.3,nrow=5,ncol=5) diag(m) - 1 m - m*variance ? 2. I need to introduce missing data (NA) AFTER generating the data set, and I need it to be random and at a specific prevalence (say, 5%). Is there a simple way to take the initial data set and randomly replace 5% of values with NA missing values? x4[sample(seq(x4),size=0.05*length(x4),replace=FALSE)] - NA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Placing a column name in a variable XXXX
Hi or if Dan prefers data frame (which is also a list) CInew2 -function(x,alpha){ data.frame(variable = deparse(substitute(x)), mean=mean(x), alpha = alpha) } CInew2(JOBSTAT, 0.05) variable mean alpha 1 JOBSTAT 0.4567117 0.05 Regards Petr In this case you want to use a 'list' since you want character and numerics returned: JOBSTAT-rnorm(10) # new function that does not return 'x' CInew -function(x,alpha){ + list(variable = deparse(substitute(x)), mean=mean(x), alpha = alpha) + } CInew(JOBSTAT, 0.05) $variable [1] JOBSTAT $mean [1] -1.113034 $alpha [1] 0.05 On Sat, Aug 27, 2011 at 6:58 PM, Dan Abner dan.abne...@gmail.com wrote: I want to it return: Variable Meanalpha JOBSTAT -0.1240675 0.05 How do I get the function parameter x to equal the name of the object that is specified as x as a character string? On Sat, Aug 27, 2011 at 6:41 PM, jim holtman jholt...@gmail.com wrote: The function is doing exactly what you are telling it to do. You have 'cbind(x, mean(x), alpha)' which is creating a matrix where the first column is all the values in 'x' and the next two are the recycled values of mean and alpha. Is this what you want: JOBSAT-rnorm(10) CI-function(x,alpha){ + cbind(x,mean=mean(x),alpha) + } CI(JOBSAT,.05) x mean alpha [1,] 0.8592324 -0.1240675 0.05 [2,] -0.3128362 -0.1240675 0.05 [3,] -2.0042218 -0.1240675 0.05 [4,] -0.4675232 -0.1240675 0.05 [5,] -0.5776273 -0.1240675 0.05 [6,] 1.5696650 -0.1240675 0.05 [7,] 0.8070593 -0.1240675 0.05 [8,] -0.8257525 -0.1240675 0.05 [9,] 0.6167636 -0.1240675 0.05 [10,] -0.9054347 -0.1240675 0.05 # new function that does not return 'x' CInew -function(x,alpha){ + c(mean=mean(x), alpha = alpha) + } CInew(JOBSAT,.05) mean alpha -0.1240675 0.050 On Sat, Aug 27, 2011 at 5:38 PM, Dan Abner dan.abne...@gmail.com wrote: Hi everyone, How does one place an object name (in this case a vector name) into another object (while essentially masking the values of the first object? For example: JOBSAT-rnorm(40) CI-function(x,alpha){ + result-cbind(x,mean=mean(x),alpha) + print(result) + } CI(JOBSAT,.05) I want this to return: Variablemean alpha JOBSTAT 0.02844131 0.05 Instead, I am getting: x mean alpha [1,] -1.07694997 0.02844131 0.05 [2,] -1.13910850 0.02844131 0.05 [3,] -0.21922026 0.02844131 0.05 [4,] 0.38618008 0.02844131 0.05 [5,] -1.24303799 0.02844131 0.05 [6,] -0.74903752 0.02844131 0.05 [7,] 0.96136975 0.02844131 0.05 [8,] -0.38891237 0.02844131 0.05 [9,] -0.20195871 0.02844131 0.05 [10,] 0.78104508 0.02844131 0.05 [11,] 0.87468778 0.02844131 0.05 [12,] -1.89131480 0.02844131 0.05 Thank you! Dan [13,] 0.74377795 0.02844131 0.05 [14,] -0.60006285 0.02844131 0.05 [15,] -0.76661652 0.02844131 0.05 [16,] 1.06005258 0.02844131 0.05 [17,] 0.02173877 0.02844131 0.05 [18,] -0.36558980 0.02844131 0.05 [19,] -1.92481588 0.02844131 0.05 [20,] -0.50337507 0.02844131 0.05 [21,] 0.82205272 0.02844131 0.05 [22,] 1.59277572 0.02844131 0.05 [23,] 0.59965718 0.02844131 0.05 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: Extracting values in table
Hi Hi All, I am a beginner in programming in r and please do forgive me if my question seems to be silly and sometimes not understandable. 1. we have a list of elements in a list say: ls-list(N,E,E,N,P,E,M,Q,E,M) 2. We have an another list of tables in a list say: n - list(M, N,E,P,Q,M,N,E,Q,N) tb - lapply(1:10, function(i)matrix(sample(4), 2, 2, dimnames=list(n[sample(10,2)], n[sample(2,2)]))) 3. we need to extract values from the table in the list where colname is always M , wherein the rowname should be the 1st element in the list ls for table 1 in the list tb and 2nd element in table 2 and so on... for ex: M N N 4 1 P 3 2 In table 1 , we need to extract value 4. I can not provide you with canned solution but x = sapply(tb, function(x) which(dimnames(x)[[2]]==M)) gives you vector of M positions in column names for (i in seq_along(ls1)) print(which(rownames(tb[[i]]) %in% ls[[i]])) # for (i in seq_along(ls1)) y[i] - which(rownames(tb[[i]]) %in% ls[[i]]) # does not work as there is sometimes no match gives you position of row names (if they exist) then you can use that for selection of items from list tb e.g. for the first table tb[[1]][x[1],y[[1]]] tb[[1]] M N N 3 2 Q 1 4 tb[[1]][x[1],y[1]] [1] 3 Regards Petr Thanks to all in advance. -- View this message in context: http://r.789695.n4.nabble.com/Extracting- values-in-table-tp3771272p3771272.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Improving result from integrate
Hi all, I am utilizing integrate() to get the area below a function that shape like an exponential, getting very low for higher values of the horizontal coordinate y. In the code below, I present the tricky way I can improve the result obtained. I compare this result with the one I get from Mathematica and also with the exact solution I have for this particular case. /*-- func - function(y, a, rate){ x - function(n,rate) { rate*exp(-n*rate) } boi - function(y,n,a){ w - y*log(a*n)-lfactorial(y)-a*n exp(w) } f - function(n){ boi(y,n,a)*x(n,rate) } r - 0 r1 - 1 x1 - 0 dx - 20 while(r1 10e-1000){ r1 - integrate(f,x1,x1+dx)$value r - r + r1 x1 - x1 + dx } r + integrate(f,x1,Inf)$valu } func(200,0.1,0.1) --*/ Altought I get better results, the value of dx must be carefully selected. So I ask, there is another method that can give me better results? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] defining id argument in geeglm
Hi all, I am trying to do a generalized estimating equation (GEE) with the geepack package and I am not 100% sure what exactly the id argument means. It seems to be an important argument because results differ considerably defining different clusters. I have a data set of counts (poisson distribution): numbers of butterfly species counted every month during a period of one year (12 repeated measures) at seven sites, three of those being continuous forest sites and four of those being secondary forest sites. The aim is to compare continuous and secondary forests. Would you define the sites or the forest type as id argument: model1-geeglm(formula = number ~ type + month, family = poisson, *id = site *, corstr = ar1) model2-geeglm(formula = number ~ type + month, family = poisson, *id = type *, corstr = ar1) or should even almost every count have a special id (e.g. * id=interaction(month,site)* or *id=interaction(month,type*)) Thanks for your help... Anna [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question on silhouette colours
Gordon Robertson grobert...@bcgsc.ca on Wed, 24 Aug 2011 22:21:22 -0700 writes: I'm fairly new to the silhouette functionality in the cluster package, so apologize if I'm asking something naive. If I run the 'agnes(ruspini)' example from the silhouette section of the cluster package vignette, and assign colours to clusters, two clusters have what appear to be incorrect colours in the silhouette plot. library(cluster) data(ruspini) ar- agnes(ruspini) si3- silhouette(cutree(ar, k = 5), daisy(ruspini)) Thank you, Gordon, for the simple reproducible example. # 1. This gives a mid-gray silhouette plot, which does not show the problem plot(si3, nmax = 80, cex.names = 0.5) # 2. This gives a multicolour silhouette plot, but there are three black lines/bars in the yellow cluster, and the cluster that should be black is actually yellow? plot(si3, nmax = 80, cex.names = 0.5, col=c(red,blue,yellow,black,green)) # 3. Check sorting by writing out sorted results to a file, then plotting from the file si3.sorted- write.table(si3.sorted,/...myPath.../si3.sorted.txt,sep=\t) well, just sortSilhouette(si3) # printing to the console is sufficient to inspect ... Inspecting the si3.sorted.txt file, cluster numbers are ordered as expected (1's then 2's then...), and sil_width's within each cluster appear correctly sorted (descending). Given this, if I load the file into say Mathematica, and plot it with colours, I easily generate a graphic that is like the one from R, but in which all cluster colours are as expected, i.e. there are no black bars in the yellow region, and the cluster that should be black -is- black. Again, I apologize if I'm missing something simple. Thanks for your help in understanding this behaviour. As a matter of fact, I'm pretty sure you found a bug. Note that it would be better in such cases (a function in an R package) to first contact the package maintainer, in this case maintainer(cluster) [1] Martin Maechler maechler@stat but I did see your message on R-help by luck and so have been able to act on it. The next version of cluster, '1.14.1' will have this buglet fixed. Thank you for your question! Best regards, Martin Maechler, ETH Zurich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] defining id argument in geeglm
You need to tell use why you want to use a GEE model. From your use of corstr = ar1 I would surmise you think the counts are serially correlated during a year (despite the presence of a 'month' main effect), in which case the id is 'site'. All 'id' does is to partition the data into clusters: counts for different clusters are independent, counts within a cluster are (potentially) dependent. The common advice applies: you should talk to a statistician conversant with GEE models about your model formulation. (My field experience would suggest that there is no good reason to suppose that the counts are Poisson: visible occurrences of butterfly species do not behave independently.) On Mon, 29 Aug 2011, Anna Mill wrote: Hi all, I am trying to do a generalized estimating equation (GEE) with the geepack package and I am not 100% sure what exactly the id argument means. It seems to be an important argument because results differ considerably defining different clusters. I have a data set of counts (poisson distribution): numbers of butterfly species counted every month during a period of one year (12 repeated measures) at seven sites, three of those being continuous forest sites and four of those being secondary forest sites. The aim is to compare continuous and secondary forests. Would you define the sites or the forest type as id argument: model1-geeglm(formula = number ~ type + month, family = poisson, *id = site *, corstr = ar1) model2-geeglm(formula = number ~ type + month, family = poisson, *id = type *, corstr = ar1) or should even almost every count have a special id (e.g. * id=interaction(month,site)* or *id=interaction(month,type*)) Thanks for your help... Anna [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] to represent color range on plot segment
On 08/28/2011 04:07 AM, karthicklakshman wrote: Dear R community, With an advantage of being NEW to R, I would like to post a very basic query here, I am in need of representing gene expression data which ranges from -0.09 to +4, on plot segment. please find below the data df, the expression values are in df[,2]. kindly help me with the code, so that I can represent the values with a clear color gradient (something like -0.09 to 0 as red gradient and 0 to +4 as green gradient) location value 15 chr+:14001-15001 0.99749499 16 chr+:15001-16001 0.99957360 17 chr+:16001-17001 0.99166481 18 chr+:17001-18001 0.97384763 19 chr+:18001-19001 0.94630009 20 chr+:19001-20001 0.90929743 21 chr+:20001-21001 0.86320937 22 chr+:21001-22001 0.80849640 23 chr+:22001-23001 0.74570521 24 chr+:23001-24001 0.67546318 25 chr+:24001-25001 0.59847214 26 chr+:25001-26001 0.51550137 27 chr+:26001-27001 0.42737988 28 chr+:27001-28001 0.33498815 29 chr+:28001-29001 0.23924933 30 chr+:29001-30001 0.14112001 31 chr+:30001-31001 0.04158066 32 chr+:31001-32001 -0.05837414 33 chr+:32001-33001 -0.15774569 34 chr+:33001-34001 -0.25554110 35 chr+:34001-35001 -0.35078323 36 chr+:35001-36001 -0.44252044 37 chr+:36001-37001 -0.52983614 38 chr+:37001-38001 -0.61185789 39 chr+:38001-39001 -0.68776616 40 chr+:39001-40001 -0.75680250 41 chr+:40001-41001 -0.81827711 42 chr+:41001-42001 -0.87157577 43 chr+:42001-43001 -0.91616594 44 chr+:43001-44001 -0.95160207 Hi karthick, Here's one way to do it: library(plotrix) df[,3]-NA df[df[,2]0,3]-color.scale(df[df[,2]0,2],1,c(0,1),c(0,1)) df[df[,2]0,3]-color.scale(df[df[,2]0,2],c(1,0),1,c(0,1)) df[,3] will then be a vector of colors that range from red at the minimum value to white at 0 to green at the maximum value. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Asking Favor For Remove element with Particular Value In Vector
chuan_zl wrote: Dear All. I am Chuan. I am beginner for R.I facing some problem in remove element from vector.I have a vector with size 238 element as follow(a part) [1] 0 18 24 33 44..[238] 255 Let the vector label as x,I want remove element 0 and 255.I try use such function: x[x0 x255] Hi Chuan, If you want to remove the specific values 0 and 255 from your vector, try: x-x[-which(x %in% c(0,255))] Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to start R script editor by default
Hi All, 1) Is it possible to set the options such that R opens a new script editor every time I start the R and 2) specify the size of windows. Thanks for the suggestion and Best regards, Krishna [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] defining id argument in geeglm
thanks for your answer! For the butterfly counts we used butterfly bait traps. They were not visible counts. I read several ecological papers that treat species or individuals counts as Poisson applying GLM rather than e.g. repeated measures ANOVA. I assumed that the monthly collection out of a species pool cannot be independent. To choose GEE was my idea because of its advantage for repeated measures and Poisson distribution... 2011/8/29 Prof Brian Ripley rip...@stats.ox.ac.uk You need to tell use why you want to use a GEE model. From your use of corstr = ar1 I would surmise you think the counts are serially correlated during a year (despite the presence of a 'month' main effect), in which case the id is 'site'. All 'id' does is to partition the data into clusters: counts for different clusters are independent, counts within a cluster are (potentially) dependent. The common advice applies: you should talk to a statistician conversant with GEE models about your model formulation. (My field experience would suggest that there is no good reason to suppose that the counts are Poisson: visible occurrences of butterfly species do not behave independently.) On Mon, 29 Aug 2011, Anna Mill wrote: Hi all, I am trying to do a generalized estimating equation (GEE) with the geepack package and I am not 100% sure what exactly the id argument means. It seems to be an important argument because results differ considerably defining different clusters. I have a data set of counts (poisson distribution): numbers of butterfly species counted every month during a period of one year (12 repeated measures) at seven sites, three of those being continuous forest sites and four of those being secondary forest sites. The aim is to compare continuous and secondary forests. Would you define the sites or the forest type as id argument: model1-geeglm(formula = number ~ type + month, family = poisson, *id = site *, corstr = ar1) model2-geeglm(formula = number ~ type + month, family = poisson, *id = type *, corstr = ar1) or should even almost every count have a special id (e.g. * id=interaction(month,site)* or *id=interaction(month,type*)) Thanks for your help... Anna [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~**ripley/http://www.stats.ox.ac.uk/%7Eripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem in writing a R data frame to Excel format using RODBC package
Hi Experts, I was trying to write a data frame which has a header row, from R to Excel disk file using RODBC ( RODBC_1.3-1) package. I met with an issue:- If in sqlSave(), I set a parameter colnames=FALSE then I get first row as header in excel file. If 'colnames=TRUE' then it gives me first 2 rows as header in excel file. Actually, according to my understanding, for FALSE it should not write header row to Excel file and for TRUE it should write a single header row to Excel. Data is ok. Problem is with header row. sqlSave() is in RODBC package. Kindly, suggest something. I need an option so that whenever I want I can save header to excel file or else drop the header and can only save data to Excel. Thanks and Regards SmartG [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to start R script editor by default
On 08/29/2011 08:03 PM, SNV Krishna wrote: Hi All, 1) Is it possible to set the options such that R opens a new script editor every time I start the R and 2) specify the size of windows. Hi Krishna, You can start an editor like this: system(my_editor,wait=FALSE) where my_editor is the name of your favorite editor. Adding this line to your .First function will start that editor when you start R. Getting a particular window size depends upon whether you can specify the size on the command line. Say you're using NEdit. You could do something like this: cat(How many rows, Krishna?) rows-scan(n=1) cat(How many columns, Krishna?) columns-scan(n=1) system( paste(nedit -rows,rows,-columns,columns,collapse= ), wait=FALSE) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How download Yahoo Quote?
This can be of interest: http://moderntoolmaking.blogspot.com/2011/08/25-more-ways-to-bring-data-into-r.html On Sun, Aug 28, 2011 at 3:00 PM, R. Michael Weylandt michael.weyla...@gmail.com wrote: I have simplified the code only to download the sp500 index. Perhaps you have, but you haven't provided any of that simplified code so I'm a little skeptical. I do have to say though, if you've managed to do it more efficiently than the 12 characters in getSymbols() you are a far better coder than I. Michael On Sun, Aug 28, 2011 at 12:46 PM, Yumin zpx...@gmail.com wrote: Hi Michael: I have simplified the code only to download the sp500 index. How to correct this simple codes. -- View this message in context: http://r.789695.n4.nabble.com/How-download-Yahoo-Quote-tp3769563p3774672.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Atenciosamente, Raphael Saldanha saldanha.plan...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem in writing a R data frame to Excel format using RODBC package
I recommend reading the posting guide and providing a reproducible example. --- Jeff Newmiller The . . Go Live... DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Smart Guy smartgu...@gmail.com wrote: Hi Experts, I was trying to write a data frame which has a header row, from R to Excel disk file using RODBC ( RODBC_1.3-1) package. I met with an issue:- If in sqlSave(), I set a parameter colnames=FALSE then I get first row as header in excel file. If 'colnames=TRUE' then it gives me first 2 rows as header in excel file. Actually, according to my understanding, for FALSE it should not write header row to Excel file and for TRUE it should write a single header row to Excel. Data is ok. Problem is with header row. sqlSave() is in RODBC package. Kindly, suggest something. I need an option so that whenever I want I can save header to excel file or else drop the header and can only save data to Excel. Thanks and Regards SmartG [[alternative HTML version deleted]] _ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Surprising behaviour of fisher.test
I did a simple little simulation of a binary variable in a two armed trial. I was quite surprised by the number of p-values delivered by the fisher.test function which was 1(!). Of course, under the null hypothesis you expect a fair number of outcomes with the same number of event in both arms but still? Is there some silly error in my crude code? ## niter-5 ra-rbinom(niter,100,.05) rb-rbinom(niter,100,.05) pval-rep(NA,niter) for (i in 1:niter){ apa-matrix(c(100-ra[i],ra[i],100-rb[i],rb[i]),byrow=T,ncol=2) pval[i]-fisher.test(apa)$p.value } cbind(ra,rb,pval)[pval 0.06 pval 0.04,] hist(pval,probability=T) summary(pval) table(pval 0.05)/niter sum(pval1)/niter Patrik Öhagen Biostatistiker Enheten för effekt och säkerhet 4 Box 26, 751 03 Uppsala Besöksadress: Dag Hammarskjöldsväg 42 Telefon: 018 - 17 49 24, växel: 018 - 17 46 00 Fax: 018 - 54 85 66 patrik.oha...@mpa.se www.lakemedelsverket.se __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generating contingency tables from the null
By the null distribution do you mean that the assignment of each observation to a column is equal? If so, the function sample() might serve your needs. For example: rows - 3 cols - 4 rowtot - 100 m - matrix(NA, nrow=rows, ncol=cols) for(i in seq(rows)) { m[i, ] - tabulate(sample(seq(cols), rowtot, replace=T)) } m [,1] [,2] [,3] [,4] [1,] 27 24 25 24 [2,] 19 24 26 31 [3,] 26 26 31 17 Jean Jim Silverton wrote on 08/29/2011 01:14:28 AM: Hi all, I have a 3x4 contingency table with row totals all being 100. I want to generate 3 x 4 tables from the null distribution. Which R function can do this? -- Thanks, Jim. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] gradient function in OPTIMX
Dear R users When I use OPTIM with BFGS, I've got a significant result without an error message. However, when I use OPTIMX with BFGS( or spg), I've got the following an error message. optimx(par=theta0, fn=obj.fy, gr=gr.fy, method=BFGS, control=list(maxit=1)) Error: Gradient function might be wrong - check it! I checked and checked my gradient function line by line. I could not find anything wrong. Is it a bug or something? I prefer OPTIMX, so I'd like to know why. Thanks a lot in advance Regards, Kathryn Lord -- View this message in context: http://r.789695.n4.nabble.com/gradient-function-in-OPTIMX-tp3775791p3775791.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Change color in forest.rma (metafor)
Thank you so much!!! Could you tell me also how to change the size of the chart? There is not enough space below the chart to add the arrows! 2011/8/28 Uwe Ligges-3 [via R] ml-node+3774557-1567708350-262...@n4.nabble.com On 26.08.2011 15:50, Paola Tellaroli wrote: I lied, that was not my last question: how can I add two arrows at the bottom with the words in favor of A / B? This is not specified in the pdf and with text I have the impression that I can't add text below the x-axis. You can, see ?par and its xpd argument. Uwe Ligges 2011/8/26 Paola Tellaroli[hidden email]http://user/SendEmail.jtp?type=nodenode=3774557i=0 Dear Prof. Viechtbauer, thank you so much for your help and kindness. Clearly graphs are the minor problem in our work, and the parameters and options that can vary in R are so many that it is obvious that you can not expect to change everything you want! Your suggestions are very helpuf, but I have one last question. I'm trying to copy the style of a forest plot that I've seen and I like (the one in the attached file, page 1034): can I do this in R? Best wishes, *Paola* 2011/8/25 Viechtbauer Wolfgang (STAT)-2 [via R] [hidden email] http://user/SendEmail.jtp?type=nodenode=3774557i=1 The color of the squares is also currently hard coded. The thing is, there are so many different elements to a forest plot (squares, lines, polygons, text, axes, axis labels, etc.), if I would add arguments to set the color of each element, things would really get out of hand (as far as I am concerned, there are already too many arguments to begin with). I can think of one possibility: I could allow the col argument to accept a vector of colors and then apply the different elements of that vector to the various elements in the plot. Of course, there is also a limit to how far that can be taken. For example, what if somebody wants to have a different color for *one* of the squares and a different color for the other squares? Another possibility is to do some post-processing with other software. One can create the forest plot in R, save it for example as a postscript file, and the edit the plot in other software. Yes, I prefer it if I can create the plot in R and have it exactly the way I want it (without having to do any post-processing), but sometimes that may not be possible. Note that you can always add whatever you want to a plot created by the forest() function after it has been drawn. You can add text, lines, squares, polygons, whatever in any color you desire (e.g., with the text(), segments(), points(), polygon() functions). So, you could also just plot over the squares with: points(yi, 4:1, pch=15, col=red) To get rid of the black squares that are drawn by the forest function, add psize=0 as an argument in forest() (this will make the size of squares equal to 0, so essentially, they are invisible). If you want to make the size of the points inversely proportional to some function of the precision of the estimates, use points() together with the cex argument. For example: wi- 1/sqrt(vi) psize- wi/sum(wi) psize- (psize - min(psize)) / (max(psize) - min(psize)) psize- (psize * 1.0) + 0.5 points(yi, 4:1, pch=15, col=red, cex=psize) Best, Wolfgang -Original Message- From: Paola Tellaroli [mailto:[hidden email] http://user/SendEmail.jtp?type=nodenode=3768683i=0] Sent: Thursday, August 25, 2011 10:57 To: Viechtbauer Wolfgang (STAT) Cc: [hidden email] http://user/SendEmail.jtp?type=nodenode=3768683i=1; Bernd Weiss Subject: Re: [R] Change color in forest.rma (metafor) Thank you for your attention and help! In this way I get the diamond coloured, but actually I would have the squares representing the values of the individual studies coloured. Is it somehow possible? Paola 2011/8/24 Viechtbauer Wolfgang (STAT) [hidden email]http://user/SendEmail.jtp?type=nodenode=3768683i=2 Thank you, Bernd, for looking into this. Yes, at the moment, the color of the summary estimate for models without moderators is hard-coded (as black). I didn't think people may want to change that. I guess I was wrong =) A dirty solution for the moment is to add: addpoly(dfs, efac=6, row=-1, col=red, border=red, annotate=F, mlab=) after the call to forest(). You will get a warning message (since the border argument gets passed to the text() function inside addpoly() and that's not a par for text), but you can just ignore that. Best, -- Wolfgang Viechtbauer Department of Psychiatry and Neuropsychology School for Mental Health and Neuroscience Maastricht University, P.O. Box 616 6200 MD Maastricht, The Netherlands Tel: +31 (43) 368-5248 Fax: +31 (43) 368-8689 Web: http://www.wvbauer.com -Original Message- From: Bernd Weiss
[R] Consult for creating one-single row heatmap
I've tried to create a heatmap from only a single row data, but I've found the error told that the data should always have more than one row. So, could you suggest me please, how to create a single row heat map, by the way. Advanced thanks for your helps. Thitipong __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] to represent color range on plot segment
Dear Jim, Thank you very much for your code. There is no problem with df[df[,2]0,3]-color.scale(df[df[,2]0,2],c(1,0),1,c(0,1)) but the other has an error message if there is a negative value, like df[df[,2]0,3]-color.scale(df[df[,2]0,2],1,c(1,0),c(1,0)) Error in rgb(reds, greens, blues) : color intensity -0.157746, not in [0,1] Kindly update me with your comments. Regards, karthick -- View this message in context: http://r.789695.n4.nabble.com/to-represent-color-range-on-plot-segment-tp3773392p3775990.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Differences in SAS and R defaults
Hello all, I am looking for theories and statistical analyses where the defaults employed in R and SAS are different. As a result, the outputs under the defaults should (at least slightly) differ for the same input. Could anyone kindly point any such instance? Thanks Nikhil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Differences in SAS and R defaults
Hello all, I am looking for theories and statistical analyses where the defaults employed in R and SAS are different. As a result, the outputs under the defaults should (at least slightly) differ for the same input. Could anyone kindly point any such instance? Thanks Nikhil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Lm gives different results depending on x-bit architecture
Dear all, I have encountered problem when developing application. My linear regression does give different results depending on architecture. Following example describes my problem perfectly. xxx - data.frame(a=c(0.2,0.2,0.2,0.2,0.2),b=c(7,8,9,10,11)) lm(a~b,xxx) summary(lm(a~b,xxx))$r.squared returns on 32-bit R: Call: lm(formula = a ~ b, data = xxx) Coefficients: (Intercept) b 2.000e-01 -1.503e-18 summary(lm(a~b,xxx))$r.squared [1] 0 and on 64-bit R: Call: lm(formula = a ~ b, data = xxx) Coefficients: (Intercept) b 2.0e-01 0 summary(lm(a~b,xxx))$r.squared [1] NA It is very easy to notice slope should be 0 in the above case. I also understand it is related to the precision of 32 and 64 bit and dependant on how those are internally written, but maybe someone had this problem and found any solution. With regards, Adam. -- View this message in context: http://r.789695.n4.nabble.com/Lm-gives-different-results-depending-on-x-bit-architecture-tp3776027p3776027.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error: Gradient function might be wrong ----- in OPTIMX
Dear R users When I use OPTIMX with BFGS, I've got the following error message. - optimx(par=theta0, fn=obj.fy, gr=gr.fy, method=BFGS) Error: Gradient function might be wrong - check it! - So, I checked and checked my gradient function line by line. However, I could not find anything wrong. When I remove the gradient, I've got - optimx(par=theta0, fn=obj.fy, method=BFGS) par fvalues method fns grs itns conv KKT1 KKT2 xtimes 1 0.4423958, 0.9665069, 0.7920856, 1.1952092, 0.3083377 -0.01733672 BFGS 35 22 NULL0 TRUE FALSE 76.02 - where the true theta is (0.5, 1.0, 0.8, 1.2, 0.6). However, I've got better results below when I tried OPTIM with the gradient. - optim(par=theta0, fn=obj.fy, gr=gr.fy, method=BFGS) $par [1] 0.5004394 0.669 0.8035140 1.1996053 0.5989842 $value [1] -0.01717598 $counts function gradient 548 $convergence [1] 0 $message NULL - Of course, I tried several different data and received similar results. If the gradient function is really wrong, why is the results of OPTIM with the gradient better? Weird, isn't it? OPTIMX has better gradient computation as I know. Would you plz explain why these results happened? Regards, Kathryn Lord -- View this message in context: http://r.789695.n4.nabble.com/Error-Gradient-function-might-be-wrong-in-OPTIMX-tp3776040p3776040.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Change color in forest.rma (metafor)
See ?par and its mar argument. Could you tell me also how to change the size of the chart? There is not enough space below the chart to add the arrows! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem in writing a R data frame to Excel format using RODBC package
Hi All, Here is the short description of my problem. mydata ###my data.frame age height weight 12 97 30 14 95 32 17 12050 I used a following method from RODBC package. ver 1.3.1, to save as excel file. sqlSave(channel,* mydata*, tablename=Sheet1, *colnames = TRUE*) I got two header rows in Excel file:- age height weight age height weight 12 97 30 14 95 32 17 12050 I need one row if parameter *colnames = TRUE* and no header rows if * colnames=FALSE*. And actually it should work like this. If any one came across same issue, kindly help me. -Thanks SmartG On 29 August 2011 16:14, Smart Guy smartgu...@gmail.com wrote: Hi Experts, I was trying to write a data frame which has a header row, from R to Excel disk file using RODBC ( RODBC_1.3-1) package. I met with an issue:- If in sqlSave(), I set a parameter colnames=FALSE then I get first row as header in excel file. If 'colnames=TRUE' then it gives me first 2 rows as header in excel file. Actually, according to my understanding, for FALSE it should not write header row to Excel file and for TRUE it should write a single header row to Excel. Data is ok. Problem is with header row. sqlSave() is in RODBC package. Kindly, suggest something. I need an option so that whenever I want I can save header to excel file or else drop the header and can only save data to Excel. Thanks and Regards SmartG -- SmartG [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem in writing a R data frame to Excel format using RODBC package
On 29/08/2011 8:54 AM, Smart Guy wrote: Hi All, Here is the short description of my problem. mydata ###my data.frame age height weight 12 97 30 14 95 32 17 12050 I used a following method from RODBC package. ver 1.3.1, to save as excel file. sqlSave(channel,* mydata*, tablename=Sheet1, *colnames = TRUE*) I got two header rows in Excel file:- age height weight age height weight 12 97 30 14 95 32 17 12050 I need one row if parameter *colnames = TRUE* and no header rows if * colnames=FALSE*. And actually it should work like this. As the help page says, colnames=TRUE adds the column names as the first row of data. They also appear as column names. So you see them twice. Complain to Microsoft (or Dan Bricklin) if you don't like the fact that you can't distinguish between column names and data in a spreadsheet. Duncan Murdoch If any one came across same issue, kindly help me. -Thanks SmartG On 29 August 2011 16:14, Smart Guysmartgu...@gmail.com wrote: Hi Experts, I was trying to write a data frame which has a header row, from R to Excel disk file using RODBC ( RODBC_1.3-1) package. I met with an issue:- If in sqlSave(), I set a parameter colnames=FALSE then I get first row as header in excel file. If 'colnames=TRUE' then it gives me first 2 rows as header in excel file. Actually, according to my understanding, for FALSE it should not write header row to Excel file and for TRUE it should write a single header row to Excel. Data is ok. Problem is with header row. sqlSave() is in RODBC package. Kindly, suggest something. I need an option so that whenever I want I can save header to excel file or else drop the header and can only save data to Excel. Thanks and Regards SmartG __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Function rank() for data frames (or multiple vectors)?
Hi! On 08/24/2011 07:46 PM, David Winsemius wrote: I was looking for an elegant solution ;) In the real case I have double values and this would be quite inefficient then. Still no r-code: Then what about rank(order(...) , further-ties.method-argument) ? I think that, as order() always gives a different value for each element, rank(order()) would return the same result as order() alone. Quite right. I didn't test it since there was no example provided. Do you not understand what is meant by a reproducible example. Sorry, I thought I gave an example in my response to your response. Didn't know that you wanted a R example (which I didn't have at that time) Pretty much every solution I come up with leaves me (re-) asking the question: What's wrong with rank(paste(...))? As said, this is rather inefficient and moreover doesn't work for floats, for which the lexical order of the string representation doesn't match the natural order (e.g., 3e-10 is lexical smaller than 1e-13, while 3e-10 is larger than 1e-13). Here's another possibility: rr - data.frame(a = c(1,1,1,1,2), b=c(1,2,2,3,1)) ave(order(rr$a, rr$b), rr$a, rr$b ) [1] 1.0 2.5 2.5 4.0 5.0 Actually, this may be a solution I was looking for! Note that it assumes that rr to be sorted already (hence the first argument of ave could be simply 1:nrow(rr)). Also, by using FUN=min or FUN=max I can cover the other cases. Thanks for this! Bye, Sebastian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Basic question about re-writing for loop as a function
Hello- Sorry to ask a basic question, but I've spent many hours on this now and seem to be missing something. I have a loop that looks like this: mainmat=data.frame(matrix(data=0, ncol=92, nrow=length(predata$Words_MH))) for(i in 1:length(predata$Words_MH)){ for(j in 1:92){ mainmat[i,j]=ifelse(j %in% as.numeric(unlist(strsplit(predata$Words_MH[i], split=,))), 1, 0) } } What it's doing is creating a matrix with 92 columns, that's the number of different codes, and then for every row of my data it looks to see if the code (code 1, code 2, etc.) is in the string and if it is, returns a 1 in the relevant column (column 1 for code 1, column 2 for code 2, etc.) There are 1000 rows in the database, and I have to run several versions of this code, so it just takes way too long, I have been trying to rewrite using lapply. I tried this: myfunction=function(x, y) ifelse(x %in% as.numeric(unlist(strsplit(predata$Words_MH[y], split=,))), 1, 0) for(j in 1:92){ mainmat[,j]= lapply(predata$Words, myfunction) } but I don't think I can use something that takes two inputs, and I can't seem to remove either. Here's a dput of the first 10 rows of the variable in case that's helpful: predata$Words=c(1, 1, 1, 1, 2,3,4, 5, 1, 1, 6, 7,8,9,10) Given these data, I want the function to return, for the first column, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0 (because those are the values of Words which contain a 1) and for the second column return 0, 0, 0, 0, 1, 0, 0, 0, 0, 0 (because the fifth value is the only one that contains a 2). Any suggestions gratefully received! Chris Beeley Institute of Mental Health, UK __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Asking Favor For Remove element with Particular Value In Vector
Jim et. al: This is the second time I've seen this advice recently. Use logical indexing: which(), though not wrong, is superfluous: x[ !x %in% c(0,255)] will do, rather than: If you want to remove the specific values 0 and 255 from your vector, try: x-x[-which(x %in% c(0,255))] Jim -- Bert __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions. -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How do I get a weighted frequency table?
If you are talking about weights that are the frequencies in each cell, you can use xtabs(): df - data.frame(Var1=c(Absent, Present, Absent, Present), Var2=c(Absent, Absent, Present, Present), Freq=c(17, 6, 3, 12)) df xtabs(Freq~Var1+Var2, data=df) -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Leandro Marino Sent: Sunday, August 28, 2011 12:15 PM To: Luca Meyer Cc: r-help@r-project.org Subject: Re: [R] How do I get a weighted frequency table? *Luca, * you may use survey package. You have to declare the design with design function and than you can you svytotal, svyby, svymean functions to do your tabulations. Regards, Leandro Atenciosamente, Leandro Marino http://www.leandromarino.com.br (Fotsgrafo) http://est.leandromarino.com.br/Blog (Estatmstico) Cel.: + 55 21 9845-7707 Cel.: + 55 21 8777-7907 2011/8/28 Luca Meyer lucam1...@gmail.com Hello, I have to run a set of crosstabulations to which I need to apply some weights. I am currently doing an unweighted version of such crosstabs using table(x,y). I am used with SPSS to create a weighting variable and to use WEIGHT BY VAR before running the CTABLES, is there a similar procedure in R? Thanks, Luca Mr. Luca Meyer www.lucameyer.com R version 2.13.1 (2011-07-08) Mac OS X 10.6.8 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R.oo data members / inheritance
Henrik, Your last suggestion did not work for me. It seems like it does not allow me to create a ClassB object with 3 arguments: setConstructorS3(ClassA, function(A=15, x=NA) { + extend(Object(), ClassA, +.size = A, +.x=x + ) + }) setConstructorS3(ClassB, function(..., bData=NA) { + extend(ClassA(...), ClassB, + .bData = bData + ) + }) b = ClassB(1,2,3) Error in ClassA(...) : unused argument(s) (3) I got around it using your 'specific' suggestion: setConstructorS3(ClassA, function(A=15, x=NA) { + extend(Object(), ClassA, +.size = A, +.x=x + ) + }) setConstructorS3(ClassB, function(..., bData=NA) { + extend(ClassA(A=15,x=NA), ClassB, + .bData = bData + ) + }) b = ClassB(1,2,3) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Surprising behaviour of fisher.test
On 29-Aug-11 11:44:28, Öhagen Patrik wrote: I did a simple little simulation of a binary variable in a two armed trial. I was quite surprised by the number of p-values delivered by the fisher.test function which was 1(!). Of course, under the null hypothesis you expect a fair number of outcomes with the same number of event in both arms but still? Is there some silly error in my crude code? # niter-5 ra-rbinom(niter,100,.05) rb-rbinom(niter,100,.05) pval-rep(NA,niter) for (i in 1:niter){ apa-matrix(c(100-ra[i],ra[i],100-rb[i],rb[i]),byrow=T,ncol=2) pval[i]-fisher.test(apa)$p.value } cbind(ra,rb,pval)[pval 0.06 pval 0.04,] hist(pval,probability=T) summary(pval) table(pval 0.05)/niter sum(pval1)/niter Patrik Öhagen After reading your posting, and being puzzled by I was quite surprised by the number of p-values delivered by the fisher.test function which was 1(!)., I ran your code. In each of three runs I got sum(pval1) = 0. It would indeed be surprising to get any pval 1, since the Fisher P-value is the sum of a subset of the probabilities possible for the table. This subset may sometimes be all of them, but even so (unless there was an unusual rounding error) pone should not see pval 1. There are certainly many P-values equal to 1. On my third sun (of 5) I get sum(pval==1) # [1] 11520 sum(pval==1)/niter [1 ] 0.2304 The probability of pval=1 in an interation is *at least* the probability (in your setup of the tables) that ra[i] = rb[i], which would be the probability that two independent binomial samples of size 100 with p=0.05 should give the same result, which is sum((dbinom((0:100),100,0.05))^2) which = 0.1307316 *at least* because, depending on the configuration of the random 2x2 table, there are other possibilities for the Fisher P-value to equal 1. So the large number of P-values equal to 1 (as can be clearly seen from the histogram) is not a surprise. I am, therefore wondering if you really observed any pval 1? Did you confuse pvale == 1 with pval 1 in your posting? If you really did get any pval 1, how many did you get? Ted. E-Mail: (Ted Harding) ted.hard...@wlandres.net Fax-to-email: +44 (0)870 094 0861 Date: 29-Aug-11 Time: 15:47:12 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Change color in forest.rma (metafor)
On 29.08.2011 13:11, Paola Tellaroli wrote: Thank you so much!!! Could you tell me also how to change the size of the chart? There is not enough space below the chart to add the arrows! Please read the whole help page for ?par You will find a way how to increase the size of the margins (using the argument mar) and many other useful things. Uwe Ligges 2011/8/28 Uwe Ligges-3 [via R] ml-node+3774557-1567708350-262...@n4.nabble.com On 26.08.2011 15:50, Paola Tellaroli wrote: I lied, that was not my last question: how can I add two arrows at the bottom with the words in favor of A / B? This is not specified in the pdf and with text I have the impression that I can't add text below the x-axis. You can, see ?par and its xpd argument. Uwe Ligges 2011/8/26 Paola Tellaroli[hidden email]http://user/SendEmail.jtp?type=nodenode=3774557i=0 Dear Prof. Viechtbauer, thank you so much for your help and kindness. Clearly graphs are the minor problem in our work, and the parameters and options that can vary in R are so many that it is obvious that you can not expect to change everything you want! Your suggestions are very helpuf, but I have one last question. I'm trying to copy the style of a forest plot that I've seen and I like (the one in the attached file, page 1034): can I do this in R? Best wishes, *Paola* 2011/8/25 Viechtbauer Wolfgang (STAT)-2 [via R] [hidden email]http://user/SendEmail.jtp?type=nodenode=3774557i=1 The color of the squares is also currently hard coded. The thing is, there are so many different elements to a forest plot (squares, lines, polygons, text, axes, axis labels, etc.), if I would add arguments to set the color of each element, things would really get out of hand (as far as I am concerned, there are already too many arguments to begin with). I can think of one possibility: I could allow the col argument to accept a vector of colors and then apply the different elements of that vector to the various elements in the plot. Of course, there is also a limit to how far that can be taken. For example, what if somebody wants to have a different color for *one* of the squares and a different color for the other squares? Another possibility is to do some post-processing with other software. One can create the forest plot in R, save it for example as a postscript file, and the edit the plot in other software. Yes, I prefer it if I can create the plot in R and have it exactly the way I want it (without having to do any post-processing), but sometimes that may not be possible. Note that you can always add whatever you want to a plot created by the forest() function after it has been drawn. You can add text, lines, squares, polygons, whatever in any color you desire (e.g., with the text(), segments(), points(), polygon() functions). So, you could also just plot over the squares with: points(yi, 4:1, pch=15, col=red) To get rid of the black squares that are drawn by the forest function, add psize=0 as an argument in forest() (this will make the size of squares equal to 0, so essentially, they are invisible). If you want to make the size of the points inversely proportional to some function of the precision of the estimates, use points() together with the cex argument. For example: wi- 1/sqrt(vi) psize- wi/sum(wi) psize- (psize - min(psize)) / (max(psize) - min(psize)) psize- (psize * 1.0) + 0.5 points(yi, 4:1, pch=15, col=red, cex=psize) Best, Wolfgang -Original Message- From: Paola Tellaroli [mailto:[hidden email] http://user/SendEmail.jtp?type=nodenode=3768683i=0] Sent: Thursday, August 25, 2011 10:57 To: Viechtbauer Wolfgang (STAT) Cc: [hidden email] http://user/SendEmail.jtp?type=nodenode=3768683i=1; Bernd Weiss Subject: Re: [R] Change color in forest.rma (metafor) Thank you for your attention and help! In this way I get the diamond coloured, but actually I would have the squares representing the values of the individual studies coloured. Is it somehow possible? Paola 2011/8/24 Viechtbauer Wolfgang (STAT) [hidden email]http://user/SendEmail.jtp?type=nodenode=3768683i=2 Thank you, Bernd, for looking into this. Yes, at the moment, the color of the summary estimate for models without moderators is hard-coded (as black). I didn't think people may want to change that. I guess I was wrong =) A dirty solution for the moment is to add: addpoly(dfs, efac=6, row=-1, col=red, border=red, annotate=F, mlab=) after the call to forest(). You will get a warning message (since the border argument gets passed to the text() function inside addpoly() and that's not a par for text), but you can just ignore that. Best, -- Wolfgang Viechtbauer Department of Psychiatry and Neuropsychology School for Mental Health and Neuroscience Maastricht University, P.O. Box 616 6200 MD Maastricht, The Netherlands Tel: +31 (43) 368-5248 Fax: +31 (43) 368-8689 Web: http://www.wvbauer.com
Re: [R] Lm gives different results depending on x-bit architecture
On 29.08.2011 13:54, AdamMarczak wrote: Dear all, I have encountered problem when developing application. My linear regression does give different results depending on architecture. Following example describes my problem perfectly. xxx- data.frame(a=c(0.2,0.2,0.2,0.2,0.2),b=c(7,8,9,10,11)) lm(a~b,xxx) summary(lm(a~b,xxx))$r.squared returns on 32-bit R: Call: lm(formula = a ~ b, data = xxx) Coefficients: (Intercept)b 2.000e-01 -1.503e-18 summary(lm(a~b,xxx))$r.squared [1] 0 and on 64-bit R: Call: lm(formula = a ~ b, data = xxx) Coefficients: (Intercept)b 2.0e-01 0 summary(lm(a~b,xxx))$r.squared [1] NA It is very easy to notice slope should be 0 in the above case. I also understand it is related to the precision of 32 and 64 bit Not necessarily, since the same precision is used by R. It may even be the result of using different compilers (or just compiler versions) for producing the 32-bit and the 64-bit version. and dependant on how those are internally written, but maybe someone had this problem and found any solution. If you want to see if b is numerically equal to zero, use all.equal(). Uwe Ligges With regards, Adam. -- View this message in context: http://r.789695.n4.nabble.com/Lm-gives-different-results-depending-on-x-bit-architecture-tp3776027p3776027.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R.oo data members / inheritance
Correction. My solution didn't work either Didn't return the correct values. Can you post an example that takes three arguments? I'm working on how to do this now. thanks...sorry. Im new to R and R.oo. Ben On Mon, Aug 29, 2011 at 8:35 AM, Ben qant ccqu...@gmail.com wrote: Henrik, Your last suggestion did not work for me. It seems like it does not allow me to create a ClassB object with 3 arguments: setConstructorS3(ClassA, function(A=15, x=NA) { + extend(Object(), ClassA, +.size = A, +.x=x + ) + }) setConstructorS3(ClassB, function(..., bData=NA) { + extend(ClassA(...), ClassB, + .bData = bData + ) + }) b = ClassB(1,2,3) Error in ClassA(...) : unused argument(s) (3) I got around it using your 'specific' suggestion: setConstructorS3(ClassA, function(A=15, x=NA) { + extend(Object(), ClassA, +.size = A, +.x=x + ) + }) setConstructorS3(ClassB, function(..., bData=NA) { + extend(ClassA(A=15,x=NA), ClassB, + .bData = bData + ) + }) b = ClassB(1,2,3) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] maximum number of subdivisions reached
Why I am getting Error in integrate(f, x1, x1 + dx) : maximum number of subdivisions reached and can I avoid this? func - function(y, a, rate, sad){ f3 - function(z){ f1 - function(y,a,n){ dpois(y,a*n) } f2 - function(n,rate){ dexp(n,rate) } f - function(n){ f1(y,a,n)*f2(n,rate) } r - 0 r1 - 1 x1 - 0 dx - 20 while(r1 10e-500){ r1 - integrate(f,x1,x1+dx)$value r - r + r1 x1 - x1 + dx } r + integrate(f,x1,Inf)$valu } sapply(y,f3) } func(200,0.1,0.1,sad=Exp) Thanks in advance. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] maximum number of subdivisions reached
Can't help, code runs fine on my machine once you change valu to value. Are you sure it fails in a vanilla run of R and isn't caused by any other choices you have made along the way? Michael PS -- Here's the code func - function(y, a, rate, sad){ f3 - function(z){ f1 - function(y,a,n){ dpois(y,a*n) } f2 - function(n,rate){ dexp(n,rate) } f - function(n){ f1(y,a,n)*f2(n,rate) } r - 0 r1 - 1 x1 - 0 dx - 20 while(r1 10e-500){ r1 - integrate(f,x1,x1+dx)$value r - r + r1 x1 - x1 + dx } r + integrate(f,x1,Inf)$value } sapply(y,f3) } V = func(200,0.1,0.1,sad=Exp) On Mon, Aug 29, 2011 at 11:16 AM, . . xkzi...@gmail.com wrote: Why I am getting Error in integrate(f, x1, x1 + dx) : maximum number of subdivisions reached and can I avoid this? func - function(y, a, rate, sad){ f3 - function(z){ f1 - function(y,a,n){ dpois(y,a*n) } f2 - function(n,rate){ dexp(n,rate) } f - function(n){ f1(y,a,n)*f2(n,rate) } r - 0 r1 - 1 x1 - 0 dx - 20 while(r1 10e-500){ r1 - integrate(f,x1,x1+dx)$value r - r + r1 x1 - x1 + dx } r + integrate(f,x1,Inf)$valu } sapply(y,f3) } func(200,0.1,0.1,sad=Exp) Thanks in advance. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bayesian functions for mle2 object
Hi everybody, I'm interested in evaluating the effect of a continuous variable on the mean and/or the variance of my response variable. I have built functions expliciting these and used the 'mle2' function to estimate the coefficients, as follows: func.1 - function(m=62.9, c0=8.84, c1=-1.6) { s - c0+c1*(x) -sum(dnorm(y, mean=m, sd=s,log=T)) } m1 - mle2(func.1, method=SANN) However, the estimation of the effect of x on the variance of y usually has dealt some troubles, resulting in no convergencies or sd of estimates extremely huge. I tried using different optimizers, but I still faced the some problems. When I had similar troubles in 'GLMM' statistical universe, I used bayesian functions to solve this problem, enjoyning the flexibility of different start points to reach the maximum likelihood estimates. However, I have no idea which package or which function to use to solve the specific problem I'm facing now. Does anyone have a clue? Thanks in advance Gustavo Requena PhD Student - Laboratory of Arthropod Behavior and Evolution Universidade de Sao Paulo - Brazil -- View this message in context: http://r.789695.n4.nabble.com/Bayesian-functions-for-mle2-object-tp3776442p3776442.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Asking Favor For Remove element with Particular Value In Vector
Thank you friend for suggestion. -- View this message in context: http://r.789695.n4.nabble.com/Asking-Favor-For-Remove-element-with-Particular-Value-In-Vector-tp3772779p3776432.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with levelplot color assignment in lattice
Hi, Thank you Duncan, you showed me how to assign a specific color NA values in the levelplot. However, I'm still not satisfied with the result of the code you provided. In the data frame I provided in the first post, there's one plant with level=0 (at x=8, y=1), and many other plants have level=1. In the resulting levelplot, levels 0 and 1 all get the same color (white) whereas I expected level=0 to get white color, and level=1 to get the next color. Also, in your code, one color is missing, there should be 10 colors for levels 0 to 9, plus black for NA=10 value. But even adding an extra color, I can't get the right result. In fact, it looks like the problem I have is in the way levelplot assigns colors: they seem to be assigned between level values, instead of being centered on thes values. There is probably a way of changing that, but so far I've had no sucess. I think I don't really deeply understand how levelplot works. -Sebastien -- View this message in context: http://r.789695.n4.nabble.com/Help-with-levelplot-color-assignment-in-lattice-tp3774374p3776131.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Legend / bar order - ggplot2
Hi all, I am trying to do a barplot in ggplot2 and want to make sure that the legend order is consistent with the bar order, that is the legend order is orig and match; and the bars are ordered in the same way. It seems to me that I can only control one of them. Any idea? library(ggplot2) df - data.frame(value = rnorm(20), name = factor(rep(letters[1:10], 2), levels = letters[1:10]), type = factor(c(rep(orig, 10), rep(match, 10)), levels = c(orig, match))) ggplot(df, aes(x = name, y = value, fill = type)) + geom_bar(position = position_dodge()) + coord_flip() Thank you very much, YL __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Asking Favor For Remove element with Particular Value In Vector
Thank you very much,friend. -- View this message in context: http://r.789695.n4.nabble.com/Asking-Favor-For-Remove-element-with-Particular-Value-In-Vector-tp3772779p3776427.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] all combinations of the elements of two vectors
Petr, Jorge, Daniel, Yes you could also use outer() instead of expand.grid(). This is quite useful to know. Also I didn't know you could turn a matrix into a vector by setting its dimensions to NULL like that. I always used as.vector( m ). And (as I've just discovered) you can use it to reconfigure the matrix's shape to any that contains the same number of elements. Thanks very much one and all. Regards Desmond -Original Message- From: Petr PIKAL [mailto:petr.pi...@precheza.cz] Sent: 29 August 2011 07:24 To: Campbell, Desmond Cc: r-help@R-project.org Subject: Odp: [R] all combinations of the elements of two vectors Hi Dear R-help readers, I'm sure this problem has been answered but I can't find the solution. I have two vectors v1 - c(a,b) v2 - c(1,2,3) I want an easy way to produce every possible combination of v1, v2 elements Ie I want to produce c(a1,a2,a3, b1,b2,b3) Another option is z-outer(x,y, paste, sep=) dim(z)-NULL z [1] a1 b1 c1 a2 b2 c2 a3 b3 c3 which gives the result in different order or z-as.vector(t(z)) z [1] a1 a2 a3 b1 b2 b3 c1 c2 c3 Which gives you desired order. Regards Petr regards Desmond Desmond Campbell Dept of Biostatistics and Computing, Institute of Psychiatry (KCL), PO Box 20, De Crespigny Park, Denmark Hill London, SE5 8AF Tel 020 7848 0309 Email d.campb...@iop.kcl.ac.ukmailto:d.campb...@iop.kcl.ac.uk [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with levelplot color assignment in lattice
In fact by fiddling with the at and colorkey options, I was able to get the result I expected. Now the colors are assigned correctly, as well as the colorkey. Here's my code: # see data in the original data$level[is.na(data$level)] - 10 # assign a value above the scale to NA values levelplot(level~ x* y, data = data, as.table=T, col.regions=c(#FF, #C6, #8D, #55, #1C, #FFE200, #FFAA00, #FF7100, #FF3800, #FF, #00), # 10 colors from white to red through yellow for levels 0 to 9, plus black for level=10 at=c(-0.5:10.5), # this is how I got colors centered on integer values colorkey = list(at = c(-0.5:10.5), labels=list(at=c(0:10),lab=c(as.character(c(0:9)),NA))), xlab=x, ylab=y, strip = strip.custom(factor.levels=c(date 3))) I hope this can help somebody else in the future. -Sebastien -- View this message in context: http://r.789695.n4.nabble.com/Help-with-levelplot-color-assignment-in-lattice-tp3774374p3776206.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Asking Favor For Remove element with Particular Value In Vector
Thank you very much,friend. -- View this message in context: http://r.789695.n4.nabble.com/Asking-Favor-For-Remove-element-with-Particular-Value-In-Vector-tp3772779p3776430.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Configuring Proxy: Proxy Authentication Required with --internet2
Hi there I'm trying to configure R to get access to the internet. Using the Internet Explorer a proxy .pac script is used. Reading some older threads I found that I can use the --internet2 option. When choosing a mirror I get the error: 407 Proxy Authentication Required. This seems reasonable since I have to log in when using the IE as well. But where do I enter username and password in R? Sys.setenv(http_proxy_user=ask) or Sys.setenv(http_proxy_user=ask) does not help. (And is not needed for the internet2 option as far as I am informed) Any help appreciated -- View this message in context: http://r.789695.n4.nabble.com/Configuring-Proxy-Proxy-Authentication-Required-with-internet2-tp3776209p3776209.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Asking Favor For Remove element with Particular Value In Vector
Thank you very much,friend. -- View this message in context: http://r.789695.n4.nabble.com/Asking-Favor-For-Remove-element-with-Particular-Value-In-Vector-tp3772779p3776435.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MuMIn Problem getting adjusted Confidence intervals
Hello R users I'm using MuMIn but for some reason I'm not getting the adjusted confidence interval and uncoditional SE whe I use model.avg(). I took into consideration the steps provided by Grueber et al (2011) Multimodel inference in ecology and evolution: challenges and solutions in JEB. I created a global model to see if malaria prevalence (binomial distribution) is related to any life history traits of 14 different birds species, while controling for Family and genus in a GLMM: global.model.Para-lmer(cbind(Parahaemoproteus,FailPh)~factor(SS)+factor(NT)+NH+W+IT+factor(MS)+(1|Family/Genus),family=binomial,data=malaria) I than standardize the input variables using the function standardize form the arm package: stdz.model.Para-standardize(global.model.Para,standardize.y=FALSE) But I get this message: Warning messages lost: In is.na(thedata): is.na() aplied to an object different from list or vector of type Null summary(stdz.model.Para) Generalized linear mixed model fit by the Laplace approximation Formula: cbind(Parahaemoproteus, FailPh) ~ factor(SS) + factor(NT) + z.NH + z.W + z.IT + factor(MS) + (1 | Family/Genus) Data: malaria AIC BIC logLik deviance 45.89 51.64 -13.9527.89 Random effects: Groups NameVariance Std.Dev. Genus:Family (Intercept) 1.4262 1.1942 Family (Intercept) 0. 0. Number of obs: 14, groups: Genus:Family, 12; Family, 5 Fixed effects: Estimate Std. Error z value Pr(|z|) (Intercept) -4.6494 1.1791 -3.943 8.04e-05 *** factor(SS)1 3.7793 2.0709 1.8250.068 . factor(NT)1 1.8975 1.2793 1.4830.138 z.NH 0.4902 2.1099 0.2320.816 z.W -1.6237 1.5957 -1.0180.309 z.IT -0.7656 1.9598 -0.3910.696 factor(MS)1 -2.0603 1.3907 -1.4810.138 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Correlation of Fixed Effects: (Intr) f(SS)1 f(NT)1 z.NH z.Wz.IT factor(SS)1 -0.202 factor(NT)1 -0.599 0.090 z.NH 0.058 -0.790 -0.178 z.W 0.232 -0.632 0.039 0.503 z.IT 0.051 0.569 0.323 -0.851 -0.339 factor(MS)1 -0.176 -0.632 -0.319 0.538 0.165 -0.567 I then proceed to use the dredge fucntion: model.set.Para-dredge(stdz.model.Para) model.set.Para Global model: glmer(formula = cbind(Parahaemoproteus, FailPh) ~ factor(SS) + factor(NT) + z.NH + z.W + z.IT + factor(MS) + (1 | Family/Genus), data = malaria, family = binomial) --- Model selection table (Int) fct(MS) fct(NT) fct(SS) z.I z.N z.W k Dev. AIC AICc delta weight 4 -5.231 + 4 34.64 42.64 47.08 0. 0.290 9 -4.750 + + 5 30.00 40.00 47.50 0.4142 0.236 . . . Random terms: 1 | Family/Genus I then select the models with delta value up to 7: top.models.Para-get.models(model.set.Para,subset=delta=7) top.models But when I do the model average I do not seem to be getting the variance or Uncoditional SE and I'm guessing that the Confidence interval are no conditional either: model.avg(top.models.Para,method=NA) Model summary: Deviance AICc Delta Weight 334.64 47.08 0.00 0.30 1+3 30.00 47.50 0.41 0.25 4+5 31.49 48.99 1.90 0.12 3+5 32.29 49.79 2.70 0.08 3+6 33.02 50.52 3.44 0.05 538.41 50.86 3.77 0.05 3+4 33.77 51.27 4.19 0.04 1+3+527.85 51.85 4.77 0.03 3+4+527.86 51.86 4.78 0.03 1+3+428.58 52.58 5.49 0.02 1+5 35.33 52.83 5.75 0.02 1+3+629.34 53.34 6.26 0.01 1+2+330.02 54.02 6.93 0.01 Variables: 1 2 3 4 5 6 factor(MS) factor(NT) factor(SS) z.IT z.NHz.W Averaged model parameters: CoefficientSE Lower CI Upper CI (Intercept) -4.75 1.410 -7.510 -1.9900 factor(MS)1 -1.54 0.809 -3.120 0.0471 factor(NT)12.28 1.310 -0.286 4.8500 factor(SS)13.30 0.9681.400 5.2000 z.IT -2.79 2.230 -7.160 1.5800 z.NH 2.28 1.660 -0.968 5.5300 z.W -1.74 1.490 -4.650 1.1800 Confidence intervals are unadjusted Relative variable importance: factor(SS) factor(MS) z.NH z.ITz.W factor(NT) 0.82 0.33 0.32 0.20 0.07 0.01 Does anyone know what I might be doing wrong? thanks for the help Marcos -- View this message in context: http://r.789695.n4.nabble.com/MuMIn-Problem-getting-adjusted-Confidence-intervals-tp3776500p3776500.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
Re: [R] all combinations of the elements of two vectors
On Aug 29, 2011, at 9:15 AM, Campbell, Desmond wrote: Petr, Jorge, Daniel, Yes you could also use outer() instead of expand.grid(). This is quite useful to know. Also I didn't know you could turn a matrix into a vector by setting its dimensions to NULL like that. I always used as.vector( m ). And (as I've just discovered) you can use it to reconfigure the matrix's shape to any that contains the same number of elements. You can do that but it requires that you understand the ordering of matrices if you want to avoid scrambling your indices. Since you seem new to that concept, you should work through several small examples to make sure you understand the effects of dimensional coercion. You should also look at the aperm function and the abind package. -- David. Thanks very much one and all. Regards Desmond -Original Message- From: Petr PIKAL [mailto:petr.pi...@precheza.cz] Sent: 29 August 2011 07:24 To: Campbell, Desmond Cc: r-help@R-project.org Subject: Odp: [R] all combinations of the elements of two vectors Hi Dear R-help readers, I'm sure this problem has been answered but I can't find the solution. I have two vectors v1 - c(a,b) v2 - c(1,2,3) I want an easy way to produce every possible combination of v1, v2 elements Ie I want to produce c(a1,a2,a3, b1,b2,b3) Another option is z-outer(x,y, paste, sep=) dim(z)-NULL z [1] a1 b1 c1 a2 b2 c2 a3 b3 c3 which gives the result in different order or z-as.vector(t(z)) z [1] a1 a2 a3 b1 b2 b3 c1 c2 c3 Which gives you desired order. Regards Petr regards Desmond Desmond Campbell Dept of Biostatistics and Computing, Institute of Psychiatry (KCL), PO Box 20, De Crespigny Park, Denmark Hill London, SE5 8AF Tel 020 7848 0309 Email d.campb...@iop.kcl.ac.ukmailto:d.campb...@iop.kcl.ac.uk [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Configuring Proxy: Proxy Authentication Required with --internet2
On 29/08/2011 9:23 AM, behave wrote: Hi there I'm trying to configure R to get access to the internet. Using the Internet Explorer a proxy .pac script is used. Reading some older threads I found that I can use the --internet2 option. When choosing a mirror I get the error: 407 Proxy Authentication Required. This seems reasonable since I have to log in when using the IE as well. But where do I enter username and password in R? Sys.setenv(http_proxy_user=ask) or Sys.setenv(http_proxy_user=ask) does not help. (And is not needed for the internet2 option as far as I am informed) Any help appreciated When using --internet2, all of the http work is handled by IE. So you need to find a way to tell IE to handle the authentication. I'd guess that starting an instance of IE would do it for you, but I don't use a proxy, so I can't try. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Differences in SAS and R defaults
It doesn't help to post this twice, but it may help to know why this is of interest. Frank n wrote: Hello all, I am looking for theories and statistical analyses where the defaults employed in R and SAS are different. As a result, the outputs under the defaults should (at least slightly) differ for the same input. Could anyone kindly point any such instance? Thanks Nikhil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Differences-in-SAS-and-R-defaults-tp3776102p3776621.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Exception while using NeweyWest function with doMC
Simon, Though we're please to see another use of bigmemory, it really isn't clear that it is gaining you anything in your example; anything like as.big.matrix(matrix(...)) still consumes full RAM for both the inner matrix() and the new big.matrix -- is the filebacking really necessary. It also doesn't appear that you are making use of shared memory, so I'm unsure what the gains are. However, I don't have any particular insight as to the subsequent problem with NeweyWest (which doesn't seem to be using the big.matrix objects). Jay -- Message: 32 Date: Sat, 27 Aug 2011 21:37:55 +0200 From: Simon Zehnder simon.zehn...@googlemail.com To: r-help@r-project.org Subject: [R] Exception while using NeweyWest function with doMC Message-ID: cagqvrp_gk+t0owbv1ste-y0zafmi9s_zwqrxyxugsui18ms...@mail.gmail.com Content-Type: text/plain Dear R users, I am using R right now for a simulation of a model that needs a lot of memory. Therefore I use the *bigmemory* package and - to make it faster - the *doMC* package. See my code posted on http://pastebin.com/dFRGdNrG snip - -- John W. Emerson (Jay) Associate Professor of Statistics Department of Statistics Yale University http://www.stat.yale.edu/~jay __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R.oo data members / inheritance
Hi, comments below. On Mon, Aug 29, 2011 at 8:12 AM, Ben qant ccqu...@gmail.com wrote: Correction. My solution didn't work either Didn't return the correct values. Can you post an example that takes three arguments? I'm working on how to do this now. thanks...sorry. Im new to R and R.oo. Ben On Mon, Aug 29, 2011 at 8:35 AM, Ben qant ccqu...@gmail.com wrote: Henrik, Your last suggestion did not work for me. It seems like it does not allow me to create a ClassB object with 3 arguments: setConstructorS3(ClassA, function(A=15, x=NA) { + extend(Object(), ClassA, + .size = A, + .x=x + ) + }) setConstructorS3(ClassB, function(..., bData=NA) { + extend(ClassA(...), ClassB, + .bData = bData + ) + }) b = ClassB(1,2,3) Error in ClassA(...) : unused argument(s) (3) I should have clarified that when putting '...' (= all arguments that does not match other arguments) at the beginning like this, you have to specify the arguments that you do not want to pass via '...' by name, i.e. b - ClassB(1,2, bData=3) I'd recommend to always name you argument, especially for a piece of code that is not just a one-time call at the R prompt, i.e. b - ClassB(A=1, x=2, bData=3); I got around it using your 'specific' suggestion: setConstructorS3(ClassA, function(A=15, x=NA) { + extend(Object(), ClassA, + .size = A, + .x=x + ) + }) setConstructorS3(ClassB, function(..., bData=NA) { + extend(ClassA(A=15,x=NA), ClassB, + .bData = bData + ) + }) That doesn't work, because arguments other than 'bData' that you pass to ClassB() will end up in '...', and that you don't pass along to ClassA(), i.e. such arguments are simply ignored. So, a solution that use neither '...' nor missing() is: setConstructorS3(ClassB, function(A=15, x=NA, bData=NA) { extend(ClassA(A=A,x=x), ClassB, .bData = bData ) }) This code is very explicit (hence more readable). The downside is that if you change the default arguments in ClassA() and you wish those to also be in ClassB(), you have to update the defaults in ClassB() manually. With '...' you don't have to do that. Finally, not that the above about '...', argument matching etc is generic to R - it is not specific to R.oo. You can find more about the '...' argument(s) in 'An Introduction to R', which you find via help.start(). Hope this helps Henrik b = ClassB(1,2,3) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Asking Favor For Remove element with Particular Value In Vector
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter Sent: Monday, August 29, 2011 7:07 AM To: Jim Lemon Cc: r-help@r-project.org Subject: Re: [R] Asking Favor For Remove element with Particular Value In Vector Jim et. al: This is the second time I've seen this advice recently. Use logical indexing: which(), though not wrong, is superfluous: which() will give the wrong answer if x does not contain any elements of the set which you want to omit. E.g., x - 1:3 x[-which(x %in% c(0,255))] # bad integer(0) x[!is.element(x, c(0,255))] # good [1] 1 2 3 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com x[ !x %in% c(0,255)] will do, rather than: If you want to remove the specific values 0 and 255 from your vector, try: x-x[-which(x %in% c(0,255))] Jim -- Bert __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions. -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] maximum number of subdivisions reached
Ooops, sorry! The problem occurs when func(1:2,0.1,0.1,sad=Exp) On Mon, Aug 29, 2011 at 12:27 PM, R. Michael Weylandt michael.weyla...@gmail.com wrote: Can't help, code runs fine on my machine once you change valu to value. Are you sure it fails in a vanilla run of R and isn't caused by any other choices you have made along the way? Michael PS -- Here's the code func - function(y, a, rate, sad){ f3 - function(z){ f1 - function(y,a,n){ dpois(y,a*n) } f2 - function(n,rate){ dexp(n,rate) } f - function(n){ f1(y,a,n)*f2(n,rate) } r - 0 r1 - 1 x1 - 0 dx - 20 while(r1 10e-500){ r1 - integrate(f,x1,x1+dx)$value r - r + r1 x1 - x1 + dx } r + integrate(f,x1,Inf)$value } sapply(y,f3) } V = func(200,0.1,0.1,sad=Exp) On Mon, Aug 29, 2011 at 11:16 AM, . . xkzi...@gmail.com wrote: Why I am getting Error in integrate(f, x1, x1 + dx) : maximum number of subdivisions reached and can I avoid this? func - function(y, a, rate, sad){ f3 - function(z){ f1 - function(y,a,n){ dpois(y,a*n) } f2 - function(n,rate){ dexp(n,rate) } f - function(n){ f1(y,a,n)*f2(n,rate) } r - 0 r1 - 1 x1 - 0 dx - 20 while(r1 10e-500){ r1 - integrate(f,x1,x1+dx)$value r - r + r1 x1 - x1 + dx } r + integrate(f,x1,Inf)$valu } sapply(y,f3) } func(200,0.1,0.1,sad=Exp) Thanks in advance. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Basic question about re-writing for loop as a function
You are somewhere in Circles 3 and 4 of 'The R Inferno'. If you have a function to apply over more than one argument, then 'mapply' will do that. But you don't need to do that -- you can do the operation you want efficiently: *) create your resulting matrix with all zeros, no reason for this to be a data frame, almost surely. mainmat - matrix(0, ncol=92, nrow=...) *) create a subscripting matrix giving the row and column combinations to change to 1. Here is a small example: ss - strsplit(c(1, 2,3, 1), split=,) sr - rep(1:length(ss), sapply(ss, length)) sr [1] 1 2 2 3 sc - as.numeric(unlist(ss)) sc [1] 1 2 3 1 mainmat[cbind(sr, sc)] - 1 On 29/08/2011 14:55, Chris Beeley wrote: Hello- Sorry to ask a basic question, but I've spent many hours on this now and seem to be missing something. I have a loop that looks like this: mainmat=data.frame(matrix(data=0, ncol=92, nrow=length(predata$Words_MH))) for(i in 1:length(predata$Words_MH)){ for(j in 1:92){ mainmat[i,j]=ifelse(j %in% as.numeric(unlist(strsplit(predata$Words_MH[i], split=,))), 1, 0) } } What it's doing is creating a matrix with 92 columns, that's the number of different codes, and then for every row of my data it looks to see if the code (code 1, code 2, etc.) is in the string and if it is, returns a 1 in the relevant column (column 1 for code 1, column 2 for code 2, etc.) There are 1000 rows in the database, and I have to run several versions of this code, so it just takes way too long, I have been trying to rewrite using lapply. I tried this: myfunction=function(x, y) ifelse(x %in% as.numeric(unlist(strsplit(predata$Words_MH[y], split=,))), 1, 0) for(j in 1:92){ mainmat[,j]= lapply(predata$Words, myfunction) } but I don't think I can use something that takes two inputs, and I can't seem to remove either. Here's a dput of the first 10 rows of the variable in case that's helpful: predata$Words=c(1, 1, 1, 1, 2,3,4, 5, 1, 1, 6, 7,8,9,10) Given these data, I want the function to return, for the first column, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0 (because those are the values of Words which contain a 1) and for the second column return 0, 0, 0, 0, 1, 0, 0, 0, 0, 0 (because the fifth value is the only one that contains a 2). Any suggestions gratefully received! Chris Beeley Institute of Mental Health, UK __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Patrick Burns pbu...@pburns.seanet.com twitter: @portfolioprobe http://www.portfolioprobe.com/blog http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Exception while using NeweyWest function with doMC
On Aug 27, 2011, at 3:37 PM, Simon Zehnder wrote: Dear R users, I am using R right now for a simulation of a model that needs a lot of memory. Therefore I use the *bigmemory* package and - to make it faster - the *doMC* package. See my code posted on http://pastebin.com/dFRGdNrG Now, if I use the foreach loop with the addon %do% (for sequential run) I have no problems at all - only here and there some singularities in regressor matrices which should be ok. BUT if I run the loop on multiple cores I get very often a bad exception. I have posted the exception on http://pastebin.com/eMWF4cu0 The exception comes from the NeweyWest function loaded within the sandwich library. I have no clue, what it want to say me and why it is so weirdly printed to the terminal. I am used to receive here and there errorsbut the messages never look like this. Does anyone have a useful answer for me, where to look for the cause of this weird error? Here some additional information: Hardware: MacBook Pro 2.66 GHz Intel Core Duo, 4 GB Memory 1067 MHz DDR3 Software System: Mac Os X Lion 10.7.1 (11B26) Software App: R64 version 2.11.1 run via Mac terminal Using the R64 version in a 4GB environment will reduce the effective memory capacity since the larger pointers take up more space, and using parallel methods is unlikely to improve performance very much with only two cores. It also seems likely that there have been several bug fixes in the last couple of years since that version of R was released, so the package authors are unlikely to be very interested in segfault errors thrown by outdated software. I hope someone has a good suggestion! Update R. Don't use features that only reduce performance and make unstable a machine that has limited resources. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How do I get a weighted frequency table?
Hi David, Unfortunately I need to use the should have been frequencies if the sample corresponded perfectly in terms of some reference variables to the population. That is, if in my sample I observe V1_R1=10%, V1_R2=50%, V3_R3=40% while the same known population distribution is V1_R1=20%, V1_R2=30%, V3_R3=50% then I would like to see what V2*V3, V2*V4, ... , V2* VN, V3*V4, ... ,VN-1 * VN would have been had the sample perfectly reflect the population in terms of V1. I hope that clarifies what I am trying to achieve... Thanks, Luca Il giorno 29/ago/2011, alle ore 16.29, David L Carlson ha scritto: If you are talking about weights that are the frequencies in each cell, you can use xtabs(): df - data.frame(Var1=c(Absent, Present, Absent, Present), Var2=c(Absent, Absent, Present, Present), Freq=c(17, 6, 3, 12)) df xtabs(Freq~Var1+Var2, data=df) -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Leandro Marino Sent: Sunday, August 28, 2011 12:15 PM To: Luca Meyer Cc: r-help@r-project.org Subject: Re: [R] How do I get a weighted frequency table? *Luca, * you may use survey package. You have to declare the design with design function and than you can you svytotal, svyby, svymean functions to do your tabulations. Regards, Leandro Atenciosamente, Leandro Marino http://www.leandromarino.com.br (Fotsgrafo) http://est.leandromarino.com.br/Blog (Estatmstico) Cel.: + 55 21 9845-7707 Cel.: + 55 21 8777-7907 2011/8/28 Luca Meyer lucam1...@gmail.com Hello, I have to run a set of crosstabulations to which I need to apply some weights. I am currently doing an unweighted version of such crosstabs using table(x,y). I am used with SPSS to create a weighting variable and to use WEIGHT BY VAR before running the CTABLES, is there a similar procedure in R? Thanks, Luca Mr. Luca Meyer www.lucameyer.com R version 2.13.1 (2011-07-08) Mac OS X 10.6.8 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Differences in SAS and R defaults
Do you mean things like treatment of categorical variables in regression procedures (which have different defaults in different procedures in SAS), and different default as to the reference category in logistic regression? Jeremy On 29 August 2011 04:46, n nikhil.abhyan...@gmail.com wrote: Hello all, I am looking for theories and statistical analyses where the defaults employed in R and SAS are different. As a result, the outputs under the defaults should (at least slightly) differ for the same input. Could anyone kindly point any such instance? Thanks Nikhil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Legend / bar order - ggplot2
Hi: The bars *are* ordered in the same way, but when you use coord_flip(), the left category goes on top and the right category goes on the bottom. Is this what you want? ggplot(df, aes(x = name, y = value, fill = type)) + geom_bar(position = position_dodge()) + coord_flip() + scale_fill_manual(breaks = rev(levels(df$type)), values = c('orange', 'blue')) HTH, Dennis On Mon, Aug 29, 2011 at 6:18 AM, Yang Lu yang...@williams.edu wrote: Hi all, I am trying to do a barplot in ggplot2 and want to make sure that the legend order is consistent with the bar order, that is the legend order is orig and match; and the bars are ordered in the same way. It seems to me that I can only control one of them. Any idea? library(ggplot2) df - data.frame(value = rnorm(20), name = factor(rep(letters[1:10], 2), levels = letters[1:10]), type = factor(c(rep(orig, 10), rep(match, 10)), levels = c(orig, match))) ggplot(df, aes(x = name, y = value, fill = type)) + geom_bar(position = position_dodge()) + coord_flip() Thank you very much, YL __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] separate mfrow region with line
The grconvertX and grconvertY functions may be helpful in finding the endpoints to use. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of David Winsemius Sent: Saturday, August 27, 2011 8:25 AM To: dood Cc: r-help@r-project.org Subject: Re: [R] separate mfrow region with line On Aug 27, 2011, at 5:01 AM, dood wrote: Dear R users, I have six plots in one figure, created with par(mfrow=c(2,3)). I would like to add two lines to the figure outside the plotting regions, separating the figure into 3 columns. Is this possible? The xpd parameter used with the segments function should provide that. The tricky bit will be establishing the proper endpoints, but without an example that cannot be illustrated. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Function rank() for data frames (or multiple vectors)?
On Aug 29, 2011, at 15:39 , Sebastian Bauer wrote: rr - data.frame(a = c(1,1,1,1,2), b=c(1,2,2,3,1)) ave(order(rr$a, rr$b), rr$a, rr$b ) [1] 1.0 2.5 2.5 4.0 5.0 Actually, this may be a solution I was looking for! Note that it assumes that rr to be sorted already (hence the first argument of ave could be simply 1:nrow(rr)). Also, by using FUN=min or FUN=max I can cover the other cases. Thanks for this! Yes, order() and rank() are different beasts so you'd need the presort. You might consider this: rr - data.frame(a = c(1,1,1,2,2), b=c(2,2,1,3,1)) rr a b 1 1 2 2 1 2 3 1 1 4 2 3 5 2 1 ave(order(rr$a, rr$b), rr$a, rr$b ) #WORNG! [1] 2 2 2 5 4 ave(order(order(rr$a, rr$b)), rr$a, rr$b ) [1] 2.5 2.5 1.0 5.0 4.0 Figuring out why order(order(x)) == rank(x) if you ignore ties is left as an exercise (i.e., I can't recall the argument just now...). -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com Døden skal tape! --- Nordahl Grieg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] control line break behavior of R output
If your main goal is to look at a data frame and you are ok with scrolling, then look at the View function (note capitalization) as an alternative to just printing the data frame. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Martin Batholdy Sent: Saturday, August 27, 2011 6:19 PM To: R Help Subject: [R] control line break behavior of R output Hi, Is it possible to define at which point a line-break is happening in R- output? I for example would rather like to scroll horizontally in a data-frame with a lot of columns instead of having a lot of breakpoints in the data.frame (to fit the screen). Can you control that? Can you tell R to do a line-break after x symbols of output for example? thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] maximum number of subdivisions reached
Hi: integrate() is not a vectorized function. This appears to work: sapply(1:2, function(x) func(x, 0.1, 0.1, sad = Exp)) [1] 0.250 0.125 In this case, sapply() is a disguised for loop. HTH, Dennis On Mon, Aug 29, 2011 at 9:45 AM, . . xkzi...@gmail.com wrote: Ooops, sorry! The problem occurs when func(1:2,0.1,0.1,sad=Exp) On Mon, Aug 29, 2011 at 12:27 PM, R. Michael Weylandt michael.weyla...@gmail.com wrote: Can't help, code runs fine on my machine once you change valu to value. Are you sure it fails in a vanilla run of R and isn't caused by any other choices you have made along the way? Michael PS -- Here's the code func - function(y, a, rate, sad){ f3 - function(z){ f1 - function(y,a,n){ dpois(y,a*n) } f2 - function(n,rate){ dexp(n,rate) } f - function(n){ f1(y,a,n)*f2(n,rate) } r - 0 r1 - 1 x1 - 0 dx - 20 while(r1 10e-500){ r1 - integrate(f,x1,x1+dx)$value r - r + r1 x1 - x1 + dx } r + integrate(f,x1,Inf)$value } sapply(y,f3) } V = func(200,0.1,0.1,sad=Exp) On Mon, Aug 29, 2011 at 11:16 AM, . . xkzi...@gmail.com wrote: Why I am getting Error in integrate(f, x1, x1 + dx) : maximum number of subdivisions reached and can I avoid this? func - function(y, a, rate, sad){ f3 - function(z){ f1 - function(y,a,n){ dpois(y,a*n) } f2 - function(n,rate){ dexp(n,rate) } f - function(n){ f1(y,a,n)*f2(n,rate) } r - 0 r1 - 1 x1 - 0 dx - 20 while(r1 10e-500){ r1 - integrate(f,x1,x1+dx)$value r - r + r1 x1 - x1 + dx } r + integrate(f,x1,Inf)$valu } sapply(y,f3) } func(200,0.1,0.1,sad=Exp) Thanks in advance. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rpart: apply tree to new data to get counts
Hi, when I have made a decision tree with rpart, is it possible to apply this tree to a new set of data in order to find out the distribution of observations? Ideally I would like to plot my original tree, with the counts (at each node) of the new data. Reagards, Jay __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] reading tables from multiple HTML pages
Hi, beginner to R and was having some problems scraping data from tables in html using the XML package. I have included some code below. I am trying to loop through a series of html pages, each of which contains a single table from which I want to scrape data. However, some of the pages are blank - and so it throws me an error message when it gets to htmlParse(). The loop then closes out and I get the error message below: Error in htmlParse(url) : error in creating parser for http://www.szrd.gov.cn/viewcommondbfc.do?id=728 How might be best to go about keeping the loop running so I can parse the rest? library(XML) url_root-http://www.szrd.gov.cn/viewcommondbfc.do?id=; for(i in 700:750){ url = paste(url_root, i, sep=) doc = htmlParse(url) tableNodes = getNodeSet(doc, //table) tbl = readHTMLTable(tableNodes[[3]]) } Steve Oliver Department of Political Science University of California at San Diego 9500 Gilman Dr. La Jolla, CA 92092 -- View this message in context: http://r.789695.n4.nabble.com/reading-tables-from-multiple-HTML-pages-tp3776605p3776605.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] splitting into multiple dataframes and then create a loop to work
Dear All Sorry for this simple question, I could not solve it by spending days. My data looks like this: # data set.seed(1234) clvar - c( rep(1, 10), rep(2, 10), rep(3, 10), rep(4, 10)) # I have 100 level for this factor var; yvar - rnorm(40, 10,6); var1 - rnorm(40, 10,4); var2 - rnorm(40, 10,4); var3 - rnorm(40, 5, 2); var4 - rnorm(40, 10, 3); var5 - rnorm(40, 15, 8) # just example df - data.frame(clvar, yvar, var1, var2, var3, var4, var5) # manual splitting df1 - subset(df, clvar == 1) df2 - subset(df, clvar == 2) df3- subset(df, clvar == 3) df4- subset(df, clvar == 4) df5- subset(df, clvar == 5) # i tried to mechanize it * for(i in 1:5) { df[i] - subset(df, clvar == i) } I know it should not work as df[i] is single variable, do it did. But I could not find away to output multiple dataframes from this loop. My limited R knowledge, did not help at all ! * # working on each of variable, just trying simple function a - 3:8 out1 - lapply(1:5, function(ind){ lm(df1$yvar ~ df1[, a[ind]]) }) p1 - lapply(out1, function(m)summary(m)$coefficients[,4][2]) p1 - do.call(rbind, p1) My ultimate objective is to apply this function to all the dataframes created (i.e. df1, df2, df3, df4, df5) and create five corresponding p-value vectors (p1, p2, p3, p4, p5). Then output would be a matrix of clvar and correponding p values clvar var1 var2 var3 var4 var5 1 2 3 4 Please help me ! Thanks NIL [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R question: generating data using MASS
Thanks! This problem isn't uniquely defined. Are you willing to generate more samples than you need and then throw away extreme values? Or do you want to 'censor' extreme values (i.e. set values = 1 to 1 and values =7 to 7)? I'd like the retain a normal distribution so I wouldn't want to delete the other values or truncate them. Can I use the cut command on the data that gets generated and retain a normal(ish, at least) distribution? Oh, thanks for the help on the matrix, that is easier, and also the random missingness, I will try those! Thanks, Mike On Aug 29, 2011, at 2:29 AM, Ben Bolker wrote: uf_mike michael.parent at ufl.edu writes: Hi, all! I'm new to R but need to use it to solve a little problem I'm having with a paper I'm writing. The question has a few components and I'd appreciate guidance on any of them. 1. The most essential thing is that I need to generate some multivariate normal data on a restricted integer range (1 to 7). I know I can use MASS mvrnorm command to do this but have a couple questions about that: -I can make the simulated data but I don't know how to issue a command that restricts the generated data to be between a specific range (1 to 7), and integer-only. This problem isn't uniquely defined. Are you willing to generate more samples than you need and then throw away extreme values? Or do you want to 'censor' extreme values (i.e. set values = 1 to 1 and values =7 to 7)? x - MASS::mvrnorm(1,...) x2 - x[x=1 x=7] x3 - x2[1:1000] ## or however many you need x4 - round(x3) -Is there a way to specify a single desired correlation between all the variables (i.e., I want, say, five variables to all be correlated about .30 with each other), rather than input the entire covariance matrix as sigma? What's wrong with m - matrix(0.3,nrow=5,ncol=5) diag(m) - 1 m - m*variance ? 2. I need to introduce missing data (NA) AFTER generating the data set, and I need it to be random and at a specific prevalence (say, 5%). Is there a simple way to take the initial data set and randomly replace 5% of values with NA missing values? x4[sample(seq(x4),size=0.05*length(x4),replace=FALSE)] - NA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading tables from multiple HTML pages
?tryCatch HTH, Dennis On Mon, Aug 29, 2011 at 9:04 AM, s1oliver s1oli...@ucsd.edu wrote: Hi, beginner to R and was having some problems scraping data from tables in html using the XML package. I have included some code below. I am trying to loop through a series of html pages, each of which contains a single table from which I want to scrape data. However, some of the pages are blank - and so it throws me an error message when it gets to htmlParse(). The loop then closes out and I get the error message below: Error in htmlParse(url) : error in creating parser for http://www.szrd.gov.cn/viewcommondbfc.do?id=728 How might be best to go about keeping the loop running so I can parse the rest? library(XML) url_root-http://www.szrd.gov.cn/viewcommondbfc.do?id=; for(i in 700:750){ url = paste(url_root, i, sep=) doc = htmlParse(url) tableNodes = getNodeSet(doc, //table) tbl = readHTMLTable(tableNodes[[3]]) } Steve Oliver Department of Political Science University of California at San Diego 9500 Gilman Dr. La Jolla, CA 92092 -- View this message in context: http://r.789695.n4.nabble.com/reading-tables-from-multiple-HTML-pages-tp3776605p3776605.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to order based on the second two columns?
Hello All, I have a data frame consisting of 4 columns (id1, id2, y, pred) where pred is the predicted value based on the glm function and my data frame is called all. data is another data frame that has all data but I want to put together some important columns from my original data frame (data) into another data frame (all) as follows and I would like them to be sorted based on the id1 and id2. Here is what I do: all_data = cbind(oder(data[,2]), order(data[,3]), data[,1], pred) all = as.data.frame(all_data) colnames(all) = c(id1, id2, y , pred) when I do the ordering thing, I am not sure why I do not get the corresponding y and pred values for that specific row after ordering. Am I doing something wrong in here? Thanks a lot, ANDRA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rpart: apply tree to new data to get counts
? predict.rpart Weidong Gu On Mon, Aug 29, 2011 at 12:49 PM, Jay josip.2...@gmail.com wrote: Hi, when I have made a decision tree with rpart, is it possible to apply this tree to a new set of data in order to find out the distribution of observations? Ideally I would like to plot my original tree, with the counts (at each node) of the new data. Reagards, Jay __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to order based on the second two columns?
On Aug 29, 2011, at 2:40 PM, Andra Isan wrote: Hello All, I have a data frame consisting of 4 columns (id1, id2, y, pred) where pred is the predicted value based on the glm function and my data frame is called all. data is another data frame that has all data but I want to put together some important columns from my original data frame (data) into another data frame (all) as follows and I would like them to be sorted based on the id1 and id2. Here is what I do: all_data = cbind(oder(data[,2]), order(data[,3]), data[,1], pred) all = as.data.frame(all_data) colnames(all) = c(id1, id2, y , pred) when I do the ordering thing, I am not sure why I do not get the corresponding y and pred values for that specific row after ordering. Am I doing something wrong in here? Your error is in not using 'merge' instead of 'cbind'. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Lee-Carter in R package
Dear all, I'm forecasting health services utilization using Lee-Carter method. I have a routine to run LC method in R package, and I understood all steps to model and forecasting the rates by this method, except two things: 1) how to adjust the estimated admission rates by the total number of admissions in each year (similar to adjust specific mortality rates to number of deaths), 2) how to incorporate the error in bx in the estimate. I know it`s by a bootstraping method, but I can't understand how to deal with this in R package. I'm working with a short time series - the only available period - and because of this I think it's very important to incorporate this error in the estimate. Could anyone please help me with this, please? I thank a lot if anyone could help me. Sincerely, Cristina [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] replacing elements of a zoo object
Why doesn't this work? x = zoo(1:5, as.Date('2001-01-01')+1:5) x[as.Date('2001-01-05')] x[as.Date('2001-01-05')] = 0 x I think this is especially bad because it doesn't cause an error. It lets you do something to x, but then you can't see x again to see what it did. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bayesian functions for mle2 object
Billy.Requena billy.requena at gmail.com writes: Hi everybody, I'm interested in evaluating the effect of a continuous variable on the mean and/or the variance of my response variable. I have built functions expliciting these and used the 'mle2' function to estimate the coefficients, as follows: func.1 - function(m=62.9, c0=8.84, c1=-1.6) { s - c0+c1*(x) -sum(dnorm(y, mean=m, sd=s,log=T)) } m1 - mle2(func.1, method=SANN) However, the estimation of the effect of x on the variance of y usually has dealt some troubles, resulting in no convergencies or sd of estimates extremely huge. I tried using different optimizers, but I still faced the some problems. When I had similar troubles in 'GLMM' statistical universe, I used bayesian functions to solve this problem, enjoyning the flexibility of different start points to reach the maximum likelihood estimates. However, I have no idea which package or which function to use to solve the specific problem I'm facing now. Does anyone have a clue? Thanks in advance Unless I'm missing something, you can fit this model (more easily) in gls() from the nlme package, which allows models for heteroscedasticity. See ?nlme::varConstPower gls(y~1,weights=varPower(power=1,form=~x),data) This gives you a standard deviation proportional to (t1+|v|); that is, if the baseline residual standard deviation is S, then the standard deviation is S*(t1+|v|), so S would correspond to your c1 and S*t1 would correspond to your c0. Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replacing elements of a zoo object
How exactly do you mean it doesn't work? Copied from my GUI: x = zoo(1:5, as.Date('2001-01-01')+1:5) x[as.Date('2001-01-05')] 2001-01-05 4 x[as.Date('2001-01-05')] = 0 x 2001-01-02 2001-01-03 2001-01-04 2001-01-05 2001-01-06 1 2 3 0 5 (Those actually line up correctly on my machine..) Michael Weylandt On Mon, Aug 29, 2011 at 2:45 PM, Gene Leynes gleyne...@gmail.com wrote: Why doesn't this work? x = zoo(1:5, as.Date('2001-01-01')+1:5) x[as.Date('2001-01-05')] x[as.Date('2001-01-05')] = 0 x I think this is especially bad because it doesn't cause an error. It lets you do something to x, but then you can't see x again to see what it did. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replacing elements of a zoo object
On Aug 29, 2011, at 2:45 PM, Gene Leynes wrote: Why doesn't this work? x = zoo(1:5, as.Date('2001-01-01')+1:5) x[as.Date('2001-01-05')] x[as.Date('2001-01-05')] = 0 x I think this is especially bad because it doesn't cause an error. It lets you do something to x, but then you can't see x again to see what it did. It did exactly what I expected it to do. What was the this that you think was bad? I hope you are not asking that R ask users to confirm every assignment with a popup window. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rpart modelling a decisión tree and getting probability
Hello everyone, I working in a public health project and we have created a Decision Tree for categorical variables usign the package rpart. Our goal is to develop a model (Using the ROC tool) in order to predict presence/ausent of diabetes and get a better understanding of what are the important factors in a particular chilean population. There are some importants variable that we have found. Now we want to apply this model over a big dataset in order to determinate a possible outcome (probability of getting the deseasse), but we only have the combination of predictive variables for a particular person. We have created this code: library( rpart) fit1 - rpart(sickness~ aetinghabit+gse+age+sex, method=class, data=data) prediccion-predict(fit1,bigdatabase, type=prob) predictionsyes-prediccion[,2] pred - prediction(predictionsyes, datos$sickness) # but this is My question is. How do I put the people's conditions in this model in order to get the people probability of getting this desease? It's possible to do a ROC curve using only this bigdatabase? Because we don't have the outcome if this people got or not this disease. It would be very helpful if someone can give us some light about it. Any web source of doing it will be very appreciated. Thanks in advance. Best Regards, José Bustos Escuela de Enfermeria Pontificia Universidad Católica de Chile Proyecto FONIS 2010 Celular 95939144 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] splitting into multiple dataframes and then create a loop to work
Hi: This is straightforward to do with the plyr package: # install.packages('plyr') library('plyr') set.seed(1234) df - data.frame(clvar = rep(1:4, each = 10), yvar = rnorm(40, 10, 6), var1 = rnorm(40, 10, 4), var2 = rnorm(40, 10, 4), var3 = rnorm(40, 5, 2), var4 = rnorm(40, 10, 3), var5 = rnorm(40, 15, 8)) mods - dlply(df, .(clvar), function(d) lm(yvar ~ . - clvar, data = d)) summary(mods[[1]]) mods is a list of model objects, one per subgroup defined by clvar. You can use extraction functions to pull out pieces from each model, e.g., ldply(mods, function(m) summary(m)[['r.squared']]) ldply(mods, function(m) coef(m)) ldply(mods, function(m) resid(m)) The dlply() function reads a data frame as input and outputs to a list; conversely, the ldply() function reads from a list and outputs to a data frame. The functions you call inside have to be compatible with the input and output data types. HTH, Dennis On Mon, Aug 29, 2011 at 8:37 AM, Nilaya Sharma nilaya.sha...@gmail.com wrote: Dear All Sorry for this simple question, I could not solve it by spending days. My data looks like this: # data set.seed(1234) clvar - c( rep(1, 10), rep(2, 10), rep(3, 10), rep(4, 10)) # I have 100 level for this factor var; yvar - rnorm(40, 10,6); var1 - rnorm(40, 10,4); var2 - rnorm(40, 10,4); var3 - rnorm(40, 5, 2); var4 - rnorm(40, 10, 3); var5 - rnorm(40, 15, 8) # just example df - data.frame(clvar, yvar, var1, var2, var3, var4, var5) # manual splitting df1 - subset(df, clvar == 1) df2 - subset(df, clvar == 2) df3- subset(df, clvar == 3) df4- subset(df, clvar == 4) df5- subset(df, clvar == 5) # i tried to mechanize it * for(i in 1:5) { df[i] - subset(df, clvar == i) } I know it should not work as df[i] is single variable, do it did. But I could not find away to output multiple dataframes from this loop. My limited R knowledge, did not help at all ! * # working on each of variable, just trying simple function a - 3:8 out1 - lapply(1:5, function(ind){ lm(df1$yvar ~ df1[, a[ind]]) }) p1 - lapply(out1, function(m)summary(m)$coefficients[,4][2]) p1 - do.call(rbind, p1) My ultimate objective is to apply this function to all the dataframes created (i.e. df1, df2, df3, df4, df5) and create five corresponding p-value vectors (p1, p2, p3, p4, p5). Then output would be a matrix of clvar and correponding p values clvar var1 var2 var3 var4 var5 1 2 3 4 Please help me ! Thanks NIL [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R question: generating data using MASS
Michael Parent michael.parent at ufl.edu writes: Thanks! This problem isn't uniquely defined. Are you willing to generate more samples than you need and then throw away extreme values? Or do you want to 'censor' extreme values (i.e. set values = 1 to 1 and values =7 to 7)? I'd like the retain a normal distribution so I wouldn't want to delete the other values or truncate them. Can I use the cut command on the data that gets generated and retain a normal(ish, at least) distribution? I don't quite understand how 'cut' (which transforms a continuous variable into a categorical one) is going to help ... by definition, a normal distribution is continuous (so discretizing the distribution will make it non-normal) and has the real numbers as its domain (so in theory you can't have a restricted domain and still have it be normal). If your standard deviation is small enough (say mean=3.5 and sd=0.1) then you will never have to worry about values beyond (1,7) in the lifetime of the universe, but if your sd is larger (and you can't allow it to be smaller) then you have to do *something* with the values that get generated outside your chosen bounds ... [snip to make Gmane happy] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] splitting into multiple dataframes and then create a loop to work
You can do this using function lmList() from package nlme, without having to split the data frames, e.g., library(nlme) mlis - lmList(yvar ~ . - clvar | clvar, data = df) mlis summary(mlis) I hope it helps. Best, Dimitris On 8/29/2011 5:37 PM, Nilaya Sharma wrote: Dear All Sorry for this simple question, I could not solve it by spending days. My data looks like this: # data set.seed(1234) clvar- c( rep(1, 10), rep(2, 10), rep(3, 10), rep(4, 10)) # I have 100 level for this factor var; yvar- rnorm(40, 10,6); var1- rnorm(40, 10,4); var2- rnorm(40, 10,4); var3- rnorm(40, 5, 2); var4- rnorm(40, 10, 3); var5- rnorm(40, 15, 8) # just example df- data.frame(clvar, yvar, var1, var2, var3, var4, var5) # manual splitting df1- subset(df, clvar == 1) df2- subset(df, clvar == 2) df3- subset(df, clvar == 3) df4- subset(df, clvar == 4) df5- subset(df, clvar == 5) # i tried to mechanize it * for(i in 1:5) { df[i]- subset(df, clvar == i) } I know it should not work as df[i] is single variable, do it did. But I could not find away to output multiple dataframes from this loop. My limited R knowledge, did not help at all ! * # working on each of variable, just trying simple function a- 3:8 out1- lapply(1:5, function(ind){ lm(df1$yvar ~ df1[, a[ind]]) }) p1- lapply(out1, function(m)summary(m)$coefficients[,4][2]) p1- do.call(rbind, p1) My ultimate objective is to apply this function to all the dataframes created (i.e. df1, df2, df3, df4, df5) and create five corresponding p-value vectors (p1, p2, p3, p4, p5). Then output would be a matrix of clvar and correponding p values clvar var1 var2 var3 var4 var5 1 2 3 4 Please help me ! Thanks NIL [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replacing elements of a zoo object
Although I'm not sure what you're talking about with pop-up windows... Weird, this is what I'm getting in either R 2.13.0 or R 2.12.0: library(zoo) Warning: package 'zoo' was built under R version 2.13.1 x = zoo(1:5, as.Date('2001-01-01')+1:5) x[as.Date('2001-01-05')] 2001-01-05 4 x[as.Date('2001-01-05')] = 0 x Error in dimnames(x) - dn : length of 'dimnames' [1] not equal to array extent Thank you for any insight On Mon, Aug 29, 2011 at 1:53 PM, David Winsemius dwinsem...@comcast.netwrote: On Aug 29, 2011, at 2:45 PM, Gene Leynes wrote: Why doesn't this work? x = zoo(1:5, as.Date('2001-01-01')+1:5) x[as.Date('2001-01-05')] x[as.Date('2001-01-05')] = 0 x I think this is especially bad because it doesn't cause an error. It lets you do something to x, but then you can't see x again to see what it did. It did exactly what I expected it to do. What was the this that you think was bad? I hope you are not asking that R ask users to confirm every assignment with a popup window. -- David Winsemius, MD West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replacing elements of a zoo object
On Aug 29, 2011, at 3:02 PM, Gene Leynes wrote: Although I'm not sure what you're talking about with pop-up windows... I got (as expected) assignment, so I assumed you were not expecting assignment. Weird, this is what I'm getting in either R 2.13.0 or R 2.12.0: library(zoo) Warning: package 'zoo' was built under R version 2.13.1 x = zoo(1:5, as.Date('2001-01-01')+1:5) x[as.Date('2001-01-05')] 2001-01-05 4 x[as.Date('2001-01-05')] = 0 x Error in dimnames(x) - dn : length of 'dimnames' [1] not equal to array extent I get x 2001-01-02 2001-01-03 2001-01-04 2001-01-05 2001-01-06 1 2 3 0 5 As did another. So you are the odd man out and the burden is on you to show why updating to a current version does not solve your broken installation. -- David. Thank you for any insight On Mon, Aug 29, 2011 at 1:53 PM, David Winsemius dwinsem...@comcast.net wrote: On Aug 29, 2011, at 2:45 PM, Gene Leynes wrote: Why doesn't this work? x = zoo(1:5, as.Date('2001-01-01')+1:5) x[as.Date('2001-01-05')] x[as.Date('2001-01-05')] = 0 x I think this is especially bad because it doesn't cause an error. It lets you do something to x, but then you can't see x again to see what it did. It did exactly what I expected it to do. What was the this that you think was bad? I hope you are not asking that R ask users to confirm every assignment with a popup window. -- David Winsemius, MD West Hartford, CT David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] splitting into multiple dataframes and then create a loop to work
Hi: Dimitris' solution is appropriate, but it needs to be mentioned that the approach I offered earlier in this thread differs from the lmList() approach. lmList() uses a pooled measure of error MSE (which you can see at the bottom of the output from summary(mlis) ), whereas the plyr approach subdivides the data into distinct sub-data frames and analyzes them as separate entities. As a result, the residual MSEs will differ between the two approaches, which in turn affects the significance tests on the model coefficients. You need to decide which approach is better for your purposes. Cheers, Dennis On Mon, Aug 29, 2011 at 12:02 PM, Dimitris Rizopoulos d.rizopou...@erasmusmc.nl wrote: You can do this using function lmList() from package nlme, without having to split the data frames, e.g., library(nlme) mlis - lmList(yvar ~ . - clvar | clvar, data = df) mlis summary(mlis) I hope it helps. Best, Dimitris On 8/29/2011 5:37 PM, Nilaya Sharma wrote: Dear All Sorry for this simple question, I could not solve it by spending days. My data looks like this: # data set.seed(1234) clvar- c( rep(1, 10), rep(2, 10), rep(3, 10), rep(4, 10)) # I have 100 level for this factor var; yvar- rnorm(40, 10,6); var1- rnorm(40, 10,4); var2- rnorm(40, 10,4); var3- rnorm(40, 5, 2); var4- rnorm(40, 10, 3); var5- rnorm(40, 15, 8) # just example df- data.frame(clvar, yvar, var1, var2, var3, var4, var5) # manual splitting df1- subset(df, clvar == 1) df2- subset(df, clvar == 2) df3- subset(df, clvar == 3) df4- subset(df, clvar == 4) df5- subset(df, clvar == 5) # i tried to mechanize it * for(i in 1:5) { df[i]- subset(df, clvar == i) } I know it should not work as df[i] is single variable, do it did. But I could not find away to output multiple dataframes from this loop. My limited R knowledge, did not help at all ! * # working on each of variable, just trying simple function a- 3:8 out1- lapply(1:5, function(ind){ lm(df1$yvar ~ df1[, a[ind]]) }) p1- lapply(out1, function(m)summary(m)$coefficients[,4][2]) p1- do.call(rbind, p1) My ultimate objective is to apply this function to all the dataframes created (i.e. df1, df2, df3, df4, df5) and create five corresponding p-value vectors (p1, p2, p3, p4, p5). Then output would be a matrix of clvar and correponding p values clvar var1 var2 var3 var4 var5 1 2 3 4 Please help me ! Thanks NIL [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bayesian functions for mle2 object
I would recommend using the new Bayesian package 'LaplacesDemon' available on CRAN. Ben Bolker bbol...@gmail.com Sent by: r-help-boun...@r-project.org 08/29/2011 02:50 PM To r-h...@stat.math.ethz.ch cc Subject Re: [R] Bayesian functions for mle2 object Billy.Requena billy.requena at gmail.com writes: Hi everybody, I'm interested in evaluating the effect of a continuous variable on the mean and/or the variance of my response variable. I have built functions expliciting these and used the 'mle2' function to estimate the coefficients, as follows: func.1 - function(m=62.9, c0=8.84, c1=-1.6) { s - c0+c1*(x) -sum(dnorm(y, mean=m, sd=s,log=T)) } m1 - mle2(func.1, method=SANN) However, the estimation of the effect of x on the variance of y usually has dealt some troubles, resulting in no convergencies or sd of estimates extremely huge. I tried using different optimizers, but I still faced the some problems. When I had similar troubles in 'GLMM' statistical universe, I used bayesian functions to solve this problem, enjoyning the flexibility of different start points to reach the maximum likelihood estimates. However, I have no idea which package or which function to use to solve the specific problem I'm facing now. Does anyone have a clue? Thanks in advance Unless I'm missing something, you can fit this model (more easily) in gls() from the nlme package, which allows models for heteroscedasticity. See ?nlme::varConstPower gls(y~1,weights=varPower(power=1,form=~x),data) This gives you a standard deviation proportional to (t1+|v|); that is, if the baseline residual standard deviation is S, then the standard deviation is S*(t1+|v|), so S would correspond to your c1 and S*t1 would correspond to your c0. Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to referee a dimension name via a variable?
hi, R-users I have a data.frame for example test$newdataday24 and test$newdataday48 I can plot them by plot(test$newdataday24) but now i want to plot different data by define a variable to describe them dayno-c(24,48) newnam-paste(test$newdataday,dayno,sep=) plot(newnam[1]) but i failed,the error message said that something wrong with plot.window what can i do to fix my script ? thanks - TANG Jie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replacing elements of a zoo object
Hmm, I don't know what this means as trouble shooting, but I get the following: 1) After library(zoo) Attaching package: 'zoo' The following object(s) are masked from 'package:base': as.Date and then for the first str(x) zoo series from 2001-01-02 to 2001-01-06 Data: int [1:5] 1 2 3 4 5 Index: Date[1:5], format: 2001-01-02 2001-01-03 2001-01-04 2001-01-05 2001-01-06 There's obviously something afoot in how the Date class is handled, but I'm not sure how yet. Michael On Mon, Aug 29, 2011 at 3:24 PM, Gene Leynes gleyne...@gmail.com wrote: This seems like a very strange error. In trying to troubleshoot this further I looked at the structure of x. The new x has the length of the Index (2001-01-05 = 11327). library(zoo) x = zoo(1:5, as.Date('2001-01-01')+1:5) str(x) zoo series from 2001-01-02 to 2001-01-06 Data: int [1:5] 1 2 3 4 5 Index: Class 'Date' num [1:5] 11324 11325 11326 11327 11328 x[as.Date('2001-01-05')] 2001-01-05 4 x[as.Date('2001-01-05')] = 0 x Error in dimnames(x) - dn : length of 'dimnames' [1] not equal to array extent str(x) zoo series from 2001-01-02 to 2001-01-06 Data: num [1:11327] 1 2 3 4 5 NA NA NA NA NA ... Index: Class 'Date' num [1:5] 11324 11325 11326 11327 11328 Obviously this is hard for anyone to troubleshoot if you can't reproduce it. I get the same error in R versions 12.0 and 13.0 (although I don't get the warning zoo was built under R 13.1 warning when I use zoo in R 12.0) On Mon, Aug 29, 2011 at 2:07 PM, Gene Leynes gleyne...@gmail.com wrote: Michael, By the way, although I replied to David's email, I was responding to you as well. Your results were exactly what I was expecting, but I didn't get your results. On Mon, Aug 29, 2011 at 1:51 PM, R. Michael Weylandt michael.weyla...@gmail.com wrote: How exactly do you mean it doesn't work? Copied from my GUI: x = zoo(1:5, as.Date('2001-01-01')+1:5) x[as.Date('2001-01-05')] 2001-01-05 4 x[as.Date('2001-01-05')] = 0 x 2001-01-02 2001-01-03 2001-01-04 2001-01-05 2001-01-06 1 2 3 0 5 (Those actually line up correctly on my machine..) Michael Weylandt On Mon, Aug 29, 2011 at 2:45 PM, Gene Leynes gleyne...@gmail.comwrote: Why doesn't this work? x = zoo(1:5, as.Date('2001-01-01')+1:5) x[as.Date('2001-01-05')] x[as.Date('2001-01-05')] = 0 x I think this is especially bad because it doesn't cause an error. It lets you do something to x, but then you can't see x again to see what it did. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to referee a dimension name via a variable?
try: newnam-paste('newdatadat',dayno,sep='') plot(test[[newnam[1]]]) On Mon, Aug 29, 2011 at 12:29 PM, Jie TANG totang...@gmail.com wrote: hi, R-users I have a data.frame for example test$newdataday24 and test$newdataday48 I can plot them by plot(test$newdataday24) but now i want to plot different data by define a variable to describe them dayno-c(24,48) newnam-paste(test$newdataday,dayno,sep=) plot(newnam[1]) but i failed,the error message said that something wrong with plot.window what can i do to fix my script ? thanks - TANG Jie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] splitting into multiple dataframes and then create a loop to work
well, if a pooled estimate of the residual standard error is not desirable, then you just need to set argument 'pool' of lmList() to FALSE, e.g., mlis - lmList(yvar ~ . - clvar | clvar, data = df, pool = FALSE) summary(mlis) Best, Dimitris On 8/29/2011 9:20 PM, Dennis Murphy wrote: Hi: Dimitris' solution is appropriate, but it needs to be mentioned that the approach I offered earlier in this thread differs from the lmList() approach. lmList() uses a pooled measure of error MSE (which you can see at the bottom of the output from summary(mlis) ), whereas the plyr approach subdivides the data into distinct sub-data frames and analyzes them as separate entities. As a result, the residual MSEs will differ between the two approaches, which in turn affects the significance tests on the model coefficients. You need to decide which approach is better for your purposes. Cheers, Dennis On Mon, Aug 29, 2011 at 12:02 PM, Dimitris Rizopoulos d.rizopou...@erasmusmc.nl wrote: You can do this using function lmList() from package nlme, without having to split the data frames, e.g., library(nlme) mlis- lmList(yvar ~ . - clvar | clvar, data = df) mlis summary(mlis) I hope it helps. Best, Dimitris On 8/29/2011 5:37 PM, Nilaya Sharma wrote: Dear All Sorry for this simple question, I could not solve it by spending days. My data looks like this: # data set.seed(1234) clvar- c( rep(1, 10), rep(2, 10), rep(3, 10), rep(4, 10)) # I have 100 level for this factor var; yvar- rnorm(40, 10,6); var1- rnorm(40, 10,4); var2- rnorm(40, 10,4); var3- rnorm(40, 5, 2); var4- rnorm(40, 10, 3); var5- rnorm(40, 15, 8) # just example df- data.frame(clvar, yvar, var1, var2, var3, var4, var5) # manual splitting df1- subset(df, clvar == 1) df2- subset(df, clvar == 2) df3- subset(df, clvar == 3) df4- subset(df, clvar == 4) df5- subset(df, clvar == 5) # i tried to mechanize it * for(i in 1:5) { df[i]- subset(df, clvar == i) } I know it should not work as df[i] is single variable, do it did. But I could not find away to output multiple dataframes from this loop. My limited R knowledge, did not help at all ! * # working on each of variable, just trying simple function a- 3:8 out1- lapply(1:5, function(ind){ lm(df1$yvar ~ df1[, a[ind]]) }) p1- lapply(out1, function(m)summary(m)$coefficients[,4][2]) p1- do.call(rbind, p1) My ultimate objective is to apply this function to all the dataframes created (i.e. df1, df2, df3, df4, df5) and create five corresponding p-value vectors (p1, p2, p3, p4, p5). Then output would be a matrix of clvar and correponding p values clvar var1 var2 var3 var4 var5 1 2 3 4 Please help me ! Thanks NIL [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to referee a dimension name via a variable?
thank you , it works . another problem is if can could define a variable to express the data.frame? for example : datanam-c(newdata,newdata2) plot(datanam[1][[newnam[1]]]) 2011/8/30 Justin Haynes jto...@gmail.com try: newnam-paste('newdatadat',dayno,sep='') plot(test[[newnam[1]]]) On Mon, Aug 29, 2011 at 12:29 PM, Jie TANG totang...@gmail.com wrote: hi, R-users I have a data.frame for example test$newdataday24 and test$newdataday48 I can plot them by plot(test$newdataday24) but now i want to plot different data by define a variable to describe them dayno-c(24,48) newnam-paste(test$newdataday,dayno,sep=) plot(newnam[1]) but i failed,the error message said that something wrong with plot.window what can i do to fix my script ? thanks - TANG Jie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- TANG Jie Email: totang...@gmail.com Tel: 0086-2154896104 Shanghai Typhoon Institute,China [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to referee a dimension name via a variable?
On 29/08/2011 3:52 PM, Jie TANG wrote: thank you , it works . another problem is if can could define a variable to express the data.frame? for example : datanam-c(newdata,newdata2) plot(datanam[1][[newnam[1]]]) Use get(): plot(get(datanam[1])[[newnam[1]]])) Duncan Murdoch 2011/8/30 Justin Haynesjto...@gmail.com try: newnam-paste('newdatadat',dayno,sep='') plot(test[[newnam[1]]]) On Mon, Aug 29, 2011 at 12:29 PM, Jie TANGtotang...@gmail.com wrote: hi, R-users I have a data.frame for example test$newdataday24 and test$newdataday48 I can plot them by plot(test$newdataday24) but now i want to plot different data by define a variable to describe them dayno-c(24,48) newnam-paste(test$newdataday,dayno,sep=) plot(newnam[1]) but i failed,the error message said that something wrong with plot.window what can i do to fix my script ? thanks - TANG Jie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replacing elements of a zoo object
On Mon, Aug 29, 2011 at 2:45 PM, Gene Leynes gleyne...@gmail.com wrote: Why doesn't this work? x = zoo(1:5, as.Date('2001-01-01')+1:5) x[as.Date('2001-01-05')] x[as.Date('2001-01-05')] = 0 x Make sure you have the most recent version of zoo which is this: packageVersion(zoo) [1] ‘1.7.4’ -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rpart: apply tree to new data to get counts
I tried that, while I find the documentation a bit short, but the only result I get from this is a probability distribution of my data (I'm building a tree with 2 classes). How do I plot a tree where the counts are show in each step/node? BR, Jay On Aug 29, 9:40 pm, Weidong Gu anopheles...@gmail.com wrote: ? predict.rpart Weidong Gu On Mon, Aug 29, 2011 at 12:49 PM, Jay josip.2...@gmail.com wrote: Hi, when I have made a decision tree with rpart, is it possible to apply this tree to a new set of data in order to find out the distribution of observations? Ideally I would like to plot my original tree, with the counts (at each node) of the new data. Reagards, Jay __ r-h...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replacing elements of a zoo object
This seems like a very strange error. In trying to troubleshoot this further I looked at the structure of x. The new x has the length of the Index (2001-01-05 = 11327). library(zoo) x = zoo(1:5, as.Date('2001-01-01')+1:5) str(x) zoo series from 2001-01-02 to 2001-01-06 Data: int [1:5] 1 2 3 4 5 Index: Class 'Date' num [1:5] 11324 11325 11326 11327 11328 x[as.Date('2001-01-05')] 2001-01-05 4 x[as.Date('2001-01-05')] = 0 x Error in dimnames(x) - dn : length of 'dimnames' [1] not equal to array extent str(x) zoo series from 2001-01-02 to 2001-01-06 Data: num [1:11327] 1 2 3 4 5 NA NA NA NA NA ... Index: Class 'Date' num [1:5] 11324 11325 11326 11327 11328 Obviously this is hard for anyone to troubleshoot if you can't reproduce it. I get the same error in R versions 12.0 and 13.0 (although I don't get the warning zoo was built under R 13.1 warning when I use zoo in R 12.0) On Mon, Aug 29, 2011 at 2:07 PM, Gene Leynes gleyne...@gmail.com wrote: Michael, By the way, although I replied to David's email, I was responding to you as well. Your results were exactly what I was expecting, but I didn't get your results. On Mon, Aug 29, 2011 at 1:51 PM, R. Michael Weylandt michael.weyla...@gmail.com wrote: How exactly do you mean it doesn't work? Copied from my GUI: x = zoo(1:5, as.Date('2001-01-01')+1:5) x[as.Date('2001-01-05')] 2001-01-05 4 x[as.Date('2001-01-05')] = 0 x 2001-01-02 2001-01-03 2001-01-04 2001-01-05 2001-01-06 1 2 3 0 5 (Those actually line up correctly on my machine..) Michael Weylandt On Mon, Aug 29, 2011 at 2:45 PM, Gene Leynes gleyne...@gmail.com wrote: Why doesn't this work? x = zoo(1:5, as.Date('2001-01-01')+1:5) x[as.Date('2001-01-05')] x[as.Date('2001-01-05')] = 0 x I think this is especially bad because it doesn't cause an error. It lets you do something to x, but then you can't see x again to see what it did. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replacing elements of a zoo object
Michael, By the way, although I replied to David's email, I was responding to you as well. Your results were exactly what I was expecting, but I didn't get your results. On Mon, Aug 29, 2011 at 1:51 PM, R. Michael Weylandt michael.weyla...@gmail.com wrote: How exactly do you mean it doesn't work? Copied from my GUI: x = zoo(1:5, as.Date('2001-01-01')+1:5) x[as.Date('2001-01-05')] 2001-01-05 4 x[as.Date('2001-01-05')] = 0 x 2001-01-02 2001-01-03 2001-01-04 2001-01-05 2001-01-06 1 2 3 0 5 (Those actually line up correctly on my machine..) Michael Weylandt On Mon, Aug 29, 2011 at 2:45 PM, Gene Leynes gleyne...@gmail.com wrote: Why doesn't this work? x = zoo(1:5, as.Date('2001-01-01')+1:5) x[as.Date('2001-01-05')] x[as.Date('2001-01-05')] = 0 x I think this is especially bad because it doesn't cause an error. It lets you do something to x, but then you can't see x again to see what it did. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.