Re: [R] matrices call a function element-wise
Hi, I would recommend reformatting the data as a 2x2x1000 array and using apply. Jonathan On Mon, Jan 3, 2011 at 7:57 AM, zhaoxing731 zhaoxing...@yahoo.com.cn wrote: Hello I have 4 1000*1000 matrix A,B,C,D. I want to use the corresponding element of the 4 matrices. Using the for loop as follow: E-o for (i in 1:1000) {for (j in 1:1000) { E-fisher.test(matrix(c(A[i][j],B[i][j],C[i][j],D[i][j]),2))#call fisher.test for every element } } It is so time-consuming Need vectorization Yours sincerely ZhaoXing Department of Health Statistics West China School of Public Health Sichuan University No.17 Section 3, South Renmin Road Chengdu, Sichuan 610041 P.R.China [[alternative HTML version deleted]] __ 8O?lW2aQE;3,4sH]A?Cb7QSJOd? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Converting data.frame from long to wide format
Matt, library(reshape2) wide.df - dcast(df, y ~ x) Works great for me. Jonathan On Wed, Dec 8, 2010 at 7:26 PM, Matthew Pettis matthew.pet...@gmail.com wrote: Hi, I was wondering if there is an easy way that I am missing for turning a long dataframe into a wide one. Below is sample code that will make what I have and, in comments, the form of what I want: # Have: dataframe like 'df' df - expand.grid( x=LETTERS[1:3], y=LETTERS[4:6]) df$z - letters[1:length(df[,1])] # Want: data.frame that has following form: # A B C # D a b c # E d e f # F g h i I looked at 'xtabs' and 'cast' from reshape/reshape2, but unless I'm misunderstanding something, these will work only for the 'z' column being numeric, not textual. Is there an easy way to do this with 'z' being textual rather than numeric? tia, Matt [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] difference between linear model scatterplot matrix
Francesco, My guess would be collinearity of the predictors. The linear model gives you the best fit to all of the predictors at once; unless the predictors are orthogonal (which in a case like this is certainly not the case), there is no guarantee that the parameter estimates which give the best overall fit for the linear model will be similar to regression coefficients if you were to regress the response on each predictor individually. There are various ways to check collinearity, such as variance inflation factors (VIF). You may want to look into them. It's very dangerous to try to interpret your parameter estimates in the presence of collinearity. Jonathan On Fri, Dec 3, 2010 at 7:42 AM, Francesco Nutini nutini.france...@gmail.com wrote: Dear R-users, I'm studing a DB, structured like this (just a little part of my dataset): _ Site Latitude Longitude Year Tot-Prod Total_Density dmp Dendoudi-1 15.441964 -13.540179 2005 3271.16 1007 16993.25 Dendoudi-2 15.397321 -13.611607 2005 1616.84 250 25376.67 … … … … … … … _ If I made a scatterplotmatrix with the command show below I obtain a matrix (visible in the image) that show which variables is more correlated with dmp data (violet color). But, if I made a linear model between the dependent variable (dmp) and many independent variables I get different information about the significativity of the variable. I mean, variables that appear correlated with dependent variable in the matrix result not correlated in the summary of linear model, and vice versa. Have I made a mistake in the interpretation of the result, or not? Thank you in advance, Francesco #command for matrix-plot dta - senegal5[c( 2,4,5,6,7,8,9,13,15,17,21, 39,44,45)] dta.r - abs(cor(dta)) dta.col - dmat.color(dta.r) dta.o - order.single(dta.r) cpairs(dta, dta.o, panel.colors=dta.col, gap=.5, main=Variables Ordered and Colored by Correlation) #command for linear model and summary() a- lm ( dmp ~ Latitude + Longitude + Year + Tot.Prod + Herbaceous.Prod.kg.ha. + Leaf.Prod + Tree.bio + Total_Density + X1st.SpecieDensity.trunk.ha.+ X2nd.SpecieDensity.trunk.ha.+ Herb_Specie_Index1 + iNDVI.JASO. + RFE.Cum.JASO., data=senegal5 ) summary(a) Call: lm(formula = dmp ~ Latitude + Longitude + Year + Tot.Prod + Herbaceous.Prod.kg.ha. + Leaf.Prod + Tree.bio + Total_Density + X1st.SpecieDensity.trunk.ha. + X2nd.SpecieDensity.trunk.ha. + Herb_Specie_Index1 + iNDVI.JASO. + RFE.Cum.JASO., data = senegal5) Residuals: Min 1Q Median 3Q Max -676.49 -195.77 -33.06 113.34 816.17 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -3.283e+05 4.505e+04 -7.288 4.41e-11 *** Latitude -6.100e+01 1.990e+02 -0.307 0.7598 Longitude -3.617e+02 8.639e+01 -4.187 5.60e-05 *** Year 1.604e+02 2.300e+01 6.973 2.15e-10 *** Tot.Prod -4.893e+00 1.565e+02 -0.031 0.9751 Herbaceous.Prod.kg.ha. 4.905e+00 1.565e+02 0.031 0.9751 Leaf.Prod 4.842e+00 1.565e+02 0.031 0.9754 Tree.bio -4.241e+01 2.771e+02 -0.153 0.8786 Total_Density -1.930e+00 8.933e-01 -2.160 0.0329 * X1st.SpecieDensity.trunk.ha. 1.992e+00 9.246e-01 2.154 0.0333 * X2nd.SpecieDensity.trunk.ha. 3.416e+00 1.642e+00 2.080 0.0398 * Herb_Specie_Index1 -1.091e+00 1.844e+00 -0.592 0.5552 iNDVI.JASO. 8.914e+02 6.076e+01 14.670 2e-16 *** RFE.Cum.JASO. 2.525e+00 4.529e-01 5.575 1.68e-07 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 295.3 on 114 degrees of freedom Multiple R-squared: 0.9206, Adjusted R-squared: 0.9116 F-statistic: 101.7 on 13 and 114 DF, p-value: 2.2e-16 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating 'all' sum contrasts
Michael, Let c_1 and c_2 be vectors representing contrasts. Then c_1 and c_2 are orthogonal if and only if the inner product is 0. In your example, you have vectors (1,0,-1) and (0,1,-1). The inner product is 1, so they are not orthogonal. It's impossible to have more orthogonal contrasts than you have levels in your factor, a result from basic linear algebra. You can get all possible pairwise contrasts, which is different from orthogonal contrasts (in fact, it's only possible to have floor(n/2) orthogonal pairwise contrasts). This is probably not the easiest way, but it works: n - 10 M - matrix(0,nrow=n,ncol=n*(n-1)/2) comb - combn(n,2) M[cbind(comb[1,],1:(n*(n-1)/2))] - 1 M[cbind(comb[2,],1:(n*(n-1)/2))] - -1 M is then a matrix containing all pairwise contrasts for n levels of a factor. Hope that helps, Jonathan On Fri, Oct 15, 2010 at 10:30 AM, Michael Hopkins hopk...@upstreamsystems.com wrote: On 15 Oct 2010, at 13:55, Berwin A Turlach wrote: G'day Michael, Hi Berwin Thanks for the reply On Fri, 15 Oct 2010 12:09:07 +0100 Michael Hopkins hopk...@upstreamsystems.com wrote: OK, my last question didn't get any replies so I am going to try and ask a different way. When I generate contrasts with contr.sum() for a 3 level categorical variable I get the 2 orthogonal contrasts: contr.sum( c(1,2,3) ) [,1] [,2] 1 1 0 2 0 1 3 -1 -1 These two contrasts are *not* orthogonal. I'm surprised. Can you please tell me how you calculated that. This provides the contrasts 1-3 and 2-3 as expected. But I also want it to create 1-2 (i.e. 1-3 - 2-3). So in general I want all possible orthogonal contrasts - think of it as the contrasts for all pairwise comparisons between the levels. You have to decide what you want. The contrasts for all pairwise comparaisons between the levels or all possible orthogonal contrasts? Well the pairwise contrasts are the most important as I am looking for evidence of whether they are zero (i.e. no difference between levels) or not. But see my above comment about orthogonality. The latter is actually not well defined. For a factor with p levels, there would be p orthogonal contrasts, which are only identifiable up to rotation, hence infinitely many such sets. But there are p(p-1) pairwise comparisons. So unless p=2, yo have to decide what you want Well of course the pairwise comparisons are bi-directional so in fact only p(p-1)/2 are of interest to me. Are there are any options for contrast() or other functions/libraries that will allow me to do this automatically? Look at package multcomp, in particular functions glht and mcp, these might help. Thanks I will have a look. But I want to be able to do this transparently within lm() using regsubsets() etc as I am collecting large quantities of summary stats from all possible models to use with a model choice criterion based upon true Bayesian model probabilities. Cheers, Berwin == Full address Berwin A Turlach Tel.: +61 (8) 6488 3338 (secr) School of Maths and Stats (M019) +61 (8) 6488 3383 (self) The University of Western Australia FAX : +61 (8) 6488 1028 35 Stirling Highway Crawley WA 6009 e-mail: ber...@maths.uwa.edu.au Australia http://www.maths.uwa.edu.au/~berwin Michael Hopkins Algorithm and Statistical Modelling Expert Upstream 23 Old Bond Street London W1S 4PZ Mob +44 0782 578 7220 DL +44 0207 290 1326 Fax +44 0207 290 1321 hopk...@upstreamsystems.com www.upstreamsystems.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extract rows of a matrix
Hannah, a - matrix(rnorm(1),nrow=500) new.matrix - a[seq(0,dim(a)[1],by=20),] Jonathan On Tue, Oct 12, 2010 at 1:59 PM, li li hannah@gmail.com wrote: Hi all, I want to extract every 20th row of a big matrix, say 1 by 1000. What is the simper way to do this? Thank you very much! Hannah [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combinations
Hi, On Tue, Oct 5, 2010 at 3:52 AM, Trying To learn again tryingtolearnag...@gmail.com wrote: Hi all, Reading more I have find a partial solution on a part of the proble in some part of the code it should appea something like: # NC: All the potential combinations 3^15 if NC[price(i,j)==1 price(i,j)==2] extract this column then save all the columns that contain this pre-requisite. That's an empty set, so it's really easy to extract. Try something like NC[,(NC[3,]==1)] Note the commas, which control whether you are selecting rows or columns. I recommend reading An Introduction to R: http://cran.r-project.org/doc/manuals/R-intro.pdf Jonathan 2010/10/4 Trying To learn again tryingtolearnag...@gmail.com Hi all, I´ve been ill and I have lost a lot of time without seen the pc. I want you to help if you can if you want. Only I need an initial guide. I´ve been out a lot of time and I need a hope. Is only for joby purposes. The problem: I want to simulate each of the posible combination in a play. Imagine they play to games (football games) and you can choose 1, X, 2 you must choose this 15 times. So finally you will get a colum 15x1 si you have (3^15) posible colums. I want to extract all the columns were in the row 3 you can find an 2. Or in the row 3 appears a 1 and row 3 x. And extract them and save in a txt document. Sorry I know that I only ask but actually I feel very fool. If you give an initial guide will be sufficient. Many thanks for all in advace. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Suppressing printing in the function
Dimitri, Maybe ?invisible will help? Jonathan On Fri, Oct 1, 2010 at 4:27 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Hello! I wrote a function that returns a data frame. Nowhere in the function do I say print(my.data.frame), but when I run the function - the data frame is printed on the console. Is there any way to suppress it? Thank you! -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to to if a calculation is out range?
Perhaps use lgamma? lgamma(220) [1] 964.8206 Jonathan On Wed, Sep 29, 2010 at 3:22 PM, song song rprojecth...@gmail.com wrote: for example, when I am calculating a posterior density, I need to calculate gamma(75*3+5)=gamma(220) which is out of the bound of gamma function. what shall I do for this condition Thank you [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] need help with ramdomly sampling some data
Mike, It works for me: data - 1:8 sample(data,replace=TRUE) [1] 6 4 5 2 5 8 7 2 Please provide a reproducible example, if possible, and the output of sessionInfo(). Jonathan On Tue, Sep 28, 2010 at 7:22 PM, Michael Larkin mlar...@rsmas.miami.eduwrote: I am trying to get R to randomly select values from my dataset (i.e. bootstrapping) with replacement. However, my attempts at this have been unsuccessful. Here is a basic example of what I am doing: I have a data vector of 8 values (i.e. data= 2,5,9,4,5,6,7,8). I used the sample function and it worked. However, it only repeated my values in the exact same order as the dataset. It did not randomly sample them. Here the code for what I did: sample(data, replace=TRUE) Any advice to randomly select data from my dataset would be greatly appreciated. Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] next step in randomly sampling
Mike, Try growth[sample(1:length(growth)),] to permute the rows. Jonathan On Tue, Sep 28, 2010 at 8:38 PM, Michael Larkin mlar...@rsmas.miami.eduwrote: Thanks to the people on this list I was able to fix my code for randomly sampling. Thanks. Now, I am moving on to the next step and I ran into another snag. I have a large dataset but I am starting with a small made-up dataset until I figure it out. I have two columns of data (age and length). I got R to read my data called growth which is the age and length for 10 fish: growth Age Length 1 2200 2 5450 3 6600 4 7702 5 8798 6 5453 7 4399 8 1120 9 2202 Then I believe I converted my data to a three vectors by: newgrowth-c(growth) Now I want to randomly select the values from this dataset to create a new dataset. I want to do this many times, however, for now I am just trying to get it to randomly select from the dataset only once. The trick is that I need to keep the columns together. Each age corresponds to a length. For example, the 200 length fish has an age of 2 years. I tried to resample the data with this code: sample(newgrowth) However, I ended up getting the data listed as a row in the same order, not randomly selected. I pasted the result below. sample(newgrowth) $Age [1] 2 5 6 7 8 5 4 1 2 $Length [1] 200 450 600 702 798 453 399 120 202 Any advice on how I can randomly select from these 9 rows of data would be greatly appreciated. Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting multiple animal tracks against Date/Time
Include individual as a factor in your dataset, and use ggplot2: library(ggplot2) ggplot(aes(x=Date, y=Distance, color=Individual), data=data) + geom_line() ought to do it. Jonathan On Thu, Sep 23, 2010 at 9:31 AM, Struve, Juliane j.str...@imperial.ac.ukwrote: Sorry for posting this questions twice, but my previous question was accidentally sent unfinished. Dear list, I would like to create a time series plot in which the paths of several individuals are stacked above each other, with the x-axis being the total observation period of three years ( 1.1.2004 to 31.12.2007) and the y-axis being some defined range[min,max]. My data consist of Date/Time information and the paths of 45 individual as the distance from the location of release. An example data set for 2 individuals is given below.The observation period and frequency of observations varies between individuals. I believe stackplot() may be able to do this task, but I am not sure how to handle the variable time period and frequency of observations for different individuals. Could someone advise if stackplot() is suitable or if there is a better approach or package ? Thank you very much for your time, Juliane Individual 1 DateDistance [m] 2005-07-18 22:05:15 1815.798 2005-07-18 22:06:35 1815.798 2005-07-18 22:08:33 1815.798 2005-07-18 22:09:49 1815.798 2005-07-18 22:12:50 1815.798 2005-07-18 22:16:26 1815.798 Individual 2 Date Distance [m] 2006-08-18 09:53:20 0.0 2006-08-18 09:59:07 0.0 2006-08-18 10:09:20 0.0 2006-08-18 10:21:14 100.5 Dr. Juliane Struve Imperial College London Department of Life Sciences Ascot, Berkshire, SL5 7PY, UK __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Setting scales for ggplot2 with facets
Swen, facet_grid forces the scale for plots along an axis to be shared. Try facet_wrap instead. Jonathan On Sat, Sep 11, 2010 at 2:21 PM, Sven Laur s...@math.ut.ee wrote: Faceting in ggplot2 seems to permit different scales for different facets, but I fail to see how one could control ylim and xlim ranges for each facet separately. For instance, I would like to set the ylim = c(0,10) for facet A and ylim = c(42,102) for facet B. Since the data is out of these ranges, setting facet_grid(factor ~ ., scales = free_y) does not achieve the goal . Is there a decent way to achieve this or not? or I have to drop data points outside y-ranges as a quick hack? Swen Laur __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to run R on Emacs+ESS
Hi Stephen, Just to check: when you say you type M-x R, are you typing the letter M? M-x in Emacs-speech means Meta-x, i.e., Alt-x. Jonathan On Mon, Sep 6, 2010 at 7:01 PM, Stephen Liu sati...@yahoo.com wrote: Hi Dirk, Thanks for your advice. Emacs and ESS already installed. $ apt-cache policy emacs emacs: Installed: 22.2+2-5 Candidate: 22.2+2-5 Version table: 23.1+1-4~bpo50+1 0 1 http://backports.org lenny-backports/main Packages *** 22.2+2-5 0 500 http://ftp.hk.debian.org lenny/main Packages 100 /var/lib/dpkg/status $ apt-cache policy ess ess: Installed: 5.3.8~svn3917-1 Candidate: 5.3.8~svn3917-1 Version table: *** 5.3.8~svn3917-1 0 500 http://ftp.hk.debian.org lenny/main Packages 100 /var/lib/dpkg/status On terminal: $ emacs starts Emacs with 2 boxes; Upper box: Welcome to GNU Emacs ... To quit a partially enter command, type Control-g (I can't type here) Lower box: -u:%% *GNU Emacs* (tab) For information about GNU Emacs and he GNU system, type C-h C-a (also I can't type here) Clicking *GNU Emacs* (tab) start another upper box (a big box): [ess-site.el]: ess-customize-alist=nil [ess-site.el _2_]: ess-customize-alist=nil (S): ess-s-versions-create making M-x defuns for (R): ess-r-versions-create making M-x defuns for Type M-x R (without quotes) and hit [Enter] there is no response. Please advise. TIA B.R. Stephen L - Original Message From: Dirk Eddelbuettel e...@debian.org To: Stephen Liu sati...@yahoo.com Cc: r-help@r-project.org Sent: Tue, September 7, 2010 12:32:39 AM Subject: Re: [R] How to run R on Emacs+ESS On 6 September 2010 at 09:18, Stephen Liu wrote: | Hi folks, | | Debian 504 64-bit Good. All you need is sudo apt-get install ess | I found following document; | http://www.biostat.wisc.edu/~kbroman/Rintro/http://www.biostat.wisc.edu/%7Ekbroman/Rintro/ | | Whether it is the right document for installing Emacs+ESS and R so that R can | run on Emacs? There is nothing else to do. Restart (X)Emacs, whichever variant you use on Debian, and type M-x R. You now run R inside Emacs. After that, see http://ess.r-project.org, esp the Documentation tab. Dirk -- Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2 multiple group barchart
Greg, Try this: library (ggplot2) v1 - c(1,2,3,3,4) v2 - c(4,3,1,1,9) v3 - c(3,5,7,2,9) gender - c(m,f,m,f,f) d.data - data.frame (v1, v2, v3, gender) d.data #library(reshape) #library(plyr) # These are already loaded by ggplot2, but for your reference: reshape provides melt(), plyr provides ddply(). d.data - melt(d.data, id.var=gender) new.data - ddply(d.data, .(gender, variable), summarize, means = mean(value)) plot - ggplot(data=new.data, aes(variable, y=means)) + geom_bar(aes(fill=gender), stat=identity, position=dodge) + coord_flip() plot I took the numbers out because it's not easy to make them fit the dodged bars. Jonathan On Wed, Sep 1, 2010 at 9:15 AM, Waller Gregor (wall) w...@zhaw.ch wrote: hi there.. i got a problem with ggplot2. here my example: library (ggplot2) v1 - c(1,2,3,3,4) v2 - c(4,3,1,1,9) v3 - c(3,5,7,2,9) gender - c(m,f,m,f,f) d.data - data.frame (v1, v2, v3, gender) d.data x - names (d.data[1:3]) y - mean (d.data[1:3]) pl - ggplot (data=d.data, aes (x=x,y=y)) pl - pl + geom_bar() pl - pl + coord_flip() pl - pl + geom_text (aes(label=round(y,1)),vjust=0.5, hjust=4,colour=white, size=7) pl this gives me a nice barchart to compare the means of my variables v1,v2 and v3. my question: how do i have to proceed if i want this barchart splittet by the variable gender. so i get two small bars for v1, one for female and one for male, two bars for v2 etc. i need them all in one chart. fill=gender, position=dodge do not work... any ideas? thanks a lot greg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dealing with data
Your second fit makes no sense, as you can easily tell if you look at the regression summaries. Fitting with spray as a categorical variable gives you an overall p-value of less than 2.2e-16, while fitting with as.numeric(spray) gives an overall p-value of .2118. The fit you've done with as.numeric induces a completely invalid model, as others have tried to point out. Jonathan On Fri, Aug 13, 2010 at 1:55 PM, TGS cran.questi...@gmail.com wrote: # I wasn't trying to do ANOVA. I was simply trying to figure out how regress count on sprays (this is after I saw another poster asking an unrelated question with the InsectSprays dataset). # # Anyhow, David clarified this but also, thanks for your explanation as well. rm(list = ls()); sprays - as.numeric(InsectSprays$spray) lm(formula = count ~ 0 + spray, data = InsectSprays) lm(formula = count ~ 0 + sprays, data = InsectSprays) # besides the point, in the ANOVA problem the degrees of freedom would be 5, not 1. On Aug 13, 2010, at 12:27 PM, Greg Snow wrote: So you want 1 degree of freedom for InsectSprays? You believe that the difference between A and B is exactly the same as between B and C which is exactly the same as between D and E (etc.)? that seems an odd assumption, but you can get that by using as.numeric (as I and others have already stated). If on the other hand you want InsectSprays to be treated correctly with the correct number of degrees of freedom, but have the output on a single line testing the overall effect, then you want to use the aov function rather than lm (internally they do the same thing, but the default summary output for aov is 1 line per term). Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: TGS [mailto:cran.questi...@gmail.com] Sent: Friday, August 13, 2010 11:51 AM To: Greg Snow Cc: r-help@r-project.org Subject: Re: [R] Dealing with data # Greg, if R automatically does that then I don't know why it's treating each indicator # as a different regressor. In other words, I am interested in treating 'spray' as one # independent variable. # # Erik, which book do you suggest I read? Thanks. data(InsectSprays) lm(InsectSprays$count ~ 0 + InsectSprays$spray) On Aug 13, 2010, at 10:34 AM, Greg Snow wrote: R/S does all of that automatically for you, you do not need to manually create the indicator variables. If you do something like: fit - lm( Sepal.Width ~ Species, data=iris, x=TRUE) Then look at the matrix actually used: fit$x Or the output: summary(fit) You will see that Species was automatically converted into indicator variables and those were used in the regression. If you really need the indicator variables yourself, look at the model.matrix function, e.g.: model.matrix( ~Species, data=iris ) Or model.matrix( ~Species - 1, data=iris ) If you really want 1 for A, 2 for B, etc. then look at as.numeric on a factor variable (e.g. as.numeric(iris$Species) ). Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of TGS Sent: Friday, August 13, 2010 11:22 AM To: David Winsemius Cc: r-help@r-project.org Subject: Re: [R] Dealing with data To clarify, I'd like to create a column of indicators for the respective letters so that I could maybe do regression on indicators, etc. For instance, A gets 1, B gets 2, and so on. On Aug 13, 2010, at 10:19 AM, David Winsemius wrote: On Aug 13, 2010, at 1:03 PM, TGS wrote: # how would I code in R to look at the letter of the alphabet # in the second column and create a indicator column for the # corresponding letter? data(InsectSprays) InsectSprays$spray It's already what most people mean when they say indicator column, i.e., a factor variable (and not a character vector) so, what do _you_ mean? -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read
Re: [R] How to run this video link
R is a program for doing statistics, not for playing videos. I recommend you try something else. Jonathan On Thu, Jul 29, 2010 at 10:43 AM, Velappan Periasamy veepsi...@gmail.comwrote: Pls tell me how to run this video in R http://nptel.iitm.ac.in/video.php?courseId=1083p=4 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error: arguments imply differing number
Hi, Thanks for including code and data so that we could reproduce what you're doing. Your problem is that you tell ddply to split the dataset by runNumber and cat1, which results in 4 groups. ddply then applies my.summary() to these four groups. One of these groups (cat1 = 1 and runNumber=1) has both start.loc and end.loc, as it contains rows which has start=TRUE and end=TRUE. This group will work fine. The other three groups, however, are broken. The group with cat1 = 2 and runNumber = 1 has neither start.loc nor end.loc, while the two groups with runNumber = 2 each have only one of the two. The error disappears if you split the dataset only by runNumber, as then each group has both start.loc and end.loc. If you want to apply my.summary() to each of these four groups, you're going to have to fix the earlier code that assigns the start and end variables. Jonathan On Wed, Jul 28, 2010 at 7:59 AM, jd6688 jdsignat...@gmail.com wrote: mydata - read.table(textConnection( Id cat1 location item_values p-values sequence a111 1 3002737 100 0.01 1 a112 1 3017821 102 0.05 2 a113 2 3027730 103 0.02 3 a114 2 3036220 104 0.04 4 a115 1 3053984 105 0.03 5 a118 1 3090500 106 0.02 8 a119 1 3103304 107 0.03 9 a120 2 3090500 106 0.02 10 a121 2 3103304 107 0.03 11 ), header = TRUE) closeAllConnections() first - function(x)c(TRUE, diff(x)!=1) last - function(x)c(diff(x)!=1, TRUE) mydata$start - first(mydata$sequence) mydata$end - last(mydata$sequence) mydata$runNumber - cumsum(first(mydata$sequence)) #load library library(plyr) ddply(mydata[, -1], .(runNumber,cat1), function(x) {max(x$item_values)}) my.summary - function(x) { start.loc - x$location[which(x$start == TRUE)] end.loc - x$location[which(x$end == TRUE)] peak - max(x$item_values) output - data.frame( start_of_the_location = start.loc, end_of_the_location = end.loc, peak_value = peak) return(output) } ddply(mydata[, -1], .(runNumber,cat1), my.summary) why ddply returned the following error Error in data.frame(start_of_the_location = start.loc, end_of_the_location = end.loc, : arguments imply differing number of rows: 0, 1 mydata[,-1] cat1 location item_values p.values sequence start end runNumber 11 3002737 100 0.011 TRUE FALSE 1 21 3017821 102 0.052 FALSE FALSE 1 32 3027730 103 0.023 FALSE FALSE 1 42 3036220 104 0.044 FALSE FALSE 1 51 3053984 105 0.035 FALSE TRUE 1 61 3090500 106 0.028 TRUE FALSE 2 71 3103304 107 0.039 FALSE FALSE 2 82 3090500 106 0.02 10 FALSE FALSE 2 92 3103304 107 0.03 11 FALSE TRUE 2 -- View this message in context: http://r.789695.n4.nabble.com/error-arguments-imply-differing-number-tp2305014p2305014.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to code it??
On Wed, Jul 28, 2010 at 2:18 PM, Henrique Dallazuanna www...@gmail.comwrote: You've tried: diff(c(0, x)) ? This is clever, but not quite what he's asking for--it converts a sequence of 1's into a 1 followed by zeroes. Jonathan On Wed, Jul 28, 2010 at 3:10 PM, Raghu r.raghura...@gmail.com wrote: Hi I have say a large vector of 3500 digits. Initially the digits are 0s and 1s. I need to check for a rule to change some of the 0s to -1s in this vector. But once I change a 0 to -1 then I need to start applying the rule to change the next 0 only after I see the next 1 in the vector. Say for example x = (0,0,0,0,0,0,0,1,0,0,1,1,0,0,1,0,0,0,1) I need to traverse from the 9th element to the last ( because the first occurrence of 1 is at 8) . Let us assume that according to our rule we change the 13th element (only 0s can be changed) to -1. Now we need to go to the next occurrence of 1 (which is 15) and begin the rule application from the 16th till the end of the vector and once replaced a 0 to a -1 then start again from the next 1. How do we code this? I 'feel' recursion is the best possible solution but I am not a programmer and will await experts' views. If this is not a typical R-forum question then my advance apologies. Many thx -- 'Raghu' [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hydrology plots in R
Sam, I recommend taking a look at the ggplot2 package. This page from the author's website contains an example of what I think you are trying to achieve: http://had.co.nz/ggplot2/geom_segment.html Obviously, this would require doing the whole plot in ggplot2, but that's not at all unpleasant. There's even a mailing list (on Google Groups) from ggplot2 with lots of friendly people to help. Best of luck, Jonathan On Thu, Jul 22, 2010 at 8:56 AM, Sam Albers tonightstheni...@gmail.comwrote: Hello, I am trying to create a plot often seen in hydrodynamic work than includes a contour plot representing the water speed with arrows pointing in the direction of flow. Does anyone have any idea how I might add arrows based on wf$angle (in the example below) to the plot below? Thanks in advance! Sam library(lattice) speed - runif(100, 0, 20) wf - data.frame(speed) wf$width - (1:10) wf$length - rep(1:10, each=10) wf$angle -runif(100, 0, 360) #How do I add arrows based on wf$angle within each coloured box to represent the direction of flow? #i don't have to use lattice. Just using it as an example. with(wf, contourplot(speed ~ width*length, region=TRUE, contour=FALSE )) -- * Sam Albers Geography Program University of Northern British Columbia University Way Prince George, British Columbia Canada, V2N 4Z9 phone: 250 960-6777 * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Historical Libor Rates
You might try asking on the R-SIG-Finance group, if nobody here can answer your question (https://stat.ethz.ch/mailman/listinfo/r-sig-finance). Jonathan On Mon, Jul 19, 2010 at 1:21 PM, Aaditya Nanduri aaditya.nand...@gmail.comwrote: Hello All, Does anyone know how to download historical LIBOR rates of different currencies into R? Or if anyone knows of a website that holds all this data...I only need up to january of 2000. Also, how can we make the row names the index of a plot (the names of the x values)? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] write.csv() : attempt to set 'append' ignored... Why?
Out of curiosity, is this a change in 2.11? I'm still runnning 2.10.1, ?write.csv mentions the other options being ignored, but not append. This might also explain why John Kane believes he has successfully used append with write.csv in that past. Jonathan On Thu, Jul 15, 2010 at 9:36 AM, Marc Schwartz marc_schwa...@me.com wrote: On Jul 15, 2010, at 9:41 AM, Cliff Clive wrote: I'm running R 2.11.0 on a 32-bit Windows XP machine. Whenever I try to write a csv file with 'append' set to TRUE, I get this message: attempt to set 'append' ignored. Obviously, this is no good, since R is deleting my previously saved data files, rather than appending to them. What can I do to fix this? Read ?write.csv more carefully: In the CSV files section: These wrappers are deliberately inflexible: they are designed to ensure that the correct conventions are used to write a valid file. Attempts to change append, col.names, sep, dec or qmethod are ignored, with a warning. If you want to use 'append', you will need to use write.table() and adjust the other arguments as you require. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] write.csv() : attempt to set 'append' ignored... Why?
Never mind, I found the answer to my own question. From the 2.11.0 change log: owrite.csv[2] no longer allow 'append' to be changed: as ever, direct calls to write.table() give more flexibility as well as more room for error. Jonathan On Thu, Jul 15, 2010 at 2:01 PM, Jonathan Christensen dzhona...@gmail.comwrote: Out of curiosity, is this a change in 2.11? I'm still runnning 2.10.1, ?write.csv mentions the other options being ignored, but not append. This might also explain why John Kane believes he has successfully used append with write.csv in that past. Jonathan On Thu, Jul 15, 2010 at 9:36 AM, Marc Schwartz marc_schwa...@me.comwrote: On Jul 15, 2010, at 9:41 AM, Cliff Clive wrote: I'm running R 2.11.0 on a 32-bit Windows XP machine. Whenever I try to write a csv file with 'append' set to TRUE, I get this message: attempt to set 'append' ignored. Obviously, this is no good, since R is deleting my previously saved data files, rather than appending to them. What can I do to fix this? Read ?write.csv more carefully: In the CSV files section: These wrappers are deliberately inflexible: they are designed to ensure that the correct conventions are used to write a valid file. Attempts to change append, col.names, sep, dec or qmethod are ignored, with a warning. If you want to use 'append', you will need to use write.table() and adjust the other arguments as you require. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Storing processed results back into original objects
Steven, You can do it with assign() if you keep the names when you put the items in the list: Dlist - list(D1=D1, D2=D2) # put the names of the objects in the list Newlist - lapply(Dlist, function(x) x[, columns]) # create a new list with the output for(i in seq(length(Newlist))) { assign(names(Newlist)[i],Newlist[[i]]) # assign the new objects to the original names } You may want to keep in mind, though: fortune(236) The only people who should use the assign function are those who fully understand why you should never use the assign function. -- Gregory L. Snow R-help (July 2009) You might want to ask yourself whether this is really the best way to achieve what you want to do. Jonathan On Thu, Jul 15, 2010 at 7:18 PM, Steven Kang stochastick...@gmail.comwrote: Hi all, There are matrices with same column names but arranged in different orders and I desire columns of these matrices to have same order. For example, below are 2 arbitrary data sets with columns arranged in different order. I require columns of these to have same order as specified in columns object and the results stored in the original object names. I know this can be done simply by: D1 - D1[, columns] But if there are hundreds of matrices, then more efficient method is required. columns - c(A, B, C) D1 - matrix(rnorm(6), nrow = 2, dimnames = list(c(R1, R2), c(C, A, B))) D2 - matrix(rnorm(6), nrow = 2, dimnames = list(c(R1, R2), c(C, B, A))) D1 CAB R1 -0.653978178594122 -0.15910510749630 0.90507729153852 R2 0.015557641181675 -0.73944224596032 0.23484927168787 D2 C BA R1 0.18843559757623 0.207589297797905 -0.018884844424975 R2 1.87387725184456 0.050349118287824 -1.796404635019739 Dlist - list(D1, D2) lapply(Dlist, function(x) x[, columns]) [[1]] A B C R1 -0.15910510749630 0.90507729153852 -0.653978178594122 R2 -0.73944224596032 0.23484927168787 0.015557641181675 [[2]] A B C R1 -0.018884844424975 0.207589297797905 0.18843559757623 R2 -1.796404635019739 0.050349118287824 1.87387725184456 How can the results from the lapply function be stored back into the original object names? Many thanks. -- Steven [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need help on index for time series object
Megh, I don't know whether this is the best way, but it works: seq(1,length(dat1))[!is.na(dat1)] [1] 1 2 4 5 6 9 10 Jonathan On Tue, Jul 13, 2010 at 1:58 PM, Megh Dal megh700...@yahoo.com wrote: Dear all, Please forgive me if there is a duplicate post; my previous mail perhaps didnt reach the list... Let say I have following time series library(zoo) dat1 - zooreg(rnorm(10), start=as.Date(2010-01-01), frequency=1) dat1[c(3, 7,8)] = NA dat1 2010-01-01 2010-01-02 2010-01-03 2010-01-04 2010-01-05 2010-01-06 2010-01-07 2010-01-08 2010-01-09 2010-01-10 0.31244288 -2.49383257 NA 0.38975582 -1.23040380 -0.09697926 NA NA -0.63171455 0.15867246 Now I want to get the Indies for the non-NA elements of dat1. Means I want to get a vector like: 1,2,4,5,6,9.10 Having a time series vector like dat1, is there any straightforward approach to get that? Thanks and regards, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] left end or right end
Hi, You need to define what you want more exactly--what are the possible conclusions (hypotheses) you want to reach? Based on what you've said, I can think of several different approaches you might want, but I'm not sure which one of them you're actually after. For example: Hypothesis A: The distance between the left endpoints of P and Q is less than (or equal to) the distance between the right endpoints. Hypothesis B: The distance between the right endpoints is smaller. This is a simple binomial test, as David Winsemius suggested. In your most recent email, though, it sounds like you want to take into account how much smaller one distance is than the other. This is more complicated. Another option occurred to me: maybe you don't care which end P is close to, you just want to know whether it's close to one of the ends, or somewhere in the middle. Without knowing what exactly you are trying to test, it's very hard for us to help you. Jonathan On Thu, Jul 1, 2010 at 7:45 AM, ravikumar sukumar ravikumarsuku...@gmail.com wrote: Sorry for posting to the R list. P Q 12, 28 10, 42 2, 5 1, 55 32, 50 22, 63 . there are 1 points of P and Q. The number of points of P and Q are equal (i,e 1). The interval P always overlaps with Q. i,e start1start2 and end1end2. mere calculating whether points have this condition will not be significant start1start2 and end1end2 and the length of P that is length(end1-start1) and Q ie length(end2-start1) differs. Example Case A: start2-start1 =2 end2-end1 = 3 Case B: start2 - start1 =100 end2-end1 = 2 In the above two cases, P is falling on the right end of Q in case B. But it depends on the length(end2-start2). If the length(end2-start2) =15000 in case of B, then it is almost on the middle point. Is there any test or function in R to bring a statistically significant conclusion that midpoint of P or P itself is falling on the left end or right end of Q. sorry once again for posting in this list. Regards [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot qplot bar removing bars when truncating scale
Matthew, The ggplot documentation pages (http://had.co.nz/ggplot2) have the following to say under geom_bar: A bar chart maps the height of the bar to a variable, and so the base of the bar must always been shown to produce a valid visual comparison. Thus, I suspect what you are trying to do may be intentionally (whether by omission or commission) broken. Of course, there are ways around it--you could make your own bar chart using geom_rect, for example. Jonathan On Wed, Jun 30, 2010 at 9:12 AM, ml692787 matthew.lester@gmail.comwrote: I'm having problems with this example, it is posted with reproduceable code below, both with the normal 0-6 scale and the desired 3-6 scale (with bars removed). How can I get the graph to have the desired 3-6 scale without removing the bars. Thanks! #Data mean=as.numeric(c(5.117647059,5,4.947368421,4.85,4.6875,4.545454545,4.473684211,4.470588235,4.428571429,4.08333,3.421052632,3.235294118)) data=as.data.frame(cbind(mean,c(Achievement,Achievement,Achievement,Impact,Achievement,Achievement,Achievement,Impact,Impact,Impact,Impact,Impact),c(Update knowledge and skills,Meet requirements for current position,Discover new job opportunities,Discover new job opportunities,Transition to a new job,Meet requirements for certificaiton,Personal enrichment,Update knowledge and skills,Meet requirements for current position,Meet requirements for certificaiton,Personal enrichment,Transition to a new job))) colnames(data)=c(mean,variable,Q) data[,1]=mean #Plot p=qplot(data=data,data$Q,data$mean,fill=data$variable,geom=bar,stat=identity,position=dodge,binwidth=2,ylab=NULL,xlab=NULL,width=.75) #With 0-6 Scale p + scale_x_discrete(expand=c(0,0)) + scale_y_continuous(limits=c(0,7),breaks=seq(from=0,to=6,by=.5),expand=c(0,0)) + coord_flip() + scale_fill_manual(values=c(darkmagenta,lightgoldenrod1)) + opts( panel.background = theme_rect(colour = NA), panel.background = theme_blank(), panel.grid.minor = theme_blank(), axis.title.x= theme_blank(), axis.title.y= theme_blank(), axis.text.y=theme_text(size=12,hjust=1), legend.text=theme_text(size=14) ) #With 3-6 Scale (Bars Deleted) p + scale_x_discrete(expand=c(0,0)) + scale_y_continuous(limits=c(3,6),breaks=seq(from=3,to=6,by=.5),expand=c(0,0)) + coord_flip() + scale_fill_manual(values=c(darkmagenta,lightgoldenrod1)) + opts( panel.background = theme_rect(colour = NA), panel.background = theme_blank(), panel.grid.minor = theme_blank(), axis.title.x= theme_blank(), axis.title.y= theme_blank(), axis.text.y=theme_text(size=12,hjust=1), legend.text=theme_text(size=14) ) There is probably an option I'm missing or maybe my data should be set up differently, any help would be much appreciated!! -- View this message in context: http://r.789695.n4.nabble.com/ggplot-qplot-bar-removing-bars-when-truncating-scale-tp2272735p2272735.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why the variation when creating .pdf file output for my plots?
Karl, dev2bitmap runs its output through Ghostscript, and I assume that the difference is somehow due to that. I can't say whether Ghostscript is decreasing the file quality or just doing something clever, though. Jonathan On Wed, Jun 30, 2010 at 10:30 AM, Karl Brand k.br...@erasmusmc.nl wrote: Thank you Erik! That works nicely now. The file size in (in kilobytes) is equal to the FileSave AsPDF method. Still curious why the file sizes (in Kb), differ by a factor of ~2 between the two methods: pdf() dev2bitmap(method = pdf) I'm just *assuming* here that file size is inidcative of image quality. Is this assumption correct? If so, how would one increase .pdf quality within the dev2bitmap() function? With thanks for any further thoughts on this, cheers, Karl On 6/30/2010 4:58 PM, Erik Iverson wrote: Method 3: pdf(file=my_plot.pdf, paper=a4) dev.off() The `pdf` function opens a *new* graphics device, you then send output to the device before calling dev.off(), e.g., pdf(file = my_plot.pdf) plot(1:10, 1:10) dev.off() -yields a .pdf file of 1kb (same plot example) and returns the following error when attempting to open with Adobe acrobat: There was an error opening this document. This file cannot be opened because it has no pages. -- Karl Brand k.br...@erasmusmc.nl Department of Genetics Erasmus MC Dr Molewaterplein 50 3015 GE Rotterdam P +31 (0)10 704 3409 | F +31 (0)10 704 4743 | M +31 (0)642 777 268 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Embed function strips out date index
Manussawee, What type of object is series? We could help you better if we could reproduce exactly what you are trying to do, which requires more information (you made a good start by including data and code, though). The output of diff is a vector (time series, ...) with length one less than the input. embed(..., 2) also returns an object with length one less than the input. This is why you noticed that series.d had a different length than series (shorter by exactly 2, I bet). You should be able to figure out what you want to do from there. Since I don't know how you want the dates to line up, I can't really help you anymore from here. Jonathan On Wed, Jun 30, 2010 at 2:32 PM, Manussawee Sukunta msuku...@illinoisalumni.org wrote: Hi, I'm having especially hard time today and couldn't find any clue/answer through the internet. I hope you can help. I'm in a process of writing a script to estimate error correction model, and I was following an example in Bernhard Pfaff's Analysis of Integrated and Cointegrated Time Series with R. I have the following price data: head(series,15) PX_SETTLE PX_SETTLE.1 2009-01-024515.0 925.50 2009-01-054540.5 927.50 2009-01-064603.5 930.50 2009-01-074470.5 905.25 2009-01-084474.5 906.75 2009-01-094430.5 885.50 2009-01-124402.0 868.00 2009-01-134343.5 868.50 2009-01-144130.5 839.75 2009-01-154070.5 839.25 2009-01-164129.5 848.50 2009-01-204032.0 806.00 2009-01-214018.0 836.75 2009-01-224011.0 825.50 2009-01-233998.0 823.50 Then I defined series.d = embed(diff(series),dim=2) which resulted in head(series.d,15) [,1] [,2] [,3] [,4] [1,] 25.5 2.00 NA NA [2,] 63.0 3.00 25.5 2.00 [3,] -133.0 -25.25 63.0 3.00 [4,]4.0 1.50 -133.0 -25.25 [5,] -44.0 -21.254.0 1.50 [6,] -28.5 -17.50 -44.0 -21.25 [7,] -58.5 0.50 -28.5 -17.50 [8,] -213.0 -28.75 -58.5 0.50 [9,] -60.0 -0.50 -213.0 -28.75 [10,] 59.0 9.25 -60.0 -0.50 [11,] -97.5 -42.50 59.0 9.25 [12,] -14.0 30.75 -97.5 -42.50 [13,] -7.0 -11.25 -14.0 30.75 [14,] -13.0 -2.00 -7.0 -11.25 [15,] 169.0 7.25 -13.0 -2.00 The new data series.d now has no date index. I'm not sure how to get it back. I tried to xts -- order.by = index(series), but the vector lengths are now not the same. I feel like the answer might be obvious, but I just can't see it. Again, I tried searching various forums and sites, but I couldn't find my answer. I feel like I'm just going around a circle. I hope someone can help me and shed some light on this problem. Thank you, Manussawee __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vertical subtraction in dataframes
Hello, On Fri, Mar 12, 2010 at 3:27 PM, Sam Albers tonightstheni...@gmail.comwrote: Hello all, I have not been able to find an answer to this problem. I feel like it might be so simple though that it might not get a response. Suppose I have a dataframe like the one I have copied below (minus the 'calib' column). I wish to create a column like calib where I am subtracting the 'Count' when 'stain' is 'none' from all other 'Count' data for every value of 'rep'. This is sort of analogous to putting a $ in front of the number that identifies a cell in a spreadsheet environment. Specifically I need some like this: mydataframe$calib - Count - (Count when stain = none for each value rep) Any thoughts on how I might accomplish this? Here's one way: b - a[(a$stain==none), Count] a$calib - a$Count - b[a$rep] Note that it only works if the values of rep are integers starting with 1 and increasing sequentially (1, 2, 3, ...) Jonathan Thanks in advance. Sam Note: I've already calculated the calib column in gnumeric for clarity. rep Count stain calib 1 1522 none 0 1 147 syto -1375 1 544.8 sytolec -977.2 1 2432.6 sytolec 910.6 1 234.6 sytolec -1287.4 2 5699.8 none 0 2 265.6 syto -5434.2 2 329.6 sytolec -5370.2 2 383 sytolec -5316.8 2 968.8 sytolec -4731 3 2466.8 none 0 3 1303 syto -1163.8 3 1290.6 sytolec -1176.2 3 110.2 sytolec -2356.6 3 15086.8 sytolec 12620 -- * Sam Albers Geography Program University of Northern British Columbia University Way Prince George, British Columbia Canada, V2N 4Z9 phone: 250 960-6777 * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] see the example and help me
Hi, On Thu, Mar 11, 2010 at 3:46 AM, chinna durgache...@gmail.com wrote: Hi Peter konings, Sorry man the forecasted values i have given wrong once again see my question and please give me the answer. snip This is the forecasted report that i get using the reporting tool cognos(BI Reporting Tool). is this is possible with the R project. If possible can u please tell me the way. Certainly. Here's a really simple solution: Load the data into R using read.table (this may involve cleaning up the dollar amounts). It looks linear (scatterplot of revenue and quarter_index), so fit a linear model (Revenue ~ quarter_index) with lm(). Use the object created and a dataframe of what values you want to predict (probably quarter_index=seq(1,16)) with the predict() command. For a bit more information, see http://cran.r-project.org/doc/manuals/R-intro.html#Linear-models and the following section. Also try ?lm, ?predict. (On a side note, my predicted values are consistently about $50,000 higher than the ones you got from what were using. Since I don't know what exactly your tool is doing, I can't tell you why that is). If you want to take into account the fact that it's actually a time series, R has plenty of tools for that too. Jonathan Thanks in advance chinna. -- View this message in context: http://n4.nabble.com/see-the-example-and-help-me-tp1587229p1588761.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.