[R] how to convert a table to adjacency matrix used in social network analysis?
Hi Guys, Do you any one know how to convert a long format table to an adjacency matrix used in sna? The long table looks like p1 p2 counts a b 100 a c 200 a d 100 b c 80 b d 90 b e 100 c d 100 c e 40 d e 60 and I want to convert it to an adjacency matrix which can be used in sna? Any methods will be appreciated! btw, besides sna package, is there any better package can be used in social network analysis, specially good at plotting? Thanks in advance! Regards, -- Samuel Wu http://webclipping.com.cn [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loading user defined functions autometically each time I start R
Hi, an easy way, is to save your functions in a file and have it in your working directoy and sources it. i.e. source(myfile.r) otherwise you can try to build your own package, see ?package.skeleton Hope it helps Regards A - Messaggio originale - Da: Arun Kumar Saha [EMAIL PROTECTED] A: [EMAIL PROTECTED] [EMAIL PROTECTED] Inviato: Mercoledì 27 febbraio 2008, 8:03:26 Oggetto: [R] Loading user defined functions autometically each time I start R Hi all, I wrote some user defined function for my own. Now I want to get a mechanism so that every time I start R, those function will automatically be loaded in R without manually copying pasting. Can gurus here pls tell me how to do that? Or I have to build my own packages bundled with those functions. However I am not familiar in writing R package yet. Regards, [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ___ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot Principal component analysis
SNN wrote: Hi, I have matrix of 300,000*115 (snps*individual). I ran the PCA on the covariance matrix which has a dimention oof 115*115. I have the first 100 individuals from group A and the rest of 15 individuals from group B. I need to plot the data in two and 3 dimentions with respect to PC1 and PC2 and (in 3D with respect to PC1, PC2 and PC3). I do not know how to have the plot ploting the first 100 points corresponding to group A in red (for example) and the rest of the 15 points in Blue? i.e I want the each group in a diffrent color in the same plot. I appreciate if someone can help. Hi Nancy, (if indeed you are a Nancy and that is not a webnym) Say that your groups really are coded A and B, the group coding variable is called group. You can define a color vector like this: colorvector-ifelse(group==A,red,blue) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] glm binomial with no successes
Dear all, I have a question on glm, family binomial. I do not see significant differences between the levels of a factor (treatment) if all data for a level is 0; and replacing a 0 for a 1 (in fact reducing the difference), then I detect the significant difference that I expected. Is there a way to overcome this problem? or this is an expected behaviour ? Here is an example: s - c(2,4,4,5,0,0,0,0) f - c(31,28,28,28,32,37,34,35) tr - gl(2, 4) sf - cbind(s,f) # numbers of successes and failures summary(glm(sf ~ tr, family=binomial)) # tr ns sf[8,1] - 1 summary(glm(sf ~ tr, family=binomial)) # tr significative ** Thanks for any suggestion Juli -- http://www.ceam.es/pausas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2 boxplot confusion
Chris, 1. This code will give you the boxplot that you want. library(ggplot2) series - c('C2','C4','C8','C10','C15','C20') ids - c('ID1','ID2','ID3') mydata - data.frame(SERIES=rep(series,30),ID=rep(ids,60),VALUE=rnorm(180)) ggplot(mydata, aes(y = VALUE, x = factor(1))) + geom_boxplot() + scale_x_discrete() But the real power of ggplot2 is when you want a boxplot for each category: ggplot(mydata, aes(y = VALUE, x = series)) + geom_boxplot() 2. Overlaying boxplots and density plots seems a bad idea to me as both plots are likey to have a different scale. HTH, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 [EMAIL PROTECTED] www.inbo.be Do not put your faith in what statistics say until you have carefully considered what they do not say. ~William W. Watt A statistical analysis, properly conducted, is a delicate dissection of uncertainties, a surgery of suppositions. ~M.J.Moroney -Oorspronkelijk bericht- Van: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Namens Chris Friedl Verzonden: woensdag 27 februari 2008 5:58 Aan: r-help@r-project.org Onderwerp: [R] ggplot2 boxplot confusion Ultimately my aim is to get a plot of density faceted by 2 factors with a horizontal boxplot overlaid on each density plot in the grid to indicate summary stats. So I've been experimenting with creating boxplots and density plots. Here's some representative data. series = c('C2','C4','C8','C10','C15','C20') ids = c('ID1','ID2','ID3') mydata - data.frame(SERIES=rep(cases,30),ID=rep(ids,60),VALUE=rnorm(180)) 1. Using R default graphics I can create a boxplot of data independent of factors as follows: boxplot(mydata$VALUE) But I can't see how to do this with ggplot2. All the examples in the help show x and y aesthetics. How to boxplot a single vector? (I saw a reference to a group parameter in R-help somewhere but can't find it in the ggplot2 help pages. Is this a case of group = identity ?) 2. I've read the density plot help and noticed the reference to ..density.. as a means to pass density data instead of original data. But I can't seem to get a boxplot to overlay a density plot. This is what I've got so far with consequent error message: m - ggplot(mydata, aes(x=VALUE)) m + geom_density()+ geom_boxplot(aes(x=..density..)) Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 0, 180 I've tried y=..density.., both x= and y = ..density.. and neither and all fail somehow. Problem is I don't really understand what I'm doing at this point. So can anyone help me out with this? thanks -- View this message in context: http://www.nabble.com/ggplot2-boxplot-confusion-tp15706116p15706116.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] periodic term
Hello, I'm trying to built a model for a time-series analysis with an periodic term for seasonality. I've tried both harmonic (package spatstat) and periodicSpline (package splines). The former don't allow a periodic constraint, while I've not been able to use the latter within a dataframe with several repetition of the month series along the years. Can you suggest some options or new commands? Thanks a lot Antonio Gasparrini Public and Environmental Health Research Unit (PEHRU) London School of Hygiene Tropical Medicine Keppel Street, London WC1E 7HT, UK Office: 0044 (0)20 79272406 - Mobile: 0044 (0)79 64925523 www.lshtm.ac.uk/pehru/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to specify ggplot2 facet plot order
Chris, The order of the facets row or column depend on the order in the associated factor. The code below is what you want. Note that I have changed 'cases' in 'series' because your example was not reproducible as a definition of 'cases' was missing. library(ggplot2) series - c('C2','C4','C8','C10','C15','C20') series - factor(series, levels = series) ids - c('ID1','ID2','ID3') mydata - data.frame(SERIES=rep(series,30),ID=rep(ids,60),VALUE=rnorm(180)) qplot(VALUE, data = mydata, geom=density, facets=SERIES ~ ID) Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 [EMAIL PROTECTED] www.inbo.be Do not put your faith in what statistics say until you have carefully considered what they do not say. ~William W. Watt A statistical analysis, properly conducted, is a delicate dissection of uncertainties, a surgery of suppositions. ~M.J.Moroney -Oorspronkelijk bericht- Van: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Namens Chris Friedl Verzonden: woensdag 27 februari 2008 4:39 Aan: r-help@r-project.org Onderwerp: [R] how to specify ggplot2 facet plot order Hi, new to R and ggplot2. I've been trying to get a facet plot in which the order of the facets is as I require, rather than ordered numerically, alphabetically, by Roman numerals, mean (answers to these were posted here after much searching). Here's some test code to demonstrate what I get. series = c('C2','C4','C8','C10','C15','C20') ids = c('ID1','ID2','ID3') mydata - data.frame(SERIES=rep(cases,30),ID=rep(ids,60),VALUE=rnorm(180)) qplot(VALUE, data = mydata, geom=density, facets=SERIES ~ ID) the facet rows are plotted in alpha order, namely, C10, C15, C2, C20, C4, C8. I want them plotted in the order specified by series. I've looked at reorder to reorder the factor called SERIES but that requires a vector of the same length upon which the ordering is defined through some function. I guess my noobness with all things R has brought me to a grinding halt. I can conceive an algorithm but don't know how to implement. 1. create myordervector of length(SERIES) comprising integers in a mapping C2: 1, C4: 2, C8: 3 ... 2. reorder using this vector as follows: mydata - with(mydata, reorder(SERIES, myordervector , as.numeric)) 3. Then plot as above Is this remotely sensible? Perhaps the order is determined at plot time rather than from the data.frame. In this case I guess the reordering before plotting is moot. I'm stuck. Can anyone help out please. thanks. -- View this message in context: http://www.nabble.com/how-to-specify-ggplot2-facet-plot-order-tp15705404 p15705404.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple lines with a different color assigned to each line (corrected code)
Judith Flores wrote: Sorry, I just realized I didn't type in the correct names of the variables I am working with, this is how it should be: plot(1,1,type=n) for (i in summ$tx) { points(summ$timep[summ$tx==i],summ$mn[summ$tx==i]) lines(summ$timep[summ$tx==i],summ$mn[summ$tx==i]) } Hi Judith, I think this might help: plot(1,1,type=n) # define your colors here # you can generate the vector in many ways ncolors-length(unique(summ$tx)) colorvector-rainbow(ncolors) colorindex-1 for(i in summ$tx) { points(summ$timep[summ$tx==1],summ$mn[summ$tx==i], type=b,col=colorvector[colorindex]) colorindex-colorindex+1 } This may also answer the query from Valentin Bellassen. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple linear regression with for loop
Markus Mühlbacher wrote: Hi everyone! I have an array containing the following fields for over hundred compounds: cpd, activity, fixterm, energy1, energy2, energy3, ... I want to run a multiple linear regression on all entries of an array. Therefore I tried to do this with a for loop. (Maybe there is a direct way of calculating it using apply, but I don't know that either.) Actually i tried the following code: ... attach(data) Now, I guess data is a data.frame, not an array as mentioned above. I'd suggest to supply it to the data argument of lm(), don't attach. for(i in 1:length(cpd)) { fitted.model - lm(activity ~ fixterm + i) + i does not make any sense here. coef(fitted.model) If you want to coefficients to be printed, you have to print() them. Probably you want to store the results from coef() in some object in order to use the results further on... Uwe Ligges } ... Unfortunatly this loop doesn't give the intended correlation coefficients of each regression. If I insert a line print(i) into the loop the desired values for i are printed correctly. Only the coefficient outputs are missing. Probably the solution is very near, but I just can't see it. Many thanks in advance, Markus - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loading user defined functions autometically each time I start R
I wrote some user defined function for my own. Now I want to get a mechanism so that every time I start R, those function will automatically be loaded in R without manually copying pasting. Can gurus here pls tell me how to do that? Or I have to build my own packages bundled with those functions. These instructions are for Windows, there may be a slight difference on other platforms. In R_HOME\etc you should have a file named RProfile.site. Inside this file, you can define a .First function, which sources your functions, e.g. .First - function() { source(c://myfunction.r) } See also: Section 10.8 of the Intro to R manual, and http://cran.r-project.org/doc/contrib/Lemon-kickstart/kr_first.html Regards, Richie. Mathematical Sciences Unit HSL ATTENTION: This message contains privileged and confidential inform...{{dropped:20}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave produces gibberish instead of apostrophe in pdf
On Wed, 2008-02-27 at 11:35 +0100, Paul Hiemstra wrote: Dear All, I try to use Sweave to make a document. But when I use the Sweave() command on it and build a pdf with pdflatex (3.141592-1.40.3) my apostrophes are replaced by some gibberish (an 'a' with a hat on it, a capital A with a arc pointing upwards on it and a capital Y with two points on it). If I manually replace the apostrophes using the keyboard, I get a different looking apostrophe and the output is correct. I'm using Debian Linux (Lenny) with TexLive, Kile and R 2.6.1. I presume that R is running in a UTF-8 locale on your Debian box (or some other locale that is different to the one pdflatex is working in); the fancy quotes used in some print methods in R aren't available in all locale/font encodings and these get interpreted as the gibberish you are seeing. Stick this in your preamble and see if it works (you might need to install a TexLive package from your usual Debian repository to get this [LaTeX] package installed): \usepackage[utf8x]{inputenc} it did for me on my Fedora box when I first came across this issue. HTH G This is a sample, the problem is in the lm output in the Signif. codes line: \documentclass[a4paper,10pt]{article} \title{Spam} \author{F. Bar} \begin{document} reg= n - 50 x - seq(1, n) a.true - 3 b.true - 1.5 y.true - a.true + b.true * x s.true - 17.3 y - y.true + s.true * rnorm(n) out1 - lm(y ~ x) summary(out1) @ \end{document} And the resulting .tex file: \documentclass[a4paper,10pt]{article} \title{Spam} \author{F. Bar} \usepackage{Sweave} \begin{document} \begin{Schunk} \begin{Sinput} n - 50 x - seq(1, n) a.true - 3 b.true - 1.5 y.true - a.true + b.true * x s.true - 17.3 y - y.true + s.true * rnorm(n) out1 - lm(y ~ x) summary(out1) \end{Sinput} \begin{Soutput} Call: lm(formula = y ~ x) Residuals: Min 1Q Median 3Q Max -31.9565 -9.4745 -0.1708 7.3759 44.6538 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -0.07386 4.64712 -0.016 0.987 x 1.57405 0.15860 9.924 3.25e-13 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 .’ 0.1 ‘ ’ 1 Residual standard error: 16.18 on 48 degrees of freedom Multiple R-Squared: 0.6723, Adjusted R-squared: 0.6655 F-statistic: 98.49 on 1 and 48 DF, p-value: 3.245e-13 \end{Soutput} \end{Schunk} \end{document} cheers and thanks for any help, Paul -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to specify ggplot2 facet plot order
Chris, You can use the as.is or stringsAsFactors argument in read.csv to prevent that strings are converted into factors. See ?read.csv for the details. Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 [EMAIL PROTECTED] www.inbo.be Do not put your faith in what statistics say until you have carefully considered what they do not say. ~William W. Watt A statistical analysis, properly conducted, is a delicate dissection of uncertainties, a surgery of suppositions. ~M.J.Moroney -Oorspronkelijk bericht- Van: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Namens Chris Friedl Verzonden: woensdag 27 februari 2008 11:23 Aan: r-help@r-project.org Onderwerp: Re: [R] how to specify ggplot2 facet plot order Hi Thierry thanks for your help. I've been searching the R-help archives for posts by you and Hadley as a way to learn ggplot details so I appreciate your help to the R community. I wasn't aware of the levels option in the factors function. In my real application I get the data using read.csv and factor assignment happens automagically. Is there a way to control the level assignments at the input stage? Can't see anything to that effect in help. Sorry for the cut paste error ... getting used to Xemacs on Windows. I used emacs years ago on Unix but now my environment is Windows. So far I find the Xemacs/Ess combo to be very powerful and flexible. Just need to get used to C-x C-c etc differences. ONKELINX, Thierry wrote: Chris, The order of the facets row or column depend on the order in the associated factor. The code below is what you want. Note that I have changed 'cases' in 'series' because your example was not reproducible as a definition of 'cases' was missing. library(ggplot2) series - c('C2','C4','C8','C10','C15','C20') series - factor(series, levels = series) ids - c('ID1','ID2','ID3') mydata - data.frame(SERIES=rep(series,30),ID=rep(ids,60),VALUE=rnorm(180)) qplot(VALUE, data = mydata, geom=density, facets=SERIES ~ ID) Thierry -- View this message in context: http://www.nabble.com/how-to-specify-ggplot2-facet-plot-order-tp15705404 p15710275.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sweave produces gibberish instead of apostrophe in pdf
Dear All, I try to use Sweave to make a document. But when I use the Sweave() command on it and build a pdf with pdflatex (3.141592-1.40.3) my apostrophes are replaced by some gibberish (an 'a' with a hat on it, a capital A with a arc pointing upwards on it and a capital Y with two points on it). If I manually replace the apostrophes using the keyboard, I get a different looking apostrophe and the output is correct. I'm using Debian Linux (Lenny) with TexLive, Kile and R 2.6.1. This is a sample, the problem is in the lm output in the Signif. codes line: \documentclass[a4paper,10pt]{article} \title{Spam} \author{F. Bar} \begin{document} reg= n - 50 x - seq(1, n) a.true - 3 b.true - 1.5 y.true - a.true + b.true * x s.true - 17.3 y - y.true + s.true * rnorm(n) out1 - lm(y ~ x) summary(out1) @ \end{document} And the resulting .tex file: \documentclass[a4paper,10pt]{article} \title{Spam} \author{F. Bar} \usepackage{Sweave} \begin{document} \begin{Schunk} \begin{Sinput} n - 50 x - seq(1, n) a.true - 3 b.true - 1.5 y.true - a.true + b.true * x s.true - 17.3 y - y.true + s.true * rnorm(n) out1 - lm(y ~ x) summary(out1) \end{Sinput} \begin{Soutput} Call: lm(formula = y ~ x) Residuals: Min 1Q Median 3Q Max -31.9565 -9.4745 -0.1708 7.3759 44.6538 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -0.07386 4.64712 -0.016 0.987 x 1.57405 0.15860 9.924 3.25e-13 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 .’ 0.1 ‘ ’ 1 Residual standard error: 16.18 on 48 degrees of freedom Multiple R-Squared: 0.6723, Adjusted R-squared: 0.6655 F-statistic: 98.49 on 1 and 48 DF, p-value: 3.245e-13 \end{Soutput} \end{Schunk} \end{document} cheers and thanks for any help, Paul -- Drs. Paul Hiemstra Department of Physical Geography Faculty of Geosciences University of Utrecht Heidelberglaan 2 P.O. Box 80.115 3508 TC Utrecht Phone: +31302535773 Fax:+31302531145 http://intamap.geo.uu.nl/~paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to specify ggplot2 facet plot order
Hi Thierry thanks for your help. I've been searching the R-help archives for posts by you and Hadley as a way to learn ggplot details so I appreciate your help to the R community. I wasn't aware of the levels option in the factors function. In my real application I get the data using read.csv and factor assignment happens automagically. Is there a way to control the level assignments at the input stage? Can't see anything to that effect in help. Sorry for the cut paste error ... getting used to Xemacs on Windows. I used emacs years ago on Unix but now my environment is Windows. So far I find the Xemacs/Ess combo to be very powerful and flexible. Just need to get used to C-x C-c etc differences. ONKELINX, Thierry wrote: Chris, The order of the facets row or column depend on the order in the associated factor. The code below is what you want. Note that I have changed 'cases' in 'series' because your example was not reproducible as a definition of 'cases' was missing. library(ggplot2) series - c('C2','C4','C8','C10','C15','C20') series - factor(series, levels = series) ids - c('ID1','ID2','ID3') mydata - data.frame(SERIES=rep(series,30),ID=rep(ids,60),VALUE=rnorm(180)) qplot(VALUE, data = mydata, geom=density, facets=SERIES ~ ID) Thierry -- View this message in context: http://www.nabble.com/how-to-specify-ggplot2-facet-plot-order-tp15705404p15710275.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Calculating monthly var-cov matrix on non-overlapping rooling window basis
let create a 'zoo' object : library(zoo) date.data = seq(as.Date(01/01/01, format = %m/%d/%y), as.Date(06/25/02, format = %m/%d/%y), by = 1) len = length(date.data) data1 = zoo(matrix(rnorm(2*len), nrow = len), date.data ) head(data1) Now I want to create an 3 dimensional array (suppose name var.cov) where, var.cov[,,i] gives the Variance-covariance matrix for i-th month of data1. That is I want to calculate monthly variance-covariance matrix on non-overlapping rolling window basis. Any suggestion? - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm binomial with no successes
On Wed, 27 Feb 2008, juli pausas wrote: Dear all, I have a question on glm, family binomial. I do not see significant differences between the levels of a factor (treatment) if all data for a level is 0; and replacing a 0 for a 1 (in fact reducing the difference), then I detect the significant difference that I expected. This is because you are using the wrong test, one with negligible power. See MASS4 pp.197-8 -- you need to use the LRT, as in drop1(glm(sf ~ tr, family=binomial), test=Chisq) Single term deletions Model: sf ~ tr Df DevianceAICLRT Pr(Chi) none 1.595 17.730 tr 1 24.244 38.379 22.649 1.944e-06 (and in your example you can replace 'drop1' by 'anova'). Is there a way to overcome this problem? or this is an expected behaviour ? Here is an example: s - c(2,4,4,5,0,0,0,0) f - c(31,28,28,28,32,37,34,35) tr - gl(2, 4) sf - cbind(s,f) # numbers of successes and failures summary(glm(sf ~ tr, family=binomial)) # tr ns sf[8,1] - 1 summary(glm(sf ~ tr, family=binomial)) # tr significative ** Thanks for any suggestion Juli -- http://www.ceam.es/pausas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating monthly var-cov matrix on non-overlapping rooling window basis
Perhaps something like this: lapply(split(data1, format(index(data1), %m)), cov) On 27/02/2008, Megh Dal [EMAIL PROTECTED] wrote: let create a 'zoo' object : library(zoo) date.data = seq(as.Date(01/01/01, format = %m/%d/%y), as.Date(06/25/02, format = %m/%d/%y), by = 1) len = length(date.data) data1 = zoo(matrix(rnorm(2*len), nrow = len), date.data ) head(data1) Now I want to create an 3 dimensional array (suppose name var.cov) where, var.cov[,,i] gives the Variance-covariance matrix for i-th month of data1. That is I want to calculate monthly variance-covariance matrix on non-overlapping rolling window basis. Any suggestion? - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to convert a table to adjacency matrix used in social network analysis?
On 2/27/2008 3:13 AM, Samuel wrote: Hi Guys, Do you any one know how to convert a long format table to an adjacency matrix used in sna? The long table looks like p1 p2 counts a b 100 a c 200 a d 100 b c 80 b d 90 b e 100 c d 100 c e 40 d e 60 and I want to convert it to an adjacency matrix which can be used in sna? Any methods will be appreciated! The graph package has some nice tools for this. mydf - data.frame(p1=c('a','a','a','b','b','b','c','c','d'), p2=c('b','c','d','c','d','e','d','e','e'), counts=c(100,200,100,80,90,100,100,40,60)) library(graph) myadjM - ftM2adjM(as.matrix(mydf[,1:2]), W=mydf$counts) myadjM a b c d e a 0 100 200 100 0 b 0 0 80 90 100 c 0 0 0 100 40 d 0 0 0 0 60 e 0 0 0 0 0 btw, besides sna package, is there any better package can be used in social network analysis, specially good at plotting? For plotting I would look into the Rgraphviz package. Here is a simple diagram of the network: library(Rgraphviz) mygraph - ftM2graphNEL(as.matrix(mydf[,1:2]), W=mydf$counts) plot(mygraph) I'm not sure how to incorporate the weights for each edge into the diagram, but maybe that is explained in the documentation for the sna and Rgraphviz packages. Thanks in advance! Regards, -- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave produces gibberish instead of apostrophe in pdf
Hi Gavin, I worked perfectly. Thank you so much. cheers, Paul Gavin Simpson wrote: On Wed, 2008-02-27 at 11:35 +0100, Paul Hiemstra wrote: Dear All, I try to use Sweave to make a document. But when I use the Sweave() command on it and build a pdf with pdflatex (3.141592-1.40.3) my apostrophes are replaced by some gibberish (an 'a' with a hat on it, a capital A with a arc pointing upwards on it and a capital Y with two points on it). If I manually replace the apostrophes using the keyboard, I get a different looking apostrophe and the output is correct. I'm using Debian Linux (Lenny) with TexLive, Kile and R 2.6.1. I presume that R is running in a UTF-8 locale on your Debian box (or some other locale that is different to the one pdflatex is working in); the fancy quotes used in some print methods in R aren't available in all locale/font encodings and these get interpreted as the gibberish you are seeing. Stick this in your preamble and see if it works (you might need to install a TexLive package from your usual Debian repository to get this [LaTeX] package installed): \usepackage[utf8x]{inputenc} it did for me on my Fedora box when I first came across this issue. HTH G This is a sample, the problem is in the lm output in the Signif. codes line: \documentclass[a4paper,10pt]{article} \title{Spam} \author{F. Bar} \begin{document} reg= n - 50 x - seq(1, n) a.true - 3 b.true - 1.5 y.true - a.true + b.true * x s.true - 17.3 y - y.true + s.true * rnorm(n) out1 - lm(y ~ x) summary(out1) @ \end{document} And the resulting .tex file: \documentclass[a4paper,10pt]{article} \title{Spam} \author{F. Bar} \usepackage{Sweave} \begin{document} \begin{Schunk} \begin{Sinput} n - 50 x - seq(1, n) a.true - 3 b.true - 1.5 y.true - a.true + b.true * x s.true - 17.3 y - y.true + s.true * rnorm(n) out1 - lm(y ~ x) summary(out1) \end{Sinput} \begin{Soutput} Call: lm(formula = y ~ x) Residuals: Min 1Q Median 3Q Max -31.9565 -9.4745 -0.1708 7.3759 44.6538 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -0.07386 4.64712 -0.016 0.987 x 1.57405 0.15860 9.924 3.25e-13 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 .’ 0.1 ‘ ’ 1 Residual standard error: 16.18 on 48 degrees of freedom Multiple R-Squared: 0.6723, Adjusted R-squared: 0.6655 F-statistic: 98.49 on 1 and 48 DF, p-value: 3.245e-13 \end{Soutput} \end{Schunk} \end{document} cheers and thanks for any help, Paul -- Drs. Paul Hiemstra Department of Physical Geography Faculty of Geosciences University of Utrecht Heidelberglaan 2 P.O. Box 80.115 3508 TC Utrecht Phone: +31302535773 Fax:+31302531145 http://intamap.geo.uu.nl/~paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave produces gibberish instead of apostrophe in pdf
These are not 'apostrophe's: it's (that is an apostrophe) misleading to call them that. They are single quotes, and there are two sorts if you look carefully. Probably you are using a UTF-8 locale and not telling LaTeX (or us) so. There are several ways to do so, depending on the age of your LaTeX setup. \usepackage[utf8]{inputenc} is one. Alternatively, you can tell R not to use UTF-8 quotes with options(useFancyQuotes=FALSE). Please note the 'at a minimum' information the posting guide asked you for -- it included the locale. On Wed, 27 Feb 2008, Paul Hiemstra wrote: Dear All, I try to use Sweave to make a document. But when I use the Sweave() command on it and build a pdf with pdflatex (3.141592-1.40.3) my apostrophes are replaced by some gibberish (an 'a' with a hat on it, a capital A with a arc pointing upwards on it and a capital Y with two points on it). If I manually replace the apostrophes using the keyboard, I get a different looking apostrophe and the output is correct. I'm using Debian Linux (Lenny) with TexLive, Kile and R 2.6.1. This is a sample, the problem is in the lm output in the Signif. codes line: \documentclass[a4paper,10pt]{article} \title{Spam} \author{F. Bar} \begin{document} reg= n - 50 x - seq(1, n) a.true - 3 b.true - 1.5 y.true - a.true + b.true * x s.true - 17.3 y - y.true + s.true * rnorm(n) out1 - lm(y ~ x) summary(out1) @ \end{document} And the resulting .tex file: \documentclass[a4paper,10pt]{article} \title{Spam} \author{F. Bar} \usepackage{Sweave} \begin{document} \begin{Schunk} \begin{Sinput} n - 50 x - seq(1, n) a.true - 3 b.true - 1.5 y.true - a.true + b.true * x s.true - 17.3 y - y.true + s.true * rnorm(n) out1 - lm(y ~ x) summary(out1) \end{Sinput} \begin{Soutput} Call: lm(formula = y ~ x) Residuals: Min 1Q Median 3Q Max -31.9565 -9.4745 -0.1708 7.3759 44.6538 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -0.07386 4.64712 -0.016 0.987 x 1.57405 0.15860 9.924 3.25e-13 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 .’ 0.1 ‘ ’ 1 Residual standard error: 16.18 on 48 degrees of freedom Multiple R-Squared: 0.6723, Adjusted R-squared: 0.6655 F-statistic: 98.49 on 1 and 48 DF, p-value: 3.245e-13 \end{Soutput} \end{Schunk} \end{document} cheers and thanks for any help, Paul -- Drs. Paul Hiemstra Department of Physical Geography Faculty of Geosciences University of Utrecht Heidelberglaan 2 P.O. Box 80.115 3508 TC Utrecht Phone: +31302535773 Fax:+31302531145 http://intamap.geo.uu.nl/~paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple linear regression with for loop
I'm not sure if this is what you want but if you have a matrix as response, you can use the matrix ~ term: example: x - 1:10 y - rep(rnorm(10,x,0.5),10) dim(y) - c(10,10) y - as.matrix(y) coef(lm(y~x)) Bart Markus quot;Mühlbacherquot; wrote: Hi everyone! I have an array containing the following fields for over hundred compounds: cpd, activity, fixterm, energy1, energy2, energy3, ... I want to run a multiple linear regression on all entries of an array. Therefore I tried to do this with a for loop. (Maybe there is a direct way of calculating it using apply, but I don't know that either.) Actually i tried the following code: ... attach(data) for(i in 1:length(cpd)) { fitted.model - lm(activity ~ fixterm + i) coef(fitted.model) } ... Unfortunatly this loop doesn't give the intended correlation coefficients of each regression. If I insert a line print(i) into the loop the desired values for i are printed correctly. Only the coefficient outputs are missing. Probably the solution is very near, but I just can't see it. Many thanks in advance, Markus - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Multiple-linear-regression-with-for-loop-tp15703017p15711401.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm binomial with no successes
Thank you very much for your reply. Then I understand that would not be correct to perform the test in summary for testing the significance of the different levels of a factor in relation to the first level, including when there are more than 2 levels, as in my real case; at least for binomial regressions. So here a more close-to-real example, with a 3-level factor s - c(rpois(8, 4), rep(0, 4)) f - rpois(12, 30) tr - gl(3, 4) sf - cbind(s,f) drop1(glm(sf ~ tr, family=binomial), test=Chisq) # significant summary(glm(sf ~ tr, family=binomial)) # the 3rd level is not significant from the 1st So I understand that I need to explite the data and perform the two tests separately: drop1(glm(sf ~ tr, family=binomial, subset=(tr %in% c(1, 2))), test=Chisq) # ns as expected drop1(glm(sf ~ tr, family=binomial, subset=(tr %in% c(1, 3))), test=Chisq) # significant, as expected Is this the correct approach? Many thanks Juli On Wed, Feb 27, 2008 at 12:13 PM, Prof Brian Ripley [EMAIL PROTECTED] wrote: On Wed, 27 Feb 2008, juli pausas wrote: Dear all, I have a question on glm, family binomial. I do not see significant differences between the levels of a factor (treatment) if all data for a level is 0; and replacing a 0 for a 1 (in fact reducing the difference), then I detect the significant difference that I expected. This is because you are using the wrong test, one with negligible power. See MASS4 pp.197-8 -- you need to use the LRT, as in drop1(glm(sf ~ tr, family=binomial), test=Chisq) Single term deletions Model: sf ~ tr Df DevianceAICLRT Pr(Chi) none 1.595 17.730 tr 1 24.244 38.379 22.649 1.944e-06 (and in your example you can replace 'drop1' by 'anova'). Is there a way to overcome this problem? or this is an expected behaviour ? Here is an example: s - c(2,4,4,5,0,0,0,0) f - c(31,28,28,28,32,37,34,35) tr - gl(2, 4) sf - cbind(s,f) # numbers of successes and failures summary(glm(sf ~ tr, family=binomial)) # tr ns sf[8,1] - 1 summary(glm(sf ~ tr, family=binomial)) # tr significative ** Thanks for any suggestion Juli -- http://www.ceam.es/pausas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 -- http://www.ceam.es/pausas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problems with quantreg and party package in 2.6.2 ver
I've problem with this pkg: - quantreg - party When i try to run this pkg under 2.6.2 (win 32). the system reports problems to reed the Rblas lib. (Not find the dynamic library links) Any suggestions to solve the problem ... tks, pablo. -- Pablo Fco. Fernández Alvarez __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] column name handling and long labels
Somehow, I don't get how the labels of Hmisc work. My expectation was that if I use the following code and then the print method, I would get an output where the headers are replaced by the labels but I get the normal variable names. How can I get the labels as headers instead in the printed table? df - data.frame(x=seq(1,3),y=seq(4,6)) df - upData(df, labels=c(x=X1,y=X2)) print(df2) Thanks again, Werner Hi, I have two loosely related questions which could make my live again a bit easier: 1) Is there a simple way to select a range of columns in a data frame using column names? I am thinking of something like mydf[1,col4:col8] Try this using builtin data frame anscombe which has columns x1 to x4 followed by y1 to y4: subset(anscombe, select = x3:y2) 2) I have a data frame with many columns and they all have short variable names which is good in most cases but sometimes it would be nice to have also a longer descriptive name / label attached to the variable which could then be used for printing and latex output. Has anybody come up with a convenient way to do that? Right now, I am using always match or merge in case of row names. See ?label in package Hmisc. Many thanks, Werner Lesen Sie Ihre E-Mails auf dem Handy. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plan to build Package to use GRASS from R
Hi Sorry for crossposting, but I think this can be of interest for GRASS and R users. I am planning to write a package to make the use of GRASS from R easier. The idea is to wrap the system call to execute the GRASS command into an R command of the same name. e.g: r.to.vect - function(..., intern=TRUE, ignore.stderr=FALSE) { comm - paste( r.to.vect , ..., sep= ) print(comm) system( comm, intern=intern, ignore.stderr=ignore.stderr ) } My questions are: 1) Is this a good way of doing it, or is giving a named list to the function more usefull? 2) Is there a way to obtain easily all commands from GRASS and the parameters possible and required? Any ideas and comments welcome, Rainer -- Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT) Plant Conservation Unit Department of Botany University of Cape Town Rondebosch 7701 South Africa [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [R-sig-Geo] Plan to build Package to use GRASS from R
On 27/02/2008, Virgilio Gomez-Rubio [EMAIL PROTECTED] wrote: Hi, if you are refering to spgrass6, yes, But if I want to execute commands in GRASS, I still have to use system(...) OK. I just wanted to check... :) Not sure what ROger Bivand will think, but maybe it would be better to add waht you develop to that package. Sounds like a perfect suggestion - otherwise everything could become to fragmented and difficult to use. Regards, Virgilio -- -- Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT) Plant Conservation Unit Department of Botany University of Cape Town Rondebosch 7701 South Africa [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Raw histogram plots
If I understand: x - rnorm(1e6) out - tapply(x, ceiling(x), length) plot(as.numeric(names(out)), out) On 27/02/2008, Andre Nathan [EMAIL PROTECTED] wrote: On Wed, 2008-02-27 at 14:15 +1300, Peter Alspach wrote: If I understand you correctly, you could try a barplot() on the result of table(). Hmm, table() does the counting exactly the way I want, i.e., just counting individual values. Is there a way to extract the counts vs. the values from a table, so that I can pass them as the x and y arguments to plot()? Thanks, Andre __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Raw histogram plots
On Wed, 2008-02-27 at 14:15 +1300, Peter Alspach wrote: If I understand you correctly, you could try a barplot() on the result of table(). Hmm, table() does the counting exactly the way I want, i.e., just counting individual values. Is there a way to extract the counts vs. the values from a table, so that I can pass them as the x and y arguments to plot()? Thanks, Andre __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Highlighting different series with colors
Thanks, it works fine except that 7 colors are repeated twice (so that one color corresponds to two types). I tried the following but it makes things worse: the legend disappears and I get only 4 different colors: pan-function(x,y) { panel.superpose(x,y,subscripts=coef$country,groups=coef$country, col=1:14)} xyplot(coef$a~coef$b,group=coef$country,auto.key=T, panel=pan, xlim=c(-b_max,b_max),ylim=c(-a_max,a_max),xlab=intercept,ylab=slope) Any idea? In any case, thanks for the previous answer. Valentin Henrique Dallazuanna a écrit : One option is use lattice: require(lattice) xyplot(x~y, data=your.data, group=type, auto.key=T) On 25/02/2008, Valentin Bellassen [EMAIL PROTECTED] wrote: Hello, I have a data frame with 3 vectors $x, $y, and $type. I would like to plot $x~$y and having different colors for the corresponding points, one for each level of $type. Would someone know how to do that? Is it possible to then generate a legend automatically? Valentin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Running randomForests on large datasets
There are a couple of things you may want to try, if you can load the data into R and still have enough to spare: - Run randomForest() with fewer trees, say 10 to start with. - Run randomForest() with nodesize set to something larger than the default (5 for classification). This puts a limit on the size of the trees being grown. Try something like 21 and see if that runs, and adjust accordingly. HTH, Andy From: Nagu Hi, I am trying to run randomForests on a datasets of size 50X650 and R pops up memory allocation error. Are there any better ways to deal with large datasets in R, for example, Splus had something like bigData library. Thank you, Nagu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Notice: This e-mail message, together with any attachme...{{dropped:15}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Label outliers in boxplot
Sorry if this is a stupid question, I'm a beginner and I didn't find help in manuals, archives, or web I have a z matrix of this type: a b c d 786-30.s2.gpr 0.2186214 1.3374486 89.37757 9.066358786-31.s1.gpr 1.0931070 1.3245885 81.37860 16.203704786-31.s2.gpr 0.5529835 1.3374486 86.43261 11.676955786-32.s1.gpr 0.6815844 1.3374486 83.96348 14.017490786-32.s2.gpr 0.8101852 1.2860082 84.36214 13.438786786-33.s1.gpr 0.2443416 1.3374486 85.59671 12.821502786-33.s2.gpr 0.2186214 1.3374486 88.55453 9.889403786-34.s1.gpr 0.9387860 1.3245885 73.91975 23.816872786-34.s2.gpr 0.6172840 1.3374486 79.51389 18.531379786-35.s1.gpr 0.5658436 1.3374486 81.52006 16.576646786-35.s2.gpr 0.347 1.3374486 78.60082 19.714506 simply when I boxplot i use boxplot(as.data.frame(z)) I would like the rownames to appear close to the outliers in the boxplot. There is a command to do it ? I've seen the identify() function but I'm not able to obtain any results Could someone help me ? Thanks in advance Giulio _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] image analysis package
Dear all, I would like to know whether any package is available for microarray image analysis. -- Regards, Abhilash [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] score test statistic in logistic regression
Hi, looking for a function or syntax to estimate the score test in logistic regression for the null hypothesis b1=0 in the model logit(p)=b0+ b1*x +b2*z. Data comes from the binomial distribution (n,p). Thanks, ben __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2 boxplot confusion
Thanks Thierry. But this leads to a couple more questions if you don't mind. 1. I tried to extend your example to a grid by the facet_grid command with the aim of getting a boxplot of VALUE according to two factors SERIES and ID. However whatever syntax I use give me an error. For example: ggplot(mydata, aes(y = VALUE, x = factor(1))) + geom_boxplot() + scale_x_discrete() +facet_grid(SERIES ~ ID) Error: position_dodge requires the following missing aesthetics: x I tried x=c(SERIES, ID) etc etc but they failed. Yet I know I can get a grid of density plots with qplot as follows: ggplot(mydata, aes(x = VALUE, y = ..density..)) + geom_density() + facet_grid(ID ~ SERIES) Yet it doesn't work if I say geom_boxplot. I hope you can help me understand where I've gone wrong. 2. On your point about overlaying box and density plots, I'm not sure I understand. I thought a a boxplot is just a particular view of a density function, showing median, interquartile range etc. The vertical scale is the same as the density functions horizontal scale, isn't it? For example in the dummy dataset above: summary(mydata$VALUE) Min. 1st Qu. Median Mean 3rd Qu. Max. -2.54400 -0.64690 0.07417 0.08289 0.77830 2.75900 and ggplot(mydata, aes(x = VALUE, y = ..density..)) + geom_density() shows a density plot that shows features on the x-axis that are visually close to the summary features. My intent was to plot density because the box plot doesn't reveal shape details such as multiple modes, and to augment with a narrow boxplot to show some density features such as the position of the median, IQR etc. Or perhaps I've completely misunderstood your point (highly likely I think). Thanks again for your help. Much appreciated. ONKELINX, Thierry wrote: Chris, 1. This code will give you the boxplot that you want. library(ggplot2) series - c('C2','C4','C8','C10','C15','C20') ids - c('ID1','ID2','ID3') mydata - data.frame(SERIES=rep(series,30),ID=rep(ids,60),VALUE=rnorm(180)) ggplot(mydata, aes(y = VALUE, x = factor(1))) + geom_boxplot() + scale_x_discrete() But the real power of ggplot2 is when you want a boxplot for each category: ggplot(mydata, aes(y = VALUE, x = series)) + geom_boxplot() 2. Overlaying boxplots and density plots seems a bad idea to me as both plots are likey to have a different scale. HTH, Thierry -- View this message in context: http://www.nabble.com/ggplot2-boxplot-confusion-tp15706116p15713934.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Raw histogram plots
On Feb 27, 2008, at 8:16 AM, Andre Nathan wrote: On Wed, 2008-02-27 at 14:15 +1300, Peter Alspach wrote: If I understand you correctly, you could try a barplot() on the result of table(). Hmm, table() does the counting exactly the way I want, i.e., just counting individual values. Is there a way to extract the counts vs. the values from a table, so that I can pass them as the x and y arguments to plot()? x - table(rbinom(20,2,0.5)) plot(names(x),x) should do it. You can also try just plot(x). Use prop.table on table if you want the relative frequencies instead. Thanks, Andre Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Raw histogram plots
Andre Nathan wrote: On Wed, 2008-02-27 at 14:15 +1300, Peter Alspach wrote: If I understand you correctly, you could try a barplot() on the result of table(). Hmm, table() does the counting exactly the way I want, i.e., just counting individual values. Is there a way to extract the counts vs. the values from a table, so that I can pass them as the x and y arguments to plot()? Thanks, Andre Also take a lot at the Hmisc package's spike histogram-related functions such as histSpike and scat1d. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question
Dear sir, I have this persistent problem trying to use the lmer function. After loading the lmer4 library and submitted my input command Model1- lmer(IndividualNum~BaitType*TrapType+( 1 |dTransect/dTrapStation/Expedition/Community), family=poisson(link=log)) It does fine but when I want it to display my model summary(Model1) I get this strange error message. Error in printMer(object) : no slot of name status for this object of class table What am I not doing right? And how can I rectify this problem. Regards Kwaku - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot Principal component analysis
Jim Lemon wrote: SNN wrote: Hi, I have matrix of 300,000*115 (snps*individual). I ran the PCA on the covariance matrix which has a dimention oof 115*115. I have the first 100 individuals from group A and the rest of 15 individuals from group B. I need to plot the data in two and 3 dimentions with respect to PC1 and PC2 and (in 3D with respect to PC1, PC2 and PC3). I do not know how to have the plot ploting the first 100 points corresponding to group A in red (for example) and the rest of the 15 points in Blue? i.e I want the each group in a diffrent color in the same plot. I appreciate if someone can help. Hi Nancy, (if indeed you are a Nancy and that is not a webnym) Say that your groups really are coded A and B, the group coding variable is called group. You can define a color vector like this: colorvector-ifelse(group==A,red,blue) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hi Nancy, in your case you may also use inertia ellipses to represent your groups, in addition to different colors. Here is an example using a microsatellite dataset from adegenet (but valid for SNPs of course): # library(ade4) library(adegenet) data(microbov) # dataset # replace missing values obj=na.replace(microbov,method=mean) # perform your pca, keep 3 axes pca1=dudi.pca(obj$tab,scannf=FALSE,nf=3,scale=FALSE) # plot the result s.class(pca1$li,obj$pop) s.class(pca1$li,obj$pop,col=sample(colors(),15)) # here, replace col by the appropriate vector of colors. # The resulting graphic represents each genotype by a point, and adds ellipses of different color for each group; each ellipse represents 95 % of the inertia of the corresponding group. The more ellipses overlap, the less your groups are differentiated on the factorial plane. Cheers, Thibaut. -- ## Thibaut JOMBART CNRS UMR 5558 - Laboratoire de Biométrie et Biologie Evolutive Universite Lyon 1 43 bd du 11 novembre 1918 69622 Villeurbanne Cedex Tél. : 04.72.43.29.35 Fax : 04.72.43.13.88 [EMAIL PROTECTED] http://lbbe.univ-lyon1.fr/-Jombart-Thibaut-.html?lang=en http://pbil.univ-lyon1.fr/software/adegenet/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2 boxplot confusion
Chris, 1. This will make more sense. ggplot(mydata, aes(y = VALUE, x = SERIES)) + geom_boxplot() + facet_grid(.~ ID) 2. Now I think I understand want you want. I'm affraid that won't be easy because you're trying to mix continuous variables with categorical ones on the same scale. A density plot has two continuous scales: VALUE and it's density. The boxplot has a continuous scale (VALUE) and the other is categorical. Maybe Hadley knows a solution for your problem. Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 [EMAIL PROTECTED] www.inbo.be Do not put your faith in what statistics say until you have carefully considered what they do not say. ~William W. Watt A statistical analysis, properly conducted, is a delicate dissection of uncertainties, a surgery of suppositions. ~M.J.Moroney -Oorspronkelijk bericht- Van: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Namens Chris Friedl Verzonden: woensdag 27 februari 2008 15:08 Aan: r-help@r-project.org Onderwerp: Re: [R] ggplot2 boxplot confusion Thanks Thierry. But this leads to a couple more questions if you don't mind. 1. I tried to extend your example to a grid by the facet_grid command with the aim of getting a boxplot of VALUE according to two factors SERIES and ID. However whatever syntax I use give me an error. For example: ggplot(mydata, aes(y = VALUE, x = factor(1))) + geom_boxplot() + scale_x_discrete() +facet_grid(SERIES ~ ID) Error: position_dodge requires the following missing aesthetics: x I tried x=c(SERIES, ID) etc etc but they failed. Yet I know I can get a grid of density plots with qplot as follows: ggplot(mydata, aes(x = VALUE, y = ..density..)) + geom_density() + facet_grid(ID ~ SERIES) Yet it doesn't work if I say geom_boxplot. I hope you can help me understand where I've gone wrong. 2. On your point about overlaying box and density plots, I'm not sure I understand. I thought a a boxplot is just a particular view of a density function, showing median, interquartile range etc. The vertical scale is the same as the density functions horizontal scale, isn't it? For example in the dummy dataset above: summary(mydata$VALUE) Min. 1st Qu. Median Mean 3rd Qu. Max. -2.54400 -0.64690 0.07417 0.08289 0.77830 2.75900 and ggplot(mydata, aes(x = VALUE, y = ..density..)) + geom_density() shows a density plot that shows features on the x-axis that are visually close to the summary features. My intent was to plot density because the box plot doesn't reveal shape details such as multiple modes, and to augment with a narrow boxplot to show some density features such as the position of the median, IQR etc. Or perhaps I've completely misunderstood your point (highly likely I think). Thanks again for your help. Much appreciated. ONKELINX, Thierry wrote: Chris, 1. This code will give you the boxplot that you want. library(ggplot2) series - c('C2','C4','C8','C10','C15','C20') ids - c('ID1','ID2','ID3') mydata - data.frame(SERIES=rep(series,30),ID=rep(ids,60),VALUE=rnorm(180)) ggplot(mydata, aes(y = VALUE, x = factor(1))) + geom_boxplot() + scale_x_discrete() But the real power of ggplot2 is when you want a boxplot for each category: ggplot(mydata, aes(y = VALUE, x = series)) + geom_boxplot() 2. Overlaying boxplots and density plots seems a bad idea to me as both plots are likey to have a different scale. HTH, Thierry -- View this message in context: http://www.nabble.com/ggplot2-boxplot-confusion-tp15706116p15713934.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2 boxplot confusion
Now I think I understand want you want. I'm affraid that won't be easy because you're trying to mix continuous variables with categorical ones on the same scale. A density plot has two continuous scales: VALUE and it's density. The boxplot has a continuous scale (VALUE) and the other is categorical. Maybe Hadley knows a solution for your problem. Well one idea is: ggplot(diamonds, aes(x = price)) + geom_density(aes(min = -..density.., adjust= 0.5),fill=grey50, colour=NA) + facet_grid(. ~ cut) + coord_flip() which looks like it would naturally fit with a boxplot overlaid on top of it. However, it's currently not possible because the boxplot is parameterised so that it is always horizontal, while the density is vertical - in the above example I have flipped the coordinate system, but that flips both density plot and boxplot. Hadly -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] png and pdf : point size, font size, etc.
Hi, I am new to this mailing list. Didn`t find the answer in the archives. But I guess, there are people out there, who know the solution: In an automatic script, I want to produce simple plots. First I prefer pdf, because of scalability and standardization. But when I have too many datapoints (say 300.000) the viewing and printing is very, very slow. So I decided to take some bitmapped format when passing 5.000 datapoints, because most of them are in one cloud. As a result, the png is nice and fast to handle and in Linux it is very easy to convert it to pdf with imagemagick. So far so good. The problem is, that the font size and the point-size of the plotpoints and the all of the plot is getting much smaller when using png-format. Why is this and how can I circumvent this issue? These are the two devices, I am opening for the two reasons: 1) pdf(name_out.pdf,11,3.5) 2) bitmap(name_out.png,height=3.5,width=11) all should be the same, isn`t it? Thanks in advance, Stephan -- Dr. med. Stephan Ripke Stat.2 (Neurology) Max-Planck-Institute of Psychiatry AG Statistical Genetics Kraepelinstr. 10 80804 Munich - Germany Tel. +49 (0)89 30622-422 or -384 Fax. +49 (0)89 30622-610 Email: [EMAIL PROTECTED] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Cross Validation
Hello, How can I do a cross validation in R? Thank You! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] png and pdf : point size, font size, etc.
On 27 Feb 2008, Stephan Ripke wrote: The problem is, that the font size and the point-size of the plotpoints and the all of the plot is getting much smaller when using png-format. Why is this and how can I circumvent this issue? These are the two devices, I am opening for the two reasons: 1) pdf(name_out.pdf,11,3.5) 2) bitmap(name_out.png,height=3.5,width=11) Why not the 'png' device? You can try different picture size and font sizes. For png, the size of the figure is in pixels (rather than inches for pdf), and the fonts are in 'points'. The documentation of png says that one point is approximately one pixel. However, that is assuming 72 dpi (screen resolution). For printing (e.g., inserting into Word), you would want at least 600dpi, or even 1200dpi, so you need set a font size proportional to the total image size. I usually just make a pdf file and then convert it to png with 'convert' from ImageMagick to the desired resolution: convert -density 600x600 a.pdf a.png Michael __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] png and pdf : point size, font size, etc.
Since bitmap() used postscript, you may as well use PDF and convert to PNG (and bitmap in R-devel uses this as an option). I've no idea about the size differences, and we don't have a reproducible example. Again, R-devel offers many more options. On Wed, 27 Feb 2008, Stephan Ripke wrote: Hi, I am new to this mailing list. Didn`t find the answer in the archives. But I guess, there are people out there, who know the solution: In an automatic script, I want to produce simple plots. First I prefer pdf, because of scalability and standardization. But when I have too many datapoints (say 300.000) the viewing and printing is very, very slow. So I decided to take some bitmapped format when passing 5.000 datapoints, because most of them are in one cloud. As a result, the png is nice and fast to handle and in Linux it is very easy to convert it to pdf with imagemagick. So far so good. The problem is, that the font size and the point-size of the plotpoints and the all of the plot is getting much smaller when using png-format. Why is this and how can I circumvent this issue? These are the two devices, I am opening for the two reasons: 1) pdf(name_out.pdf,11,3.5) 2) bitmap(name_out.png,height=3.5,width=11) all should be the same, isn`t it? Thanks in advance, Stephan -- Dr. med. Stephan Ripke Stat.2 (Neurology) Max-Planck-Institute of Psychiatry AG Statistical Genetics Kraepelinstr. 10 80804 Munich - Germany Tel. +49 (0)89 30622-422 or -384 Fax. +49 (0)89 30622-610 Email: [EMAIL PROTECTED] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Raw histogram plots
On Wed, 2008-02-27 at 08:48 -0500, Charilaos Skiadas wrote: x - table(rbinom(20,2,0.5)) plot(names(x),x) should do it. You can also try just plot(x). Use prop.table on table if you want the relative frequencies instead. Yes, names is what I needed :) Thanks for the prop.table hint. I looked everywhere but none of my searches hinted at table/table.prop. You guys' help has been invaluable for me. Thanks again, Andre __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multiple plots per page using hist and pdf
Hello, I am puzzled by the behavior of hist() when generating multiple plots per page on the pdf device. In the following example two pdf files are generated. The first results in 4 plots on one pdf page as expected. However, the second, which swaps one of the plot() calls for hist(), results in a 4 page pdf with one plot per page. How might I get the histogram with 3 other scatter plots onto a single pdf page? platform powerpc-apple-darwin8.10.1 version.string R version 2.6.1 (2007-11-26) Thanks! Ben ###BEGIN data(iris) orig.par = par(no.readonly = TRUE) pdf(file = just_plots.pdf) par(mfrow=c(2,2)) plot(iris$Sepal.Length, iris$Sepal.Width, main = Plot 1) plot(iris$Petal.Length, iris$Petal.Width, main = Plot 2) plot(iris$Sepal.Length, iris$Petal.Length, main = Plot 3) plot(iris$Sepal.Width, iris$Petal.Width, main = Plot 4) dev.off() pdf(file = hist_and_plots.pdf) hist(iris$Sepal.Length, main = Plot 1) plot(iris$Petal.Length, iris$Petal.Width, main = Plot 2) plot(iris$Sepal.Length, iris$Petal.Length, main = Plot 3) plot(iris$Sepal.Width, iris$Petal.Width, main = Plot 4) dev.off() par(orig.par) ###END Ben Tupper [EMAIL PROTECTED] I GoodSearch for Ashwood Waldorf School. Raise money for your favorite charity or school just by searching the Internet with GoodSearch - www.goodsearch.com - powered by Yahoo! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Call for abstracts: Innovative Tools in Data Analysis (ERCIM08)
Dear useRs, we are organizing the following session Topic: Innovative Tools in Data Analysis Organizers: Achim Zeileis and Bettina Gruen at the First Workshop of the ERCIM Working Group on Computing Statistics June 19-21, 2008 Neuchatel, Switzerland URL: http://www.dcs.bbk.ac.uk/ercim08 To improve the quality of statistical data analysis the provision of innovative tools which make new techniques readily available is extremely important. In the session 'Innovative Tools for Data Analysis' we are looking for presentations of tools which support any area of data analysis and address techniques ranging from classical methods and their extensions to machine learning. Please consider giving a presentation on a flexible tool you have implemented in R in our session. Submit your abstract via the web page indicating our session name in the text field and let us also know informally. Deadline for early registration: March 3, 2008 Deadline for submission of abstracts: April 30, 2008 Kind regards, Bettina and Achim -- --- Bettina Grün Department für Statistik und Mathematik Wirtschaftsuniversität Wien Augasse 2-6 A-1090 Wien, Österreich Tel: (+43 1) 31336 5032 Fax: (+43 1) 31336 734 --- -- --- Bettina Grün Department für Statistik und Mathematik Wirtschaftsuniversität Wien Augasse 2-6 A-1090 Wien, Österreich Tel: (+43 1) 31336 5032 Fax: (+43 1) 31336 734 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] History of R
Kathy, You might find some relevant reading in volume 13 of the Journal of Statistical Software: http://www.jstatsoft.org/v13 Some of the papers have a bit of discussion on why R has become more widely used than lisp-stat. K Wright On Fri, Feb 15, 2008 at 1:53 PM, Kathy Gerber [EMAIL PROTECTED] wrote: Earlier today I sent a question to Frank Harrell as an R developer with whom I am most familiar. He suggested also that I put my questions to the list for additional responses. Next month I'll be giving a talk on R as an example of high quality open source software. I think there is much to learn from R as a high quality extensible product that (at least as far as I can tell) has never been spun or hyped like so many open source fads. The question that intrigues me the most is why is R as an open source project is so incredibly successful and other projects, say for example, Octave don't enjoy that level of success? I have some ideas of course, but I would really like to know your thoughts when you look at R from such a vantage point. Thanks. Kathy Gerber University of Virginia ITC - Research Computing Support __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm binomial with no successes
[Following up for my own personal education so a bit OT!] Naively, I would have thought that package multcomp would be of use here. So I tried, for my own comprehension and education, to answer the OP's question using multcomp. Here's what I got: ## make this reproducible (I hope) set.seed(1234) s - c(rpois(8, 4), rep(0, 4)) f - rpois(12, 30) tr - gl(3, 4) sf - cbind(s,f) ## fit the glm mod - glm(sf ~ tr, family=binomial) summary(mod) ## tr2 and tr3 not different from reference level tr1 anova(mod, test = Chisq) ## tr is signif ## multiple comparison of levels of tr require(multcomp) mod.glht - glht(mod, linfct = mcp(tr = Tukey)) mod.glht summary(mod.glht) If I interpret this correctly, both summary(mod) and summary(mod.glht) suggest that there are no significant differences between the 3 levels of tr, but that tr, as a whole, is better than the Null model (as shown by anova(mod) )? Is my interpretation correct, for this specific example, or am I abusing multcomp and statistics in this case? Thanks for your time and indulgence of a more statistically-related than R-related question All the best, G On Wed, 2008-02-27 at 12:51 +0100, juli pausas wrote: Thank you very much for your reply. Then I understand that would not be correct to perform the test in summary for testing the significance of the different levels of a factor in relation to the first level, including when there are more than 2 levels, as in my real case; at least for binomial regressions. So here a more close-to-real example, with a 3-level factor s - c(rpois(8, 4), rep(0, 4)) f - rpois(12, 30) tr - gl(3, 4) sf - cbind(s,f) drop1(glm(sf ~ tr, family=binomial), test=Chisq) # significant summary(glm(sf ~ tr, family=binomial)) # the 3rd level is not significant from the 1st So I understand that I need to explite the data and perform the two tests separately: drop1(glm(sf ~ tr, family=binomial, subset=(tr %in% c(1, 2))), test=Chisq) # ns as expected drop1(glm(sf ~ tr, family=binomial, subset=(tr %in% c(1, 3))), test=Chisq) # significant, as expected Is this the correct approach? Many thanks Juli On Wed, Feb 27, 2008 at 12:13 PM, Prof Brian Ripley [EMAIL PROTECTED] wrote: On Wed, 27 Feb 2008, juli pausas wrote: Dear all, I have a question on glm, family binomial. I do not see significant differences between the levels of a factor (treatment) if all data for a level is 0; and replacing a 0 for a 1 (in fact reducing the difference), then I detect the significant difference that I expected. This is because you are using the wrong test, one with negligible power. See MASS4 pp.197-8 -- you need to use the LRT, as in drop1(glm(sf ~ tr, family=binomial), test=Chisq) Single term deletions Model: sf ~ tr Df DevianceAICLRT Pr(Chi) none 1.595 17.730 tr 1 24.244 38.379 22.649 1.944e-06 (and in your example you can replace 'drop1' by 'anova'). Is there a way to overcome this problem? or this is an expected behaviour ? Here is an example: s - c(2,4,4,5,0,0,0,0) f - c(31,28,28,28,32,37,34,35) tr - gl(2, 4) sf - cbind(s,f) # numbers of successes and failures summary(glm(sf ~ tr, family=binomial)) # tr ns sf[8,1] - 1 summary(glm(sf ~ tr, family=binomial)) # tr significative ** Thanks for any suggestion Juli -- http://www.ceam.es/pausas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] selecting consecutive records over a threshold
Dear all, I am having trouble working out how I might do the following and would appreciate any thoughts. I am working with data concerning precipitation. The data are in 2 columns in a data frame called storm in the following format: HourCount - 1,2,3,4,5,6,7,8,...48 Amt - 0,0,0.3,3,4,8,10,15,12,6,4,3,0.2,0.2... There are 48 hours worth of data. I am trying to extract a storm. My storm is defined as a threshold - when the amount is greater than 2 for 2 hours the storm starts, and when the amount is less than 1 for two hours the storm ends. I can extract data above thresholds but it is obviously important to be able to extract consecutive records to capture the whole storm. Can anybody help? Thanks Jamie Ledingham PhD Researcher University of Newcastle Upon Tyne __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cross Validation
http://www.burns-stat.com/pages/Tutor/bootstrap_resampling.html may be of some use to you. Patrick Burns [EMAIL PROTECTED] +44 (0)20 8525 0696 http://www.burns-stat.com (home of S Poetry and A Guide for the Unwilling S User) Carla Rebelo wrote: Hello, How can I do a cross validation in R? Thank You! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot Principal component analysis
Thanks for your help, Can R plot the data in 3 dimention, with different colors for each group ? for exmple I would like to have the plot with respect to PC1, PC2 and PC3. Thanks, SNN wrote: Hi, I have matrix of 300,000*115 (snps*individual). I ran the PCA on the covariance matrix which has a dimention oof 115*115. I have the first 100 individuals from group A and the rest of 15 individuals from group B. I need to plot the data in two and 3 dimentions with respect to PC1 and PC2 and (in 3D with respect to PC1, PC2 and PC3). I do not know how to have the plot ploting the first 100 points corresponding to group A in red (for example) and the rest of the 15 points in Blue? i.e I want the each group in a diffrent color in the same plot. I appreciate if someone can help. Thanks, -- View this message in context: http://www.nabble.com/Plot-Principal-component-analysis-tp15700123p15717669.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Running randomForests on large datasets
Thank you Andy. It is throwing memory allocation error for me for numerous combinations of ntree and nodesize values. I tried with memory.limit() and memory.size to use the maximum memory but the error was consistent. But one thing I noticed was that I had tough time even just loading the dataset previously. I, then, used Rcmdr library to load the same data, and it was faster than just loading with the R console and it didn't throw any memory errors like it used to throw previously, now and then. I thought that may be this was a fluke with Rcmdr, I, then, opened it a few more times and every time Rcmdr was consistent in loading the large dataset without any allocation errors. I also tried with opening a few other programs on the desktop, repeated the process, it loaded just fine. Any ideas on how Rcmdr is loading the file as opposed to R console (I am using read.table())? Anyway, I thought I'd share this observation with the others. Thank you Andy for your ideas. I'll keep tinkering with the parameters. Thank you, Nagu On Wed, Feb 27, 2008 at 5:24 AM, Liaw, Andy [EMAIL PROTECTED] wrote: There are a couple of things you may want to try, if you can load the data into R and still have enough to spare: - Run randomForest() with fewer trees, say 10 to start with. - Run randomForest() with nodesize set to something larger than the default (5 for classification). This puts a limit on the size of the trees being grown. Try something like 21 and see if that runs, and adjust accordingly. HTH, Andy From: Nagu Hi, I am trying to run randomForests on a datasets of size 50X650 and R pops up memory allocation error. Are there any better ways to deal with large datasets in R, for example, Splus had something like bigData library. Thank you, Nagu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Notice: This e-mail message, together with any attachments, contains information of Merck Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp Dohme or MSD and in Japan, as Banyu - direct contact information for affiliates is available at http://www.merck.com/contact/contacts.html) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system. -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] installing package
Dear all, I have prepared a package to install in R. I followed all steps explained in writing R extensions. Also I read the section add-on package in R installation and administration. These documents refer to R CMD INSTALL command but I don't know how to use this command. My specific question is where should I type this command? In help page, we have Use R CMD INSTALL --help for more usage information when I type this, I have R CMD INSTALL --help Error: unexpected symbol in R CMD Is there something I do wrong? Thanks in advance Zahra Mntazeri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multiple plots per page using hist and pdf
On Feb 27, 2008, at 11:45 AM, Gavin Simpson wrote: On Wed, 2008-02-27 at 11:31 -0500, Ben Tupper wrote: Hello, I am puzzled by the behavior of hist() when generating multiple plots per page on the pdf device. In the following example two pdf files are generated. The first results in 4 plots on one pdf page as expected. However, the second, which swaps one of the plot() calls for hist(), results in a 4 page pdf with one plot per page. How might I get the histogram with 3 other scatter plots onto a single pdf page? Look a bit more closely and you'll see what is wrong ;-) In the second example, you forgot the par(mfrow=c(2,2)) bit, so of course there was no split plotting region. Hi, Oh, for goodness sake! Why are troubles for newbies so prominently displayed under the nose? But it is oh so good to get that sorted out. Note that the par(mfrow=c(2,2)) bit in the first version (with all plot() calls) pertains to the pdf device you just opened, it doesn't persist as you closed that device ofter plotting with dev.off(). In the second example you need to change the par again to what you require. And as a result, your orig.par and par(orig.par) are irrelevant here as you didn't change any device that was open when you reset the parameters using par(orig.par). OK, so as long as I am working within one device the parameters in par () are sticky. Got it! Thanks! Ben This is how your second example should have been called: pdf(file = hist_and_plots.pdf) ## set up the new plotting device (pdf) par(mfrow = c(2,2)) ## draw the plot hist(iris$Sepal.Length, main = Plot 1) plot(iris$Petal.Length, iris$Petal.Width, main = Plot 2) plot(iris$Sepal.Length, iris$Petal.Length, main = Plot 3) plot(iris$Sepal.Width, iris$Petal.Width, main = Plot 4) ## close the device to do the drawing dev.off() HTH G platform powerpc-apple-darwin8.10.1 version.string R version 2.6.1 (2007-11-26) Thanks! Ben ###BEGIN data(iris) orig.par = par(no.readonly = TRUE) pdf(file = just_plots.pdf) par(mfrow=c(2,2)) plot(iris$Sepal.Length, iris$Sepal.Width, main = Plot 1) plot(iris$Petal.Length, iris$Petal.Width, main = Plot 2) plot(iris$Sepal.Length, iris$Petal.Length, main = Plot 3) plot(iris$Sepal.Width, iris$Petal.Width, main = Plot 4) dev.off() pdf(file = hist_and_plots.pdf) hist(iris$Sepal.Length, main = Plot 1) plot(iris$Petal.Length, iris$Petal.Width, main = Plot 2) plot(iris$Sepal.Length, iris$Petal.Length, main = Plot 3) plot(iris$Sepal.Width, iris$Petal.Width, main = Plot 4) dev.off() par(orig.par) ###END Ben Tupper [EMAIL PROTECTED] I GoodSearch for Ashwood Waldorf School. Raise money for your favorite charity or school just by searching the Internet with GoodSearch - www.goodsearch.com - powered by Yahoo! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Ben Tupper [EMAIL PROTECTED] I GoodSearch for Ashwood Waldorf School. Raise money for your favorite charity or school just by searching the Internet with GoodSearch - www.goodsearch.com - powered by Yahoo! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] column name handling and long labels
Werner Wernersen wrote: Somehow, I don't get how the labels of Hmisc work. My expectation was that if I use the following code and then the print method, I would get an output where the headers are replaced by the labels but I get the normal variable names. How can I get the labels as headers instead in the printed table? df - data.frame(x=seq(1,3),y=seq(4,6)) df - upData(df, labels=c(x=X1,y=X2)) print(df2) Thanks again, Werner The labels are generally too long for printing. They are instead used for axis labels in plotting and for various statistical tables (e.g. those generated by summary.formula). Above all they are used for fully annotating data frames (see the contents and describe function). Frank Hi, I have two loosely related questions which could make my live again a bit easier: 1) Is there a simple way to select a range of columns in a data frame using column names? I am thinking of something like mydf[1,col4:col8] Try this using builtin data frame anscombe which has columns x1 to x4 followed by y1 to y4: subset(anscombe, select = x3:y2) 2) I have a data frame with many columns and they all have short variable names which is good in most cases but sometimes it would be nice to have also a longer descriptive name / label attached to the variable which could then be used for printing and latex output. Has anybody come up with a convenient way to do that? Right now, I am using always match or merge in case of row names. See ?label in package Hmisc. Many thanks, Werner Lesen Sie Ihre E-Mails auf dem Handy. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] jpeg in batch mode
Thanks for the quick reply and sorry that I didn't see the documentation that could help me. I actually didn't see anything on the jpeg() help pages, or didn't recognize what was there as relating to my problem. I will look further on the archives if this plus Henrik's reply is not enough. Best, Elizabeth Prof Brian Ripley wrote: Well, this is documented and there are solutions even on the jpeg() help page and mamy more in the archives. Basically 1) Use a Xvfb X server 2) Use bitmap() 3) Use an alternative such as GDD or Cairo (but for me those do a poor job on symbol fonts). The good news is that R 2.7.0 will have a better solution, producing JPEGs without using X11. R-2.6.0, GNU/Linux Thanks, Elizabeth My caveat from above about the jpeg working as long as I'm signed on: one time I got the mysterious error: jpeg(~/batch5Effect/ProbPbsets_summaryHeatmaps%03d.jpeg,height=1200,width=800) Error in jpeg(~/batch5Effect/ProbPbsets_summaryHeatmaps%03d.jpeg, height = 1200, : X11 fatal IO error: please save work and shut down R Even though it had just correctly done this same basic command a few lines back. I redid it (without changing anything that I was aware of) and it worked fine. So I don't think it's related. That's an X11 problem, not an R problem. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] selecting consecutive records over a threshold
Convert the data frame to a zoo object and note that: diff(-rollmax(-z, 2) 2) 0 diff(rollmax(z, 2) 1) 0 have 1 at the start and end of the storm period respectively so that cumsum of their difference has ones for the storm period. In the last line we extract that portion. # input DF - data.frame(HourCount = 1:14, Amt = c(0, 0, 0.3, 3, 4, 8, 10, 15, 12, 6, 4, 3, 0.2, 0.2)) library(zoo) z - with(DF, zoo(Amt, HourCount)) r - cumsum((diff(-rollmax(-z, 2) 2) 0) - (diff(rollmax(z, 2) 1) 0)) window(z, time(r[r 0])) (You may have to use align=right argument to both rollmax occurrences depending on how you want to define a storm period.) See ?rollmax and the three vignettes on the zoo package for more info. On Wed, Feb 27, 2008 at 12:00 PM, Jamie Ledingham [EMAIL PROTECTED] wrote: Dear all, I am having trouble working out how I might do the following and would appreciate any thoughts. I am working with data concerning precipitation. The data are in 2 columns in a data frame called storm in the following format: HourCount - 1,2,3,4,5,6,7,8,...48 Amt - 0,0,0.3,3,4,8,10,15,12,6,4,3,0.2,0.2... There are 48 hours worth of data. I am trying to extract a storm. My storm is defined as a threshold - when the amount is greater than 2 for 2 hours the storm starts, and when the amount is less than 1 for two hours the storm ends. I can extract data above thresholds but it is obviously important to be able to extract consecutive records to capture the whole storm. Can anybody help? Thanks Jamie Ledingham PhD Researcher University of Newcastle Upon Tyne __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] installing package
Yes. If you're building this package in windows this is done at a dos prompt in your r source directory -Original Message- From: [EMAIL PROTECTED] on behalf of [EMAIL PROTECTED] Sent: Wed 2/27/2008 1:02 PM To: r-help@r-project.org Subject: [R] installing package Dear all, I have prepared a package to install in R. I followed all steps explained in writing R extensions. Also I read the section add-on package in R installation and administration. These documents refer to R CMD INSTALL command but I don't know how to use this command. My specific question is where should I type this command? In help page, we have Use R CMD INSTALL --help for more usage information when I type this, I have R CMD INSTALL --help Error: unexpected symbol in R CMD Is there something I do wrong? Thanks in advance Zahra Mntazeri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Running randomForests on large datasets
Also, use the non-formula interface to the function: # saves some space randomForest(x, y) the formula interface: # avoid: randomForest(y~., data = something) This second method saves a terms object that is very sparse and takes up a lot of space. Max On Wed, Feb 27, 2008 at 12:31 PM, Nagu [EMAIL PROTECTED] wrote: Thank you Andy. It is throwing memory allocation error for me for numerous combinations of ntree and nodesize values. I tried with memory.limit() and memory.size to use the maximum memory but the error was consistent. But one thing I noticed was that I had tough time even just loading the dataset previously. I, then, used Rcmdr library to load the same data, and it was faster than just loading with the R console and it didn't throw any memory errors like it used to throw previously, now and then. I thought that may be this was a fluke with Rcmdr, I, then, opened it a few more times and every time Rcmdr was consistent in loading the large dataset without any allocation errors. I also tried with opening a few other programs on the desktop, repeated the process, it loaded just fine. Any ideas on how Rcmdr is loading the file as opposed to R console (I am using read.table())? Anyway, I thought I'd share this observation with the others. Thank you Andy for your ideas. I'll keep tinkering with the parameters. Thank you, Nagu On Wed, Feb 27, 2008 at 5:24 AM, Liaw, Andy [EMAIL PROTECTED] wrote: There are a couple of things you may want to try, if you can load the data into R and still have enough to spare: - Run randomForest() with fewer trees, say 10 to start with. - Run randomForest() with nodesize set to something larger than the default (5 for classification). This puts a limit on the size of the trees being grown. Try something like 21 and see if that runs, and adjust accordingly. HTH, Andy From: Nagu Hi, I am trying to run randomForests on a datasets of size 50X650 and R pops up memory allocation error. Are there any better ways to deal with large datasets in R, for example, Splus had something like bigData library. Thank you, Nagu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Notice: This e-mail message, together with any attachments, contains information of Merck Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp Dohme or MSD and in Japan, as Banyu - direct contact information for affiliates is available at http://www.merck.com/contact/contacts.html) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system. -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Max __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave produces gibberish instead of apostrophe in pdf
On Wed, 27 Feb 2008, Gavin Simpson wrote: On Wed, 2008-02-27 at 11:35 +0100, Paul Hiemstra wrote: Dear All, I try to use Sweave to make a document. But when I use the Sweave() command on it and build a pdf with pdflatex (3.141592-1.40.3) my apostrophes are replaced by some gibberish (an 'a' with a hat on it, a capital A with a arc pointing upwards on it and a capital Y with two points on it). If I manually replace the apostrophes using the keyboard, I get a different looking apostrophe and the output is correct. I'm using Debian Linux (Lenny) with TexLive, Kile and R 2.6.1. I presume that R is running in a UTF-8 locale on your Debian box (or some other locale that is different to the one pdflatex is working in); His email seens to be in UTF-8, but as he failed to follow the posting guide I decided not to respond with unnecessary guesswork. the fancy quotes used in some print methods in R aren't available in all locale/font encodings and these get interpreted as the gibberish you are seeing. Stick this in your preamble and see if it works (you might need to install a TexLive package from your usual Debian repository to get this [LaTeX] package installed): \usepackage[utf8x]{inputenc} it did for me on my Fedora box when I first came across this issue. The standard 'spell' is \usepackage[utf8]{inputenc}: that is what R itself uses when making a package manual if the encoding is UTF-8. That should come with any reasonably recent LaTeX (dates in the LaTeX world are imaginary, but I think it is from the '2003' release). HTH G This is a sample, the problem is in the lm output in the Signif. codes line: \documentclass[a4paper,10pt]{article} \title{Spam} \author{F. Bar} \begin{document} reg= n - 50 x - seq(1, n) a.true - 3 b.true - 1.5 y.true - a.true + b.true * x s.true - 17.3 y - y.true + s.true * rnorm(n) out1 - lm(y ~ x) summary(out1) @ \end{document} And the resulting .tex file: \documentclass[a4paper,10pt]{article} \title{Spam} \author{F. Bar} \usepackage{Sweave} \begin{document} \begin{Schunk} \begin{Sinput} n - 50 x - seq(1, n) a.true - 3 b.true - 1.5 y.true - a.true + b.true * x s.true - 17.3 y - y.true + s.true * rnorm(n) out1 - lm(y ~ x) summary(out1) \end{Sinput} \begin{Soutput} Call: lm(formula = y ~ x) Residuals: Min 1Q Median 3Q Max -31.9565 -9.4745 -0.1708 7.3759 44.6538 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -0.07386 4.64712 -0.016 0.987 x 1.57405 0.15860 9.924 3.25e-13 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 .’ 0.1 ‘ ’ 1 Residual standard error: 16.18 on 48 degrees of freedom Multiple R-Squared: 0.6723, Adjusted R-squared: 0.6655 F-statistic: 98.49 on 1 and 48 DF, p-value: 3.245e-13 \end{Soutput} \end{Schunk} \end{document} cheers and thanks for any help, Paul -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave produces gibberish instead of apostrophe in pdf
On Wed, 2008-02-27 at 20:33 +, Prof Brian Ripley wrote: On Wed, 27 Feb 2008, Gavin Simpson wrote: On Wed, 2008-02-27 at 11:35 +0100, Paul Hiemstra wrote: snip / Stick this in your preamble and see if it works (you might need to install a TexLive package from your usual Debian repository to get this [LaTeX] package installed): \usepackage[utf8x]{inputenc} it did for me on my Fedora box when I first came across this issue. The standard 'spell' is \usepackage[utf8]{inputenc}: that is what R itself uses when making a package manual if the encoding is UTF-8. That should come with any reasonably recent LaTeX (dates in the LaTeX world are imaginary, but I think it is from the '2003' release). Thank you for the correction/clarification Prof. Ripley. If you'll permit some further 'guesswork' on my part, IIRC, I had a problem \usepackage[utf8]{inputenc} on an early Fedora Core but \usepackage[utf8x]{inputenc} worked and I am a creature of habit at times. Hence \usepackage[utf8x]{inputenc} was in the Rnw file I looked at to check I'd got this 'correct'. Interestingly, the CTAN page for the unicode package: http://www.ctan.org/tex-archive/macros/latex/contrib/unicode/ contains the instruction \usepackage[utf8x]{inputenc} Having checked on my Fedora 8 machines, both incantations produce the desired result. All the best, Gavin HTH G This is a sample, the problem is in the lm output in the Signif. codes line: \documentclass[a4paper,10pt]{article} \title{Spam} \author{F. Bar} \begin{document} reg= n - 50 x - seq(1, n) a.true - 3 b.true - 1.5 y.true - a.true + b.true * x s.true - 17.3 y - y.true + s.true * rnorm(n) out1 - lm(y ~ x) summary(out1) @ \end{document} And the resulting .tex file: \documentclass[a4paper,10pt]{article} \title{Spam} \author{F. Bar} \usepackage{Sweave} \begin{document} \begin{Schunk} \begin{Sinput} n - 50 x - seq(1, n) a.true - 3 b.true - 1.5 y.true - a.true + b.true * x s.true - 17.3 y - y.true + s.true * rnorm(n) out1 - lm(y ~ x) summary(out1) \end{Sinput} \begin{Soutput} Call: lm(formula = y ~ x) Residuals: Min 1Q Median 3Q Max -31.9565 -9.4745 -0.1708 7.3759 44.6538 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -0.07386 4.64712 -0.016 0.987 x 1.57405 0.15860 9.924 3.25e-13 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 .’ 0.1 ‘ ’ 1 Residual standard error: 16.18 on 48 degrees of freedom Multiple R-Squared: 0.6723, Adjusted R-squared: 0.6655 F-statistic: 98.49 on 1 and 48 DF, p-value: 3.245e-13 \end{Soutput} \end{Schunk} \end{document} cheers and thanks for any help, Paul -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error in cor.default(x1, x2) : missing observations in cov/cor
Hello, I'm trying to do cor(x1,x2) and I get the following error: Error in cor.default(x1, x2) : missing observations in cov/cor A few things: 1. I've used cor() many times and have never encountered this error. 2. length(x1) = length(x2) 3. is.numeric(x1) = is.numeric(x2) = TRUE 4. which(is.na(x1)) = which(is.na(x2)) = integer(0) {the same goes for is.nan()} 5. I also try cor(x1,x2, use = all.obs) and get the same error. What can be going wrong? -- View this message in context: http://www.nabble.com/Error-in-cor.default%28x1%2C-x2%29-%3A-missing-observations-in-cov-cor-tp15723848p15723848.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading a file created with Fortran
Ben Bolker [EMAIL PROTECTED] wrote: Dennis Fisher fisher at plessthan.com writes: I am trying to read a file written by Fortran. Several lines of the file are pasted below: Perhaps read.fwf is what you want? (fwf stands for fixed width format). You would have to work out the field widths, but it would seem to be pretty straightforward). A couple of points. First, since you know the format statement, perhaps you control the Fortran program. Then, it might be nicer to introduce whitespace between the data items, which would serve two purposes: making read.table() work on the data set and making it easier for humans to check the data file more easily. Second, you could look at read.fortran() -- a function that takes a lightly modified Fortran format specification as an argument. That seems even better for your purposes than read.fwf. -- Mike Prager, NOAA, Beaufort, NC * Opinions expressed are personal and not represented otherwise. * Any use of tradenames does not constitute a NOAA endorsement. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in cor.default(x1, x2) : missing observations in cov/cor
Ken Spriggs wrote: Hello, I'm trying to do cor(x1,x2) and I get the following error: Error in cor.default(x1, x2) : missing observations in cov/cor A few things: 1. I've used cor() many times and have never encountered this error. 2. length(x1) = length(x2) 3. is.numeric(x1) = is.numeric(x2) = TRUE 4. which(is.na(x1)) = which(is.na(x2)) = integer(0) {the same goes for is.nan()} 5. I also try cor(x1,x2, use = all.obs) and get the same error. What can be going wrong? Er, this is strange. As far as I can see, cor is not normally generic, and getAnywhere(cor.default) no object named ‘cor.default’ was found so something is not normal. What do you get from a traceback() after the error, which is your cor.default, and what is the class of x1 and x2? Do you perchance have some special packages loaded? -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in cor.default(x1, x2) : missing observations in cov/cor
What happens when you type cor at the R prompt? Perhaps your calling of the cor function is not calling the cor function in the stats package? Ken Spriggs wrote: Hello, I'm trying to do cor(x1,x2) and I get the following error: Error in cor.default(x1, x2) : missing observations in cov/cor A few things: 1. I've used cor() many times and have never encountered this error. 2. length(x1) = length(x2) 3. is.numeric(x1) = is.numeric(x2) = TRUE 4. which(is.na(x1)) = which(is.na(x2)) = integer(0) {the same goes for is.nan()} 5. I also try cor(x1,x2, use = all.obs) and get the same error. What can be going wrong? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot y1 and y2 on one graph
Dear all I have a code like x-1:10 y1-x+runif(10)*2 y2-seq(0,50,length.out=10)+rnorm(10)*10 par(mfrow=c(1,2)) plot(y1~x) plot(y2~x) Now I would like to plot y1 and y2 on the same graph, with its two scales (y1 on left and y2 on rigth side). Any help are welcome. Kind regards Miltinho Brazil [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] numeric format
Rolf Turner [EMAIL PROTECTED] wrote: I have often wanted to suppress these row numbers and for that purpose wrote the following version of print.data.frame() [...] The ``srn'' argument means ``suppress row numbers''; [...] I once suggested to an R Core person that my version of print.data.frame() be adopted as the system version, but was politely declined. Rolf-- Clearly, and appropriately, R development is not a democratic process. Still, if a vote were held, I would support your version. I have also needed to suppress row names from time to time. -- Mike Prager, NOAA, Beaufort, NC * Opinions expressed are personal and not represented otherwise. * Any use of tradenames does not constitute a NOAA endorsement. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Label outliers in boxplot
simply when I boxplot i use boxplot(as.data.frame(z)) I would like the rownames to appear close to the outliers in the boxplot. There is a command to do it ? I've seen the identify() function but I'm not able to obtain any results. Your data frame came through scrambled for me. From the posting guide comes a hint on making this easier for readers to help you: When providing examples, it is best to give an R command that constructs the data ... For more complicated data structures, dump(x, file=stdout()) will print an expression that will recreate the object x. That aside, there's some help for you in '?boxplot'. See the values in the returned object; the outliers are identified already. You can call them out separately (http://tolstoy.newcastle.edu.au/R/help/05/09/12735.html) or get them with the identify function (http://tolstoy.newcastle.edu.au/R/e2/help/07/01/8598.html). Note that the last Identify command in the latter link should be a lowercase identify. I got those by searching the mail archives (originally). - David Hewitt Virginia Institute of Marine Science http://www.vims.edu/fish/students/dhewitt/ -- View this message in context: http://www.nabble.com/Label-outliers-in-boxplot-tp15712733p15725405.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot y1 and y2 on one graph
This should do what you want: x-1:10 y1-x+runif(10)*2 y2-seq(0,50,length.out=10)+rnorm(10)*10 plot(y1~x, bty='c') par(new=TRUE) # plot on the same graph plot(y2~x, col='red', axes=FALSE, bty='c', xlab='', ylab='') axis(4, col.axis='red', col='red') mtext(y2, 4, col='red', line=-2) On Wed, Feb 27, 2008 at 5:05 PM, milton ruser [EMAIL PROTECTED] wrote: Dear all I have a code like x-1:10 y1-x+runif(10)*2 y2-seq(0,50,length.out=10)+rnorm(10)*10 par(mfrow=c(1,2)) plot(y1~x) plot(y2~x) Now I would like to plot y1 and y2 on the same graph, with its two scales (y1 on left and y2 on rigth side). Any help are welcome. Kind regards Miltinho Brazil [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multi-level hierarchical logistic regression with sampling weight
Hi I would like to run a multi-level hierarchical logistic regression model with sampling weight? Is this possible with R? Thanks a lot, Qian Guo - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in cor.default(x1, x2) : missing observations in cov/cor
I get the following class(x1) [1] numeric class(x2) [1] numeric and: cor(x1,x2) Error in cor.default(x1, x2) : missing observations in cov/cor traceback() 2: cor.default(x1, x2) 1: cor(x1, x2) Peter Dalgaard wrote: Ken Spriggs wrote: Hello, I'm trying to do cor(x1,x2) and I get the following error: Error in cor.default(x1, x2) : missing observations in cov/cor A few things: 1. I've used cor() many times and have never encountered this error. 2. length(x1) = length(x2) 3. is.numeric(x1) = is.numeric(x2) = TRUE 4. which(is.na(x1)) = which(is.na(x2)) = integer(0) {the same goes for is.nan()} 5. I also try cor(x1,x2, use = all.obs) and get the same error. What can be going wrong? Er, this is strange. As far as I can see, cor is not normally generic, and getAnywhere(cor.default) no object named ‘cor.default’ was found so something is not normal. What do you get from a traceback() after the error, which is your cor.default, and what is the class of x1 and x2? Do you perchance have some special packages loaded? -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Error-in-cor.default%28x1%2C-x2%29-%3A-missing-observations-in-cov-cor-tp15723848p15724665.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in cor.default(x1, x2) : missing observations in cov/cor
I get the following: cor function (x, y = NULL, use = all.obs, method = c(pearson, kendall, spearman)) { UseMethod(cor) } Erik Iverson wrote: What happens when you type cor at the R prompt? Perhaps your calling of the cor function is not calling the cor function in the stats package? Ken Spriggs wrote: Hello, I'm trying to do cor(x1,x2) and I get the following error: Error in cor.default(x1, x2) : missing observations in cov/cor A few things: 1. I've used cor() many times and have never encountered this error. 2. length(x1) = length(x2) 3. is.numeric(x1) = is.numeric(x2) = TRUE 4. which(is.na(x1)) = which(is.na(x2)) = integer(0) {the same goes for is.nan()} 5. I also try cor(x1,x2, use = all.obs) and get the same error. What can be going wrong? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Error-in-cor.default%28x1%2C-x2%29-%3A-missing-observations-in-cov-cor-tp15723848p15724597.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2 boxplot confusion
Thierry 1. ggplot(mydata, aes(y = VALUE, x = SERIES)) + geom_boxplot() + facet_grid(.~ ID) creates a grid with three ID columns (ID1, ID2, ID3) and six SERIES columns within each ID column with two boxplots in each ID column (C10, C2) (C15, C4), (C20, C8). I was aiming for a grid with ID columns and SERIES rows. However if I try something like this: ggplot(mydata, aes(y = VALUE, x = factor(1))) + geom_boxplot() + facet_grid(SERIES ~ ID) I get an error: Error: position_dodge requires the following missing aesthetics: x Yet this works fine for a single boxplot (as you showed previously) if I remove the facet_grid() command. Any ideas? Or perhaps my only recourse is to build this up programmtically, such as: (pseudo-ish code) for id, ser in ids, series: dat - subset(mydata, (id %in% ids ser %in% series) boxplot(dat) in grid position id, ser Can I specify grid plotting grid by grid with ggplot or do I need to look at lattice graphics? (I'd like to stick with ggplot if I can) ONKELINX, Thierry wrote: Chris, 1. This will make more sense. ggplot(mydata, aes(y = VALUE, x = SERIES)) + geom_boxplot() + facet_grid(.~ ID) 2. Now I think I understand want you want. I'm affraid that won't be easy because you're trying to mix continuous variables with categorical ones on the same scale. A density plot has two continuous scales: VALUE and it's density. The boxplot has a continuous scale (VALUE) and the other is categorical. Maybe Hadley knows a solution for your problem. Thierry -- View this message in context: http://www.nabble.com/ggplot2-boxplot-confusion-tp15706116p15725522.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Custom LaTeX tables
Thanks for the answers! I will play around a bit and maybe I really need to write a custom wrapper than. Best, Werner --- Gabor Grothendieck [EMAIL PROTECTED] schrieb: On Tue, Feb 26, 2008 at 10:58 AM, Werner Wernersen [EMAIL PROTECTED] wrote: Hello, I am very happy that I have Sweave and R to write my papers. But I still have to do some tables by hand since I have not found out how I can customize the latex tables produced by R further (I mainly use xtable()). Like for instance, I have a table which needs an extra row every few rows as a group header and sometimes I want some extra horizontal lines in the table and also a multicolumn heading. How do you guys cope with such cases, do you set the table by hand in the end or have you found a neat way to deal with this? Many thanks and regards, Werner A few options are: - Hmisc latex() supports multicolumn headings and group headings although the large number of arguments may be daunting - xtable (and Hmisc's latex too) supports a style of combining latex fragments with the xtable. The add.to.row= argument on print.xtable is the one to notice. e.g. using the builtin BOD data frame this adds a 2nd row of headings and a group heading: print(xtable(BOD, align = r|r|r|), include.rownames = FALSE, add.to.row = list(pos = list(0, 3), command = c(\\multicolumn{1}{|c|}{(days)} \\multicolumn{1}{|c|}{(mg/l)} , \\hline \\multicolumn{2}{|l|}{Special Values} \\hline ))) - given the freedom from restrictions I find its often just best to do it manually in latex or if you have many tables with the same format in a report to generate a report-specific table layout wrapper that emits the latex you need. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in cor.default(x1, x2) : missing observations in cov/cor
OK, that is not the definition of cor in the stats package. Some add-on package you are loading might be overwriting it. What happens if you do stats::cor(x1,x2) ? Ken Spriggs wrote: I get the following: cor function (x, y = NULL, use = all.obs, method = c(pearson, kendall, spearman)) { UseMethod(cor) } Erik Iverson wrote: What happens when you type cor at the R prompt? Perhaps your calling of the cor function is not calling the cor function in the stats package? Ken Spriggs wrote: Hello, I'm trying to do cor(x1,x2) and I get the following error: Error in cor.default(x1, x2) : missing observations in cov/cor A few things: 1. I've used cor() many times and have never encountered this error. 2. length(x1) = length(x2) 3. is.numeric(x1) = is.numeric(x2) = TRUE 4. which(is.na(x1)) = which(is.na(x2)) = integer(0) {the same goes for is.nan()} 5. I also try cor(x1,x2, use = all.obs) and get the same error. What can be going wrong? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in cor.default(x1, x2) : missing observations in cov/cor
On 28/02/2008, at 11:11 AM, Ken Spriggs wrote: I get the following class(x1) [1] numeric class(x2) [1] numeric and: cor(x1,x2) Error in cor.default(x1, x2) : missing observations in cov/cor traceback() 2: cor.default(x1, x2) 1: cor(x1, x2) ``Clearly'' you must be using a non-standard cor. As Peter Dalgaard pointed out, cor() is not generic and there is no such function as cor.default() in ``standard R''. Suggestions: * Try find(cor) to see what package you are actually using. * Detach that package from the search list --- or remove cor from that position on the search list --- then you'll get the ``standard R'' version of cor() and all will be well. cheers, Rolf Turner ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2 boxplot confusion
Hi Hadley First off, thanks for ggplot2 and everything that bringing it to life and sustaining it entails. I noticed the coord flip problem during my ggplot investigations. Is this something I can override by getting into the code? On the coord flipping problem I was thinking to grab the density data explicitly, swap x and y and then plot as a scatter plot with a box plot overlaid. Or perhaps just draw the density plots with vertical lines drawn at the median, IQR etc Or perhaps draw the density plots and fake a boxplot by drawing bars explicitly. I'm hoping you can at least advise which of any of these routes is likely to be a dead end. regards Chris hadley wrote: Now I think I understand want you want. I'm affraid that won't be easy because you're trying to mix continuous variables with categorical ones on the same scale. A density plot has two continuous scales: VALUE and it's density. The boxplot has a continuous scale (VALUE) and the other is categorical. Maybe Hadley knows a solution for your problem. Well one idea is: ggplot(diamonds, aes(x = price)) + geom_density(aes(min = -..density.., adjust= 0.5),fill=grey50, colour=NA) + facet_grid(. ~ cut) + coord_flip() which looks like it would naturally fit with a boxplot overlaid on top of it. However, it's currently not possible because the boxplot is parameterised so that it is always horizontal, while the density is vertical - in the above example I have flipped the coordinate system, but that flips both density plot and boxplot. Hadly -- http://had.co.nz/ -- View this message in context: http://www.nabble.com/ggplot2-boxplot-confusion-tp15706116p15725753.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multi-level hierarchical logistic regression with sampling weight
On 28/02/2008, at 11:28 AM, GUO, Qian wrote: Hi I would like to run a multi-level hierarchical logistic regression model with sampling weight? Is this possible with R? Yes. In R, all things are *possible*. :-) cheers, Rolf Turner ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multiple plots per page using hist and pdf
I think you need to reset the par(mfrow=c(2,2)) before plotting the second set of graphs. --- Ben Tupper [EMAIL PROTECTED] wrote: Hello, I am puzzled by the behavior of hist() when generating multiple plots per page on the pdf device. In the following example two pdf files are generated. The first results in 4 plots on one pdf page as expected. However, the second, which swaps one of the plot() calls for hist(), results in a 4 page pdf with one plot per page. How might I get the histogram with 3 other scatter plots onto a single pdf page? platform powerpc-apple-darwin8.10.1 version.string R version 2.6.1 (2007-11-26) Thanks! Ben ###BEGIN data(iris) orig.par = par(no.readonly = TRUE) pdf(file = just_plots.pdf) par(mfrow=c(2,2)) plot(iris$Sepal.Length, iris$Sepal.Width, main = Plot 1) plot(iris$Petal.Length, iris$Petal.Width, main = Plot 2) plot(iris$Sepal.Length, iris$Petal.Length, main = Plot 3) plot(iris$Sepal.Width, iris$Petal.Width, main = Plot 4) dev.off() pdf(file = hist_and_plots.pdf) hist(iris$Sepal.Length, main = Plot 1) plot(iris$Petal.Length, iris$Petal.Width, main = Plot 2) plot(iris$Sepal.Length, iris$Petal.Length, main = Plot 3) plot(iris$Sepal.Width, iris$Petal.Width, main = Plot 4) dev.off() par(orig.par) ###END Ben Tupper [EMAIL PROTECTED] I GoodSearch for Ashwood Waldorf School. Raise money for your favorite charity or school just by searching the Internet with GoodSearch - www.goodsearch.com - powered by Yahoo! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem with creation of eSet
Hi, I am having troubles with creating an eSet and would appreciate any help on the following problem. I am trying to create an eSet using the following code pd - read.table(file=pdata.txt,header =TRUE,row.names=1); colnames(pd) - c(type,tumor,time,id); pdN - list(type = Cellline/xenograft,tumor=primary,secondary,cellline,time = 0hr,1hr,2hr,4hr, id = 1,2,3,4,5,6,7,8,9) # Initialize exprSet object pD - new(phenoData, pData=pd, varLabels=pdN); # This is my eSet!!! metastasis.eset - new(exprSet, exprs=as.matrix(geneExpr.log), phenoData=pD) I get the following error: The phenoData class is deprecated, use AnnotatedDataFrame (with ExpressionSet) instead Can someone suggest me how to use the new method AnnotatedDataFrame to create eSet? Thanks Manisha [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in cor.default(x1, x2) : missing observations in cov/cor
Does the code below solve your problem? If you have NAs in the same rows, you have to use c or p as use= parameters. Otherwise you get the error you described. a=c(1,2,3,4,NA,6) b=c(2,4,3,5,NA,7) which(is.na(a))==which(is.na(b)) cor(a,b) Error cor(a,b,use=all.obs) Error cor(a,b,use=complete.obs) Does it. AND cor(a,b,use=pairwise.complete.obs) Does it too. - cuncta stricte discussurus - -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von Ken Spriggs Gesendet: Wednesday, February 27, 2008 4:34 PM An: r-help@r-project.org Betreff: [R] Error in cor.default(x1, x2) : missing observations in cov/cor Hello, I'm trying to do cor(x1,x2) and I get the following error: Error in cor.default(x1, x2) : missing observations in cov/cor A few things: 1. I've used cor() many times and have never encountered this error. 2. length(x1) = length(x2) 3. is.numeric(x1) = is.numeric(x2) = TRUE 4. which(is.na(x1)) = which(is.na(x2)) = integer(0) {the same goes for is.nan()} 5. I also try cor(x1,x2, use = all.obs) and get the same error. What can be going wrong? -- View this message in context: http://www.nabble.com/Error-in-cor.default%28x1%2C-x2%29-%3A-missing-observa tions-in-cov-cor-tp15723848p15723848.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in cor.default(x1, x2) : missing observations in cov/cor
Sorry, I overlooked the = integer(0) result to which(is.na(x1))==which(is.na(x2)). So that's not it. Cheers. - cuncta stricte discussurus - -Ursprüngliche Nachricht- Von: Daniel Malter [mailto:[EMAIL PROTECTED] Gesendet: Wednesday, February 27, 2008 7:41 PM An: 'Ken Spriggs'; 'r-help@r-project.org' Betreff: AW: [R] Error in cor.default(x1, x2) : missing observations in cov/cor Does the code below solve your problem? If you have NAs in the same rows, you have to use c or p as use= parameters. Otherwise you get the error you described. a=c(1,2,3,4,NA,6) b=c(2,4,3,5,NA,7) which(is.na(a))==which(is.na(b)) cor(a,b) Error cor(a,b,use=all.obs) Error cor(a,b,use=complete.obs) Does it. AND cor(a,b,use=pairwise.complete.obs) Does it too. - cuncta stricte discussurus - -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von Ken Spriggs Gesendet: Wednesday, February 27, 2008 4:34 PM An: r-help@r-project.org Betreff: [R] Error in cor.default(x1, x2) : missing observations in cov/cor Hello, I'm trying to do cor(x1,x2) and I get the following error: Error in cor.default(x1, x2) : missing observations in cov/cor A few things: 1. I've used cor() many times and have never encountered this error. 2. length(x1) = length(x2) 3. is.numeric(x1) = is.numeric(x2) = TRUE 4. which(is.na(x1)) = which(is.na(x2)) = integer(0) {the same goes for is.nan()} 5. I also try cor(x1,x2, use = all.obs) and get the same error. What can be going wrong? -- View this message in context: http://www.nabble.com/Error-in-cor.default%28x1%2C-x2%29-%3A-missing-observa tions-in-cov-cor-tp15723848p15723848.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2 boxplot confusion
I noticed the coord flip problem during my ggplot investigations. Is this something I can override by getting into the code? The basic problem is all the geoms/stats in ggplot are based around the assumption that we are interested in Y | X, rather than X | Y. I don't think this is an unreasonable as it simplify much of the code and makes most common plots easier to specify. However, it is rather restrictive in your case - what you are trying to do make sense, but the parameterisation of the prebuild layers in ggplot makes it very difficult. However, there is one geom that is parameterised in the opposite direction: geom_vline. So your second option just draw the density plots with vertical lines drawn at the median, IQR etc), is fairly easy in the development version of ggplot: q5 - function(x) unname(quantile(x, c(0.05, 0.25, 0.5, 0.75, 0.95))) qplot(carat, data=diamonds, geom=density) + geom_vline(intercept=q5) (this won't work in the current version because I've only just allow geom_vline to use a function to calculate the intercept). I can send you a copy of the development version if you let me know your OS. Alternatively, you could write your own versions of stat_boxplot and geom_boxplot (or stat_density and geom_area) that work in the opposite direction to usual. This probably isn't too hard if you just take the code and change x's to y's (and vice versa), but it's currently completely undocumented. Hadley On the coord flipping problem I was thinking to grab the density data explicitly, swap x and y and then plot as a scatter plot with a box plot overlaid. Or perhaps just draw the density plots with vertical lines drawn at the median, IQR etc Or perhaps draw the density plots and fake a boxplot by drawing bars explicitly. I'm hoping you can at least advise which of any of these routes is likely to be a dead end. regards Chris hadley wrote: Now I think I understand want you want. I'm affraid that won't be easy because you're trying to mix continuous variables with categorical ones on the same scale. A density plot has two continuous scales: VALUE and it's density. The boxplot has a continuous scale (VALUE) and the other is categorical. Maybe Hadley knows a solution for your problem. Well one idea is: ggplot(diamonds, aes(x = price)) + geom_density(aes(min = -..density.., adjust= 0.5),fill=grey50, colour=NA) + facet_grid(. ~ cut) + coord_flip() which looks like it would naturally fit with a boxplot overlaid on top of it. However, it's currently not possible because the boxplot is parameterised so that it is always horizontal, while the density is vertical - in the above example I have flipped the coordinate system, but that flips both density plot and boxplot. Hadly -- http://had.co.nz/ -- View this message in context: http://www.nabble.com/ggplot2-boxplot-confusion-tp15706116p15725753.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot y1 and y2 on one graph
Now I would like to plot y1 and y2 on the same graph, with its two scales (y1 on left and y2 on rigth side). Before you actually do that, you might want to think about if it's a good idea or not. Are you trying to deliberately mislead or confuse your readers? If so, it's a good idea, otherwise it's probably not. You might want to read this blast from the past: K. W. Haemer. Double scales are dangerous. The American Statistician, 2(3):24–24, 1948. Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot y1 and y2 on one graph
There is an example in library(zoo) example(plot.zoo) On Wed, Feb 27, 2008 at 5:05 PM, milton ruser [EMAIL PROTECTED] wrote: Dear all I have a code like x-1:10 y1-x+runif(10)*2 y2-seq(0,50,length.out=10)+rnorm(10)*10 par(mfrow=c(1,2)) plot(y1~x) plot(y2~x) Now I would like to plot y1 and y2 on the same graph, with its two scales (y1 on left and y2 on rigth side). Any help are welcome. Kind regards Miltinho Brazil [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with creation of eSet
Hi Manisha -- this is a bioconductor package, so ask on the Bioconductor mailing list. The 'exprSet' class is quite old, and has been replaced by the 'ExpressionSet' class. Update your venison of R and Bioconductor (http://bioconductor.org/download). Then evaluate the commands library(Biobase) openVignette() and read the first vignette 'Biobase - An introduction to Biobase and ExpressionSets'. Hope that helps, Martin Manisha Brahmachary [EMAIL PROTECTED] writes: Hi, I am having troubles with creating an eSet and would appreciate any help on the following problem. I am trying to create an eSet using the following code pd - read.table(file=pdata.txt,header =TRUE,row.names=1); colnames(pd) - c(type,tumor,time,id); pdN - list(type = Cellline/xenograft,tumor=primary,secondary,cellline,time = 0hr,1hr,2hr,4hr, id = 1,2,3,4,5,6,7,8,9) # Initialize exprSet object pD - new(phenoData, pData=pd, varLabels=pdN); # This is my eSet!!! metastasis.eset - new(exprSet, exprs=as.matrix(geneExpr.log), phenoData=pD) I get the following error: The phenoData class is deprecated, use AnnotatedDataFrame (with ExpressionSet) instead Can someone suggest me how to use the new method AnnotatedDataFrame to create eSet? Thanks Manisha [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M2 B169 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot y1 and y2 on one graph
Thanks to all, I reached up to my needs. miltinho Brazil. On 2/27/08, Gabor Grothendieck [EMAIL PROTECTED] wrote: There is an example in library(zoo) example(plot.zoo) On Wed, Feb 27, 2008 at 5:05 PM, milton ruser [EMAIL PROTECTED] wrote: Dear all I have a code like x-1:10 y1-x+runif(10)*2 y2-seq(0,50,length.out=10)+rnorm(10)*10 par(mfrow=c(1,2)) plot(y1~x) plot(y2~x) Now I would like to plot y1 and y2 on the same graph, with its two scales (y1 on left and y2 on rigth side). Any help are welcome. Kind regards Miltinho Brazil [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem of subscript value from vector and list
Hi, everyone I got some problems when trying to subscript the value of vector and list, by using calculated indices. Here is the vector I am generated lon-rep(0,886);lat-rep(0,691) for (i in 1:886){ lon[i]-112+0.05*(i-1) } for (i in 691:1){ lat[i]--44.5+0.05*(691-i) } For a given location of lon(xp) and lat(yp), I would like to calculate the position of them, and using that to get the value for the location from one list. xp-c(112,112.05); yp-c(-10,-10.10) x-rep(0,length(xp)); y-rep(0,length(yp)) for (i in 1:length(xp)){ x[i]-(xp[i]-112)/0.05+1 } for(j in 1:length(yp)){ y[j]-(-10-yp[j])/0.05+1 } So here the value of x and y should indicate the position where xp and yp are in the vector lon and lat. And it appears to be the right number. But when I tried to retrieve the value of lon using the position, it returns the wrong value. Here's the result I got x [1] 1 2 lon[x] [1] 112 112 lon[c(1,2)] [1] 112.00 112.05 y [1] 1 3 lat[y] [1] -10.00 -10.05 lat[c(1,3)] [1] -10.0 -10.1 I have no idea why it returns the wrong value when I subscripts by x, while it works perfectly fine when I subscripts with the value of x directly. Is there any special rule of subscripts? My version of R is 2.5.1. Thanks a lot Jingru Dai School of Mathematical Sciences Rm 454, Building 28 Monash University, 3800 Victoria, Australia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multi-level hierarchical logistic regression with sampling weight
Rolf Turner [EMAIL PROTECTED] wrote in news:[EMAIL PROTECTED]: On 28/02/2008, at 11:28 AM, GUO, Qian wrote: Hi I would like to run a multi-level hierarchical logistic regression model with sampling weight? Is this possible with R? Yes. In R, all things are *possible*. :-) Is this the right place for fortune() nominations? -- David Winsemius __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating monthly var-cov matrix on non-overlapping rooling window basis
Thanks Henrique for this mail. It works fine. However I need one more modification. When number of column is 1 then some error is coming : library(zoo) date.data = seq(as.Date(01/01/01, format = %m/%d/%y),as.Date(06/25/02, format = %m/%d/%y), by = 1) len = length(date.data) data1 = zoo(matrix(rnorm(len), nrow = len), date.data ) head(data1) 2001-01-01 -1.5128990 2001-01-02 -0.2939971 2001-01-03 1.6387866 2001-01-04 -0.8107857 2001-01-05 0.7966224 2001-01-06 0.6007594 lapply(split(data1, format(index(data1), %m)), cov) Error in FUN(X[[1L]], ...) : supply both 'x' and 'y' or a matrix-like 'x' However I tried with an 'ifelse' condition : lapply(split(data1, format(index(data1), %m)), ifelse(dim(data1)[1] 1, cov, var)) Still I am getting error. What to do? Henrique Dallazuanna [EMAIL PROTECTED] wrote: Perhaps something like this: lapply(split(data1, format(index(data1), %m)), cov) On 27/02/2008, Megh Dal wrote: let create a 'zoo' object : library(zoo) date.data = seq(as.Date(01/01/01, format = %m/%d/%y), as.Date(06/25/02, format = %m/%d/%y), by = 1) len = length(date.data) data1 = zoo(matrix(rnorm(2*len), nrow = len), date.data ) head(data1) Now I want to create an 3 dimensional array (suppose name var.cov) where, var.cov[,,i] gives the Variance-covariance matrix for i-th month of data1. That is I want to calculate monthly variance-covariance matrix on non-overlapping rolling window basis. Any suggestion? - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multi-level hierarchical logistic regression with sampling weight
On Feb 27, 2008, at 11:04 PM, David Winsemius wrote: Rolf Turner [EMAIL PROTECTED] wrote in news:[EMAIL PROTECTED]: On 28/02/2008, at 11:28 AM, GUO, Qian wrote: Hi I would like to run a multi-level hierarchical logistic regression model with sampling weight? Is this possible with R? Yes. In R, all things are *possible*. :-) Is this the right place for fortune() nominations? Seconded, though this is very close: fortune(no if) Evelyn Hall: I would like to know how (if) I can extract some of the information from the summary of my nlme. Simon Blomberg: This is R. There is no if. Only how. -- Evelyn Hall and Simon 'Yoda' Blomberg R-help (April 2005) I just love that one. -- David Winsemius Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2 boxplot confusion
hadley wrote: I noticed the coord flip problem during my ggplot investigations. Is this something I can override by getting into the code? However, there is one geom that is parameterised in the opposite direction: geom_vline. So your second option just draw the density plots with vertical lines drawn at the median, IQR etc), is fairly easy in the development version of ggplot: q5 - function(x) unname(quantile(x, c(0.05, 0.25, 0.5, 0.75, 0.95))) qplot(carat, data=diamonds, geom=density) + geom_vline(intercept=q5) (this won't work in the current version because I've only just allow geom_vline to use a function to calculate the intercept). I can send you a copy of the development version if you let me know your OS. I discovered geom_vline today. This idea works for me as a boxplot alternative. As per example in the ggplot2 online docs I can plot multiple density plots with median lines or fivenums colored by factor. eg. series - c('C2','C4','C8','C10','C15','C20') ids - c('ID1','ID2','ID3') mydata - data.frame(SERIES=rep(series,30),ID=rep(ids,60),VALUE=rnorm(180)) df - data.frame(series=levels(mydata$SERIES), intercept=tapply(mydata$VALUE, list(mydata$SERIES), median)) p - ggplot(mydata, aes(x=VALUE)) + geom_density(aes(color=SERIES)) p + geom_vline(data=df, aes(color=factor(series))) However I can't seem to get this to work if I add a facet_grid() layer to get a grid of density plots against two factors. p + geom_vline(data=df, aes(color=factor(series))) + facet_grid(SERIES ~ ID) Each grid is a single density plot in this case but each density plot has all median vlines. I suspect I've set up the vline data frame incorrectly but I can't see where with my noob glasses on. And, yes I'd be happy to try dev code. I'm running on Windows XP. Should I report problems with the dev code in this forum? Please advise. As for writing my own stat_boxplot etc, I wouldn't mind looking at the code to see what rolling my own would take. It might help me grasp all the countless functions in R. But where is the source? I seem to recall a thread of yours about the merits or otherwise of putting ggplot code on R-forge etc. -- View this message in context: http://www.nabble.com/ggplot2-boxplot-confusion-tp15706116p15729573.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Very Simple Regression Question
I've just been running my first ever regression and I'm using R for an lmer. I've created two models, a null one (part ~ 1 + (1 | id) + (1 | word)) and another with a predictor I want to test (part ~ 1 + conf + (1 | id) + (1 | word)). I've compared them both and the model with a predictor is significantly better, but I can't see from the data which direction the prediction goes (i.e. are conf and part inversely correlated or not). Could somebody tell me where I can find this information? Here's my data, thanks in advance: Null Model: Generalized linear mixed model fit using Laplace Formula: part ~ 1 + (1 | id) + (1 | word) Data: align Family: binomial(logit link) AIC BIC logLik deviance 537.9 551 -265.9531.9 Random effects: Groups NameVariance Std.Dev. word (Intercept) 2.309122 1.51958 id (Intercept) 0.055252 0.23506 number of obs: 601, groups: word, 32; id, 20 Estimated scale (compare to 1 ) 0.8671394 Fixed effects: Estimate Std. Error z value Pr(|z|) (Intercept) -1.8920 0.3072 -6.158 7.37e-10 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Model with predictors: Generalized linear mixed model fit using Laplace Formula: part ~ 1 + conf + (1 | id) + (1 | word) Data: align Family: binomial(logit link) AIC BIC logLik deviance 490.1 507.7 -241.1482.1 Random effects: Groups NameVariance Std.Dev. word (Intercept) 2.977549 1.72556 id (Intercept) 0.097771 0.31268 number of obs: 601, groups: word, 32; id, 20 Estimated scale (compare to 1 ) 0.9142627 Fixed effects: Estimate Std. Error z value Pr(|z|) (Intercept) -3.1001 0.4014 -7.724 1.13e-14 *** conf 1.7941 0.2734 6.563 5.29e-11 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Correlation of Fixed Effects: (Intr) conf -0.505 ANOVA: Data: align Models: align.nul: part ~ 1 + (1 | id) + (1 | word) align.conf: part ~ 1 + conf + (1 | id) + (1 | word) Df AIC BIC logLik Chisq Chi Df Pr(Chisq) align.nul 3 537.85 551.05 -265.93 align.conf 4 490.14 507.73 -241.07 49.713 1 1.780e-12 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 -- View this message in context: http://www.nabble.com/Very-Simple-Regression-Question-tp15729195p15729195.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to read HUGE data sets?
Dear R-list, Does somebody know how can I read a HUGE data set using R? It is a hapmap data set (txt format) which is around 4GB. After read it, I need to delete some specific rows and columns. I'm running R 2.6.2 patched over XP SP2 using a 2.4 GHz Core 2-Duo processor and 4GB RAM. Any suggestion would be appreciated. Thanks in advance, Jorge [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to read HUGE data sets?
Depending on how many rows you will delete, and if you know in advance which ones they are, one approach is to use the skip argument of read.table. If you only need a fraction of the total number of rows this will save a lot of RAM. Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 204-4202 Home (no voice mail please) mwkimpelatgmaildotcom ** Jorge Iván Vélez wrote: Dear R-list, Does somebody know how can I read a HUGE data set using R? It is a hapmap data set (txt format) which is around 4GB. After read it, I need to delete some specific rows and columns. I'm running R 2.6.2 patched over XP SP2 using a 2.4 GHz Core 2-Duo processor and 4GB RAM. Any suggestion would be appreciated. Thanks in advance, Jorge [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to read HUGE data sets?
I may be mistaken, but I believe R does all it work in memory. If that is so, you would really only have 2 options: 1. Get a lot of memory 2. Figure out a way to do the desired operation on parts of the data at a time. -Roy M. On Feb 27, 2008, at 9:03 PM, Jorge Iván Vélez wrote: Dear R-list, Does somebody know how can I read a HUGE data set using R? It is a hapmap data set (txt format) which is around 4GB. After read it, I need to delete some specific rows and columns. I'm running R 2.6.2 patched over XP SP2 using a 2.4 GHz Core 2-Duo processor and 4GB RAM. Any suggestion would be appreciated. Thanks in advance, Jorge [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. ** The contents of this message do not reflect any position of the U.S. Government or NOAA. ** Roy Mendelssohn Supervisory Operations Research Analyst NOAA/NMFS Environmental Research Division Southwest Fisheries Science Center 1352 Lighthouse Avenue Pacific Grove, CA 93950-2097 e-mail: [EMAIL PROTECTED] (Note new e-mail address) voice: (831)-648-9029 fax: (831)-648-8440 www: http://www.pfeg.noaa.gov/ Old age and treachery will overcome youth and skill. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 2D kolmogorov
Googling MUAC mentioned in that message will take you to the following site with information on the algorithm and apparently access to the code. http://www.acooke.org/jara/muac/index.html -Christos -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Máté Maus Sent: Thursday, February 28, 2008 12:04 AM To: [EMAIL PROTECTED] Subject: [R] 2D kolmogorov Dear Dr. Rich Grenyer, I am a PhD student at the Department of Immunology, ELTE University, Budapest. I try to find a robust approach to compare 2D distribution of signaling molecules in cell cytoplasm. I read your message at https://stat.ethz.ch/pipermail/r-help/2004-June/052973.html about your implementation of 2D Kolmogorov statistics. I tryed the links in your message, but I suppose they are not alive by now. I was wondering if you could give me some advices, how to use 2D Kolmogorov statistics for confocal microscopic images. Yours faithfuly, Mate Maus PhD student __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to read HUGE data sets?
On Wed, 27-Feb-2008 at 09:13PM -0800, Roy Mendelssohn wrote: | I may be mistaken, but I believe R does all it work in memory. If | that is so, you would really only have 2 options: | | 1. Get a lot of memory But with a 32bit operating system, 4G is all the memory that can be addressed (including the operating system). So your chances of getting all the data into R seem very slim. | | 2. Figure out a way to do the desired operation on parts of the data | at a time. That might involve using a database which you can query from R, or you might be able to use a Perl script to select what you require. I have heard of people using Perl with Windows. Someone once asked me to plot some SAS output which was several hundred Mb. In that case, a simple Perl script cut it down to 3 Mb. You might be lucky too. Good luck. -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_Middle minds discuss events (:_~*~_:)Small minds discuss people (_)-(_) . Anon ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Kaiser-Meyer-Olkin
I am a beginner when it comes to using R, though fortunately I already know something about statistics. I think factor analysis should be used sparingly, but I occasionally use it. It doesn't seem to me that factanal() provides Kaiser's Measure of Sampling Adequacy, which should be computed for factor problems based on a small number of subjects, though perhaps it is elsewhere. Does anyone know? (Better yet, is there a complete list of procedures that can be performed by all available packages?) I have coded MSA in C++, so I could add it if it is not yet available. In that case I suppose I should find out how to submit it. Robert Tim Kopp http://analytic.tripod.com/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.