Re: [R] Alternate to for-loop
Stefan Evert wrote: A couple of remarks on vQ's naive benchmark: f.rep = function(n, m) replicate(n, rnorm(m)) I suppose you meant f.rep = function(n, m) replicate(n, mean(rnorm(m))) which doesn't make a substantial speed difference, though. indeed, thanks; i've already posted a correction, and as you say, it doesn't make much difference for these particular benchmark values. f.pat = function(n, m) colMeans(array(rnorm(n*m), c(n, m))) system.time(f.pat(1000, 1000)) system.time(f.rep(1000, 1000)) makes me believe that there is no significant difference in efficiency between the 'professionally-looking' replicate-based solution and the 'as fast as possible' pat's solution. True, I get the same timing results on my machine. But then you should also point out that the original for-loop: f.for = function(n, m) { res - numeric(n); for (i in 1:n) res[i] - mean(rnorm(m)); res } is exactly as fast as replicate(). So apart from looking more professional, there isn't any difference between an explicit loop and replicate(). indeed, and pat seems correct in blaming for loops for the inefficiency of replicate in cases where log10(n/m) 2. Perhaps loops in R aren't always as slow (compared to matrix operations) as one seemed to think. depends how and where you use them. in the problem discussed here, they do slow down the code for some class of inputs and do not speedup for the others, compared to the array version of pat. I ran into a similar issue with a simple benchmark the other day, where a plain loop in Lua was faster than vectorised code in R ... hmm, would you be saying that r's vectorised performance is overhyped? or is it just that non-vectorised code in r is slow? I have to say, though, that like Patrick I assumed the goal was to obtain a large number of replicates for relatively small sets of random numbers, in which case the matrix solution is indeed faster (though not as much as I would have thought): system.time(f.for(10, 100)) user system elapsed 4.212 0.025 4.273 system.time(f.rep(10, 100)) user system elapsed 4.109 0.028 4.172 system.time(f.pat(10, 100)) user system elapsed 1.580 0.134 1.739 sure; it's even more pronounced when n = 10^6 and m=10. vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] stopping stepAIC (MASS package)
Dear R-listers, I am building a prediction model starting with many, many variables. I use the 'stepAIC' procedure in the MASS package, and the model building process takes several hours to complete. At the very end, I am occasionally confronted with warnings like this one: Warning messages: 1: In fitter(X, Y, strata = Strata, offset = offset, weights = weights, : Loglik converged before variable 4 ; beta may be infinite. In these instances, I take it that I have stretched my data too thin, and I discard the final model. Therefore, I would save valuable time if the procedure would stop automatically when it encounters any anomaly that causes it to warn me. Is this possible? And if so, how? Thank you in advance for any assistance. Peter Jepsen, MD sessionInfo() R version 2.8.1 (2008-12-22) i386-pc-mingw32 locale: LC_COLLATE=Danish_Denmark.1252;LC_CTYPE=Danish_Denmark.1252;LC_MONETARY= Danish_Denmark.1252;LC_NUMERIC=C;LC_TIME=Danish_Denmark.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] MASS_7.2-45 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem with dates in zoo package
i have the following code - assimilating the maximum annual discharge each year ffrom a daily discharge record from year 1989-2005. m - read.table(D:/documents/5 stations/01014000.csv, sep =,) z - zoo(m[,4],as.Date(as.character(m[,3]), %m/%d/%Y)) x - aggregate(z, floor(as.numeric(as.yearmon(time(z, max) #code suggested by Gabor Grothendieck which gives me the maximum discharge each year and the year itself.. what i am trying now is to produce the date when the maximum discharge was observed in the pattern -mm-dd, like: 1988 11/9/1988 18600 1989 5/8/1989 49000 ... 2005 4/26/2005111000 thank you all in advance... -- View this message in context: http://www.nabble.com/problem-with-dates-in-zoo-package-tp22053103p22053103.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Efficient matrix computations
Hi, I am looking for two ways to speed up my computations: 1. Is there a function that efficiently computes the 'sandwich product' of three matrices, say, ZPZ' 2. Is there a function that efficiently computes the determinant of a positive definite symmetric matrix? Thanks, S.A. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] difficulty in starting time sequence from 00:00:00
Dear R-experts, Need your help. Dear R- Experts, Seek your help. I created a time sequence using: x[i] -chron(dates, tt, format=c(dates=y-m-d, tt=h:m:s)) first element in the list is displayed as: (09-01-01 00:00:00) Further elements are: (09-01-01 00:01:00) (09-01-01 00:02:00) (09-01-01 00:03:00) ... and so on. I want to compare each value with the timestamp values from database. I cannot simply compare above values with the timestamp values, as the above values are not in date format. Therefore, I used: format.Date(x[1],%y-%m-%d %H:%M:%S), I expect following value: 09-01-01 00:00:00 HOWEVER, the value displayed as: 09-01-01 01:00:00 Unnecessary addition of 1 hour. I want to create time sequence starting from 09-01-01 00:00:00 !!! Can any one please help? Thank you. -- View this message in context: http://www.nabble.com/difficulty-in-starting-time-sequence-from-00%3A00%3A00-tp22053632p22053632.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to add direction of time to plot.circular()
Michael Kubovy wrote: Dear r-helpers, I want to show that time is flowing CCW in the following: require(circular) len - 8 labl - as.character(c(0, 1, 1, 1, 0, 0, 1, 0)) r - circular(2*pi* (rep(c(1, 3, 6), each = 200)/len + rnorm(600, 0, 0.025))) r.dens - density(r, bw = 25, adjust = 4, kernel = 'vonmises') plot(r, shrink = 2.5, axes = FALSE, ticks = FALSE, pch = 1, col = 'lightblue', stack = TRUE, bins = 12 * len) axis.circular(at = circular(seq(0, (len - 1) * 2 * pi/len, length.out = len)), label = labl) lines(r.dens, col = 2) I had imagined a directed arc with a smaller radius than the black circle, running from 0 to 315 deg. I also thought that adding a short horizontal line at its beginning might be helpful. I would appreciate advice on how best to do this or anything else that would provide the required information. Hi Michael, You might find that clock24.plot (plotrix) will do what you want, although that is in the conventional clockwise direction. You can reverse this by using radial.plot and recreating the clock grid in the opposite orientation. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create package with Fortran 90 and C code
On Tue, 17 Feb 2009, Nathan S. Watson-Haigh wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I'm trying to add some Fortran 90 code to an existing package. When I compile and load the file manually like: SHELL R CMD SHLIB file.f90 R dyn.load(file.so) I can use the .Fortran() fine. However, when I try to build, install and load the library I seem to be missing something. I do a: SHELL R CMD build dir SHELL R CMD INSTALL pkg_version.tar.gz Things seem to progress smoothly. However, in R when I try to load the package I get an error like: Error in dyn.load(file, DLLpath = DLLpath, ...) : unable to load shared library '/cs/home/cslsi/wat410/R/ia64-unknown-linux-gnu-library/2.7/pkg/libs/pkg.so': /cs/home/cslsi/wat410/R/ia64-unknown-linux-gnu-library/2.7/pkg/libs/pkg.so: undefined symbol: _ZTVN10__cxxabiv117__class_type_infoE Error: package/namespace load failed for 'pkg' Can anyone suggest what I might do to solve this? See the posting guide! This is a question about compiled code, hence for the R-devel list. You are apparently using an obsolete R, and it rather looks as if you are using C++ with Fortran 90, something that is not supported (since in general it does not work, and you need to tell us the compilers you are using). So please post a much more complete description on R-devel, and perhaps make the failing package available for potential helpers to look at. Cheers, Nathan - Dr. Nathan S. Watson-Haigh OCE Post Doctoral Fellow CSIRO Livestock Industries Queensland Bioscience Precinct St Lucia, QLD 4067 Australia Tel: +61 (0)7 3214 2922 Fax: +61 (0)7 3214 2900 Web: http://www.csiro.au/people/Nathan.Watson-Haigh.html -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Subset Regression Package
Dear all , Is there any subset regression (subset selection regression) package in R other than leaps? Thanks and regards Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Efficient matrix computations
it would be a bit more helpful if we knew more info regarding these matrices, for instance is P diagonal, etc. In any case, you could have a look at crossprod() # and tcorssprod() and, for the determinant maybe prod(eigen(mat, symmetric = TRUE, only.values = FALSE)$values) # or prod(diag(chol(mat)))^2 are a bit faster than det(), but I haven't tested it. I hope it helps. Best, Dimitris Shimrit Abraham wrote: Hi, I am looking for two ways to speed up my computations: 1. Is there a function that efficiently computes the 'sandwich product' of three matrices, say, ZPZ' 2. Is there a function that efficiently computes the determinant of a positive definite symmetric matrix? Thanks, S.A. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Ingore this message. I have already solved the problem..
Ingore this message. I have already solved the problem.. Regards, Suresh Suresh_FSFM wrote: Dear R-experts, Need your help. Dear R- Experts, Seek your help. I created a time sequence using: x[i] -chron(dates, tt, format=c(dates=y-m-d, tt=h:m:s)) first element in the list is displayed as: (09-01-01 00:00:00) Further elements are: (09-01-01 00:01:00) (09-01-01 00:02:00) (09-01-01 00:03:00) ... and so on. I want to compare each value with the timestamp values from database. I cannot simply compare above values with the timestamp values, as the above values are not in date format. Therefore, I used: format.Date(x[1],%y-%m-%d %H:%M:%S), I expect following value: 09-01-01 00:00:00 HOWEVER, the value displayed as: 09-01-01 01:00:00 Unnecessary addition of 1 hour. I want to create time sequence starting from 09-01-01 00:00:00 !!! Can any one please help? Thank you. -- View this message in context: http://www.nabble.com/difficulty-in-starting-time-sequence-from-00%3A00%3A00-tp22053632p22054075.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Efficient matrix computations
sorry, in my previous e-mail it should be tcrossprod() # and prod(eigen(mat, symmetric = TRUE, only.values = TRUE)$values) Best, Dimitris Shimrit Abraham wrote: Hi, I am looking for two ways to speed up my computations: 1. Is there a function that efficiently computes the 'sandwich product' of three matrices, say, ZPZ' 2. Is there a function that efficiently computes the determinant of a positive definite symmetric matrix? Thanks, S.A. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Alternate to for-loop
I ran into a similar issue with a simple benchmark the other day, where a plain loop in Lua was faster than vectorised code in R ... hmm, would you be saying that r's vectorised performance is overhyped? or is it just that non-vectorised code in r is slow? What I meant, I guess, was (apart from a little bit of trolling) that I'd had misconceptions about the speed differences between loops and vectorised code. In particular, I had entertained the naive belief that vectorised solutions are always highly efficient (I wonder if I'm the only one who was naive enough to think this ..), so I was very much surprised to find a loop in an interpreted language like Lua to be faster than vectorised R code. My silly little benchmark translated the Lua code sum = 0 for i=1,N do sum = sum + i end into vectorised R sum(as.numeric(1:N)) The performance results were as follows: for loop in R: 0.75 Mops/s (200 ops in 2.66 s) vectorised R: 29.75 Mops/s (5000 ops in 1.68 s) Lua:51.54 Mops/s (1 ops in 1.94 s) Perl: 8.26 Mops/s (1000 ops in 1.21 s) Note that Lua is an interpreted language (compiled to byte code); with the just in time compiler I get more than 230 Mops/s. I suspect that this has to do with cache trashing, since the vectorised code in R operates on large vectors that have to be read from / written to RAM, while the Lua loop presumably runs entirely from the L1 cache. (Before you ask, I split the vectorised R code into a loop that processes 1 million numbers at a time; I tried different ways of coding the benchmark and picked the fastest solution.) Perhaps loops in R aren't always as slow (compared to matrix operations) as one seemed to think. depends how and where you use them. in the problem discussed here, they do slow down the code for some class of inputs and do not speedup for the others, compared to the array version of pat. My mistake was to think that vectorisation will always give a substantial performance boost and that for-loops should be avoided whenever possible. But it's really just the inner loops that need to be vectorised: iterating over the outer margins of a matrix doesn't add much overhead, especially if the vectorised solution would have to operate on huge matrices. Guess that's a bad habit from my old Matlab days (back in the early 90s) ... Best, Stefan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Efficient matrix computations
Thanks for your suggestions. I'll try to implement what you suggested. Perhaps the following information can help you to think of alternative ways to speed up computations: I am coding the Kalman Filter in R because I have certain requirements that are not provided by the packages regarding State Space models (dlm, sspir, dse, are there any more? ) . Since I don't see a way to avoid a for loop, I am trying to make the matrix computations as efficent as possible by using the properties that some of these matrices possess, hence my questions. The Kalman Filter requires sandwich matrix products, say ZPZ', where P is a covariance matrix, hence positive definite and symmetric (Z is not necessarily a square matrix). To compute the likelihood, I also need an efficient way to calculate the determinant of a covariance matrix, which is again positive definite and symmetric. Please let me know if this information gives you some new ideas on how to solve my problem. Thanks, S.A. On Tue, Feb 17, 2009 at 11:09 AM, Dimitris Rizopoulos d.rizopou...@erasmusmc.nl wrote: sorry, in my previous e-mail it should be tcrossprod() # and prod(eigen(mat, symmetric = TRUE, only.values = TRUE)$values) Best, Dimitris Shimrit Abraham wrote: Hi, I am looking for two ways to speed up my computations: 1. Is there a function that efficiently computes the 'sandwich product' of three matrices, say, ZPZ' 2. Is there a function that efficiently computes the determinant of a positive definite symmetric matrix? Thanks, S.A. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] scatterplot and correlation for weird data format
William Simpson wrote: I have data in a format like this: namessexsex viewnum rating rt ahl4f m f 56 -1082246 ahl4f m f 74 85 1444 ahl4f m f 52 151 1595 ahl4f m f 85 1 1447 ahl4f m f 53 46 1716 ahl4f m f 37 145 1276 ahl4f m f 50 98 1465 ahl4f m f 51 -26 1322 ahl4f m f 38 -97 1790 ahl4f m f 14 -158865 ... ahl4f m p 43 -1361669 ahl4f m p 10 -59 808 ahl4f m p 67 -1111279 ahl4f m p 85 -86 994 ahl4f m p 100 134 1337 ahl4f m p 76 56 665 ahl4f m p 51 -49 594 ahl4f m p 33 -118505 ahl4f m p 49 -1561283 ... and so on for many subjects (name) I would like to do a scatterplot of the rating given by each subject (with identifier name) for the frontal (view==f) and profile (view==p) views of each face (each face has an identifier num). I'd like to find the correlation as well. For each subject, since there are 100 faces, there will be 100 points on the scatterplot. I would just lump all the subjects' data together for the plot and correlation I think (unless somebody tells me I should do each subject separately). I'm stumped on how to do this. Thanks very much for any help! Hi Bill, The first thing that comes to mind is a variation on count.overplot, a function that displays the number of overplotted points for a given tolerance rather than a blur of separate symbols. The problem would be separating the various categories of experimental stimuli in your case. You could use, say, F and P as suffixes for the counts to indicate orientation, color to indicate sex of face, male/female symbol for sex of respondent, and so on. The problem is that you end up with a difficult to interpret plot, as each entry (of which there will still be many) must be decoded by the viewer. If you think this is worth pursuing, email me and I will try to outline a way to do it. Another, perhaps simpler way is to define a summary score for each subject for each class of face and plot that. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Alternate to for-loop
Stefan Evert wrote: hmm, would you be saying that r's vectorised performance is overhyped? or is it just that non-vectorised code in r is slow? What I meant, I guess, was (apart from a little bit of trolling) that I'd had misconceptions about the speed differences between loops and vectorised code. In particular, I had entertained the naive belief that vectorised solutions are always highly efficient (I wonder if I'm the only one who was naive enough to think this ..), so I was very much surprised to find a loop in an interpreted language like Lua to be faster than vectorised R code. My silly little benchmark translated the Lua code sum = 0 for i=1,N do sum = sum + i end into vectorised R sum(as.numeric(1:N)) what you're benchmarking here is a lump of vector allocation and the actual summing. assuming that you already have generated and stored the values to be summed, the runtime differs a bit: system.time(sum(as.numeric(1:10^8))) system.time({ x = 1:10^8 }) system.time({ y = as.numeric(x) }) system.time(sum(y)) though again, it's a factor below one order of magnitude. vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with dates in zoo package
See ?which.max : library(zoo) z - read.zoo(myfile.dat, sep = ,, format = %m/%d/%Y) f - function(x) time(x)[which.max(x)] ix - tapply(z, floor(as.yearmon(time(z))), f) z[ix] On Tue, Feb 17, 2009 at 3:58 AM, CJ Rubio cjru...@kongju.ac.kr wrote: i have the following code - assimilating the maximum annual discharge each year ffrom a daily discharge record from year 1989-2005. m - read.table(D:/documents/5 stations/01014000.csv, sep =,) z - zoo(m[,4],as.Date(as.character(m[,3]), %m/%d/%Y)) x - aggregate(z, floor(as.numeric(as.yearmon(time(z, max) #code suggested by Gabor Grothendieck which gives me the maximum discharge each year and the year itself.. what i am trying now is to produce the date when the maximum discharge was observed in the pattern -mm-dd, like: 1988 11/9/1988 18600 1989 5/8/1989 49000 ... 2005 4/26/2005111000 thank you all in advance... -- View this message in context: http://www.nabble.com/problem-with-dates-in-zoo-package-tp22053103p22053103.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Combination
Hello, I have a sequence of numbers: seq(1:50) and I would like to have all the possible combinations with this numbers without repeating any combination: 11, 12, 13, ... ,22,23,24,... How can I do it? Best, Dani -- Daniel Valverde Saubí Grup de Biologia Molecular de Llevats Facultat de Veterinària de la Universitat Autònoma de Barcelona Edifici V, Campus UAB 08193 Cerdanyola del Vallès- SPAIN Centro de Investigación Biomédica en Red en Bioingeniería, Biomateriales y Nanomedicina (CIBER-BBN) Grup d'Aplicacions Biomèdiques de la RMN Facultat de Biociències Universitat Autònoma de Barcelona Edifici Cs, Campus UAB 08193 Cerdanyola del Vallès- SPAIN +34 93 5814126 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combination
Try this: apply(expand.grid(1:50, 1:50), 1, paste, collapse = '') On Tue, Feb 17, 2009 at 8:37 AM, Dani Valverde daniel.valve...@uab.catwrote: Hello, I have a sequence of numbers: seq(1:50) and I would like to have all the possible combinations with this numbers without repeating any combination: 11, 12, 13, ... ,22,23,24,... How can I do it? Best, Dani -- Daniel Valverde Saubí Grup de Biologia Molecular de Llevats Facultat de Veterinària de la Universitat Autònoma de Barcelona Edifici V, Campus UAB 08193 Cerdanyola del Vallès- SPAIN Centro de Investigación Biomédica en Red en Bioingeniería, Biomateriales y Nanomedicina (CIBER-BBN) Grup d'Aplicacions Biomèdiques de la RMN Facultat de Biociències Universitat Autònoma de Barcelona Edifici Cs, Campus UAB 08193 Cerdanyola del Vallès- SPAIN +34 93 5814126 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combination
Hi Dani, see ?combn . combn(1:50,2) gives you all combinations as matrix. you can do sth like apply(combn(1:50,2),2, paste, sep=, collapse=) to get concatenated results. hth. Dani Valverde schrieb: Hello, I have a sequence of numbers: seq(1:50) and I would like to have all the possible combinations with this numbers without repeating any combination: 11, 12, 13, ... ,22,23,24,... How can I do it? Best, Dani -- Eik Vettorazzi Institut für Medizinische Biometrie und Epidemiologie Universitätsklinikum Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/42803-8243 F ++49/40/42803-7790 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with calculating the residuals in an AR(p)-model
Hi, I'm trying to calculate the residuals in a one-sided AR(p) model: sum (phi_j (X_t-j - EX)) = epsilon My X is an ARMA(1,1)-model, which can be represented as an AR(infty) model. I calculate the order of the model with AIC and the parameters with the Yule-Walker method. For the residuals, I tried: for (k in (p+1):n){ for (j in 1:p){ eps[k] - sum(phi[j] * (X[k-j] - mean(X))) } ceps[k] - eps[k] - sum(eps)/(n-p+1) #centering the residuals } The residuals in my ARMA(1,1)-models are t-distributed with 6 degrees of freedom. Unfortunately my code results in residuals, which are all around zero. So ecdf(eps) or ecdf(ceps) are very different from a t(6)-distribution. Where's the bug in my code? Regards, Martin -- View this message in context: http://www.nabble.com/Problem-with-calculating-the-residuals-in-an-AR%28p%29-model-tp22055764p22055764.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using eval in multinom argument
Thanks for your help, I managed to get it working like this: fml-as.formula(paste(y ~, paste(PCnames, collapse=+))) y = class.ind(grp[outgroup]) z1=multinom(formula = fml, data=data.frame(scores)) Daniel Crouch **QUOTE: Forget eval(parse(text = )) See ?as.formula ?update.formula and try out the example() s there. HTH, Chuck On Mon, 16 Feb 2009, Crouch, Daniel wrote: Hi, I am having difficulty entering a 'programmable' argument into the multinom function from the nnet package. Interactively, I can get the function to work fine by calling it this way: z1=multinom(formula = class.ind(grp[-outgroup])~ (PC1 + PC2 + PC3), data=data.frame(scores)) However I need to be able to change the number of variables I am looking for in 'scores' and so am trying to call it this way... z1=multinom(formula = class.ind(grp[-outgroup])~ eval(parse(text=PCnames)), data=data.frame(scores)) ...where, for example, PCnames = c(PC1, +, PC2, +, PC3) This gives no error messages, but only the last variable (in this case PC3) gets considered in the model. z1 looks like this: (Intercept) eval(parse(text = PCnames)) 23.530352 -116.87140 3 -1.30861313.59134 43.662172 -57.52198 5 -1.216041 -242.38827 6 -9.377894 -367.71614 7 -3.145738 -286.19766 Rather than this: (Intercept) PC1PC2 PC3 2 288.97131 889.281 3776.5837 -2105.751 3 -712.53519 2775.663 8490.5724 8602.834 4 229.17772 4234.950 329.6995 -2182.238 585.54585 -3036.657 3968.2517 -3450.070 6 -676.55377 -9545.785 2422.5340 -7183.686 7 -631.91921 10997.432 -3310.2905 -5348.513 Any help would be much appreciated. Thanks, Daniel Crouch Daniel Crouch Research Student Department of Medical Molecular Genetics King's College London 8th Floor, Tower Wing Guy's Hospital London SE1 9RT United Kingdom __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 Daniel Crouch Research Student Department of Medical Molecular Genetics King's College London 8th Floor, Tower Wing Guy's Hospital London SE1 9RT United Kingdom __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Comparison of age categories using contrasts
Hi Dylan, Am I trying to use contrast.Design() for something that it was not intended for? ... I think Prof. Harrell's main point had to do with how interactions are handled. You can also get the kind of contrasts that Patrick was interested in via multcomp. If we do this using your artificial data set we see that the contrasts are the same as those got by fitting the model using contr.sdif, but a warning is generated about interactions in the model c. [see example code] Part of Prof. Harrell's system is that in generating contrasts via contr.Design an appropriate value is automatically chosen for the interacting variable (in this case the median value of x) so that a reasonable default set of contrasts is calculated. This can of course be changed. Coming to your question [?] about how to generate the kind of contrasts that Patrick wanted using contrast.Design. Well, it is not that straightforward, though I may have missed something in the documentation to the function. In the past I have cobbled them together using the kind of hack given below: ## Exampe code x - rnorm(100) y1 - x[1:25] * 2 + rnorm(25, mean=1) y2 - x[26:50] * 2.6 + rnorm(25, mean=1.5) y3 - x[51:75] * 2.9 + rnorm(25, mean=5) y4 - x[76:100] * 3.5 + rnorm(25, mean=5.5) d - data.frame(x=x, y=c(y1,y2,y3,y4), f=factor(rep(letters[1:4], each=25))) # now try with contrast.Design(): library(multcomp) dd - datadist(d) options(datadist='dd') l - ols(y ~ x * f, data=d) ## model fitted using successive difference contrasts library(MASS) T.lm - lm(y ~ x * f, contrasts=list(f=contr.sdif), data=d) summary(T.lm) ## show the warning: model fitted using ols() and treatment contrasts summary(glht(l, linfct=mcp(f=Seq))) ## custom successive difference contrasts using contrast.Design TFin - {} for (i in 1:(length(levels(d$f))-1)) { tcont - contrast(l, a=list(f=levels(d$f)[i]), b=list(f=levels(d$f)[i+1])) TFin - rbind(TFin, tcont) rownames(TFin)[i] - paste(levels(d$f)[i], levels(d$f)[i+1], sep=-) } TFin[,1:9] Regards, Mark. Dylan Beaudette-2 wrote: On Mon, Feb 16, 2009 at 5:28 PM, Patrick Giraudoux patrick.giraud...@univ-fcomte.fr wrote: Greg Snow a écrit : One approach is to create your own contrasts matrix: mycmat - diag(8) mycmat[ row(mycmat) == col(mycmat) + 1 ] - -1 mycmati - solve(mycmat) contrasts(agefactor) - mycmati[,-1] Now when you use agefactor, the intercept will be the first age group and the slopes will be the differences between the pairs of groups (make sure that the order of the levels of agefactor is correct). The difference between this method and the contr.sdif function in MASS is how the intercept will end up being interpreted (and the dimnames). Hope this helps, Actually, exactly what I needed including the reference to contr.sdif in MASS I did not spot before (although I am a faithful reader of the yellow book... but so many things still escape to me). Again thanks a lot. Patrick Glad you were able to solve your problem. Frank replied earlier with the suggestion to use contrast.Design() to perform the tests after the initial model has been fit. Perhaps I am a little denser than normal, but I cannot figure out how to apply contrast.Design() to a similar model with several levels of a factor. Example data: # need these library(lattice) library(Design) # replicate an important experimental dataset set.seed(10101010) x - rnorm(100) y1 - x[1:25] * 2 + rnorm(25, mean=1) y2 - x[26:50] * 2.6 + rnorm(25, mean=1.5) y3 - x[51:75] * 2.9 + rnorm(25, mean=5) y4 - x[76:100] * 3.5 + rnorm(25, mean=5.5) d - data.frame(x=x, y=c(y1,y2,y3,y4), f=factor(rep(letters[1:4], each=25))) # plot xyplot(y ~ x, groups=f, data=d, auto.key=list(columns=4, title='Beard Type', lines=TRUE, points=FALSE, cex=0.75), type=c('p','r'), ylab='Number of Pirates', xlab='Distance from Land') # standard comparison to base level of f summary(lm(y ~ x * f, data=d)) # compare to level 4 of f summary(lm(y ~ x * C(f, base=4), data=d)) # now try with contrast.Design(): dd - datadist(d) options(datadist='dd') l - ols(y ~ x * f, data=d) # probably the wrong approach, as the results do not look too familiar: contrast(l, list(f=levels(d$f))) x f Contrast S.E. Lower Upper t Pr(|t|) -0.3449623 a 0.3966682 0.1961059 0.007184856 0.7861515 2.02 0.0460 -0.3449623 b 0.5587395 0.1889173 0.183533422 0.9339456 2.96 0.0039 -0.3449623 c 4.1511677 0.1889170 3.775962254 4.5263730 21.97 0. -0.3449623 d 4.3510407 0.120 3.975904726 4.7261766 23.04 0. This is adjusting the output to a given value of 'x'... Am I trying to use contrast.Design() for something that it was not intended for? Any tips Frank? Cheers, Dylan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and
Re: [R] Comparison of age categories using contrasts
On 2/16/2009 10:18 PM, Dylan Beaudette wrote: On Mon, Feb 16, 2009 at 5:28 PM, Patrick Giraudoux patrick.giraud...@univ-fcomte.fr wrote: Greg Snow a écrit : One approach is to create your own contrasts matrix: mycmat - diag(8) mycmat[ row(mycmat) == col(mycmat) + 1 ] - -1 mycmati - solve(mycmat) contrasts(agefactor) - mycmati[,-1] Now when you use agefactor, the intercept will be the first age group and the slopes will be the differences between the pairs of groups (make sure that the order of the levels of agefactor is correct). The difference between this method and the contr.sdif function in MASS is how the intercept will end up being interpreted (and the dimnames). Hope this helps, Actually, exactly what I needed including the reference to contr.sdif in MASS I did not spot before (although I am a faithful reader of the yellow book... but so many things still escape to me). Again thanks a lot. Patrick Glad you were able to solve your problem. Frank replied earlier with the suggestion to use contrast.Design() to perform the tests after the initial model has been fit. Perhaps I am a little denser than normal, but I cannot figure out how to apply contrast.Design() to a similar model with several levels of a factor. Example data: # need these library(lattice) library(Design) # replicate an important experimental dataset set.seed(10101010) x - rnorm(100) y1 - x[1:25] * 2 + rnorm(25, mean=1) y2 - x[26:50] * 2.6 + rnorm(25, mean=1.5) y3 - x[51:75] * 2.9 + rnorm(25, mean=5) y4 - x[76:100] * 3.5 + rnorm(25, mean=5.5) d - data.frame(x=x, y=c(y1,y2,y3,y4), f=factor(rep(letters[1:4], each=25))) # plot xyplot(y ~ x, groups=f, data=d, auto.key=list(columns=4, title='Beard Type', lines=TRUE, points=FALSE, cex=0.75), type=c('p','r'), ylab='Number of Pirates', xlab='Distance from Land') # standard comparison to base level of f summary(lm(y ~ x * f, data=d)) # compare to level 4 of f summary(lm(y ~ x * C(f, base=4), data=d)) # now try with contrast.Design(): dd - datadist(d) options(datadist='dd') l - ols(y ~ x * f, data=d) # probably the wrong approach, as the results do not look too familiar: contrast(l, list(f=levels(d$f))) x f Contrast S.E. Lower Upper t Pr(|t|) -0.3449623 a 0.3966682 0.1961059 0.007184856 0.7861515 2.02 0.0460 -0.3449623 b 0.5587395 0.1889173 0.183533422 0.9339456 2.96 0.0039 -0.3449623 c 4.1511677 0.1889170 3.775962254 4.5263730 21.97 0. -0.3449623 d 4.3510407 0.120 3.975904726 4.7261766 23.04 0. This is adjusting the output to a given value of 'x'... Am I trying to use contrast.Design() for something that it was not intended for? Any tips Frank? Does this help? # Compare with summary(lm(y ~ x * f, data=d)) contrast(l, a=list(f=levels(d$f)[2:4], x=0), b=list(f=levels(d$f)[1], x=0)) f Contrast S.E. Lower Upper t Pr(|t|) b 0.3673455 0.2724247 -0.1737135 0.9084046 1.35 0.1808 c 4.1310015 0.2714011 3.5919754 4.6700275 15.22 0. d 4.4308653 0.2731223 3.8884207 4.9733098 16.22 0. Error d.f.= 92 # Compare with summary(lm(y ~ x * C(f, base=4), data=d)) contrast(l, a=list(f=levels(d$f)[1:3], x=0), b=list(f=levels(d$f)[4], x=0)) f Contrast S.E. Lower Upper t Pr(|t|) a -4.4308653 0.2731223 -4.9733098 -3.8884207 -16.22 0. b -4.0635198 0.2782704 -4.6161888 -3.5108507 -14.60 0. c -0.2998638 0.2772684 -0.8505427 0.2508151 -1.08 0.2823 Error d.f.= 92 Cheers, Dylan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Chuck Cleland, Ph.D. NDRI, Inc. (www.ndri.org) 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] assuming AR(1) residuals in OLS
Thank you for the lightening replies. I tested various corStruct objects (?corClasses) using the nlme package and all work flawlessly. My best regards to all... Constantine Tsardounis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combination
Eik Vettorazzi wrote: Hi Dani, see ?combn . combn(1:50,2) gives you all combinations as matrix. you can do sth like apply(combn(1:50,2),2, paste, sep=, collapse=) note that you could simplify the above by combn(50, 2, paste, collapse = ) Best, Dimitris to get concatenated results. hth. Dani Valverde schrieb: Hello, I have a sequence of numbers: seq(1:50) and I would like to have all the possible combinations with this numbers without repeating any combination: 11, 12, 13, ... ,22,23,24,... How can I do it? Best, Dani -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Comparison of age categories using contrasts
Hi Dylan, Chuck, contrast(l, a=list(f=levels(d$f)[1:3], x=0), b=list(f=levels(d$f)[4], x=0)) There is a subtlety here that needs to be emphasized. Setting the interacting variable (x) to zero is reasonable in this case, because the mean value of rnorm(n) is zero. However, in the real world of real data it is somewhat unusual for zero to be a reasonable value for an interacting variable. In fact, it is this setting-to-zero that causes multcomp's bells to ring with a word of warning. ## glht(l, linfct=mcp(f=Seq))$linfct Regards, Mark. Chuck Cleland wrote: On 2/16/2009 10:18 PM, Dylan Beaudette wrote: On Mon, Feb 16, 2009 at 5:28 PM, Patrick Giraudoux patrick.giraud...@univ-fcomte.fr wrote: Greg Snow a écrit : One approach is to create your own contrasts matrix: mycmat - diag(8) mycmat[ row(mycmat) == col(mycmat) + 1 ] - -1 mycmati - solve(mycmat) contrasts(agefactor) - mycmati[,-1] Now when you use agefactor, the intercept will be the first age group and the slopes will be the differences between the pairs of groups (make sure that the order of the levels of agefactor is correct). The difference between this method and the contr.sdif function in MASS is how the intercept will end up being interpreted (and the dimnames). Hope this helps, Actually, exactly what I needed including the reference to contr.sdif in MASS I did not spot before (although I am a faithful reader of the yellow book... but so many things still escape to me). Again thanks a lot. Patrick Glad you were able to solve your problem. Frank replied earlier with the suggestion to use contrast.Design() to perform the tests after the initial model has been fit. Perhaps I am a little denser than normal, but I cannot figure out how to apply contrast.Design() to a similar model with several levels of a factor. Example data: # need these library(lattice) library(Design) # replicate an important experimental dataset set.seed(10101010) x - rnorm(100) y1 - x[1:25] * 2 + rnorm(25, mean=1) y2 - x[26:50] * 2.6 + rnorm(25, mean=1.5) y3 - x[51:75] * 2.9 + rnorm(25, mean=5) y4 - x[76:100] * 3.5 + rnorm(25, mean=5.5) d - data.frame(x=x, y=c(y1,y2,y3,y4), f=factor(rep(letters[1:4], each=25))) # plot xyplot(y ~ x, groups=f, data=d, auto.key=list(columns=4, title='Beard Type', lines=TRUE, points=FALSE, cex=0.75), type=c('p','r'), ylab='Number of Pirates', xlab='Distance from Land') # standard comparison to base level of f summary(lm(y ~ x * f, data=d)) # compare to level 4 of f summary(lm(y ~ x * C(f, base=4), data=d)) # now try with contrast.Design(): dd - datadist(d) options(datadist='dd') l - ols(y ~ x * f, data=d) # probably the wrong approach, as the results do not look too familiar: contrast(l, list(f=levels(d$f))) x f Contrast S.E. Lower Upper t Pr(|t|) -0.3449623 a 0.3966682 0.1961059 0.007184856 0.7861515 2.02 0.0460 -0.3449623 b 0.5587395 0.1889173 0.183533422 0.9339456 2.96 0.0039 -0.3449623 c 4.1511677 0.1889170 3.775962254 4.5263730 21.97 0. -0.3449623 d 4.3510407 0.120 3.975904726 4.7261766 23.04 0. This is adjusting the output to a given value of 'x'... Am I trying to use contrast.Design() for something that it was not intended for? Any tips Frank? Does this help? # Compare with summary(lm(y ~ x * f, data=d)) contrast(l, a=list(f=levels(d$f)[2:4], x=0), b=list(f=levels(d$f)[1], x=0)) f Contrast S.E. Lower Upper t Pr(|t|) b 0.3673455 0.2724247 -0.1737135 0.9084046 1.35 0.1808 c 4.1310015 0.2714011 3.5919754 4.6700275 15.22 0. d 4.4308653 0.2731223 3.8884207 4.9733098 16.22 0. Error d.f.= 92 # Compare with summary(lm(y ~ x * C(f, base=4), data=d)) contrast(l, a=list(f=levels(d$f)[1:3], x=0), b=list(f=levels(d$f)[4], x=0)) f Contrast S.E. Lower Upper t Pr(|t|) a -4.4308653 0.2731223 -4.9733098 -3.8884207 -16.22 0. b -4.0635198 0.2782704 -4.6161888 -3.5108507 -14.60 0. c -0.2998638 0.2772684 -0.8505427 0.2508151 -1.08 0.2823 Error d.f.= 92 Cheers, Dylan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Chuck Cleland, Ph.D. NDRI, Inc. (www.ndri.org) 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context:
[R] Chromatogram deconvolution and peak matching
Hi, I'm trying to match peaks between chromatographic runs. I'm able to match peaks when they are chromatographed with the same method, but not when there are different methods are used and spectra comes in to play. While searching I found the ALS package which should be usefull for my application, but I couldn't figure it out. I made some dummy chroms with R, which mimic my actual datasets, to play with, but after looking at the manuals of ALS, I'm affraid I can't get the job done. Can someone put me on the right way? Here is my code to generate the dummy chroms, which also plots the 2 chroms and the spectra of the 3 peaks: #2D chromatogram generation par(mfrow=c(3,1)) time - seq(0,20,by=0.05) f - function(x,rt) dnorm((x-rt),mean=0,sd=rt/35) c1 - f(time,6.1) c2 - f(time,5.6) c3 - f(time,15) plot(c1+c2+c3~time,type=l,main=chrom1) #spectrum generation spectra - function(x,a,b,c,d,e) a + b*(x-e) + c*((x-e)^2) + d*((x-e)^3) x - 220:300 s1 - spectra(x,(-194.2),2.386,(-0.009617),(1.275e-05),0) s2 - spectra(x,(-1.054e02),1.3,(-5.239e-03),(6.927e-06),-20) s3 - spectra(x,(-194.2),2.386,(-0.009617),(1.275e-05),20) chrom1.tot - data.frame(time,outer(c1,s1,*)+outer(c2,s2,*)+outer(c2,s2,*)) names(chrom.tot)[-1] - x #generation of chromatogram 2 c1 - f(time,2.1) c2 - f(time,4) c3 - f(time,8) plot(c1+c2+c3~time,type=l,main=chrom2) chrom2.tot - data.frame(time,outer(c1,s1,*)+outer(c2,s2,*)+outer(c2,s2,*)) names(chrom.tot)[-1] - x plot(s1~x,type=l,main=spectra) lines(s2~x,col=2) lines(s3~x,col=3) Thanks for your time Kind Regards Bart -- View this message in context: http://www.nabble.com/Chromatogram-deconvolution-and-peak-matching-tp22057592p22057592.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stopping stepAIC (MASS package)
Peter Jepsen PJ at DCE.AU.DK writes: [snip] ... I would save valuable time if the procedure would stop automatically when it encounters any anomaly that causes it to warn me. Is this possible? And if so, how? Maybe options(warn=2) will do what you want. Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] assuming AR(1) residuals in OLS
Thank you Gabor Grothendieck for your message. I would surely like to say, that if someone wants to assume AR(1) residuals, running the regression y ~ x, could run gls(y~x, correlation = corAR1(0, ~1)) Constantine Tsardounis http://www.costis.name __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] joining one-to-many
Hello list, I am wondering if a joining one-to-many can be done a little bit easier. I tried merge function but I was not able to do it, so I end up using for and if. Suppose you have a table with locations, each location repeated several times, and some attributes at that location. The second table has the same locations, but only once with a different set of attributes. I would like to add the second set of attributes to the first table. Example: set.seed - 123 loc - c(rep(L1, 3), rep(L2, 5), rep(L3, 2)) val1 - round(rnorm(10),2) val2 - c(a, b, c, a, b, d, f, e, b, e) t1 - data.frame(loc, val1, val2) t2 - data.frame(loc=c(L1,L2,L3), val3 = c(m, n, p), val4 = c(25, 67, 48)) # join one-to-many n - nrow(t1) m - nrow(t2) t1$val3 - rep(1, n) t1$val4 - rep(1, n) for (i in 1:n) { for (j in 1:m){ if (t1$loc[i]==t2$loc[j]) { t1$val3[i] - as.character(t2$val3[j]) t1$val4[i] - t2$val4[j] } } } Desired result: t1 loc val1 val2 val3 val4 1 L1 -0.41am 25 2 L1 -0.69bm 25 3 L1 0.36cm 25 4 L2 1.11an 67 5 L2 0.15bn 67 6 L2 -0.80dn 67 7 L2 -0.08fn 67 8 L2 -1.01en 67 9 L3 -1.01bp 48 10 L3 -2.50ep 48 This code works OK but it is slow if the data frames are actually bigger than my little example. I hope somebody knows of a better way of doing these type of things. Thanks, Monica _ 22009 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] joining one-to-many
On Tue, Feb 17, 2009 at 8:33 AM, Monica Pisica pisican...@hotmail.com wrote: Hello list, I am wondering if a joining one-to-many can be done a little bit easier. I tried merge function but I was not able to do it, so I end up using for and if. Suppose you have a table with locations, each location repeated several times, and some attributes at that location. The second table has the same locations, but only once with a different set of attributes. I would like to add the second set of attributes to the first table. Example: set.seed - 123 loc - c(rep(L1, 3), rep(L2, 5), rep(L3, 2)) val1 - round(rnorm(10),2) val2 - c(a, b, c, a, b, d, f, e, b, e) t1 - data.frame(loc, val1, val2) t2 - data.frame(loc=c(L1,L2,L3), val3 = c(m, n, p), val4 = c(25, 67, 48)) # join one-to-many n - nrow(t1) m - nrow(t2) t1$val3 - rep(1, n) t1$val4 - rep(1, n) for (i in 1:n) { for (j in 1:m){ if (t1$loc[i]==t2$loc[j]) { t1$val3[i] - as.character(t2$val3[j]) t1$val4[i] - t2$val4[j] } } } Desired result: t1 loc val1 val2 val3 val4 1 L1 -0.41am 25 2 L1 -0.69bm 25 3 L1 0.36cm 25 4 L2 1.11an 67 5 L2 0.15bn 67 6 L2 -0.80dn 67 7 L2 -0.08fn 67 8 L2 -1.01en 67 9 L3 -1.01bp 48 10 L3 -2.50ep 48 This code works OK but it is slow if the data frames are actually bigger than my little example. I hope somebody knows of a better way of doing these type of things. merge(t1, t2) Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] joining one-to-many
on 02/17/2009 08:33 AM Monica Pisica wrote: Hello list, I am wondering if a joining one-to-many can be done a little bit easier. I tried merge function but I was not able to do it, so I end up using for and if. Suppose you have a table with locations, each location repeated several times, and some attributes at that location. The second table has the same locations, but only once with a different set of attributes. I would like to add the second set of attributes to the first table. Example: set.seed - 123 This needs to be set.seed(123) See ?set.seed :-) loc - c(rep(L1, 3), rep(L2, 5), rep(L3, 2)) val1 - round(rnorm(10),2) val2 - c(a, b, c, a, b, d, f, e, b, e) t1 - data.frame(loc, val1, val2) t2 - data.frame(loc=c(L1,L2,L3), val3 = c(m, n, p), val4 = c(25, 67, 48)) # join one-to-many n - nrow(t1) m - nrow(t2) t1$val3 - rep(1, n) t1$val4 - rep(1, n) for (i in 1:n) { for (j in 1:m){ if (t1$loc[i]==t2$loc[j]) { t1$val3[i] - as.character(t2$val3[j]) t1$val4[i] - t2$val4[j] } } } Desired result: t1 loc val1 val2 val3 val4 1 L1 -0.41am 25 2 L1 -0.69bm 25 3 L1 0.36cm 25 4 L2 1.11an 67 5 L2 0.15bn 67 6 L2 -0.80dn 67 7 L2 -0.08fn 67 8 L2 -1.01en 67 9 L3 -1.01bp 48 10 L3 -2.50ep 48 This code works OK but it is slow if the data frames are actually bigger than my little example. I hope somebody knows of a better way of doing these type of things. merge(t1, t2, by = loc) loc val1 val2 val3 val4 1 L1 -0.32am 25 2 L1 -1.50bm 25 3 L1 -0.31cm 25 4 L2 1.42an 67 5 L2 0.32bn 67 6 L2 -0.12dn 67 7 L2 0.33fn 67 8 L2 -1.74en 67 9 L3 0.88bp 48 10 L3 1.88ep 48 system.time(merge(t1, t2, by = loc)) user system elapsed 0.004 0.000 0.019 HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] frequency table for multiple variables
Hi r-help! Consider the following data-frame: var1 var2 var3 1 314 2 223 3 223 4 44 NA 5 435 6 223 7 343 How can I get R to convert this into the following? Value 1 2 3 4 5 var1 0 3 2 2 0 var2 1 3 1 2 0 var3 0 0 4 1 1 TIA, -- Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net Q. What is that strange attachment in this mail? A. My digital signature, see www.gnupg.org for info on how you could use it to ensure that this mail is from me and has not been altered on the way to you. signature.asc Description: Digital signature __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] joining one-to-many
Try merge(t1, t2) On Tue, Feb 17, 2009 at 9:33 AM, Monica Pisica pisican...@hotmail.com wrote: Hello list, I am wondering if a joining one-to-many can be done a little bit easier. I tried merge function but I was not able to do it, so I end up using for and if. Suppose you have a table with locations, each location repeated several times, and some attributes at that location. The second table has the same locations, but only once with a different set of attributes. I would like to add the second set of attributes to the first table. Example: set.seed - 123 loc - c(rep(L1, 3), rep(L2, 5), rep(L3, 2)) val1 - round(rnorm(10),2) val2 - c(a, b, c, a, b, d, f, e, b, e) t1 - data.frame(loc, val1, val2) t2 - data.frame(loc=c(L1,L2,L3), val3 = c(m, n, p), val4 = c(25, 67, 48)) # join one-to-many n - nrow(t1) m - nrow(t2) t1$val3 - rep(1, n) t1$val4 - rep(1, n) for (i in 1:n) { for (j in 1:m){ if (t1$loc[i]==t2$loc[j]) { t1$val3[i] - as.character(t2$val3[j]) t1$val4[i] - t2$val4[j] } } } Desired result: t1 loc val1 val2 val3 val4 1 L1 -0.41am 25 2 L1 -0.69bm 25 3 L1 0.36cm 25 4 L2 1.11an 67 5 L2 0.15bn 67 6 L2 -0.80dn 67 7 L2 -0.08fn 67 8 L2 -1.01en 67 9 L3 -1.01bp 48 10 L3 -2.50ep 48 This code works OK but it is slow if the data frames are actually bigger than my little example. I hope somebody knows of a better way of doing these type of things. Thanks, Monica _ 22009 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] spss-file problem with foreign 0.8-32
Hi, after updating to foreign version 0.8-32, I experienced the following error when I tried to load a SPSS file: Fehler in inherits(x, factor) : objekt cp nicht gefunden Zusätzlich: Warning message: In read.spss(***l.sav, use.value.labels = TRUE, to.data.frame = TRUE) : ***.sav: File-indicated character representation code (1252) looks like a Windows codepage It worked without problems with earlier versions. Thanks for any clues, best, Harry -- View this message in context: http://www.nabble.com/spss-file-problem-with-foreign-0.8-32-tp22059259p22059259.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Uninstall question
I need to uninstall R 2.7.1 from my Mac. What is the best way to uninstall it? Simply delete the R icon in the Applications folder? Or is it more involved? TIA, Anjan -- = anjan purkayastha, phd bioinformatics analyst whitehead institute for biomedical research nine cambridge center cambridge, ma 02142 purkayas [at] wi [dot] mit [dot] edu 703.740.6939 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] joining one-to-many
Hi Monica, merge(t1, t2) works on your example. So why don't you use merge? HTH, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Monica Pisica Verzonden: dinsdag 17 februari 2009 15:33 Aan: R help project Onderwerp: [R] joining one-to-many Hello list, I am wondering if a joining one-to-many can be done a little bit easier. I tried merge function but I was not able to do it, so I end up using for and if. Suppose you have a table with locations, each location repeated several times, and some attributes at that location. The second table has the same locations, but only once with a different set of attributes. I would like to add the second set of attributes to the first table. Example: set.seed - 123 loc - c(rep(L1, 3), rep(L2, 5), rep(L3, 2)) val1 - round(rnorm(10),2) val2 - c(a, b, c, a, b, d, f, e, b, e) t1 - data.frame(loc, val1, val2) t2 - data.frame(loc=c(L1,L2,L3), val3 = c(m, n, p), val4 = c(25, 67, 48)) # join one-to-many n - nrow(t1) m - nrow(t2) t1$val3 - rep(1, n) t1$val4 - rep(1, n) for (i in 1:n) { for (j in 1:m){ if (t1$loc[i]==t2$loc[j]) { t1$val3[i] - as.character(t2$val3[j]) t1$val4[i] - t2$val4[j] } } } Desired result: t1 loc val1 val2 val3 val4 1 L1 -0.41am 25 2 L1 -0.69bm 25 3 L1 0.36cm 25 4 L2 1.11an 67 5 L2 0.15bn 67 6 L2 -0.80dn 67 7 L2 -0.08fn 67 8 L2 -1.01en 67 9 L3 -1.01bp 48 10 L3 -2.50ep 48 This code works OK but it is slow if the data frames are actually bigger than my little example. I hope somebody knows of a better way of doing these type of things. Thanks, Monica _ 22009 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Uninstall question
This is on the Mac FAQ: http://cran.cnr.berkeley.edu/bin/macosx/RMacOSX-FAQ.html#How-can-R-for-Mac-OS-X-be-uninstalled_003f HTH, --sundar On Tue, Feb 17, 2009 at 7:17 AM, ANJAN PURKAYASTHA anjan.purkayas...@gmail.com wrote: I need to uninstall R 2.7.1 from my Mac. What is the best way to uninstall it? Simply delete the R icon in the Applications folder? Or is it more involved? TIA, Anjan -- = anjan purkayastha, phd bioinformatics analyst whitehead institute for biomedical research nine cambridge center cambridge, ma 02142 purkayas [at] wi [dot] mit [dot] edu 703.740.6939 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] joining one-to-many
Ok, I feel properly ashamed. I suppose my real data is a little bit different than my toy data (although i don't know how) because i did try the merge function as simple as merge(t1, t2) and did not work. Maybe a reset of my session will solve my problems and more coffee my confusion. Again, thanks for your help, Monica Date: Tue, 17 Feb 2009 10:09:17 -0500 Subject: Re: [R] joining one-to-many From: ggrothendi...@gmail.com To: pisican...@hotmail.com CC: r-help@r-project.org Try merge(t1, t2) On Tue, Feb 17, 2009 at 9:33 AM, Monica Pisica wrote: Hello list, I am wondering if a joining one-to-many can be done a little bit easier. I tried merge function but I was not able to do it, so I end up using for and if. Suppose you have a table with locations, each location repeated several times, and some attributes at that location. The second table has the same locations, but only once with a different set of attributes. I would like to add the second set of attributes to the first table. Example: set.seed - 123 loc - c(rep(L1, 3), rep(L2, 5), rep(L3, 2)) val1 - round(rnorm(10),2) val2 - c(a, b, c, a, b, d, f, e, b, e) t1 - data.frame(loc, val1, val2) t2 - data.frame(loc=c(L1,L2,L3), val3 = c(m, n, p), val4 = c(25, 67, 48)) # join one-to-many n - nrow(t1) m - nrow(t2) t1$val3 - rep(1, n) t1$val4 - rep(1, n) for (i in 1:n) { for (j in 1:m){ if (t1$loc[i]==t2$loc[j]) { t1$val3[i] - as.character(t2$val3[j]) t1$val4[i] - t2$val4[j] } } } Desired result: t1 loc val1 val2 val3 val4 1 L1 -0.41 a m 25 2 L1 -0.69 b m 25 3 L1 0.36 c m 25 4 L2 1.11 a n 67 5 L2 0.15 b n 67 6 L2 -0.80 d n 67 7 L2 -0.08 f n 67 8 L2 -1.01 e n 67 9 L3 -1.01 b p 48 10 L3 -2.50 e p 48 This code works OK but it is slow if the data frames are actually bigger than my little example. I hope somebody knows of a better way of doing these type of things. Thanks, Monica _ 22009 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Survival-Analysis: How to get numerical values from survfit (and not just a plot)?
Hi! I came across R just a few days ago since I was looking for a toolbox for cox-regression. I´ve read Cox Proportional-Hazards Regression for Survival Data Appendix to An R and S-PLUS Companion to Applied Regression from John Fox. As described therein plotting survival-functions works well (plot(survfit(model))). But I´d like to do some manipulation with the survival-functions before plotting them e.g. dividing one survival-function by another. survfit() only returns an object of the following structure n events median 0.9LCL 0.9UCL 55.000 55.000 1.033 0.696 1.637 Can you tell me how I can calculate a survival- or baseline-function out of these values and how I extract the values from the object? I´m sure the calculation is done by the corresponding plot-routine, but I couldn´t find that one either. Regards Bernhard __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Survival-Analysis: How to get numerical values from survfit (and not just a plot)?
Hi, See ?survfit.object if fit is the object you get using survfit, fit$surv will give you the survival probability. Best, arthur Bernhard Reinhardt wrote: Hi! I came across R just a few days ago since I was looking for a toolbox for cox-regression. I´ve read Cox Proportional-Hazards Regression for Survival Data Appendix to An R and S-PLUS Companion to Applied Regression from John Fox. As described therein plotting survival-functions works well (plot(survfit(model))). But I´d like to do some manipulation with the survival-functions before plotting them e.g. dividing one survival-function by another. survfit() only returns an object of the following structure n events median 0.9LCL 0.9UCL 55.000 55.000 1.033 0.696 1.637 Can you tell me how I can calculate a survival- or baseline-function out of these values and how I extract the values from the object? I´m sure the calculation is done by the corresponding plot-routine, but I couldn´t find that one either. Regards Bernhard __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] spss-file problem with foreign 0.8-32
Harry Haupt wrote: Hi, after updating to foreign version 0.8-32, I experienced the following error when I tried to load a SPSS file: Fehler in inherits(x, factor) : objekt cp nicht gefunden Zusätzlich: Warning message: In read.spss(***l.sav, use.value.labels = TRUE, to.data.frame = TRUE) : ***.sav: File-indicated character representation code (1252) looks like a Windows codepage It worked without problems with earlier versions. Thanks for any clues, best, Harry Yes, something in the logic appears to have gotten garbled. It's in this part of read,spss: if (is.character(reencode)) { cp - reencode reencode - TRUE } else if (codepage = 500 || codepage = 2000) { attr(rval, codepage) - NULL reencode - FALSE } else if (m - match(cp, knownCP, 0L)) cp - names(knownCP)[m] if you get to the 2nd else if then cp is unset. Possible the attempted match should be of codepage? But it still looks wrong: Why restrict to codepages between 500 and 2000 when knownCP contains several values above 1??? A workaround is to set reencode=ascii (or whatever is relevant). -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R scripts and parameters
A couple of weeks ago I asked how it is possible to run an R script (not a function) passing some parameters. Someone suggested the function commandArgs(). I read the on-line help and found no clarifying example. Therefore I do not know how to use it appropriately. I noticed this function returns the pathname of the R executable which is not what I need. I meant to ask if it is possible to open an R session and launch a script passing parameters that the script can retrieve and use itself. Just like in C I can run a program and call it with some arguments Example_Prog A B C The program Example_Prog can acess its own arguments through the data structures argc an argv. How can I launch an R script simulating the above mechanism ? Shall I use source (script-name) ? Where are the arguments to be passed, as part of the source call ? Is the function commandArgs to be places as one of the first code lines of the script in order to access its own arguments ? Is there any commandArgs usage example ? Thank you very much in advance. Maura tutti i telefonini TIM! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] frequency table for multiple variables
on 02/17/2009 09:06 AM Hans Ekbrand wrote: Hi r-help! Consider the following data-frame: var1 var2 var3 1 314 2 223 3 223 4 44 NA 5 435 6 223 7 343 How can I get R to convert this into the following? Value 1 2 3 4 5 var1 0 3 2 2 0 var2 1 3 1 2 0 var3 0 0 4 1 1 t(sapply(DF, function(x) table(factor(x, levels = 1:5 1 2 3 4 5 var1 0 3 2 2 0 var2 1 3 1 2 0 var3 0 0 4 1 1 The key is to turn each column into a factor with explicitly defined common levels for tabulation. This enables the table result to have a consistent format across each column, allowing for a matrix to be created, rather than a list. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R crash after fGarch update
Prof Ripley: Many thanks - it did indeed say it cannot find fGarch after I tried your advice - but a completely clean re-install did the trick. John On Tue, Feb 17, 2009 at 1:09 AM, Prof Brian Ripley rip...@stats.ox.ac.ukwrote: Start R with --vanilla, or rename youe saved workspace (.RData). Then library(fGarch) load(.RData) # or whatever you renamed it to. This will either work or (more ikely) tell you it cannot find fGarch or a package it depends on). On Mon, 16 Feb 2009, John Kerpel wrote: Hi folks! After updating my packages my R seems to have completely crashed as will not start up - even after I installed 2.8.1 from 2.8.0. You haven't told us your OS: I am guesing Windows. I get the following: Fatal error: unable to restore saved data in .Rdata Error in loadNamespeace(name): there is no package called fGarch But I do have a package called fGarch. After I hit ok, it crashes and exits. I cannot use any functionality at all. What do I do? John [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] word wrap using Hershey fonts
Hi, is it possible to wrap a text using Hershey fonts? \n does not work! Thanks in advance, Martina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Survival-Analysis: How to get numerical values from survfit (and not just a plot)?
I seriously doubt that a survfit object could only contain that information. I suspect that you are erroneously thinking that what print.survfit offers is the entire story. What does str(survfit(formula, data=dataframe) ) show you? data(aml) aml.mdl - survfit(Surv(time, status) ~ x, data=aml) # this is with survfit.Design since I load Hmisc/Design by default now. str(aml.mdl) List of 18 $ n: int 23 $ time : num [1:20] 9 13 18 23 28 31 34 45 48 161 ... $ n.risk : num [1:20] 11 10 8 7 6 5 4 3 2 1 ... $ n.event : num [1:20] 1 1 1 1 0 1 1 0 1 0 ... $ surv : num [1:20] 0.909 0.818 0.716 0.614 0.614 ... $ type : chr right $ ntimes.strata: Named int [1:2] 10 10 ..- attr(*, names)= chr [1:2] 1 2 $ strata : Named num [1:2] 10 10 ..- attr(*, names)= chr [1:2] Maintained Nonmaintained $ strata.all : Named int [1:2] 11 12 ..- attr(*, names)= chr [1:2] Maintained Nonmaintained $ std.err : num [1:20] 0.0953 0.1421 0.1951 0.2487 0.2487 ... $ upper: num [1:20] 0.987 0.951 0.899 0.835 0.835 ... $ lower: num [1:20] 0.508 0.447 0.35 0.266 0.266 ... $ conf.type: chr log-log $ conf.int : num 0.95 $ maxtime : num 161 $ units: chr Day $ time.label : chr time $ call : language survfit(formula = Surv(time, status) ~ x, data = aml) - attr(*, class)= chr survfit I also don't think survfit returns a Cox model. On Feb 17, 2009, at 10:37 AM, Bernhard Reinhardt wrote: Hi! I came across R just a few days ago since I was looking for a toolbox for cox-regression. I´ve read Cox Proportional-Hazards Regression for Survival Data Appendix to An R and S-PLUS Companion to Applied Regression from John Fox. As described therein plotting survival-functions works well (plot(survfit(model))). But I´d like to do some manipulation with the survival-functions before plotting them e.g. dividing one survival-function by another. survfit() only returns an object of the following structure n events median 0.9LCL 0.9UCL 55.000 55.000 1.033 0.696 1.637 Can you tell me how I can calculate a survival- or baseline-function out of these values and how I extract the values from the object? I ´m sure the calculation is done by the corresponding plot-routine, but I couldn´t find that one either. Regards Bernhard __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R scripts and parameters
Try this: A - 1 B - 2 C - 3 source(myfile.R) Now the code in myfile can access A, B and C. On Tue, Feb 17, 2009 at 10:55 AM, mau...@alice.it wrote: A couple of weeks ago I asked how it is possible to run an R script (not a function) passing some parameters. Someone suggested the function commandArgs(). I read the on-line help and found no clarifying example. Therefore I do not know how to use it appropriately. I noticed this function returns the pathname of the R executable which is not what I need. I meant to ask if it is possible to open an R session and launch a script passing parameters that the script can retrieve and use itself. Just like in C I can run a program and call it with some arguments Example_Prog A B C The program Example_Prog can acess its own arguments through the data structures argc an argv. How can I launch an R script simulating the above mechanism ? Shall I use source (script-name) ? Where are the arguments to be passed, as part of the source call ? Is the function commandArgs to be places as one of the first code lines of the script in order to access its own arguments ? Is there any commandArgs usage example ? Thank you very much in advance. Maura tutti i telefonini TIM! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] spss-file problem with foreign 0.8-32
Peter Dalgaard wrote: Yes, something in the logic appears to have gotten garbled. It's in this part of read,spss: if (is.character(reencode)) { cp - reencode reencode - TRUE } else if (codepage = 500 || codepage = 2000) { attr(rval, codepage) - NULL reencode - FALSE } else if (m - match(cp, knownCP, 0L)) cp - names(knownCP)[m] if you get to the 2nd else if then cp is unset. Possible the attempted match should be of codepage? But it still looks wrong: Why restrict to codepages between 500 and 2000 when knownCP contains several values above 1??? A workaround is to set reencode=ascii (or whatever is relevant). -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Dear Peter, thanks, reencode=ascii fixed it (and leaves just the warning message, which seems to have no effect). Best, Harry - --- Centre for Statistics Bielefeld University, Germany -- View this message in context: http://www.nabble.com/spss-file-problem-with-foreign-0.8-32-tp22059259p22060774.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R scripts and parameters
On 2/17/2009 10:55 AM, mau...@alice.it wrote: A couple of weeks ago I asked how it is possible to run an R script (not a function) passing some parameters. Someone suggested the function commandArgs(). I read the on-line help and found no clarifying example. Therefore I do not know how to use it appropriately. I noticed this function returns the pathname of the R executable which is not what I need. I meant to ask if it is possible to open an R session and launch a script passing parameters that the script can retrieve and use itself. Just like in C I can run a program and call it with some arguments Example_Prog A B C The program Example_Prog can acess its own arguments through the data structures argc an argv. How can I launch an R script simulating the above mechanism ? Shall I use source (script-name) ? Where are the arguments to be passed, as part of the source call ? Is the function commandArgs to be places as one of the first code lines of the script in order to access its own arguments ? Is there any commandArgs usage example ? Gabor gave you a solution from within R. If you want to run a script from the command line, then use commandArgs(TRUE). For example, put this into the file test.R: commandArgs(TRUE) (The TRUE says you only want to see the trailing arguments, not everything else on the command line.) Then from the command line, do Rscript test.R A B C and you'll see the output [1] A B C Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ARIMA models
is there some sort of R function which can advise me of the best ARIMA(p,q,r) model to use based on the Schwarz criterion e.g for e.g p=0-5, q =0, r=0-5 or for example p+r 5??? or is this something I will have to write my own code for? Thanks Emma -- View this message in context: http://www.nabble.com/ARIMA-models-tp22059382p22059382.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to connect R and WinBUGS/OpenBUGS/LinBUGS in Linux in Feb. 2009
Hi all, I've managed to get JAGS working on my Ubuntu Hardy Linux with a 32-bit computer and AMD processors using R 2.8.1. JAGS is great. I've read that JAGS is the fastest, but that hasn't been my experience. At any rate, I have more experience with WinBUGS under Windows and would like a version of that working as well. It seems like I've read a lot on the subject and tried a lot, but haven't managed to get BUGS to work yet. The most success I've had is to install WinBUGS or OpenBUGS using this method: http://www.math.aau.dk/~slb/kurser/bayes-08/install.html What you also need to know is that you need to open Wine and add a drive. Although Z is recommended, I haven't been able to specify it, but have gotten a D drive to work, using: wine D:/opt/OpenBUGS/winbugs.exe Using this method, OpenBUGS opens. Now, to be able to open it with R. I've read all sorts of discussions about BRugs (which is no longer on CRAN, but old versions can still be found), rbugs, and R2WinBUGS (which I'm used to using on Windows with WinBUGS). Some people say R2WinBUGS cannot run OpenBUGS on Linux, some claim they've done it (I think). It seems the same thing with everything else. I've tried making the linbugs and cbugs file recommended elsewhere online. It's all very confusing. Can someone show a method that works currently, along with some sample code? I'm also new to Linux, and confused by path conventions. For example, in rbugs, it shows an example of a path such as /var/scratch/jyan/wine-20040408/wine, and I don't see how to modify this. I have no /var/scratch to begin with, and think Wine is installed in /home/me/.wine...(I don't have Linux in front of me right now). Please help. Thanks. -- View this message in context: http://www.nabble.com/How-to-connect-R-and-WinBUGS-OpenBUGS-LinBUGS-in-Linux-in-Feb.-2009-tp22058716p22058716.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Survival-Analysis: How to get numerical values from survfit (and not just a plot)?
Hi Bernhard, I'm wondering what you will expect to get in dividing two proportional survival curves from a fitted cox model. Anyway, you can provide a newdata object to the survfit function containing any combination of cofactors you are interested in and then use summary, eg: fit - coxph( Surv(futime,fustat)~resid.ds+rx+ecog.ps,data=ovarian) summary(survfit( fit,newdata=data.frame(rx=1, ecog.ps=2, resid.ds=0))) hth. Bernhard Reinhardt schrieb: Hi! I came across R just a few days ago since I was looking for a toolbox for cox-regression. I´ve read Cox Proportional-Hazards Regression for Survival Data Appendix to An R and S-PLUS Companion to Applied Regression from John Fox. As described therein plotting survival-functions works well (plot(survfit(model))). But I´d like to do some manipulation with the survival-functions before plotting them e.g. dividing one survival-function by another. survfit() only returns an object of the following structure n events median 0.9LCL 0.9UCL 55.000 55.000 1.033 0.696 1.637 Can you tell me how I can calculate a survival- or baseline-function out of these values and how I extract the values from the object? I´m sure the calculation is done by the corresponding plot-routine, but I couldn´t find that one either. Regards Bernhard __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Eik Vettorazzi Institut für Medizinische Biometrie und Epidemiologie Universitätsklinikum Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/42803-8243 F ++49/40/42803-7790 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Comparison of age categories using contrasts
Hi Dylan, Chuck, Mark Difford wrote: Coming to your question [?] about how to generate the kind of contrasts that Patrick wanted using contrast.Design. Well, it is not that straightforward, though I may have missed something in the documentation to the function. In the past I have cobbled them together using the kind of hack given below: Here is a much simpler route, and a correction to my earlier posting (helped by a little off-list prompting from Prof. Harrell). All that is required to get successive difference (i.e. sdif) contrasts using contrast.Design is the following: contrast(l, a=list(f=c(a,b,c)), b=list(f=c(b,c,d))) ## Run a full example set.seed(10101010) x - rnorm(100) y1 - x[1:25] * 2 + rnorm(25, mean=1) y2 - x[26:50] * 2.6 + rnorm(25, mean=1.5) y3 - x[51:75] * 2.9 + rnorm(25, mean=5) y4 - x[76:100] * 3.5 + rnorm(25, mean=5.5) d - data.frame(x=x, y=c(y1,y2,y3,y4), f=factor(rep(letters[1:4], each=25))) dd - datadist(d) options(datadist='dd') l - ols(y ~ x * f, data=d) ## compare with multcomp by setting x=0 summary(glht(l, linfct=mcp(f=Seq))) contrast(l, b=list(x=0, f=c(a,b,c)), a=list(x=0, f=c(b,c,d))) Regards, and apologies for any confusion, Mark. Mark Difford wrote: Hi Dylan, Am I trying to use contrast.Design() for something that it was not intended for? ... I think Prof. Harrell's main point had to do with how interactions are handled. You can also get the kind of contrasts that Patrick was interested in via multcomp. If we do this using your artificial data set we see that the contrasts are the same as those got by fitting the model using contr.sdif, but a warning is generated about interactions in the model c. [see example code] Part of Prof. Harrell's system is that in generating contrasts via contr.Design an appropriate value is automatically chosen for the interacting variable (in this case the median value of x) so that a reasonable default set of contrasts is calculated. This can of course be changed. Coming to your question [?] about how to generate the kind of contrasts that Patrick wanted using contrast.Design. Well, it is not that straightforward, though I may have missed something in the documentation to the function. In the past I have cobbled them together using the kind of hack given below: ## Exampe code x - rnorm(100) y1 - x[1:25] * 2 + rnorm(25, mean=1) y2 - x[26:50] * 2.6 + rnorm(25, mean=1.5) y3 - x[51:75] * 2.9 + rnorm(25, mean=5) y4 - x[76:100] * 3.5 + rnorm(25, mean=5.5) d - data.frame(x=x, y=c(y1,y2,y3,y4), f=factor(rep(letters[1:4], each=25))) # now try with contrast.Design(): library(multcomp) dd - datadist(d) options(datadist='dd') l - ols(y ~ x * f, data=d) ## model fitted using successive difference contrasts library(MASS) T.lm - lm(y ~ x * f, contrasts=list(f=contr.sdif), data=d) summary(T.lm) ## show the warning: model fitted using ols() and treatment contrasts summary(glht(l, linfct=mcp(f=Seq))) ## custom successive difference contrasts using contrast.Design TFin - {} for (i in 1:(length(levels(d$f))-1)) { tcont - contrast(l, a=list(f=levels(d$f)[i]), b=list(f=levels(d$f)[i+1])) TFin - rbind(TFin, tcont) rownames(TFin)[i] - paste(levels(d$f)[i], levels(d$f)[i+1], sep=-) } TFin[,1:9] Regards, Mark. Dylan Beaudette-2 wrote: On Mon, Feb 16, 2009 at 5:28 PM, Patrick Giraudoux patrick.giraud...@univ-fcomte.fr wrote: Greg Snow a écrit : One approach is to create your own contrasts matrix: mycmat - diag(8) mycmat[ row(mycmat) == col(mycmat) + 1 ] - -1 mycmati - solve(mycmat) contrasts(agefactor) - mycmati[,-1] Now when you use agefactor, the intercept will be the first age group and the slopes will be the differences between the pairs of groups (make sure that the order of the levels of agefactor is correct). The difference between this method and the contr.sdif function in MASS is how the intercept will end up being interpreted (and the dimnames). Hope this helps, Actually, exactly what I needed including the reference to contr.sdif in MASS I did not spot before (although I am a faithful reader of the yellow book... but so many things still escape to me). Again thanks a lot. Patrick Glad you were able to solve your problem. Frank replied earlier with the suggestion to use contrast.Design() to perform the tests after the initial model has been fit. Perhaps I am a little denser than normal, but I cannot figure out how to apply contrast.Design() to a similar model with several levels of a factor. Example data: # need these library(lattice) library(Design) # replicate an important experimental dataset set.seed(10101010) x - rnorm(100) y1 - x[1:25] * 2 + rnorm(25, mean=1) y2 - x[26:50] * 2.6 + rnorm(25, mean=1.5) y3 - x[51:75] * 2.9 + rnorm(25, mean=5) y4 - x[76:100] * 3.5 + rnorm(25, mean=5.5) d - data.frame(x=x, y=c(y1,y2,y3,y4), f=factor(rep(letters[1:4], each=25))) # plot
Re: [R] Chromatogram deconvolution and peak matching
Hoi Bart, I think you're right that ALS should be applicable to this problem. Unfortunately in writing I see that there is a bug when the spectra are NOT constrained to nonnegative values (the package has been used to my knowledge only in fitting multiway mass spectra thus far, where this constraint is always applied); I'll have a patched version soon. I'd be interested in hearing about where the manual could be improved, too. #2D chromatogram generation time - seq(0,20,by=0.05) f - function(x,rt) dnorm((x-rt),mean=0,sd=rt/35) C1 - matrix(c( f(time,6.1), f(time,5.6), f(time,15)), nrow=401,ncol=3) C2 - matrix(c( f(time,2.1), f(time,4), f(time,8)), nrow=401,ncol=3) #spectrum generation spectra - function(x,a,b,c,d,e) a + b*(x-e) + c*((x-e)^2) + d*((x-e)^3) x - 220:300 s1 - spectra(x,(-194.2),2.386,(-0.009617),(1.275e-05),0) s2 - spectra(x,(-1.054e02),1.3,(-5.239e-03),(6.927e-06),-20) s3 - spectra(x,(-194.2),2.386,(-0.009617),(1.275e-05),20) spec - matrix(c(s1,s2,s3), nrow=81,ncol=3) ## data matrix 1 chrom1 - C1 %*% t(spec) ## data matrix 2 chrom2 - C2 %*% t(spec) require(ALS) ## you want to recover 2 (401 x 3) matrices C1 and C2 and a ## (82 x 3) matrix E such that ## chrom1 = C1 E^T, chrom2 = C2 E^T ## some starting guess for elution profiles ## these need to be pretty good Cstart1 - matrix(c( f(time,4), f(time,6), f(time,12)), nrow=401,ncol=3) Cstart2 - matrix(c( f(time,3), f(time,5), f(time,10)), nrow=401,ncol=3) xx - als( CList = list(Cstart1, Cstart2), PsiList = list(chrom1, chrom2), S=matrix(1, nrow=81, ncol=3), x=time, x2=x, uniC=TRUE, normS=TRUE, nonnegS=TRUE) par(mfrow=c(3,1)) matplot(time, xx$CList[[1]], col=2, type=l, main=elution profiles chrom1, lty=1, ylab=amplitude) matplot(time, C1, col=1, add=TRUE, type=l, lty=1) legend(10,2,legend=c(real, estimated), col=1:2, lty=1) matplot(time, xx$CList[[2]], col=2, type=l, main=elution profiles chrom2, lty=1, ylab=amplitude) matplot(time, C2, col=1, add=TRUE, type=l, lty=1) legend(10,6,legend=c(real, estimated), col=1:2, lty=1) matplot(x, xx$S, col=2, type=l, main=spectra, lty=1, ylab=amplitude) matplot(x, spec, col=1, add=TRUE, type=l, lty=1) legend(270,.95,legend=c(real, estimated), col=1:2, lty=1) On Tue, 17 Feb 2009, bartjoosen wrote: Hi, I'm trying to match peaks between chromatographic runs. I'm able to match peaks when they are chromatographed with the same method, but not when there are different methods are used and spectra comes in to play. While searching I found the ALS package which should be usefull for my application, but I couldn't figure it out. I made some dummy chroms with R, which mimic my actual datasets, to play with, but after looking at the manuals of ALS, I'm affraid I can't get the job done. Can someone put me on the right way? Here is my code to generate the dummy chroms, which also plots the 2 chroms and the spectra of the 3 peaks: #2D chromatogram generation par(mfrow=c(3,1)) time - seq(0,20,by=0.05) f - function(x,rt) dnorm((x-rt),mean=0,sd=rt/35) c1 - f(time,6.1) c2 - f(time,5.6) c3 - f(time,15) plot(c1+c2+c3~time,type=l,main=chrom1) #spectrum generation spectra - function(x,a,b,c,d,e) a + b*(x-e) + c*((x-e)^2) + d*((x-e)^3) x - 220:300 s1 - spectra(x,(-194.2),2.386,(-0.009617),(1.275e-05),0) s2 - spectra(x,(-1.054e02),1.3,(-5.239e-03),(6.927e-06),-20) s3 - spectra(x,(-194.2),2.386,(-0.009617),(1.275e-05),20) chrom1.tot - data.frame(time,outer(c1,s1,*)+outer(c2,s2,*)+outer(c2,s2,*)) names(chrom.tot)[-1] - x #generation of chromatogram 2 c1 - f(time,2.1) c2 - f(time,4) c3 - f(time,8) plot(c1+c2+c3~time,type=l,main=chrom2) chrom2.tot - data.frame(time,outer(c1,s1,*)+outer(c2,s2,*)+outer(c2,s2,*)) names(chrom.tot)[-1] - x plot(s1~x,type=l,main=spectra) lines(s2~x,col=2) lines(s3~x,col=3) Thanks for your time Kind Regards Bart -- View this message in context: http://www.nabble.com/Chromatogram-deconvolution-and-peak-matching-tp22057592p22057592.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] creating a map
I'm trying to create a fairly basic map using R. What i want to get is the map of the country with circles representing a count of students in each state. What I've done so far is as following - map(state) symbols(data1$count,circles=log(data1$count)*3,fg=col,bg=col,add=T,inches=F) this gives me the map of the country, but one that's not populated by my counts. Does anyone know what I'm doing wrong? Also, if anyone can recommend a good reference for creating maps in R, I'd really appreciate that. thank you [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ARIMA models
see auto.arima in the forecast package. On Tue, Feb 17, 2009 at 10:20 AM, emj83 stp08...@shef.ac.uk wrote: is there some sort of R function which can advise me of the best ARIMA(p,q,r) model to use based on the Schwarz criterion e.g for e.g p=0-5, q =0, r=0-5 or for example p+r 5??? or is this something I will have to write my own code for? Thanks Emma -- View this message in context: http://www.nabble.com/ARIMA-models-tp22059382p22059382.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to connect R and WinBUGS/OpenBUGS/LinBUGS in Linux in Feb. 2009
Paul Heinrich Dietrich wrote: Hi all, I've managed to get JAGS working on my Ubuntu Hardy Linux with a 32-bit computer and AMD processors using R 2.8.1. JAGS is great. I've read that JAGS is the fastest, but that hasn't been my experience. At any rate, I have more experience with WinBUGS under Windows and would like a version of that working as well. It seems like I've read a lot on the subject and tried a lot, but haven't managed to get BUGS to work yet. The most success I've had is to install WinBUGS or OpenBUGS using this method: http://www.math.aau.dk/~slb/kurser/bayes-08/install.html What you also need to know is that you need to open Wine and add a drive. Although Z is recommended, I haven't been able to specify it, but have gotten a D drive to work, using: wine D:/opt/OpenBUGS/winbugs.exe Using this method, OpenBUGS opens. Now, to be able to open it with R. I've read all sorts of discussions about BRugs (which is no longer on CRAN, but old versions can still be found), rbugs, and R2WinBUGS (which I'm used to using on Windows with WinBUGS). Some people say R2WinBUGS cannot run OpenBUGS on Linux, some claim they've done it (I think). It seems the same thing with everything else. I've tried making the linbugs and cbugs file recommended elsewhere online. It's all very confusing. For short: It is quite unlikely that BRugs / OpenBUGS (which is called LinBUGS under Linux) works natively under your Linux (although it might work under very specific settings). BRugs is available for Windows users from the CRAN extras repsository maintained by Brian Ripley. We moved it in order to meet GPL compliance issues. Hence a standard recommendation is to use R2WinBUGS under native R under Linux with WinBUGS running under wine. R2WinBUGS can use wine to do so. See the help page ?bugs once you have loaded R2WinBUGS. Best wishes, Uwe Ligges Can someone show a method that works currently, along with some sample code? I'm also new to Linux, and confused by path conventions. For example, in rbugs, it shows an example of a path such as /var/scratch/jyan/wine-20040408/wine, and I don't see how to modify this. I have no /var/scratch to begin with, and think Wine is installed in /home/me/.wine...(I don't have Linux in front of me right now). Please help. Thanks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Percentiles/Quantiles with Weighting
Hi All, I am looking at applications of percentiles to time sequenced data. I had just been using the quantile function to get percentiles over various periods, but am more interested in if there is an accepted (and/or R-implemented) method to apply weighting to the data so as to weigh recent data more heavily. I wrote the following function, but it seems quite inefficient, and not really very flexible in its applications - so if anyone has any suggestions on how to look at quantiles/percentiles within R while also using a weighting schema, I would be very interested. Note - this function supposes the data in X is time-sequenced, with the most recent (and thus heaviest weighted) data at the end of the vector WtPercentile - function(X=rnorm(100), pctile=seq(.1,1,.1)) { Xprime - NA for(i in 1:length(X)) { Xprime - c(Xprime, rep(X[i], times=i)) } print(Percentiles:) print(quantile(X, pctile)) print(Weighted:) print(Xprime) print(Weighted Percentiles:) print(quantile(Xprime, pctile, na.rm=TRUE)) } WtPercentile(1:10) WtPercentile(rnorm(10)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating a map
Two places that have worked examples leap to mind: --- Sarkar's online accompaniment to his book: http://lmdvr.r-forge.r-project.org/figures/figures.html Thumbing through the hard copy I see Figure 6.5 might of interest. --- Addicted to R's graphics gallery: http://addictedtor.free.fr/graphiques/search.php?q=mapengine=RGG -- David Winsemius On Feb 17, 2009, at 11:53 AM, Alina Sheyman wrote: I'm trying to create a fairly basic map using R. What i want to get is the map of the country with circles representing a count of students in each state. What I've done so far is as following - map(state) symbols (data1$count,circles=log(data1$count)*3,fg=col,bg=col,add=T,inches=F) this gives me the map of the country, but one that's not populated by my counts. Does anyone know what I'm doing wrong? Also, if anyone can recommend a good reference for creating maps in R, I'd really appreciate that. thank you [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] merging files with different structures
Hello list, Thanks in advance for any help. I have many (approx 20) files that I have merged. For example d1-read.csv(AlleleReport.csv) d2-read.csv(AlleleReport.csv) m1 - merge(d1, d2, by = c(IND, intersect(colnames(d1), colnames(d2))), all = TRUE) m2 - merge(m1, d3, by = c(IND, intersect(colnames(m1), colnames(d3))), all = TRUE) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] spss-file problem with foreign 0.8-32
It is a issue specific to 0.8-32 and some files (most likely those with some (not all) Windows codepages declared). We are trying to collect together some examples, and will update foreign accordingly later in the week. On Tue, 17 Feb 2009, Harry Haupt wrote: Hi, after updating to foreign version 0.8-32, I experienced the following error when I tried to load a SPSS file: Fehler in inherits(x, factor) : objekt cp nicht gefunden Zusätzlich: Warning message: In read.spss(***l.sav, use.value.labels = TRUE, to.data.frame = TRUE) : ***.sav: File-indicated character representation code (1252) looks like a Windows codepage It worked without problems with earlier versions. Thanks for any clues, best, Harry -- View this message in context: http://www.nabble.com/spss-file-problem-with-foreign-0.8-32-tp22059259p22059259.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merging files with different structures
Hello list, I am sorry for the previous half post. I accidentily hit send. Thanks again in advance for any help. I have many (approx 20) files that I have merged. Each data set contains rows for individuals and data in 2 - 5 columns (depending upon which data set). The individuals in each data set are not necessarily the same and are all duplicated (different data in columns) across sheets I am trying to merge. I have used the merge function For example d1-read.csv(AlleleReport.csv) d2-read.csv(AlleleReport.csv) m1 - merge(d1, d2, by = c(IND, intersect(colnames(d1), colnames(d2))), all = TRUE) m2 - merge(m1, d3, by = c(IND, intersect(colnames(m1), colnames(d3))), all = TRUE) My problem is that when the data is merged it looks something like Ind L1 L1.1L2 L2.1L3 L3.1 a 12 13 NA NA NA NA a NA NA 22 43 34 45 b 14 1545 64 NA NA b NANANA NA 99 84 Is there a way that I can merge the rows for each individual? Cheers Grant [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Percentiles/Quantiles with Weighting
I do know that Harrell's Quantile function in the Hmisc package will allow quantile estimates from models. Whether it is general enough to extend to time series, I have no experience and cannot say. -- David Winsemius On Feb 17, 2009, at 11:57 AM, Brigid Mooney wrote: Hi All, I am looking at applications of percentiles to time sequenced data. I had just been using the quantile function to get percentiles over various periods, but am more interested in if there is an accepted (and/or R-implemented) method to apply weighting to the data so as to weigh recent data more heavily. I wrote the following function, but it seems quite inefficient, and not really very flexible in its applications - so if anyone has any suggestions on how to look at quantiles/percentiles within R while also using a weighting schema, I would be very interested. Note - this function supposes the data in X is time-sequenced, with the most recent (and thus heaviest weighted) data at the end of the vector WtPercentile - function(X=rnorm(100), pctile=seq(.1,1,.1)) { Xprime - NA for(i in 1:length(X)) { Xprime - c(Xprime, rep(X[i], times=i)) } print(Percentiles:) print(quantile(X, pctile)) print(Weighted:) print(Xprime) print(Weighted Percentiles:) print(quantile(Xprime, pctile, na.rm=TRUE)) } WtPercentile(1:10) WtPercentile(rnorm(10)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating a map
You need to give the symbols function the locations where you want the centers of the circles to be. Some datesets with map information also have centers of the states that you can use, for the USA, there is the state.center dataset that may work for you, or the maptools package function get.Pcent will compute a center for polygons (there are probably other similar functions in other packages). For adding circles to a map of the USA, you may want to look at the state.vbm data in the TeachingDemos package (works with maptools package), but you will need to computer the centers of the polygons, they don't match state.center or others. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Alina Sheyman Sent: Tuesday, February 17, 2009 9:53 AM To: r-help@r-project.org Subject: [R] creating a map I'm trying to create a fairly basic map using R. What i want to get is the map of the country with circles representing a count of students in each state. What I've done so far is as following - map(state) symbols(data1$count,circles=log(data1$count)*3,fg=col,bg=col,add=T,inch es=F) this gives me the map of the country, but one that's not populated by my counts. Does anyone know what I'm doing wrong? Also, if anyone can recommend a good reference for creating maps in R, I'd really appreciate that. thank you [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overdispersion with binomial distribution
Jessica L Hite/hitejl/O/VCU hitejl at vcu.edu writes: I am attempting to run a glm with a binomial model to analyze proportion data. I have been following Crawley's book closely and am wondering if there is an accepted standard for how much is too much overdispersion? (e.g. change in AIC has an accepted standard of 2). In principle, in the null case (i.e. data are really binomial) the deviance is chi-squared distributed with the df equal to the residual df. For example: example(glm) deviance(glm.D93) ## 5.13 summary(glm.D93)$dispersion ## 1 (by definition) dfr - df.residual(glm.D93) deviance(glm.D93)/dfr ## 1.28 d2 - sum(residuals(glm.D93,pearson)^2) ## 5.17 (disp2 - d2/dfr) ## 1.293 gg2 - update(glm.D93,family=quasipoisson) summary(gg2)$dispersion ## 1.293, same as above pchisq(d2,df=dfr,lower.tail=FALSE) all.equal(coef(glm.D93),coef(gg2)) ## TRUE se1 - coef(summary(glm.D93))[,Std. Error] se2 - coef(summary(gg2))[,Std. Error] se2/se1 # (Intercept)outcome2outcome3 treatment2 treatment3 # 1.1372341.1372341.1372341.1372341.137234 sqrt(disp2) # [1] 1.137234 My code and output are below, given the example in the book, these data are WAY overdispersed .do I mention this and go on or does this signal the need to try a different model? If so, any suggestions on the type of distribution (gamma or negative binomial ?)? Way overdispersed may indicate model lack of fit. Have you examined residuals/data for outliers etc.? quasibinomial should be fine, or you can try beta-binomial (see the aod package) ... attach(Clutch2) y-cbind(Total,Size-Total) glm1-glm(y~Pred,binomial) summary(glm1) Call: glm(formula = y ~ Pred, family = binomial) Deviance Residuals: Min 1Q Median 3Q Max -9.1022 -2.7899 -0.4781 2.6058 8.4852 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) 1.350950.06612 20.433 2e-16 *** PredF -0.348110.11719 -2.970 0.00297 ** PredSN -3.291560.10691 -30.788 2e-16 *** PredW -1.464510.12787 -11.453 2e-16 *** PredWF -0.564120.13178 -4.281 1.86e-05 *** --- the output for residual deviance and df does not change even when I use quasibinomial, is this ok? # That's as expected. library(MASS) you don't really need MASS for quasibinomial. glm2-glm(y~Pred,quasibinomial) summary(glm2) Call: glm(formula = y ~ Pred, family = quasibinomial) Deviance Residuals: Min 1Q Median 3Q Max -9.1022 -2.7899 -0.4781 2.6058 8.4852 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 1.3510 0.2398 5.633 1.52e-07 *** PredF-0.3481 0.4251 -0.819 0.41471 PredSN -3.2916 0.3878 -8.488 1.56e-13 *** PredW-1.4645 0.4638 -3.157 0.00208 ** PredWF -0.5641 0.4780 -1.180 0.24063 --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 (Dispersion parameter for quasibinomial family taken to be 13.15786) Null deviance: 2815.5 on 108 degrees of freedom Residual deviance: 1323.5 on 104 degrees of freedom (3 observations deleted due to missingness) AIC: NA Number of Fisher Scoring iterations: 5 anova(glm2,test=F) Analysis of Deviance Table Model: quasibinomial, link: logit Response: y Terms added sequentially (first to last) Df Deviance Resid. Df Resid. Dev F Pr(F) NULL108 2815.5 Pred 4 1492.0 104 1323.5 28.349 6.28e-16 *** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 model1-update(glm2,~.-Pred) anova(glm2,model1,test=F) Analysis of Deviance Table Model 1: y ~ Pred Model 2: y ~ 1 Resid. Df Resid. Dev Df Deviance F Pr(F) 1 104 1323.5 2 108 2815.5 -4 -1492.0 28.349 6.28e-16 *** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 coef(glm2) (Intercept) PredF PredSN PredW PredWF 1.3509550 -0.3481096 -3.2915601 -1.4645097 -0.5641223 Thanks Jessica hitejl at vcu.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating a map
Thanks Greg, do you know where i can find the sate.center dataset that you mention? On Tue, Feb 17, 2009 at 12:28 PM, Greg Snow greg.s...@imail.org wrote: You need to give the symbols function the locations where you want the centers of the circles to be. Some datesets with map information also have centers of the states that you can use, for the USA, there is the state.center dataset that may work for you, or the maptools package function get.Pcent will compute a center for polygons (there are probably other similar functions in other packages). For adding circles to a map of the USA, you may want to look at the state.vbm data in the TeachingDemos package (works with maptools package), but you will need to computer the centers of the polygons, they don't match state.center or others. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Alina Sheyman Sent: Tuesday, February 17, 2009 9:53 AM To: r-help@r-project.org Subject: [R] creating a map I'm trying to create a fairly basic map using R. What i want to get is the map of the country with circles representing a count of students in each state. What I've done so far is as following - map(state) symbols(data1$count,circles=log(data1$count)*3,fg=col,bg=col,add=T,inch es=F) this gives me the map of the country, but one that's not populated by my counts. Does anyone know what I'm doing wrong? Also, if anyone can recommend a good reference for creating maps in R, I'd really appreciate that. thank you [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error with make with R-devel
Hi, I am getting an error compiling the R-devel on a suse architecture 64-bit architecture. The cp attribute is sending 'trusted.lov' and an error. This is a sample of the output: make[3]: Entering directory `/lustre/people/schaffer/R-devel/src/library/base' building package 'base' make[4]: Entering directory `/lustre/people/schaffer/R-devel/src/library/base' all.R is unchanged cmp: EOF on ../../../library/base/R/base cp: getting attribute `trusted.lov' of `all.R': Operation not permitted make[4]: *** [mkR] Error 1 make[4]: Leaving directory `/lustre/people/schaffer/R-devel/src/library/base' make[3]: *** [all] Error 2 Is there someone who can modify the Makefile so that I am able to compile R-devel? Lana Schaffer Biostatistics/Informatics The Scripps Research Institute DNA Array Core Facility La Jolla, CA 92037 (858) 784-2263 (858) 784-2994 schaf...@scripps.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] printing out the summary for lm into a txt file
Hi All, I am trying to run several linear regressions and print out the summay and the anova reslts on the top of each other for each model. Below is a sample progarm that did not work. is it possible to print the anova below the summary of lm in one file? thanks for your help ## data-read.table(data.txt, header=T, sep='\t') for (i in 1:100){ y-data[,i] lm.model-lm(y~data$x1+data$x2+data$x3+data) sink(results.txt, append=T) s-summary(lm.Model) print(s) sink() an-anova(lm.Model) print(an) sink() } -- View this message in context: http://www.nabble.com/printing-out-the-summary-for-lm-into-a-txt-file-tp22062643p22062643.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to control the overall shape of a figure?
Thanks very much, exactly what I need. Oliver On Feb 16, 10:36 pm, Marc Schwartz marc_schwa...@comcast.net wrote: on 02/16/2009 07:51 PM Oliver wrote: hi, I am a R beginner. One thing I notice is that when do graphing is, if I want to draw two figures in a row such as this: par(mfrow(1, 2)) plot(...) plot(...) Each figure inside will be rectangle instead of the familiar square shape. Though you can drag the edge the window to resize it. I would have prefer this can be done automatically ... also when I do the pdf export for example. Is this possible? The thing is if you do par (mfrow(2,2)), then you got all 4 figures in perfect square shape ... but one row two figures are pretty common to me. thanks for help Oliver The overall shape of the plot region is controlled by par(pty), which is set to m by default, for 'maximal' plotting region. Set this to s to create a square plot region. For example: par(mfrow = c(1, 2), pty = s) plot(1) plot(1) In order to do this with a PDF file, ensure that the overall plot dimensions are in the proper relative aspect ratio. For example: pdf(file = SquarePlots.pdf, height = 4, width = 8) par(mfrow = c(1, 2), pty= s) plot(1) plot(1) dev.off() This will create 2 square plots, side-by-side, within an overall plot size of 4x8. You can further adjust the plot margins and other characteristics as may be required. See ?par and ?Devices for more information regarding these customizations and options to set size arguments for specific devices. HTH, Marc Schwartz __ r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multiple levels of nesting with trellis plots
Hi guys, I have a tricky problem that I'd appreciate your help with. I have two categorical variables, say varA and varB and an associated frequency Freq for combinations of the levels of varA and varB. This was created with a table() call. I'd now like to make panel plots of the frequency. I can do a panel plot by varA: barchart(data$Freq ~ data$varB | data$varA) How do I achieve two levels of nesting for say a third categorical variable varC? That is, how do I specify 2 'given's in the the panel plot, and preferably arranged the panels vertically within the 'inner' panel/ thanks very much, Richie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overdispersion with binomial distribution
On Tue, 17 Feb 2009, Ben Bolker wrote: Jessica L Hite/hitejl/O/VCU hitejl at vcu.edu writes: I am attempting to run a glm with a binomial model to analyze proportion data. I have been following Crawley's book closely and am wondering if there is an accepted standard for how much is too much overdispersion? (e.g. change in AIC has an accepted standard of 2). In principle, in the null case (i.e. data are really binomial) the deviance is chi-squared distributed with the df equal to the residual df. *Approximately*, provided the expected counts are not near or below one. See MASS §7.5 for an analysis of the size of the approximation errors (which can be large and in both directions). Given that I once had a consulting job where the over-dispersion was causing something close ot panic and was entirely illusory, the lack of the 'approximately' can have quite serious consequences. For example: example(glm) deviance(glm.D93) ## 5.13 summary(glm.D93)$dispersion ## 1 (by definition) dfr - df.residual(glm.D93) deviance(glm.D93)/dfr ## 1.28 d2 - sum(residuals(glm.D93,pearson)^2) ## 5.17 (disp2 - d2/dfr) ## 1.293 gg2 - update(glm.D93,family=quasipoisson) summary(gg2)$dispersion ## 1.293, same as above pchisq(d2,df=dfr,lower.tail=FALSE) all.equal(coef(glm.D93),coef(gg2)) ## TRUE se1 - coef(summary(glm.D93))[,Std. Error] se2 - coef(summary(gg2))[,Std. Error] se2/se1 # (Intercept)outcome2outcome3 treatment2 treatment3 # 1.1372341.1372341.1372341.1372341.137234 sqrt(disp2) # [1] 1.137234 My code and output are below, given the example in the book, these data are WAY overdispersed .do I mention this and go on or does this signal the need to try a different model? If so, any suggestions on the type of distribution (gamma or negative binomial ?)? Way overdispersed may indicate model lack of fit. Have you examined residuals/data for outliers etc.? quasibinomial should be fine, or you can try beta-binomial (see the aod package) ... attach(Clutch2) y-cbind(Total,Size-Total) glm1-glm(y~Pred,binomial) summary(glm1) Call: glm(formula = y ~ Pred, family = binomial) Deviance Residuals: Min 1Q Median 3Q Max -9.1022 -2.7899 -0.4781 2.6058 8.4852 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) 1.350950.06612 20.433 2e-16 *** PredF -0.348110.11719 -2.970 0.00297 ** PredSN -3.291560.10691 -30.788 2e-16 *** PredW -1.464510.12787 -11.453 2e-16 *** PredWF -0.564120.13178 -4.281 1.86e-05 *** --- the output for residual deviance and df does not change even when I use quasibinomial, is this ok? # That's as expected. library(MASS) you don't really need MASS for quasibinomial. glm2-glm(y~Pred,quasibinomial) summary(glm2) Call: glm(formula = y ~ Pred, family = quasibinomial) Deviance Residuals: Min 1Q Median 3Q Max -9.1022 -2.7899 -0.4781 2.6058 8.4852 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 1.3510 0.2398 5.633 1.52e-07 *** PredF-0.3481 0.4251 -0.819 0.41471 PredSN -3.2916 0.3878 -8.488 1.56e-13 *** PredW-1.4645 0.4638 -3.157 0.00208 ** PredWF -0.5641 0.4780 -1.180 0.24063 --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 (Dispersion parameter for quasibinomial family taken to be 13.15786) Null deviance: 2815.5 on 108 degrees of freedom Residual deviance: 1323.5 on 104 degrees of freedom (3 observations deleted due to missingness) AIC: NA Number of Fisher Scoring iterations: 5 anova(glm2,test=F) Analysis of Deviance Table Model: quasibinomial, link: logit Response: y Terms added sequentially (first to last) Df Deviance Resid. Df Resid. Dev F Pr(F) NULL108 2815.5 Pred 4 1492.0 104 1323.5 28.349 6.28e-16 *** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 model1-update(glm2,~.-Pred) anova(glm2,model1,test=F) Analysis of Deviance Table Model 1: y ~ Pred Model 2: y ~ 1 Resid. Df Resid. Dev Df Deviance F Pr(F) 1 104 1323.5 2 108 2815.5 -4 -1492.0 28.349 6.28e-16 *** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 coef(glm2) (Intercept) PredF PredSN PredW PredWF 1.3509550 -0.3481096 -3.2915601 -1.4645097 -0.5641223 Thanks Jessica hitejl at vcu.edu -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
Re: [R] printing out the summary for lm into a txt file
did not work. Might that mean errors? Care to share? Running that through my wetware R interpreter, I think I am seeing you ask for creation of models with variable outcomes from a the first 100 columns of data. And then you are specifying a formula that includes a weird mixture of 3 vectors, data[,x1],x2 x3,... plus all of the data.frame, data? Have you tested this loop without the sink's to see if it produces anything meaningful? Once you test the process, then issue the sink(filename, append=TRUE) once, before the loop, do your analyses in the loop, and then close with sink() after the loop. -- David Winsemius On Feb 17, 2009, at 12:52 PM, kayj wrote: Hi All, I am trying to run several linear regressions and print out the summay and the anova reslts on the top of each other for each model. Below is a sample progarm that did not work. is it possible to print the anova below the summary of lm in one file? thanks for your help ## data-read.table(data.txt, header=T, sep='\t') for (i in 1:100){ y-data[,i] lm.model-lm(y~data$x1+data$x2+data$x3+data) sink(results.txt, append=T) s-summary(lm.Model) print(s) sink() an-anova(lm.Model) print(an) sink() } -- View this message in context: http://www.nabble.com/printing-out-the-summary-for-lm-into-a-txt-file-tp22062643p22062643.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overdispersion with binomial distribution
Thanks for the clarification. I actually had MASS open to that page while I was composing my reply but forgot to mention it (trying to do too many things at once) ... Ben Bolker Prof Brian Ripley wrote: On Tue, 17 Feb 2009, Ben Bolker wrote: Jessica L Hite/hitejl/O/VCU hitejl at vcu.edu writes: I am attempting to run a glm with a binomial model to analyze proportion data. I have been following Crawley's book closely and am wondering if there is an accepted standard for how much is too much overdispersion? (e.g. change in AIC has an accepted standard of 2). In principle, in the null case (i.e. data are really binomial) the deviance is chi-squared distributed with the df equal to the residual df. *Approximately*, provided the expected counts are not near or below one. See MASS §7.5 for an analysis of the size of the approximation errors (which can be large and in both directions). Given that I once had a consulting job where the over-dispersion was causing something close ot panic and was entirely illusory, the lack of the 'approximately' can have quite serious consequences. For example: example(glm) deviance(glm.D93) ## 5.13 summary(glm.D93)$dispersion ## 1 (by definition) dfr - df.residual(glm.D93) deviance(glm.D93)/dfr ## 1.28 d2 - sum(residuals(glm.D93,pearson)^2) ## 5.17 (disp2 - d2/dfr) ## 1.293 gg2 - update(glm.D93,family=quasipoisson) summary(gg2)$dispersion ## 1.293, same as above pchisq(d2,df=dfr,lower.tail=FALSE) all.equal(coef(glm.D93),coef(gg2)) ## TRUE se1 - coef(summary(glm.D93))[,Std. Error] se2 - coef(summary(gg2))[,Std. Error] se2/se1 # (Intercept)outcome2outcome3 treatment2 treatment3 # 1.1372341.1372341.1372341.1372341.137234 sqrt(disp2) # [1] 1.137234 My code and output are below, given the example in the book, these data are WAY overdispersed .do I mention this and go on or does this signal the need to try a different model? If so, any suggestions on the type of distribution (gamma or negative binomial ?)? Way overdispersed may indicate model lack of fit. Have you examined residuals/data for outliers etc.? quasibinomial should be fine, or you can try beta-binomial (see the aod package) ... attach(Clutch2) y-cbind(Total,Size-Total) glm1-glm(y~Pred,binomial) summary(glm1) Call: glm(formula = y ~ Pred, family = binomial) Deviance Residuals: Min 1Q Median 3Q Max -9.1022 -2.7899 -0.4781 2.6058 8.4852 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) 1.350950.06612 20.433 2e-16 *** PredF -0.348110.11719 -2.970 0.00297 ** PredSN -3.291560.10691 -30.788 2e-16 *** PredW -1.464510.12787 -11.453 2e-16 *** PredWF -0.564120.13178 -4.281 1.86e-05 *** --- the output for residual deviance and df does not change even when I use quasibinomial, is this ok? # That's as expected. library(MASS) you don't really need MASS for quasibinomial. glm2-glm(y~Pred,quasibinomial) summary(glm2) Call: glm(formula = y ~ Pred, family = quasibinomial) Deviance Residuals: Min 1Q Median 3Q Max -9.1022 -2.7899 -0.4781 2.6058 8.4852 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 1.3510 0.2398 5.633 1.52e-07 *** PredF-0.3481 0.4251 -0.819 0.41471 PredSN -3.2916 0.3878 -8.488 1.56e-13 *** PredW-1.4645 0.4638 -3.157 0.00208 ** PredWF -0.5641 0.4780 -1.180 0.24063 --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 (Dispersion parameter for quasibinomial family taken to be 13.15786) Null deviance: 2815.5 on 108 degrees of freedom Residual deviance: 1323.5 on 104 degrees of freedom (3 observations deleted due to missingness) AIC: NA Number of Fisher Scoring iterations: 5 anova(glm2,test=F) Analysis of Deviance Table Model: quasibinomial, link: logit Response: y Terms added sequentially (first to last) Df Deviance Resid. Df Resid. Dev F Pr(F) NULL108 2815.5 Pred 4 1492.0 104 1323.5 28.349 6.28e-16 *** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 model1-update(glm2,~.-Pred) anova(glm2,model1,test=F) Analysis of Deviance Table Model 1: y ~ Pred Model 2: y ~ 1 Resid. Df Resid. Dev Df Deviance F Pr(F) 1 104 1323.5 2 108 2815.5 -4 -1492.0 28.349 6.28e-16 *** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 coef(glm2) (Intercept) PredF PredSN PredW PredWF 1.3509550 -0.3481096 -3.2915601 -1.4645097 -0.5641223 Thanks Jessica hitejl at vcu.edu -- Ben Bolker Associate professor, Biology Dep't, Univ. of Florida bol...@ufl.edu / www.zoology.ufl.edu/bolker GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc
Re: [R] creating a map
It should be in the datasets package that is automatically loaded with R (at least my copy), try ?state and you should see the help for it and others. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 From: Alina Sheyman [mailto:alina...@gmail.com] Sent: Tuesday, February 17, 2009 11:02 AM To: Greg Snow Cc: r-help@r-project.org Subject: Re: [R] creating a map Thanks Greg, do you know where i can find the sate.center dataset that you mention? On Tue, Feb 17, 2009 at 12:28 PM, Greg Snow greg.s...@imail.orgmailto:greg.s...@imail.org wrote: You need to give the symbols function the locations where you want the centers of the circles to be. Some datesets with map information also have centers of the states that you can use, for the USA, there is the state.center dataset that may work for you, or the maptools package function get.Pcent will compute a center for polygons (there are probably other similar functions in other packages). For adding circles to a map of the USA, you may want to look at the state.vbm data in the TeachingDemos package (works with maptools package), but you will need to computer the centers of the polygons, they don't match state.center or others. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.orgmailto:greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org [mailto:r-help-boun...@r-mailto:r-help-boun...@r- project.orghttp://project.org] On Behalf Of Alina Sheyman Sent: Tuesday, February 17, 2009 9:53 AM To: r-help@r-project.orgmailto:r-help@r-project.org Subject: [R] creating a map I'm trying to create a fairly basic map using R. What i want to get is the map of the country with circles representing a count of students in each state. What I've done so far is as following - map(state) symbols(data1$count,circles=log(data1$count)*3,fg=col,bg=col,add=T,inch es=F) this gives me the map of the country, but one that's not populated by my counts. Does anyone know what I'm doing wrong? Also, if anyone can recommend a good reference for creating maps in R, I'd really appreciate that. thank you [[alternative HTML version deleted]] __ R-help@r-project.orgmailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] printing out the summary for lm into a txt file
The sink() command stops the sinking, so you send the lm output to the file, then stop the sinking before printing out the anova result. So the simplest thing to try is to put the first sink (with the filename and append=T) before you start the loop, remove all calls to sink within the loop, then do a single sink() after the loop ends. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of kayj Sent: Tuesday, February 17, 2009 10:52 AM To: r-help@r-project.org Subject: [R] printing out the summary for lm into a txt file Hi All, I am trying to run several linear regressions and print out the summay and the anova reslts on the top of each other for each model. Below is a sample progarm that did not work. is it possible to print the anova below the summary of lm in one file? thanks for your help ## data-read.table(data.txt, header=T, sep='\t') for (i in 1:100){ y-data[,i] lm.model-lm(y~data$x1+data$x2+data$x3+data) sink(results.txt, append=T) s-summary(lm.Model) print(s) sink() an-anova(lm.Model) print(an) sink() } -- View this message in context: http://www.nabble.com/printing-out-the- summary-for-lm-into-a-txt-file-tp22062643p22062643.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Percentiles/Quantiles with Weighting
Thanks for pointing me to the quantreg package as a resource. I was hoping to ask be able to address one quick follow-up question... I get slightly different variants between using the rq funciton with formula = mydata ~ 1 as I would if I ran the same data using the quantile function. Example: mydata - (1:10)^2/2 pctile - seq(.59, .99, .1) quantile(mydata, pctile) 59%69%79%89%99% 20.015 26.075 32.935 40.595 49.145 rq(mydata~1, tau=pctile) Call: rq(formula = mydata ~ 1, tau = pctile) Coefficients: tau= 0.59 tau= 0.69 tau= 0.79 tau= 0.89 tau= 0.99 (Intercept)18 24.532 40.550 Degrees of freedom: 10 total; 9 residual Is it correct to assume this is due to the different accepted methods of calculating quantiles? If so, do you know where I would be able to see the algorithms used in these functions? I'm not finding it in the documentation for function rq, and am new enough to R that I don't know where those references would generally be. On Tue, Feb 17, 2009 at 12:29 PM, roger koenker rkoen...@uiuc.edu wrote: http://www.nabble.com/weighted-quantiles-to19864562.html#a19865869 gives one possibility... url:www.econ.uiuc.edu/~rogerRoger Koenker emailrkoen...@uiuc.eduDepartment of Economics vox: 217-333-4558University of Illinois fax: 217-244-6678Champaign, IL 61820 On Feb 17, 2009, at 10:57 AM, Brigid Mooney wrote: Hi All, I am looking at applications of percentiles to time sequenced data. I had just been using the quantile function to get percentiles over various periods, but am more interested in if there is an accepted (and/or R-implemented) method to apply weighting to the data so as to weigh recent data more heavily. I wrote the following function, but it seems quite inefficient, and not really very flexible in its applications - so if anyone has any suggestions on how to look at quantiles/percentiles within R while also using a weighting schema, I would be very interested. Note - this function supposes the data in X is time-sequenced, with the most recent (and thus heaviest weighted) data at the end of the vector WtPercentile - function(X=rnorm(100), pctile=seq(.1,1,.1)) { Xprime - NA for(i in 1:length(X)) { Xprime - c(Xprime, rep(X[i], times=i)) } print(Percentiles:) print(quantile(X, pctile)) print(Weighted:) print(Xprime) print(Weighted Percentiles:) print(quantile(Xprime, pctile, na.rm=TRUE)) } WtPercentile(1:10) WtPercentile(rnorm(10)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Percentiles/Quantiles with Weighting
url:www.econ.uiuc.edu/~rogerRoger Koenker emailrkoen...@uiuc.eduDepartment of Economics vox: 217-333-4558University of Illinois fax: 217-244-6678Champaign, IL 61820 On Feb 17, 2009, at 1:58 PM, Brigid Mooney wrote: Thanks for pointing me to the quantreg package as a resource. I was hoping to ask be able to address one quick follow-up question... I get slightly different variants between using the rq funciton with formula = mydata ~ 1 as I would if I ran the same data using the quantile function. Example: mydata - (1:10)^2/2 pctile - seq(.59, .99, .1) quantile(mydata, pctile) 59%69%79%89%99% 20.015 26.075 32.935 40.595 49.145 rq(mydata~1, tau=pctile) Call: rq(formula = mydata ~ 1, tau = pctile) Coefficients: tau= 0.59 tau= 0.69 tau= 0.79 tau= 0.89 tau= 0.99 (Intercept)18 24.532 40.550 Degrees of freedom: 10 total; 9 residual Is it correct to assume this is due to the different accepted methods of calculating quantiles? If so, do you know where I would be able to see the algorithms used in these functions? I'm not finding it in the documentation for function rq, and am new enough to R that I don't know where those references would generally be. Yes, quantile() in base R documents 9 varieties of quantiles, 2 more than William Empson's famous 7 Types of Ambiguity. In quantreg the function rq() finds a solution to an underlying optimization problem and doesn't ask any further into the nature of the ambiguity -- it does often produce a warning indicating that there may be more than one solution. The default base R quantile is interpolated, while the default rq() with method = br using the simplex algorithm finds an order statistic, typically. If you prefer something more like interpolation, you can try rq() with method = fn which is using an interior point algorithm and when there are multiple solutions it tends to produce something more like the centroid of the solution set. I hope that this helps. On Tue, Feb 17, 2009 at 12:29 PM, roger koenker rkoen...@uiuc.edu wrote: http://www.nabble.com/weighted-quantiles-to19864562.html#a19865869 gives one possibility... url:www.econ.uiuc.edu/~rogerRoger Koenker emailrkoen...@uiuc.eduDepartment of Economics vox: 217-333-4558University of Illinois fax: 217-244-6678Champaign, IL 61820 On Feb 17, 2009, at 10:57 AM, Brigid Mooney wrote: Hi All, I am looking at applications of percentiles to time sequenced data. I had just been using the quantile function to get percentiles over various periods, but am more interested in if there is an accepted (and/or R-implemented) method to apply weighting to the data so as to weigh recent data more heavily. I wrote the following function, but it seems quite inefficient, and not really very flexible in its applications - so if anyone has any suggestions on how to look at quantiles/percentiles within R while also using a weighting schema, I would be very interested. Note - this function supposes the data in X is time-sequenced, with the most recent (and thus heaviest weighted) data at the end of the vector WtPercentile - function(X=rnorm(100), pctile=seq(.1,1,.1)) { Xprime - NA for(i in 1:length(X)) { Xprime - c(Xprime, rep(X[i], times=i)) } print(Percentiles:) print(quantile(X, pctile)) print(Weighted:) print(Xprime) print(Weighted Percentiles:) print(quantile(Xprime, pctile, na.rm=TRUE)) } WtPercentile(1:10) WtPercentile(rnorm(10)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] printing out the summary for lm into a txt file
Thanks a lot for your help, it worked. Greg Snow-2 wrote: The sink() command stops the sinking, so you send the lm output to the file, then stop the sinking before printing out the anova result. So the simplest thing to try is to put the first sink (with the filename and append=T) before you start the loop, remove all calls to sink within the loop, then do a single sink() after the loop ends. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of kayj Sent: Tuesday, February 17, 2009 10:52 AM To: r-help@r-project.org Subject: [R] printing out the summary for lm into a txt file Hi All, I am trying to run several linear regressions and print out the summay and the anova reslts on the top of each other for each model. Below is a sample progarm that did not work. is it possible to print the anova below the summary of lm in one file? thanks for your help ## data-read.table(data.txt, header=T, sep='\t') for (i in 1:100){ y-data[,i] lm.model-lm(y~data$x1+data$x2+data$x3+data) sink(results.txt, append=T) s-summary(lm.Model) print(s) sink() an-anova(lm.Model) print(an) sink() } -- View this message in context: http://www.nabble.com/printing-out-the- summary-for-lm-into-a-txt-file-tp22062643p22062643.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/printing-out-the-summary-for-lm-into-a-txt-file-tp22062643p22065292.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] printing out the summary for lm into a txt file
this is the error message that I was getting In sink() ... : no sink to remove i got it to work . thanks for the help David Winsemius wrote: did not work. Might that mean errors? Care to share? Running that through my wetware R interpreter, I think I am seeing you ask for creation of models with variable outcomes from a the first 100 columns of data. And then you are specifying a formula that includes a weird mixture of 3 vectors, data[,x1],x2 x3,... plus all of the data.frame, data? Have you tested this loop without the sink's to see if it produces anything meaningful? Once you test the process, then issue the sink(filename, append=TRUE) once, before the loop, do your analyses in the loop, and then close with sink() after the loop. -- David Winsemius On Feb 17, 2009, at 12:52 PM, kayj wrote: Hi All, I am trying to run several linear regressions and print out the summay and the anova reslts on the top of each other for each model. Below is a sample progarm that did not work. is it possible to print the anova below the summary of lm in one file? thanks for your help ## data-read.table(data.txt, header=T, sep='\t') for (i in 1:100){ y-data[,i] lm.model-lm(y~data$x1+data$x2+data$x3+data) sink(results.txt, append=T) s-summary(lm.Model) print(s) sink() an-anova(lm.Model) print(an) sink() } -- View this message in context: http://www.nabble.com/printing-out-the-summary-for-lm-into-a-txt-file-tp22062643p22062643.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/printing-out-the-summary-for-lm-into-a-txt-file-tp22062643p22065365.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subset Regression Package
Alex Roy wrote: Dear all , Is there any subset regression (subset selection regression) package in R other than leaps? RSiteSearch({subset regression}) doesn't turn up much other than special-purpose tools for ARMA models etc.. What does leaps not do that you need it to do? Ben Bolker -- View this message in context: http://www.nabble.com/Subset-Regression-Package-tp22054068p22066217.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Whitening Time Series
Hi Bob - your suggesting worked out great... Many thanks! Also, thanks everyone for the other suggestions! Bob McCall wrote: Look in the package forecast for the function Arima. It will do what you want. It's different than arima function in the stats package. Bob Pele wrote: Hi R users, I am doing cross correlation analysis on 2 time series (call them y-series and x-series) where I need the use the model developed on the x-series to prewhiten the yseries.. Can someone point me to a function/filter in R that would allow me to do that? Thanks in advance for any help! -- View this message in context: http://www.nabble.com/Whitening-Time-Series-tp22041765p22066246.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Cross classified or Multiple membership or Hierarchical (3 level ) logistic models using Umacs
Dear R users, I would like to fit cross classified or multiple membership logistic models or a 3 level hierarchical logistic model using the Umacs package. Can anyone advise me on how to proceed or better point me to examples of how its done. Regards, -- Luwis Diya, Leuven Biostatistics and Statistical Bioinformatics Centre (L-BioStat), Kapucijnenvoer 35 blok d - bus 7001, 3000 Leuven, Belgium Tel: +32 16 336886 or +32 16 336892 Fax: +32 16 337015 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] using sapply to apply function to some columns of a dataframe
Hello: I would like to sum every x columns of a dataframe for each row. For instance, if x is 10, then for dataframe df, this function will sum the first ten elements together and then the next ten: sapply(list(colnames(df)[1:10], colnames(df)[11:20]),function(x)apply( df[,x], 1, sum)) If the number of columns is quite large (1000's), then manually entering the list above is not practical. Any suggestions? I would also like to do a variant of the above, where I sum every nth element. Thanks, Michael -- Michael I. Westphal, PhD Africa Region Water Resources (AFTWR) South Asia Region Sustainable Development (SASSD) World Development Report 2010: Development in a Changing Climate www.worldbank.org/wdr2010 Room J6-007 (mail stop: J6-603) Tel: 202.473.1217 The World Bank 1818 H St NW, Washington DC 20433, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subset Regression Package
Alex Roy alexroy2008 at gmail.com writes: Dear all , Is there any subset regression (subset selection regression) package in R other than leaps? Lars and Lasso are other 'subset selection' methods, see the corresponding packages 'lars' and 'lasso2' and its description in The Elements of Statistical Learning. Also, 'dr', Methods for dimension reduction for regression, or 'relaimpo', Relative importance of regressors in linear models, can be considered. Thanks and regards Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cumsum vs. sum
I recently traced a bug of mine to the fact that cumsum(s)[length(s)] is not always exactly equal to sum(s). For example, x-1/(12:14) sum(x) - cumsum(x)[3] = 2.8e-17 Floating-point addition is of course not exact, and in particular is not associative, so there are various possible reasons for this. Perhaps sum uses clever summing tricks to get more accurate results? In some quick experiments, it does seem to get more accurate results than cumsum. It might be worth documenting. -s __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using sapply to apply function to some columns of a dataframe
On 17/02/2009 4:42 PM, mwestp...@worldbank.org wrote: Hello: I would like to sum every x columns of a dataframe for each row. For instance, if x is 10, then for dataframe df, this function will sum the first ten elements together and then the next ten: sapply(list(colnames(df)[1:10], colnames(df)[11:20]),function(x)apply( df[,x], 1, sum)) If the number of columns is quite large (1000's), then manually entering the list above is not practical. Any suggestions? I would also like to do a variant of the above, where I sum every nth element. I think the easiest way to do this is to convert the dataframe into an array with 3 indices, and sum over one of them. For example: rows - 20 cols - 120 df - matrix(1:(rows*cols), rows, cols) # in your case, df - as.matrix( df ) arr - array( df, c(rows, 10, cols/10)) sums - apply( arr, c(1,3), sum) Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] matrix output
Hi friends, I have questions about printing a pretty big size matrix. As you could see from below, the matrix wasn't showed in R at full size (11X11), but it was cut partly into three smaller matrices (11X4,11X4,11X3). I'm wondering if there is a way to show the whole matrix with dimension 11X11, do you know how to make it? If R really couldn't fit the full big matrix at once, what about output the FULL matrix to a .pdf document? I have been wondering for a long time if we could output something other than graphic, like data frame or text, to a .pdf file. Thanks in advance for your patience and answer. SY Sldur Slonset Waso Sboutn Sldur1 -0.6744252 -0.08427312 -0.2871798 Slonset327 1.000 0.14257353 0.1981339 Waso 327 327.000 1. 0.5104723 Sboutn 327 327.000 327. 1.000 MidSleep 327 327.000 327. 327.000 SleepEff 327 327.000 327. 327.000 NumTrials 22 22.000 22. 22.000 MeanTrials 22 22.000 22. 22.000 MedianTrials22 22.000 22. 22.000 NS3all5tage296 296.000 296. 296.000 HGsoc118 325 325.000 325. 325.000 MidSleepSleepEff NumTrials MeanTrials Sldur -0.14512244 0.6721889 0.26482213 0.26256352 Slonset0.73991223 -0.5613362 -0.09429701 -0.02540937 Waso 0.08729977 -0.6836098 0.18577075 0.26369283 Sboutn 0.06169839 -0.4895902 0.10897798 0.07058159 MidSleep 1. -0.1498673 -0.17786561 0.13043478 SleepEff 327. 1.000 -0.01637493 -0.20835686 NumTrials 22. 22.000 1. -0.16768424 MeanTrials22. 22.000 46. 1. MedianTrials 22. 22.000 46. 46. NS3all5tage 296. 296.000 44. 44. HGsoc118 325. 325.000 44. 44. MedianTrials NS3all5tage HGsoc118 Sldur 0.1857708 0.07849234 -0.03751973 Slonset 0.2919255 -0.14858206 0.05323562 Waso0.3856578 -0.08148054 0.09341454 Sboutn 0.2557877 -0.05049218 0.06847118 MidSleep0.4172784 -0.09344423 0.05287818 SleepEff -0.3088650 0.12099185 -0.08453191 NumTrials -0.1108233 -0.06779422 -0.13164200 MeanTrials 0.9203207 -0.07625088 0.10584919 MedianTrials1.000 -0.10894996 0.13615222 NS3all5tage44.000 1. -0.10554137 HGsoc118 44.000 632. 1. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix output
On 17/02/2009 5:31 PM, phoebe kong wrote: Hi friends, I have questions about printing a pretty big size matrix. As you could see from below, the matrix wasn't showed in R at full size (11X11), but it was cut partly into three smaller matrices (11X4,11X4,11X3). I'm wondering if there is a way to show the whole matrix with dimension 11X11, do you know how to make it? If you increase the output width, e.g. options(width=1) won't wrap (but it'll probably be too wide to be useful). You can also reduced the number of decimal places, e.g. SYrounded - round(SY, 3) If R really couldn't fit the full big matrix at once, what about output the FULL matrix to a .pdf document? I have been wondering for a long time if we could output something other than graphic, like data frame or text, to a .pdf file. The main way to produce nice text in a PDF document through R is to use Sweave and LaTeX. Too much explanation needed for a simple example here. Duncan Murdoch Thanks in advance for your patience and answer. SY Sldur Slonset Waso Sboutn Sldur1 -0.6744252 -0.08427312 -0.2871798 Slonset327 1.000 0.14257353 0.1981339 Waso 327 327.000 1. 0.5104723 Sboutn 327 327.000 327. 1.000 MidSleep 327 327.000 327. 327.000 SleepEff 327 327.000 327. 327.000 NumTrials 22 22.000 22. 22.000 MeanTrials 22 22.000 22. 22.000 MedianTrials22 22.000 22. 22.000 NS3all5tage296 296.000 296. 296.000 HGsoc118 325 325.000 325. 325.000 MidSleepSleepEff NumTrials MeanTrials Sldur -0.14512244 0.6721889 0.26482213 0.26256352 Slonset0.73991223 -0.5613362 -0.09429701 -0.02540937 Waso 0.08729977 -0.6836098 0.18577075 0.26369283 Sboutn 0.06169839 -0.4895902 0.10897798 0.07058159 MidSleep 1. -0.1498673 -0.17786561 0.13043478 SleepEff 327. 1.000 -0.01637493 -0.20835686 NumTrials 22. 22.000 1. -0.16768424 MeanTrials22. 22.000 46. 1. MedianTrials 22. 22.000 46. 46. NS3all5tage 296. 296.000 44. 44. HGsoc118 325. 325.000 44. 44. MedianTrials NS3all5tage HGsoc118 Sldur 0.1857708 0.07849234 -0.03751973 Slonset 0.2919255 -0.14858206 0.05323562 Waso0.3856578 -0.08148054 0.09341454 Sboutn 0.2557877 -0.05049218 0.06847118 MidSleep0.4172784 -0.09344423 0.05287818 SleepEff -0.3088650 0.12099185 -0.08453191 NumTrials -0.1108233 -0.06779422 -0.13164200 MeanTrials 0.9203207 -0.07625088 0.10584919 MedianTrials1.000 -0.10894996 0.13615222 NS3all5tage44.000 1. -0.10554137 HGsoc118 44.000 632. 1. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Python and R
Hello all, I am just wondering if any of you are doing most of your scripting with Python instead of R's programming language and then calling the relevant R functions as needed? And if so, what is your experience with this and what sort of software/library do you use in combination with Python to be able to access R's functionality. Is there much of a performance hit either way? (as both are interpreted languages) Thanks, Esmail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Percentiles/Quantiles with Weighting
Here is one kind of weighted quantile function. The basic idea is very simple: wquantile - function( v, w, p ) { v - v[order(v)] w - w[order(v)] v [ which.max( cumsum(w) / sum(w) = p ) ] } With some more error-checking and general clean-up, it looks like this: # Simple weighted quantile # # v A numeric vector of observations # w A numeric vector of positive weights # p The probability 0=p=1 # # Nothing fancy: no interpolation etc. # Basic idea wquantile - function( v, w, p ) { v - v[order(v)] w - w[order(v)] v [ which.max( cumsum(w) / sum(w) = p ) ] } # Simple weighted quantile # # v A numeric vector of observations # w A numeric vector of positive weights # p The probability 0=p=1 # # Nothing fancy: no interpolation etc. wquantile - function(v,w=rep(1,length(v)),p=.5) { if (!is.numeric(v) || !is.numeric(w) || length(v) != length(w)) stop(Values and weights must be equal-length numeric vectors) if ( !is.numeric(p) || any( p0 | p1 ) ) stop(Quantiles must be 0=p=1) ranking - order(v) sumw - cumsum(w[ranking]) if ( is.na(w[1]) || w[1]0 ) stop(Weights must be non-negative numbers) plist - sumw/sumw[length(sumw)] sapply(p, function(p) v [ ranking [ which.max( plist = p ) ] ]) } I would appreciate any comments people have on this -- whether correctness, efficiency, style, -s On Tue, Feb 17, 2009 at 11:57 AM, Brigid Mooney bkmoo...@gmail.com wrote: Hi All, I am looking at applications of percentiles to time sequenced data. I had just been using the quantile function to get percentiles over various periods, but am more interested in if there is an accepted (and/or R-implemented) method to apply weighting to the data so as to weigh recent data more heavily. I wrote the following function, but it seems quite inefficient, and not really very flexible in its applications - so if anyone has any suggestions on how to look at quantiles/percentiles within R while also using a weighting schema, I would be very interested. Note - this function supposes the data in X is time-sequenced, with the most recent (and thus heaviest weighted) data at the end of the vector WtPercentile - function(X=rnorm(100), pctile=seq(.1,1,.1)) { Xprime - NA for(i in 1:length(X)) { Xprime - c(Xprime, rep(X[i], times=i)) } print(Percentiles:) print(quantile(X, pctile)) print(Weighted:) print(Xprime) print(Weighted Percentiles:) print(quantile(Xprime, pctile, na.rm=TRUE)) } WtPercentile(1:10) WtPercentile(rnorm(10)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] frequency table for multiple variables
On Tue, Feb 17, 2009 at 10:00:40AM -0600, Marc Schwartz wrote: on 02/17/2009 09:06 AM Hans Ekbrand wrote: Hi r-help! Consider the following data-frame: var1 var2 var3 1 314 2 223 3 223 4 44 NA 5 435 6 223 7 343 How can I get R to convert this into the following? Value 1 2 3 4 5 var1 0 3 2 2 0 var2 1 3 1 2 0 var3 0 0 4 1 1 t(sapply(DF, function(x) table(factor(x, levels = 1:5 1 2 3 4 5 var1 0 3 2 2 0 var2 1 3 1 2 0 var3 0 0 4 1 1 The key is to turn each column into a factor with explicitly defined common levels for tabulation. This enables the table result to have a consistent format across each column, allowing for a matrix to be created, rather than a list. Thanks alot, Marc. Neat and efficient, just what I wanted. BTW, before I saw that you actually included code, I tried on my own, and wrote this: my.count - function(data.frame, levels) { result.df - data.frame(matrix(nrow=length(data.frame),ncol=levels)) for (i in 1:length(data.frame)) { result.df[i,] - table(factor(data.frame[[i]], levels = c(1:levels))) } result.df } which produces the same result. I take this to be a an instructive example of unnecessary use of for-loops in R. -- Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net Q. What is that strange attachment in this mail? A. My digital signature, see www.gnupg.org for info on how you could use it to ensure that this mail is from me and has not been altered on the way to you. signature.asc Description: Digital signature __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Python and R
Esmail Bonakdarian wrote: I am just wondering if any of you are doing most of your scripting with Python instead of R's programming language and then calling the relevant R functions as needed? No, but if I wanted to do such a thing, I'd look at Sage: http://sagemath.org/ It'll give you access to a lot more than just R. Is there much of a performance hit either way? (as both are interpreted languages) Are you just asking, or do you have a particular execution time goal, which if exceeded would prevent doing this? I ask because I suspect it's the former, and fast enough is fast enough. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Python and R
2009/2/17 Esmail Bonakdarian esmail...@gmail.com: Hello all, I am just wondering if any of you are doing most of your scripting with Python instead of R's programming language and then calling the relevant R functions as needed? I tend to use R in its native form for data analysis and modelling, and python for all my other programming needs (gui stuff with PyQt4, web stuff, text processing etc etc). And if so, what is your experience with this and what sort of software/library do you use in combination with Python to be able to access R's functionality. When I need to use the two together, it's easiest with 'rpy'. This lets you call R functions from python, so you can do: from rpy import r r.hist(z) to get a histogram of the values in a python list 'z'. There are some complications converting structured data types between the two but they can be overcome, and apparently are handled better with the next generation Rpy2 (which I've not got into yet). Google for rpy for info. Is there much of a performance hit either way? (as both are interpreted languages) Not sure what you mean here. Do you mean is: R sum(x) faster than Python sum(x) and how much worse is: Python from rpy import r Python r.sum(x) ? Knuth's remark on premature optimization applies, as ever Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Python and R
Hello! On Tue, Feb 17, 2009 at 5:58 PM, Warren Young war...@etr-usa.com wrote: Esmail Bonakdarian wrote: I am just wondering if any of you are doing most of your scripting with Python instead of R's programming language and then calling the relevant R functions as needed? No, but if I wanted to do such a thing, I'd look at Sage: http://sagemath.org/ ah .. thanks for the pointer, I had not heard of Sage, I was just starting to look at SciPy. It'll give you access to a lot more than just R. Is there much of a performance hit either way? (as both are interpreted languages) Are you just asking, or do you have a particular execution time goal, which if exceeded would prevent doing this? I ask because I suspect it's the former, and fast enough is fast enough. I put together a large'ish R program last year, but I think I would be happier if I could code it in say Python - but I would rather not do that at the expense of execution time. Thanks again for telling me about Sage. Esmail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Saving data to a MS Access Database
I have the following dataframe: ad - data.frame(dates, av, sn$SectorName) colnames(ad) - c(Date, Value, Tag) which has data (rows 10 to 20, for example) as follows: DateValue Tag 10 2008-01-16-0.20875ConDisc 11 2008-01-17-0.21125ConDisc 12 2008-01-18-0.08875ConDisc 13 2008-01-21-0.10875ConDisc 14 2008-01-220.02125 ConDisc 15 2008-01-230.09500 ConDisc 16 2008-01-240.02500 ConDisc 17 2008-01-250.03500 ConDisc 18 2008-01-280.06625 ConDisc 19 2008-01-290.20500 ConDisc 20 2008-01-300.11625 ConDisc I want to import this data in into an existing MS Access database which has the following properties: Name: tblAD ID: Autonumber, primary key Date: Date Value: Double Tag: String I am trying this by using the following: result - sqlUpdate(channel, ad, tblAD) But I get the following error: result - sqlUpdate(channel, ad, tblAD) Error in sqlUpdate(channel, ad, tblAD) : cannot update 'tblAD' without unique column Can anyone suggest where I am going wrong? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ylim in plot.ts?
Hello, I tried to use ylim=c(x,y) in a plot.ts(). This was ignored. How can I achieve it to create such graphics? Ciao, Oliver Bandel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] :How to insert a column in a data frame
Hi dear list, I wonder if somebody can help me with this. I have a text file with 300 rows and around 30 columns and I need to insert a column that has the number 1 in every row. This new column should be placed between columns 6 and 7. As an example: I would want to insert a column (consiting just of the number 1 in every row, as in column 6) between columns 6 and 7. 847 847 0 0 2 1 G G T C A G T T C G A C C C G C A A A G G G G A C A T T 847 847 0 0 2 1 G G T C A G T T C G A C C C G C A A A G G G G A C A T T 847 847 0 0 2 1 G G T C A G T T C G A C C C G C A A A G G G G A C A T T 847 847 0 0 2 1 G G T C A G T T C G A C C C G C A A A G G G G A C A T T 847 847 0 0 2 1 G G T C A G T T C G A C C C G C A A A G G G G A C A T T 847 847 0 0 2 1 G G T C A G T T C G A C C C G C A A A G G G G A C A T T So the new table would look like this: 847 847 0 0 2 1 1 G G T C A G T T C G A C C C G C A A A G G G G A C A T T 847 847 0 0 2 1 1 G G T C A G T T C G A C C C G C A A A G G G G A C A T T 847 847 0 0 2 1 1 G G T C A G T T C G A C C C G C A A A G G G G A C A T T 847 847 0 0 2 1 1 G G T C A G T T C G A C C C G C A A A G G G G A C A T T 847 847 0 0 2 1 1 G G T C A G T T C G A C C C G C A A A G G G G A C A T T 847 847 0 0 2 1 1 G G T C A G T T C G A C C C G C A A A G G G G A C A T T You can also look at this problem as duplicating column 6. Thank you for your help! Laura R Murillo Columbia University New York __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Python and R
On Tue, Feb 17, 2009 at 6:05 PM, Barry Rowlingson b.rowling...@lancaster.ac.uk wrote: 2009/2/17 Esmail Bonakdarian esmail...@gmail.com: When I need to use the two together, it's easiest with 'rpy'. This lets you call R functions from python, so you can do: from rpy import r r.hist(z) wow .. that is pretty straight forward, I'll have to check out rpy for sure. to get a histogram of the values in a python list 'z'. There are some complications converting structured data types between the two but they can be overcome, and apparently are handled better with the next generation Rpy2 (which I've not got into yet). Google for rpy for info. Will do! Is there much of a performance hit either way? (as both are interpreted languages) Not sure what you mean here. Do you mean is: R sum(x) faster than Python sum(x) and how much worse is: Python from rpy import r Python r.sum(x) Well, I have a program written in R which already takes quite a while to run. I was just wondering if I were to rewrite most of the logic in Python - the main thing I use in R are its regression facilities - if it would speed things up. I suspect not since both of them are interpreted, and the bulk of the time is taken up by R's regression calls. Esmail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Normal cdf modified function
I wonder if an R package would have a function that calculates the following. Let Y be a normal multivariate function. For example, let Y have 4 dimensions. I want to calculate P(Y1 Z1, Y2 Z2, Y3 Z3, Y4 Z4). There are R functions to do the calculation if all the inequalities are of the type (the cdf). But is there an R function where the two types of inequalities ( and ) can be mixed? (The user would have to specify the set of indexes with inequalities of the type ) Thanks for any suggestions. FS __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.