Re: [R] How to test if a slope is different than 1?
Doesn't the p-value from using offset work for you? if you really need a p-value. The confint method is a quick and easy way to see if it is significantly different from 1 (see Rolf's response), but does not provide an exact p-value. I guess you could do confidence intervals at different confidence levels until you find the level such that one of the limits is close enough to 1, but that seems like way to much work. You could also compute the p-value by taking the slope minus 1 divided by the standard error and plug that into the pt function with the correct degrees of freedom. You could even write a function to do that for you, but it still seems more work than adding the offset to the formula. On Tue, Apr 24, 2012 at 8:17 AM, Mark Na mtb...@gmail.com wrote: Hi Greg. Thanks for your reply. Do you know if there is a way to use the confint function to get a p-value on this test? Thanks, Mark On Mon, Apr 23, 2012 at 3:10 PM, Greg Snow 538...@gmail.com wrote: One option is to subtract the continuous variable from y before doing the regression (this works with any regression package/function). The probably better way in R is to use the 'offset' function: formula = I(log(data$AB.obs + 1, 10)-log(data$SIZE,10)) ~ log(data$SIZE, 10) + data$Y formula = log(data$AB.obs + 1) ~ offset( log(data$SIZE,10) ) + log(data$SIZE,10) + data$Y Or you can use a function like 'confint' to find the confidence interval for the slope and see if 1 is in the interval. On Mon, Apr 23, 2012 at 12:11 PM, Mark Na mtb...@gmail.com wrote: Dear R-helpers, I would like to test if the slope corresponding to a continuous variable in my model (summary below) is different than one. I would appreciate any ideas for how I could do this in R, after having specified and run this model? Many thanks, Mark Na Call: lm(formula = log(data$AB.obs + 1, 10) ~ log(data$SIZE, 10) + data$Y) Residuals: Min 1Q Median 3Q Max -0.94368 -0.13870 0.04398 0.17825 0.63365 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -1.18282 0.09120 -12.970 2e-16 *** log(data$SIZE, 10) 0.56009 0.02564 21.846 2e-16 *** data$Y2008 0.16825 0.04366 3.854 0.000151 *** data$Y2009 0.20310 0.04707 4.315 0.238 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.2793 on 228 degrees of freedom Multiple R-squared: 0.6768, Adjusted R-squared: 0.6726 F-statistic: 159.2 on 3 and 228 DF, p-value: 2.2e-16 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Accessing a list
I believe that fortune(312) applies here. As my current version of fortunes does not show this I am guessing that it is in the development version and so here is what fortune(312) will eventually print (unless something changes or I got something wrong): The problem here is that the $ notation is a magical shortcut and like any other magic if used incorrectly is likely to do the programmatic equivalent of turning yourself into a toad. —Greg Snow (in response to a user that wanted to access a column whose name is stored in y via x$y rather than x[[y]]) R-help (February 2012) On Tue, Apr 24, 2012 at 9:42 PM, Jim Silverton jim.silver...@gmail.com wrote: Hi, I have the following problem- I want to access a list whose elements are imp1, imp2, imp3 etc I tried theusing the paste comand in a for loop see the last for loop below. But I keep calling it df but df = imp1 (for the first run). Any ideas on how I can access the elements of the list? Isaac require(Amelia) library(Amelia) data.use - read.csv(multiplecarol.CSV, header=T) names(data.use) = c(year, dischargex1, y, pressurex2 , windx3) ts - c (c(1:12), c(1:12), c(1:12), c(1:12), c(1:12), c(1:12), c(1:12), c(1:6) ) length(ts) data.use = cbind(ts, data.use) #a.out2 - amelia(data.use, m = 1000, idvars = year) n.times = 100 a.out.time - amelia(data.use, m = n.times, ts=ts, idvars=year, polytime=2) constant.col = dischargex1.col = pressurex2.col = windx3.col = rep(0,n.times) for (i in 1: n.times) { x = c(imp,i) df = paste(x, collapse = ) data1 = a.out.time[[1]]$df attach(data1) y = as.numeric(y) dischargex1 = as.numeric(dischargex1) pressurex2 = as.numeric(pressurex2) windx3 = as.numeric(windx3) multi.regress = lm(y~ dischargex1 + pressurex2 + windx3) constant.col[i] = as.numeric(multi.regress[[1]][1]) dischargex1.col[i] = as.numeric(multi.regress[[1]][2]) pressurex2.col[i] = as.numeric(multi.regress[[1]][3]) windx3.col[i] = as.numeric(multi.regress[[1]][4]) } -- Thanks, Jim. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scatter plot / LOESS, or LOWESS for more than one parameter
Assuming that you want event as the x-axis (horizontal) you can do something like (untested without reproducible data): par(mfrow=c(2,1)) scatter.smooth( event, pH1 ) scatter.smooth( event, pH2 ) or plot( event, pH1, ylim=range(pH1,pH2) , col='blue') points( event, pH2, col='green' ) lines( loess.smooth(event,pH1), col='blue') lines( loess.smooth(event,pH2), col='green') Only do the second one if pH1 and pH2 are measured on the same scale in a way that the comparison and any crossings are meaningful or if there is enough separation (but not too much) that there is no overlap, but still enough detail. On Mon, Apr 23, 2012 at 10:40 PM, R. Michael Weylandt michael.weyla...@gmail.com wrote: The scatter plot is easy: plot(pH1 ~ pH2, data = OBJ) When you say a loess for each -- how do you break them up? Are there repeat values for pH1? If so, this might be hard to do in base graphics, but ggplot2 would make it easy: library(ggplot2) ggplot(OBJ, aes(x = pH1, y = pH2)) + geom_point() + stat_smooth() + facet_wrap(~factor(pH1)) or something similar. Michael On Mon, Apr 23, 2012 at 11:26 PM, David Doyle kydaviddo...@gmail.com wrote: Hi folks. If I have the following in my data event pH1 pH2 1 4.0 6.0 2 4.3 5.9 3 4.1 6.1 4 4.0 5.9 and on and on. for about 400 events Is there a way I can get R to plot event vs. pH1 and event vs. pH2 and then do a loess or lowess line for each?? Thanks in advance David [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] need advice on using excel to check data for import into R
This is really a job for a database, and Excel is not a database (even though many think it is). I have some clients that I have convinced to create an Access database rather than use Excel (still MS product so it can't be that scary, right?). They were often a little reluctant at first because they would be using a new tool, and they actually had to think about the design of the database up front, but once they got to serious data entry they were very grateful for me directing them to Access over Excel. Databases have tools to validate data on entry so there will be fewer cases where you need to ask them for corrections (and it will be easier for them to fix any problems that do sneak through). On Sun, Apr 22, 2012 at 12:34 PM, Markus Weisner r...@themarkus.com wrote: I have created an S4 object type for conducting fire department data analysis. The object includes validity check that ensures certain fields are present and that duplicate records don't exist for certain combinations of columns (e.g. no duplicate incident number / incident data / unit ID ensures that the data does not show the same fire engine responding twice on the same call). I am finding that I spend a lot of time taking client data, converting it to my S4 object, and then sending it back to the client to correct data validity issues. I am trying to figure out a clever way to have excel (typically the program used by my clients) check client data prior to them submitting it to me. I have been working with somebody on trying to develop an excel toolbar add-in with limited success. My question is whether anybody can think of clever alternatives for clients to validate their data … for example, is their a R excel plugin (that would be easily installed by a client) where I might be able write some lines of R to check the data and output messages … or maybe some sort of server where they could upload their data and I could have some lines of R code that would check the code and send back potential error messages? I realize this is a fairly open ended question … just looking for some general ideas and directions to go. Getting a little frustrated with spending most of my work time dealing with data cleaning issues … guessing this is a problem shared by many of us that use R! Thanks, Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Selecting columns whose names contain mutated except when they also contain non or un
Here is a method that uses negative look behind: tmp - c('mutation','nonmutated','unmutated','verymutated','other') grep((?!un)(?!non)muta, tmp, perl=TRUE) [1] 1 4 it looks for muta that is not immediatly preceeded by un or non (but it would match unusually mutated since the un is not immediatly befor the muta). Hope this helps, On Mon, Apr 23, 2012 at 10:10 AM, Paul Miller pjmiller...@yahoo.com wrote: Hello All, Started out awhile ago trying to select columns in a dataframe whose names contain some variation of the word mutant using code like: names(KRASyn)[grep(muta, names(KRASyn))] The idea then would be to add together the various columns using code like: KRASyn$Mutant_comb - rowSums(KRASyn[grep(muta, names(KRASyn))]) What I discovered though, is that this selects columns like nonmutated and unmutated as well as columns like mutated, mutation, and mutational. So I'd like to know how to select columns that have some variation of the word mutant without the non or the un. I've been looking around for an example of how to do that but haven't found anything yet. Can anyone show me how to select the columns I need? Thanks, Paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to test if a slope is different than 1?
One option is to subtract the continuous variable from y before doing the regression (this works with any regression package/function). The probably better way in R is to use the 'offset' function: formula = I(log(data$AB.obs + 1, 10)-log(data$SIZE,10)) ~ log(data$SIZE, 10) + data$Y formula = log(data$AB.obs + 1) ~ offset( log(data$SIZE,10) ) + log(data$SIZE,10) + data$Y Or you can use a function like 'confint' to find the confidence interval for the slope and see if 1 is in the interval. On Mon, Apr 23, 2012 at 12:11 PM, Mark Na mtb...@gmail.com wrote: Dear R-helpers, I would like to test if the slope corresponding to a continuous variable in my model (summary below) is different than one. I would appreciate any ideas for how I could do this in R, after having specified and run this model? Many thanks, Mark Na Call: lm(formula = log(data$AB.obs + 1, 10) ~ log(data$SIZE, 10) + data$Y) Residuals: Min 1Q Median 3Q Max -0.94368 -0.13870 0.04398 0.17825 0.63365 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -1.18282 0.09120 -12.970 2e-16 *** log(data$SIZE, 10) 0.56009 0.02564 21.846 2e-16 *** data$Y2008 0.16825 0.04366 3.854 0.000151 *** data$Y2009 0.20310 0.04707 4.315 0.238 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.2793 on 228 degrees of freedom Multiple R-squared: 0.6768, Adjusted R-squared: 0.6726 F-statistic: 159.2 on 3 and 228 DF, p-value: 2.2e-16 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xyplot - ordering factors in graph
R works on the idea that factor level ordering is a property of the data rather than a property of the graph. So if you have the factor levels ordered properly in the data, then the graph will take care of itself. To order the levels see functions like: factor, relevel, and reorder. On Sat, Apr 21, 2012 at 1:23 AM, pip philsiv...@hotmail.com wrote: Hello - newbie Have created a lattice graph and want to know how to sort one of the elements which is a factor. The factor numbers in graph are - eg - 10 32 21 2 22 4 etc Regards -- View this message in context: http://r.789695.n4.nabble.com/xyplot-ordering-factors-in-graph-tp4576013p4576013.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Print warning messages and save them automatically in a file
Would using the 'sink' function with type='message' and split=TRUE do what you want? On Thu, Apr 19, 2012 at 2:00 AM, Alexander juschitz_alexan...@yahoo.de wrote: Hello, I am working under R2.11.0 Windows and I would like to ask you if you know a way to save all warning messages obtained by the R function warning in a file and keeping the functionalities of the base-function warning. For example if I use external code, I don't want to replace all lines containing warning(...) by a selfwritten function. I want to execute it normally and everytime the external code makes a call to warning, I want the warnings message printed out in the console AND written in a file. My first solution is to redefine the function warning in the global environment such as: warning - function(...){ write(...,Warning.log,append=TRUE) base::warning(...) #unfortunately the warning happens always in the function warning of the .GlobalEnv #and doesn't indicate anymore where the error happens :-( } This solution isn't very clean. I would like to try to redefine warning.expression in options. In last case, I don't understand how the passing of arguments works. I would like to do something like: options(warning.expression=quote({ write(...,Warning.log,append=TRUE) ? })) I put the because I don't know how I should call the function warning without being recursive and how I can pas arguments. Thank you Alexander -- View this message in context: http://r.789695.n4.nabble.com/Print-warning-messages-and-save-them-automatically-in-a-file-tp4570163p4570163.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matrix multiplication by multple constants
And another way is to remember properties of matrix multiplication: y %*% diag(x) On Fri, Apr 20, 2012 at 8:35 AM, David Winsemius dwinsem...@comcast.net wrote: On Apr 20, 2012, at 4:57 AM, Dimitris Rizopoulos wrote: try this: x - 1:3 y - matrix(1:12, ncol = 3, nrow = 4) y * rep(x, each = nrow(y)) Another way with a function specifically designed for that purpose: sweep(y, 2, x, *) -- David. I hope it helps. Best, Dimitris On 4/20/2012 10:51 AM, Vincy Pyne wrote: Dear R helpers Suppose x- c(1:3) y- matrix(1:12, ncol = 3, nrow = 4) y [,1] [,2] [,3] [1,] 1 5 9 [2,] 2 6 10 [3,] 3 7 11 [4,] 4 8 12 I wish to multiply 1st column of y by first element of x i.e. 1, 2nd column of y by 2nd element of x i.e. 2 an so on. Thus the resultant matrix should be like z [,1] [,2] [,3] [1,] 1 10 27 [2,] 2 12 30 [3,] 3 14 33 [4,] 4 16 36 When I tried simple multiplication like x*y, y is getting multiplied column-wise x*z [,1] [,2] [,3] [1,] 1 5 9 [2,] 4 12 20 [3,] 9 21 33 [4,] 16 32 48 Kindly guide Regards Vincy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ternaryplot as an inset graph
The triplot function in the TeachingDemos package uses base graphics, the subplot function (also in TeachingDemos) is another way to place a plot within a plot (and triplot and subplot do work together). If you want to stick to grid graphics then you can use viewports in grid to insert one plot into another. On Fri, Apr 20, 2012 at 10:05 AM, Ben Bolker bbol...@gmail.com wrote: Young, Jennifer A Jennifer.Young at dfo-mpo.gc.ca writes: I am trying to add a ternary plot as a corner inset graph to a larger main ternary plot. I have successfully used add.scatter in the past for different kinds of plots but It doesn't seem to work for this particular function. It overlays the old plot rather than plotting as an inset. Here is a simple version of what I'm trying. Note that if I change the inset plot to be an ordinary scatter, for instance, it works as expected. library(ade4) library(vcd) tdat - data.frame(x=runif(20), y=rlnorm(20), z=rlnorm(20)) insetPlot - function(data){ ternaryplot(data) } ternaryplot(tdat) add.scatter(insetPlot(tdat), posi=topleft, ratio=.2) I think the problem is that add.scatter assumes you're using base graphics, while ternaryplot() uses grid graphics. Mixing and matching grid+base graphs is a little bit tricky. You might try it with triax.plot() from the plotrix package, which I believe does ternary plots in base graphics ... Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pierce's criterion
Determining what is an outlier is complicated regardless of the tools used (this is a philosophical issue rather than an R issue). You need to make some assumptions and definitions based on the science that produces the data rather than the data itself before even approaching the question of outliers. What is an outlier for a normal distribution may be reasonable from a gamma distribution and completely expected from a cauchy distribution. See the 'outliers' dataset in the TeachingDemos package, and more importantly the examples in the help page for it, for a demonstration of the perils of automatic outlier deletion. On Wed, Apr 18, 2012 at 4:11 PM, Ryan Murphy rmurp...@u.rochester.edu wrote: Hello all, I would like to rigorously test whether observations in my dataset are outliers. I guess all the main tests in R (Grubbs) impose the assumption of normality. My data is surely not normal, so I would like to use something else. As far as I can tell from wikipedia, Peirce's criterion is just that. The data I am interested in testing is: 1) Continuous on the unit interval 2) Discrete 3) Ordinal on 0 6. If you need more specifics, (1) refers to the gini index of inequality, (2) refers to measures for the number of assasinations, strikes, etc in a country, (3) refers to ranking data of how politically free a country is. Does R do this test? Thanks a lot, and PS I unlike many economists prefer R over Stata R Stata! Sincerely, Ryan Murphy -- Ryan Murphy 2012 B.A. Economics and Mathematics 339-223-4181 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] call object from character?
Almost always when people ask this question (it and its answer are FAQ 7.21) it is because they want to do things the wrong way (just don't know there is a better way). The better way is to put the variables that you want to access in this way into a list, then you can easily access the objects in the list by name (or position, or all of them, etc.): mylist - list(a=12) call_A - 'a' mylist[[call_A]] Adding more objects to the list is easier than creating new data objects in the global environment, and if you want to do something with all those objects (copy, delete, rename, etc.) then you have 1 list to work with rather than a bunch of separate objects. If you want to do the same operation on all the objects (the common follow-up question) then if they are in a list you can use lapply, sapply, or vapply and it is much simpler than looping and getting. On Wed, Apr 18, 2012 at 8:25 PM, chuck.01 charliethebrow...@gmail.com wrote: Let say I have an object (I hope my terminology is correct) a a - 12 a [1] 12 And a has been assigned the number 12, or whatever And lets say I have a character call_A call_A - a call_A [1] a What is the function F that allows this to happen: F( call_A ) [1] 12 -- View this message in context: http://r.789695.n4.nabble.com/call-object-from-character-tp4569686p4569686.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Effeciently sum 3d table
Look at the Reduce function. On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra dava...@verizon.net wrote: I have a large number of 3d tables that I wish to sum Is there an efficient way to do this? Or perhaps a function I can call? I tried using do.call(sum,listoftables) but that returns a single value. So far, it seems only a loop will do the job. TIA, DAV __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Effeciently sum 3d table
Here is a simple example: mylist - replicate(4, matrix(rnorm(12), ncol=3), simplify=FALSE) A - Reduce( `+`, mylist ) B - mylist[[1]] + mylist[[2]] + mylist[[3]] + mylist[[4]] all.equal(A,B) [1] TRUE Basically what Reduce does is it first applies the function (`+` in this case) to the 1st 2 elements of mylist, then applies it to that result and the 3rd element, then that result and the 4th element (and would continue on if mylist had more than 4 elements). It is basically a way to create functions like sum from functions like `+` which only work on 2 objects at a time. Another way to see what it is doing is to run something like: Reduce( function(a,b){ cat(I am adding,a,and,b,\n); a+b }, 1:10 ) The Reduce function will probably not be any faster than a really well written loop, but will probably be faster (both to write the command and to run) than a poorly designed naive loop application. On Mon, Apr 16, 2012 at 12:52 PM, David A Vavra dava...@verizon.net wrote: Thanks Greg, I think this may be what I'm after but the documentation for it isn't particularly clear. I hate it when someone documents a piece of code saying it works kinda like some other code (running elsewhere, of course) making the tacit assumption that everybody will immediately know what that means and implies. I'm sure I'll understand it once I know what it is trying to say. :) There's an item in the examples which may be exactly what I'm after. DAV -Original Message- From: Greg Snow [mailto:538...@gmail.com] Sent: Monday, April 16, 2012 11:54 AM To: David A Vavra Cc: r-help@r-project.org Subject: Re: [R] Effeciently sum 3d table Look at the Reduce function. On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra dava...@verizon.net wrote: I have a large number of 3d tables that I wish to sum Is there an efficient way to do this? Or perhaps a function I can call? I tried using do.call(sum,listoftables) but that returns a single value. So far, it seems only a loop will do the job. TIA, DAV -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Gradients in bar charts XXXX
Here is one approach: tmp - rbinom(10, 100, 0.78) mp - barplot(tmp, space=0, ylim=c(0,100)) tmpfun - colorRamp( c('green','yellow',rep('red',8)) ) mat - 1-row(matrix( nrow=100, ncol=10 ))/100 tmp2 - tmpfun(mat) mat2 - as.raster( matrix( rgb(tmp2, maxColorValue=255), ncol=10) ) for(i in 1:10) mat2[ mat[,i] = tmp[i]/100, i] - NA rasterImage(mat2, mp[1] - (mp[2]-mp[1])/2, 0, mp[10] + (mp[2]-mp[1])/2, 100, interpolate=FALSE) barplot(tmp, col=NA, add=TRUE, space=0) You can tweak it to your desire. It might look a little better if each bar were drawn independently with interpolate=TRUE (this would also be needed if you had space between the bars). On Mon, Apr 9, 2012 at 12:40 PM, Jason Rodriguez jason.rodrig...@dca.ga.gov wrote: Hello, I have a graphics-related question: I was wondering if anyone knows of a way to create a bar chart that is colored with a three-part gradient that changes at fixed y-values. Each bar needs to fade green-to-yellow at Y=.10 and from yellow-to-red at Y=.20. Is there an option in a package somewhere that offers an easy way to do this? Attached is a chart I macgyvered together in Excel using a combination of a simple bar chart, fit line, and some drawing tools. I want to avoid doing it this way in the future by finding a way to replicate it in R. Any ideas? Thanks, Jason Michael Rodriguez Data Analyst State Housing Trust Fund for the Homeless Georgia Department of Community Affairs Email: jason.rodrig...@dca.ga.gov __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Curve fitting, probably splines
This sounds like possibly using logsplines may be what you want. See the 'oldlogspline' function in the 'logspline' package. On Thu, Apr 12, 2012 at 7:45 AM, Michael Haenlein haenl...@escpeurope.eu wrote: Dear all, This is probably more related to statistics than to [R] but I hope someone can give me an idea how to solve it nevertheless: Assume I have a variable y that is a function of x: y=f(x). I know the average value of y for different intervals of x. For example, I know that in the interval[0;x1] the average y is y1, in the interval [x1;x2] the average y is y2 and so forth. I would like to find a line of minimum curvature so that the average values of y in each interval correspond to y1, y2, ... My idea was to use (cubic) splines. But the problem I have seems somewhat different to what is usually done with splines. As far as I understand it, splines help to find a curve that passes a set of given points. But I don't have any points, I only have average values of y per interval. If you have any suggestions on how to solve this, I'd love to hear them. Thanks very much in advance, Michael [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to compute a vector of min values ?
Peter showed how to get the minimums from a list or data frame using sapply, here is a way to copy your 1440 vectors into a single list (doing this and keeping your data in a list instead of separate vectors will make your life easier in general): my.list - lapply( 1:1440, function(x) get( sprintf(v%i,x)) ) You can then name the elements of the list, if you want, with something like: names(my.list) - sprintf(v%i, 1:1440) Then if all the vectors are of the same length you can convert this into a data frame with: df - as.data.frame(my.list) But this is not needed as most of the work can be done with it as a list (and if they are different lengths then the list is how it should stay). Either way you can now use sapply on the list/data frame to get all the minimums. To anticipate a possible future question, if you next want the minimum of each position across vectors then you can use the pmin function: do.call( pmin, my.list ) On Fri, Apr 6, 2012 at 12:29 AM, peter dalgaard pda...@gmail.com wrote: On Apr 6, 2012, at 00:25 , ikuzar wrote: Hi, I'd like to know how to get a vector of min value from many vectors without making a loop. For example : v1 = c( 1, 2, 3) v2 = c( 2, 3, 4) v3 = c(3, 4, 5) df = data.frame(v1, v2, v3) df v1 v2 v3 1 1 2 3 2 2 3 4 3 3 4 5 min_vect = min(df) min_vect [1] 1 I 'd like to get min_vect = (1, 2, 3), where 1 is the min of v1, 2 is the min of v2 and 3 is the min of v3. The example above are very easy but, in real, I have got v1, v2, ... v1440 sapply(df, min) (possibly sapply(df, min, na.rm=TRUE) ) Thanks for your help, ikuzar -- View this message in context: http://r.789695.n4.nabble.com/how-to-compute-a-vector-of-min-values-tp4536224p4536224.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bayesian 95% Credible interval
The emp.hpd function in the TeachingDemos package will do this (assumes a single interval result, either unimodal or multimodes but the valleys between don't drop far enough to split the interval). I am sure there are similar functions in other packages as well. On Fri, Apr 6, 2012 at 12:39 PM, Gyanendra Pokharel gyanendra.pokha...@gmail.com wrote: Hi all, I have the data from the posterior distribution for some parameter. I want to find the 95% credible interval. I think t.test(data) is only for the confidence interval. I did not fine function for the Bayesian credible interval. Could some one suggest me? Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Histogram classwise
You might want to look at the lattice or ggplot2 packages, both of which can create a graph for each of the classes. On Tue, Apr 3, 2012 at 6:20 AM, arunkumar akpbond...@gmail.com wrote: Hi I have a data class wise. I want to create a histogram class wise without using for loop as it takes a long time my data looks like this x class 27 1 93 3 65 5 1 2 69 5 2 1 92 4 49 5 55 4 46 1 51 3 100 4 - Thanks in Advance Arun -- View this message in context: http://r.789695.n4.nabble.com/Histogram-classwise-tp4528624p4528624.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] identify with mfcol=c(1,2)
I tried your code, first I removed the reference to the global variable data$Line, then it works if I finish identifying by either right clicking (I am in windows) and choosing stop, or using the stop menu. It does as you say if I press escape or use the stop sign button (both stop the whole evaluation rather than just the identifying). On Tue, Apr 3, 2012 at 8:52 AM, John Sorkin jsor...@grecc.umaryland.edu wrote: I would like to have a figure with two graphs. This is easily accomplished using mfcol: oldpar - par(mfcol=c(1,2)) plot(x,y) plot(z,x) par(oldpar) I run into trouble if I try to use identify with the two plots. If, after identifying points on my first graph I hit the ESC key, or hitting stop menu bar of my R session, the system stops the identification process, but fails to give me my second graph. Is there a way to allow for the identification of points when one is plotting to graphs in a single graph window? My code follows. plotter - function(first,second) { # Allow for two plots in on graph window. oldpar-par(mfcol=c(1,2)) #Bland-Altman plot. plot((second+first)/2,second-first) abline(0,0) # Allow for indentification of extreme values. BAzap-identify((second+first)/2,second-first,labels = seq_along(data$Line)) print(BAzap) # Plot second as a function of first value. plot(first,second,main=Limin vs. Limin,xlab=First (cm^2),ylab=Second (cm^3)) # Add identity line. abline(0,1,lty=2,col=red) # Allow for identification of extreme values. zap-identify(first,second,labels = seq_along(data$Line)) print(zap) # Add regression line. fit1-lm(first~second) print(summary(fit1)) abline(fit1) print(summary(fit1)$sigma) # reset par to default values. par(oldpar) } plotter(first,second) Thanks, John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for ...{{dropped:15}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] meaning of sigma from LM, is it the same as RMSE
If you look at the code for summary.lm the line for the value of sigma is: ans$sigma - sqrt(resvar) and above that we can see that resvar is defined as: resvar - rss/rdf If that is not sufficient you can find how rss and rdf are computed in the code as well. On Tue, Apr 3, 2012 at 8:56 AM, John Sorkin jsor...@grecc.umaryland.edu wrote: Is the sigma from a lm, i.e. fit1 - lm(y~x) summary(fit1) summary(fit1)$sigma the RMSE (root mean square error) Thanks, John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for ...{{dropped:14}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How does predict.loess work?
Run the examples for the loess.demo function in the TeachingDemos package to get a better understanding of what goes into the loess predictions. On Tue, Apr 3, 2012 at 2:12 PM, Recher She rrrecher@gmail.com wrote: Dear R community, I am trying to understand how the predict function, specifically, the predict.loess function works. I understand that the loess function calculates regression parameters at each data point in 'data'. lo - loess ( y~x, data) p - predict (lo, newdata) I understand that the predict function predicts values for 'newdata' according to the loess regression parameters. How does predict.loess do this in the case that 'newdata' is different from the original data x? How does the interpolation take place? Thank you. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simulate correlated binary, categorical and continuous variable
How are you calculating the correlations? That may be part of the problem, when you categorize a continuous variable you get a factor whose internal representation is a set of integers. If you try to get a correlation with that variable it will not be the polychoric correlation. Also do you need your data to have the exact proportions and means that you show below? or represent random samples from those populations and therefore the actual proportions and means will vary a bit from what is specified? If you are interested in tetrachoric and polychoric correlations, then generating the latent normals and categorizing seems the most straightforward method. Also, which function (from which package) are you using to generate your normal variables? That may have some effect. On Sun, Apr 1, 2012 at 7:00 PM, Burak Aydin burak235...@hotmail.com wrote: Hello Greg, Sorry for the confusion. Lets say, I have a population. I have 6 variables. They are correlated to each other. I can get you pearson correlation, tetrachoric or polychoric correlation coefficients. 2 of them continuous, 2 binary, 2 categorical. Lets assume following conditions; Co1 and Co2 are normally distributed continuous random variables. Co1-- N (0,1), Co2--N(100,15) Ca1 and Ca2 are categorical variables. Ca1 probabilities =c(.02,.18,.28,.22,.30), Ca2 probs =c(.06,.18,.76) Bi1 and Bi2 are binaries, Marginal probabilities Bi1 p= 0.4, Bi2 p=0.5. And , again, I have the correlations. When I try to simulate this population I fail. If I keep the means and probabilities same I lost the correct correlations. When I keep correlations, I loose precision on means and frequencies/probabilities. See these links please http://www.mathworks.com/products/statistics/demos.html?file=/products/demos/shipping/stats/copulademo.html http://stats.stackexchange.com/questions/22856/how-to-generate-correlated-test-data-that-has-bernoulli-categorical-and-contin http://www.springerlink.com/content/011x633m554u843g/ -- View this message in context: http://r.789695.n4.nabble.com/simulate-correlated-binary-categorical-and-continuous-variable-tp4516433p4524863.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simulate correlated binary, categorical and continuous variable
Your explanation below has me more confused than before. Now it is possible that it is just me, but it seems that if others understood it then someone else would have given a better answer by now. Are you restricting your categorical and binary variables to be binned versions of underlying normals? if that is the case I doubt that there would be a more efficient way than binning a normal variable. If not then can you show us more of what you want to produce? along with what you mean by correlation or covariance with categorical variables (which is meaningless without additional restrictions/assumptions). On Fri, Mar 30, 2012 at 3:41 PM, Burak Aydin burak235...@hotmail.com wrote: Hello Greg, Thanks for your time, Lets say I know Pearson covariance matrix. When I use rmvnorm to simulate 9 variables and then dichotomize/categorize them, I cant retrieve the population covariance matrix. -- View this message in context: http://r.789695.n4.nabble.com/simulate-correlated-binary-categorical-and-continuous-variable-tp4516433p4520464.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] discrepancy between paired t test and glht on lme models
I nominate the following paragraph for the fortunes package: The basic issue appears to be that glht is not smart enough to deal with degrees of freedom so it uses an asymptotic z-test instead of a t-test. Infinite df, basically, and since 4 is a pretty poor approximation of infinity, you get your discrepancy. On Thu, Mar 29, 2012 at 1:36 AM, peter dalgaard pda...@gmail.com wrote: On Mar 28, 2012, at 20:23 , Rajasimhan Rajagovindan wrote: Hi folks, I am working with repeated measures data and I ran into issues where the paired t-test results did not match those obtained by employing glht() contrasts on a lme model. While the lme model itself appears to be fine, there seems to be some discrepancy with using glht() on the lme model (unless I am missing something here). I was wondering if someone could help identify the issue. On my actual dataset the differences between glht() and paired t test is more severe than the example provided here. You might want to move to the R-sig-ME (mixed effects) mailing list for up to date advice. The basic issue appears to be that glht is not smart enough to deal with degrees of freedom so it uses an asymptotic z-test instead of a t-test. Infinite df, basically, and since 4 is a pretty poor approximation of infinity, you get your discrepancy. It's not that surprising, given that lme() itself is pretty poor at figuring out df in some cases. Especially if you have to deal with cross-stratum effects, the calculation of appropriate degrees of freedom is nontrivial. Some recent developments allow the calculation of Kenward-Roger for the lmer() models, but I wouldn't know to what extend this carries to glht-style testing. I am using glht() for my data since I need to perform pairwise comparisons across multiple levels, any alternate approach to performing posthoc comparisons on lme object is also welcome. I have included the code and the results from a mocked up data (one that I found online) here. require(nlme) require(multcomp) dv - c(1,3,2,2,2,5,3,4,3,5) subject - factor(c(s1,s1,s2,s2,s3,s3,s4,s4,s5,s5)) myfactor - factor(c(f1,f2,f1,f2,f1,f2,f1,f2,f1,f2)) mydata - data.frame(dv, subject, myfactor) rm(subject,myfactor,dv) attach(mydata) # paired t test (H0: f2-f1 = 0) t.test(mydata[myfactor=='f2',1],mydata[myfactor=='f1',1],paired=TRUE) # yields : t = 3.1379, df = 4, p-value = 0.03492, mean of the differences= 1.6 # lme (f1 as reference level) fit.lme - lme(dv ~ myfactor, random = ~1|subject,method=REML,correlation=corCompSymm(),data=mydata) summary(fit.lme) # yields identical results as paired t test # f2-f1: t = 3.1379, df = 4, p-value = 0.0349 summary(glht(fit.lme,linfct=mcp(myfactor=Tukey))) # while test statistic is comparable, p value is different # have noticed cases where the differences between glht() and paired t test is more severe ### sample outputs from the script ### # things appear ok here and match paired t test results # summary(fit.lme) Linear mixed-effects model fit by REML Data: mydata AIC BIC logLik 36.43722 36.83443 -13.21861 Random effects: Formula: ~1 | subject (Intercept) Residual StdDev: 0.7420274 0.8058504 Correlation Structure: Compound symmetry Formula: ~1 | subject Parameter estimate(s): Rho -0.0009325763 Fixed effects: dv ~ myfactor Value Std.Error DF t-value p-value (Intercept) 2.2 0.4898979 4 4.490732 0.0109 myfactorf2 1.6 0.5099022 4 3.137857 0.0349 Correlation: (Intr) myfactorf2 -0.52 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -1.45279696 -0.53193228 0.03481143 0.58490026 1.09867599 Number of Observations: 10 Number of Groups: 5 # result differs from paired t test ! summary(glht(fit.lme,linfct=mcp(myfactor=Tukey)),test=adjusted(none)) Simultaneous Tests for General Linear Hypotheses Multiple Comparisons of Means: Tukey Contrasts Fit: lme.formula(fixed = dv ~ myfactor, data = mydata, random = ~1 | subject, correlation = corCompSymm(), method = REML) Linear Hypotheses: Estimate Std. Error z value Pr(|z|) f2 - f1 == 0 1.6000 0.5099 3.138 0.0017 ** -- --- Signif. codes: 0 Œ***‚ 0.001 Œ**‚ 0.01 Œ*‚ 0.05 Œ.‚ 0.1 Œ ‚ 1 (Adjusted p values reported -- none method) platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 13.1 year 2011 month 07 day 08 svn rev 56322 language R version.string R version 2.13.1 (2011-07-08) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list
Re: [R] simulate correlated binary, categorical and continuous variable
Partly this depends on what you mean by a covariance between categorical variables (and binary) and what is a covariance between a categorical and a continuous variable? On Thu, Mar 29, 2012 at 12:31 PM, Burak Aydin burak235...@hotmail.com wrote: Hi, I d like to simulate 9 variables; 3 binary, 3 categorical and 3 continuous with a known covariance matrix. Using mvtnorm and later dichotimize/categorize variables is not efficient. Do you know any package or how to simulate mixed data? -- View this message in context: http://r.789695.n4.nabble.com/simulate-correlated-binary-categorical-and-continuous-variable-tp4516433p4516433.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot points using circles filled half in red and half in blue.
I would use the my.symbols function from the TeachingDemos package (but then I might be a little bit biased), here is a simple example: library(TeachingDemos) x - runif(25) y - runif(25) z - sample(1:4, 25, TRUE) ms.halfcirc2 - function(col, adj=pi/2, ...) { theta - seq(0, 2*pi, length.out=300)+adj x - cos(theta) y - sin(theta) if(col==1) { polygon(x,y) } else if(col==2) { polygon(x,y, col='red') } else if(col==3) { polygon(x,y, col='blue') } else { polygon(x[1:150], y[1:150], border=NA, col='red') polygon(x[151:300], y[151:300], border=NA, col='blue') polygon(x,y) } } my.symbols( x, y, ms.halfcirc2, inches=1/5, add=FALSE, symb.plots=TRUE, col=z) # spice it up a bit my.symbols( x, y, ms.halfcirc2, inches=1/5, add=FALSE, symb.plots=TRUE, col=z, adj=runif(25, 0, pi)) Adjust things to fit better what you want. On Tue, Mar 27, 2012 at 8:49 PM, alan alan.wu2...@gmail.com wrote: I want to plot many points and want to use circles. The filling color depends on variable a. if a=1, then not fill if a=2 then fill with red, if a=3 then fill with blue, if a=4, fill half with red and half with blue. Can anyone tell me how to plot the case a=4? Thanks a lot __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to test for the difference of means in population, please help
You should use mixed effects modeling to analyze data of this sort. This is not a topic that has generally been covered by introductory classes, so you should consult with a professional statistician on your problem, or educate yourself well beyond the novice level (this takes more than just reading 1 book, a few classes would be good to get to this level, or intense study of several books). Since everything is balanced nicely, you could average over the 4 repeats and use a 2 sample t test (assuming the assumptions hold, your sample data would be fine) comparing the 2 sets of 400 means. This will test for a general difference in the overall means, but ignores other information and hypotheses that may be important (which is why the mixed effects model approach is much preferred). On Tue, Mar 27, 2012 at 1:13 AM, ali_protocol mohammadianalimohammad...@gmail.com wrote: Dear all, Novice in statistics. I have 2 experimental conditions. Each condition has ~400 points as its response. Each condition is done in 4 repereats (so I have 2 x 400 x 4 points). I want to compare the means of two conditions and test whether they are same or not. Which test should I use? #populations c = matrix (sample (1:20,1600, replace= TRUE), 400 ,4) b = matrix (sample (1:20,1600, replace= TRUE), 400 ,4) #means of repeats c.mean= apply (c,2, mean) b.mean= apply (b,2,mean) #mean of experiment c.mean.all= mean (c) b.mean.all= mean (b) -- View this message in context: http://r.789695.n4.nabble.com/How-to-test-for-the-difference-of-means-in-population-please-help-tp4508089p4508089.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Work -Shift Scheduling - Constraint Linear Programming
Running findFn('linear programming') from the sos package brings up several possibilities that look promising. On Sun, Mar 25, 2012 at 5:48 AM, agent dunham crossp...@hotmail.com wrote: Dear Community, I've a Work -Shift Scheduling Problem I'd like to solve via constraint linear programming. Maybe something similar to http://support.sas.com/documentation/cdl/en/orcpug/63349/HTML/default/viewer.htm#orcpug_clp_sect037.htm Can anybody suggest me any package/R examples to solve this? If it's needed more details of my little problemm I can provide. Thanks in advance, u...@host.com as u...@host.com -- View this message in context: http://r.789695.n4.nabble.com/Work-Shift-Scheduling-Constraint-Linear-Programming-tp4503037p4503037.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compare similarit of two vector of not same length
If you are trying to see if both vectors could be random samples from the same population then I would look at a qqplot (see ?qqplot) which will compare them visually (and if they are not the same length then the qqplot function will use interpolation to compare them. For a more formal test you can use the ks.test function (also can take vectors of different length), just note that a non-significant result does not mean that they are the same, and with big sample sizes this can be significant even though the differences are not practically meaningful. Another option is to do the qqplot along with the vis.test function in the TeachingDemos package, this lets you do a test based on the qqplot, but also gives you a feel for the practical difference. On Sat, Mar 24, 2012 at 5:44 AM, Alaios ala...@yahoo.com wrote: Dear all, this is not strictly R question. I have two vectors of different length (this is in the order of 10%). I am trying to see if still one can compare these two for similarity. IF the vectors were of the same length I would just take the difference of the two and plot a pdf of it. One way I am thinking is prorbably to find the longer length and short it in some way to get to the length of the short. Which are the math formulations for this type of problems and which of those R supports? Regards Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to compute within-group mean and sd?
In addition to Michael's answers, there are packages that allow you to use SQL syntax on R data objects, so you could probably just use what you are familiar with. On Sat, Mar 24, 2012 at 9:32 AM, reeyarn reey...@gmail.com wrote: Hi, I want to run something like SELECT firm_id, count(*), mean(value), sd(value) FROM table GROUP BY firm_id; But I have to write a for loop like for ( id in unique(table$firm_id ) { print(paste( id, mean(table[firm_id == id, value]) )) } Is there any way to do it easier? Thanks :) Best, Reeyarn Lee [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] show and produce PDF file with pdf() and dev.off( ) in function
As others have said, you pretty much need to do the plot 2 times, but if it takes more that one command to create the plot you can use the dev.copy function to copy what you have just plotted into another graphics device rather than reissuing all the commands again. On Sat, Mar 24, 2012 at 9:43 AM, Uwe Ligges lig...@statistik.tu-dortmund.de wrote: On 24.03.2012 13:11, Igor Sosa Mayor wrote: apart from the other answers, be aware that you have to 'print' the graph with pl-plot(x) print(pl) Which is true for lattice function but not for a base graphics plot(). Uwe Ligges in case you're using lattice or ggplot2 plots. On Fri, Mar 23, 2012 at 02:40:04PM -0700, casperyc wrote: Hi all, I know how to use pdf() and dev.off() to produce and save a graph. However, when I put them in a function say myplot(x=1:20){ pdf(xplot.pdf) plot(x) dev.off() } the function work. But is there a way show the graph in R as well as saving it to the workspace? Thanks. casper - ### PhD candidate in Statistics School of Mathematics, Statistics and Actuarial Science, University of Kent ### -- View this message in context: http://r.789695.n4.nabble.com/show-and-produce-PDF-file-with-pdf-and-dev-off-in-function-tp4500213p4500213.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to write and analyze data with 3 dimensions
You could put this data into a 3 dimensional array and then use the apply function to apply a function (such as mean) over which ever variables you choose. Or you could put the data into a data frame in long format where you have your 3 variable indices in 3 columns, then the data in a 4th column. Then use the tapply function to apply the mean (or other function) to groups based on the indices of choice. If you want to do fancier things in either case then look into the reshape2 and plyr packages for ways of shaping the data and taking the data apart into pieces, apply a function to each piece, then put it all back together again. On Tue, Mar 20, 2012 at 11:16 AM, jorge Rogrigues hjm...@gmail.com wrote: Suppose I have data organized in the following way: (P_i, M_j, S_k) where i, j and k and indexes for sets. I would like to analyze the data to get for example the following information: what is the average over k for (P_i, M_j) or what is the average over j and k for P_i. My question is what would be the way of doing this in R. Specifically how should I write the data in a csv file and how do I read the data from the csv file into R and perform these basic operations. Thank you. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to hide code of any function
See the 'petals' function in the TeachingDemos package for one example of hiding source from casual inspection (intermediate level R users will still easily be able to figure out what the key code is, but will not be able to claim that they stumbled across it on accident). This post gives another possibility: https://stat.ethz.ch/pipermail/r-devel/2011-October/062236.html On Thu, Mar 15, 2012 at 6:53 AM, mrzung mrzun...@gmail.com wrote: hi I'm making some program and it need to be hidden. it's not commercial purpose but it is educational, so i do want to hide the code of function. for example, if i made following function: a-function(x){ y-x^2 print(y) } i do not want someone to type a and take the code of the function. is there anyone who can help me? -- View this message in context: http://r.789695.n4.nabble.com/how-to-hide-code-of-any-function-tp4474822p4474822.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help please. 2 tables, which test?
For this case I would use a permutation test. Start by choosing some statistic that represents your 4 students across the different grades, some possibilities would be the sum of scores across grades and students, or mean, or median, or ... Compute the selected statistic for your 4 students and save that value. Now select 4 students at random and compute the same statistic, repeat this a bunch of times (thousands) and compute the statistic each time. All those stats on the random selections represent the distribution of the statistic under the null hypothesis that your 4 students were randomly chosen (vs. chosen based on something that is related to the grade). Now you just compare the stat on the original 4 students to the distribution (if you need a specific p-value it is just the proportion of the random stats that are as or more extreme as your original 4). On Sat, Mar 10, 2012 at 4:04 AM, aoife doherty aaral.si...@gmail.com wrote: Thank you for the replies. So what my test wants to do is this: I have a big matrix, 30 rows (students in a class) X 50 columns (students grades for the year). An example of the matrix is as such: grade1 grade2 grade3 . grade 50 student 1 student 2*** student 3 student 4*** student 5*** student 6 . . . . . student 30*** As you can see, four students (students 2,4,5 and 30) have stars beside their name. I have chosen these students based on a particular characteristic that they all share.I then pulled these students out to make a new table: grade1 grade2 grade3 ... grade 50 student 2 student 4 student 5 student 30 and what i want to see is basically is there any difference between the grades this particular set of students(i.e. student 2,4,5 and 30) got, and the class as a whole? So my null hypothesis is that there is no difference between this set of students grades, and what you would expect from the class as a whole. Aaral On Sat, Mar 10, 2012 at 12:18 AM, Greg Snow 538...@gmail.com wrote: Just what null hypothesis are you trying to test or what question are you trying to answer by comparing 2 matrices of different size? I think you need to figure out what your real question is before worrying about which test might work on it. Trying to get your data to fit a given test rather than finding the appropriate test or other procedure to answer your question is like buying a new suit then having plastic surgery to make you fit the suit rather than having the tailor modify the suit to fit you. If you can give us more information about what your question is we have a better chance of actually helping you. On Fri, Mar 9, 2012 at 9:46 AM, aoife doherty aaral.si...@gmail.com wrote: Thank you. Can the chi-squared test compare two matrices that are not the same size, eg if matrix 1 is a 2 X 4 table, and matrix 2 is a 3 X 5 matrix? On Fri, Mar 9, 2012 at 4:37 PM, Greg Snow 538...@gmail.com wrote: The chi-squared test is one option (and seems reasonable to me if it the the proportions/patterns that you want to test). One way to do the test is to combine your 2 matrices into a 3 dimensional array (the abind package may help here) and test using the loglin function. On Thu, Mar 8, 2012 at 5:46 AM, aaral singh aaral.si...@gmail.com wrote: Hi.Please help if someone can. Problem: I have 2 matrices Eg matrix 1: Freq None Some Heavy 3 2 5 Never 8 13 8 Occas 1 4 4 Regul 9 5 7 matrix 2: Freq None Some Heavy 7 1 3 Never 87 18 84 Occas 12 3 4 Regul 9 1 7 I want to see if matrix 1 is significantly different from matrix 2. I consider using a chi-squared test. Is this appropriate? Could anyone advise? Many thank you. Aaral Singh -- View this message in context: http://r.789695.n4.nabble.com/help-please-2-tables-which-test-tp4456312p4456312.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com
Re: [R] help please. 2 tables, which test?
The chi-squared test is one option (and seems reasonable to me if it the the proportions/patterns that you want to test). One way to do the test is to combine your 2 matrices into a 3 dimensional array (the abind package may help here) and test using the loglin function. On Thu, Mar 8, 2012 at 5:46 AM, aaral singh aaral.si...@gmail.com wrote: Hi.Please help if someone can. Problem: I have 2 matrices Eg matrix 1: Freq None Some Heavy 3 2 5 Never 8 13 8 Occas 1 4 4 Regul 9 5 7 matrix 2: Freq None Some Heavy 7 1 3 Never 87 18 84 Occas 12 3 4 Regul 9 1 7 I want to see if matrix 1 is significantly different from matrix 2. I consider using a chi-squared test. Is this appropriate? Could anyone advise? Many thank you. Aaral Singh -- View this message in context: http://r.789695.n4.nabble.com/help-please-2-tables-which-test-tp4456312p4456312.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xyplot without external box
Why do you want to do this? Lattice was not really designed to put just part of the graph up, but rather to create the entire graph using one command. If you want to show a process, putting up part of a graph at a time, it may be better to create the whole graph as a vector graphics file (pdf, postscript, svg, pgf, emf, etc.) then use an external program to remove those parts that you don't want for a given step. On Thu, Mar 8, 2012 at 6:02 AM, Mauricio Zambrano-Bigiarini hzambran.newsgro...@gmail.com wrote: Dear list members, Within a loop, I need to create an xyplot with only a legend, not even with the default external box drawn by lattice. I already managed to remove the axis labels and tick marks, but I couldn't find in the documentation of xyplot how to remove the external box. I would really appreciate any help with this - START --- library(lattice) x-1:100 cuts - unique( quantile( as.numeric(x), probs=c(0, 0.25, 0.5, 0.75, 0.9, 0.95, 1), na.rm=TRUE) ) gof.levels - cut(x, cuts) nlevels - length(levels(gof.levels)) xyplot(1~1, groups=gof.levels, type=n, xlab=, ylab=, scales=list(draw=FALSE), key = list(x = .5, y = .5, corner = c(0.5, 0.5), title=legend, points = list(pch=16, col=c(2,4,3), cex=1.5), text = list(levels(gof.levels)) ) ) - END --- Thanks in advance, Mauricio Zambrano-Bigiarini -- FLOODS Action Water Resources Unit (H01) Institute for Environment and Sustainability (IES) European Commission, Joint Research Centre (JRC) webinfo : http://floods.jrc.ec.europa.eu/ DISCLAIMER: The views expressed are purely those of the writer and may not in any circumstances be regarded as stating an official position of the European Commission. Linux user #454569 -- Ubuntu user #17469 There is only one pretty child in the world, and every mother has it. (Chinese Proverb) http://c2.com/cgi/wiki?HowToAskQuestionsTheSmartWay __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to sort frequency distribution table?
R tends to see the ordering of factor levels as a property of the data rather than a property of the table/graph. So it is generally best to modify the data object (factor) to represent what you want rather than look for an option in the table/plot function (this will also be more efficient in the long run). Here is a simple example using the reorder function: tmp - factor(sample( letters[1:5], 100, TRUE )) table(tmp) tmp a b c d e 20 20 19 18 23 tmp2 - reorder(tmp, rep(1,length(tmp)), sum) table(tmp2) tmp2 d c a b e 18 19 20 20 23 tmp2 - reorder(tmp, rep(-1,length(tmp)), sum) table(tmp2) tmp2 e a b c d 23 20 20 19 18 On Wed, Mar 7, 2012 at 9:46 PM, Manish Gupta mandecent.gu...@gmail.com wrote: Hi, I am working on categorical data with column as disease name(categaory). My input data is [1] Acute lymphoblastic leukemia (childhood) [2] Adiponectin levels [3] Adiponectin levels [4] Adiponectin levels [5] Adiponectin levels [6] Adiponectin levels [7] Adiposity [8] Adiposity [9] Adiposity [10] Adiposity [11] Age-related macular degeneration [12] Age-related macular degeneration [13] Aging (time to death) [14] Aging (time to event) [15] Aging (time to event) [16] Aging (time to event) [17] Aging (time to event) [18] AIDS [19] AIDS [20] AIDS . when i use table command, i get [,1] Acute lymphoblastic leukemia (childhood) 1 Adiponectin levels 5 Adiposity 4 Age-related macular degeneration 2 Aging (time to death) 1 .. But i need to sort this table by frequency and need to plot a histogram with lable first column (e.g. Adiposity , Age-related macular degeneration as bar name). How can i do it? Regards -- View this message in context: http://r.789695.n4.nabble.com/How-to-sort-frequency-distribution-table-tp4455595p4455595.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help please. 2 tables, which test?
Just what null hypothesis are you trying to test or what question are you trying to answer by comparing 2 matrices of different size? I think you need to figure out what your real question is before worrying about which test might work on it. Trying to get your data to fit a given test rather than finding the appropriate test or other procedure to answer your question is like buying a new suit then having plastic surgery to make you fit the suit rather than having the tailor modify the suit to fit you. If you can give us more information about what your question is we have a better chance of actually helping you. On Fri, Mar 9, 2012 at 9:46 AM, aoife doherty aaral.si...@gmail.com wrote: Thank you. Can the chi-squared test compare two matrices that are not the same size, eg if matrix 1 is a 2 X 4 table, and matrix 2 is a 3 X 5 matrix? On Fri, Mar 9, 2012 at 4:37 PM, Greg Snow 538...@gmail.com wrote: The chi-squared test is one option (and seems reasonable to me if it the the proportions/patterns that you want to test). One way to do the test is to combine your 2 matrices into a 3 dimensional array (the abind package may help here) and test using the loglin function. On Thu, Mar 8, 2012 at 5:46 AM, aaral singh aaral.si...@gmail.com wrote: Hi.Please help if someone can. Problem: I have 2 matrices Eg matrix 1: Freq None Some Heavy 3 2 5 Never 8 13 8 Occas 1 4 4 Regul 9 5 7 matrix 2: Freq None Some Heavy 7 1 3 Never 87 18 84 Occas 12 3 4 Regul 9 1 7 I want to see if matrix 1 is significantly different from matrix 2. I consider using a chi-squared test. Is this appropriate? Could anyone advise? Many thank you. Aaral Singh -- View this message in context: http://r.789695.n4.nabble.com/help-please-2-tables-which-test-tp4456312p4456312.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gsub: replacing double backslashes with single backslash
The issue here is the difference between what is contained in a string and what R displays to you. The string produced with the code: tmp - C:\\ only has 3 characters (as David pointed out), the third of which is a single backslash, since the 1st \ escapes the 2nd and the R string parsing rules use the combination to put a sing backslash in the string. When you print the string (whether you call print directly or indirectly) the print function escapes special characters, including the backslash, so you see \\ which represents a single backslash in the string. If you use the cat function instead of the print function, then you will only see a single backslash (and other escape sequences such as \n will also display different in print vs. cat output). There are other ways to see the exact string (write to a file, use in certain command, etc.) but cat is probably the simplest. On Wed, Mar 7, 2012 at 7:57 AM, David Winsemius dwinsem...@comcast.net wrote: On Mar 7, 2012, at 6:54 AM, Markus Elze wrote: Hello everybody, this might be a trivial question, but I have been unable to find this using Google. I am trying to replace double backslashes with single backslashes using gsub. Actually you don't have double backslashes in the argument you are presenting to gsub. The string entered at the console as C:\\ only has a single backslash. nchar(C:\\) [1] 3 There seems to be some unexpected behaviour with regards to the replacement string \\. The following example uses the string C:\\ which should be converted to C:\ . gsub(, \\, C:\\) [1] C: But I do not understand that returned value, either. I thought that the 'repl' argument (which I think I have demonstrated is a single backslash) would get put back in the returned value. gsub(, Test, C:\\) [1] C:Test gsub(, , C:\\) [1] C:\\ I thought the parsing rules for 'replacement' were different than the rules for 'patt'. So I'm puzzled, too. Maybe something changed in 2.14? sub(, \\, C:\\, fixed=TRUE) [1] C:\\ sub(, \\, C:\\) [1] C: sub(([\\]), \\1, C:\\) [1] C:\\ The NEWS file does say that there is a new regular expression implementation and that the help file for regex should be consulted. And presumably we should study this: http://laurikari.net/tre/documentation/regex-syntax/ In the 'replacement' argument, the \\ is used to back-reference a numbered sub-pattern, so perhaps \\ is now getting handled as the null subpattern? I don't see that mentioned in the regex help page, but it is a big page. I also didn't see \\ referenced in the TRE documentation, but then again I don't think that \\ in console or source() input is a double backslash. The TRE document says that A \ cannot be the last character of an ERE. I cannot tell whether that rule gets applied to the 'replacement'. I have observed similar behaviour for fixed=TRUE and perl=TRUE. I use R 2.14.1 64-bit on Windows 7. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to read this data properly?
Using the readlines function on your dat string gives the error because it is looking for a file named 2 3 ... which it is not finding. more likely what you want is to create a text connection (see ?textConnection) to your string, then use scan or read.table on that connection. On Sat, Mar 3, 2012 at 8:15 AM, Bogaso Christofer bogaso.christo...@gmail.com wrote: Dear all, I have been given a data something like below: Dat = 2 3 28.3 3.05 8 3 3 22.5 1.55 0 1 1 26.0 2.30 9 3 3 24.8 2.10 0 3 3 26.0 2.60 4 2 3 23.8 2.10 0 3 2 24.7 1.90 0 2 1 23.7 1.95 0 3 3 25.6 2.15 0 3 3 24.3 2.15 0 2 3 25.8 2.65 0 2 3 28.2 3.05 11 4 2 21.0 1.85 0 2 1 26.0 2.30 14 1 1 27.1 2.95 8 2 3 25.2 2.00 1 2 3 29.0 3.00 1 4 3 24.7 2.20 0 2 3 27.4 2.70 5 2 2 23.2 1.95 4 I want to create a matrix out of those data for my further calculations. I have tried with readLines() but got error: readLines(Dat) Error in file(con, r) : cannot open the connection In addition: Warning message: In file(con, r) : cannot open file '2 3 28.3 3.05 8 3 3 22.5 1.55 0 1 1 26.0 2.30 9 3 3 24.8 2.10 0 3 3 26.0 2.60 4 2 3 23.8 2.10 0 3 2 24.7 1.90 0 2 1 23.7 1.95 0 3 3 25.6 2.15 0 3 3 24.3 2.15 0 2 3 25.8 2.65 0 2 3 28.2 3.05 11 4 2 21.0 1.85 0 2 1 26.0 2.30 14 1 1 27.1 2.95 8 2 3 25.2 2.00 1 2 3 29.0 3.00 1 4 3 24.7 2.20 0 2 3 27.4 2.70 5 2 2 23.2 1.95 4': No such file or directory Can somebody help to put that data in some workable format? Thanks and regards, [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] contour for plotting confidence interval on scatter plot of bivariate normal distribution
Look at the ellipse package (and the ellipse function in the package) for a simple way of showing a confidence region for bivariate data on a plot (a 68% confidence interval is about 1 SD if you just want to show 1 SD). On Sat, Mar 3, 2012 at 7:54 AM, drflxms drfl...@googlemail.com wrote: Dear all, I created a bivariate normal distribution: set.seed(138813) n-100 x-rnorm(n); y-rnorm(n) and plotted a scatterplot of it: plot(x,y) Now I'd like to add the 2D-standard deviation. I found a thread regarding plotting arbitrary confidence boundaries from Pascal Hänggi http://www.mail-archive.com/r-help@r-project.org/msg24013.html which cites the even older thread http://tolstoy.newcastle.edu.au/R/help/03b/5384.html As I am unfortunately only a very poor R programmer, the code of Pascal Hänggi is a myth to me and I am not sure whether I was able to translate the recommendation of Brain Ripley in the later thread (which provides no code) into the the correct R code. Brain wrote: You need a 2D density estimate (e.g. kde2d in MASS) then compute the density values at the points and draw the contour of the density which includes 95% of the points (at a level computed from the sorted values via quantile()). [95% confidence interval was desired in thread instead of standard deviation...] So I tried this... den-kde2d(x, y, n=n) #as I chose n to be the same as during creating the distributions x and y (see above), a z-value is assigned to every combination of x and y. # create a sorted vector of z-values (instead of the matrix stored inside the den object den.z -sort(den$z) # set desired confidence border to draw and store it in variable confidence.border - quantile(den.z, probs=0.6827, na.rm = TRUE) # draw a line representing confidence.border on the existing scatterplot par(new=TRUE) contour(den, levels=confidence.border, col = red, add = TRUE) Unfortunately I doubt very much this is correct :( In fact I am sure this is wrong, because the border for probs=0.05 is drawn outside the values So please help and check. Pascal Hänggis code seems to work, but I don't understand the magic he does with pp - array() for (i in 1:1000){ z.x - max(which(den$x x[i])) z.y - max(which(den$y y[i])) pp[i] - den$z[z.x, z.y] } before doing the very same as I did above: confidencebound - quantile(pp, 0.05, na.rm = TRUE) plot(x, y) contour(den, levels = confidencebound, col = red, add = TRUE) My problems: 1.) setting probs=0.6827 is somehow a dirty trick which I can only use by simply knowing that this is the percentage of values inside +-1sd when a distribution is normal. Is there a way doing this with native sd function? sd(den.z) is not correct, as den.z is in contrast to x and y not normal any more. So ecdf(den.z)(sd(den.z)) results in a percentile of 0.5644 in this example instead of the desired 0.6827. 2.) I would like to have code that works with any desired confidence. Unfortunately setting probs to the desired confidence would probably be wrong (?!) as it relates to den.z instead of x and y, which are the underlying distributions I am interested in. To put it short I want the confidence of x/y and not of den.z. I am really completely stuck. Please help me out of this! Felix __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cleaning up messy Excel data
Sometimes we adapt to our environment, sometimes we adapt our environment to us. I like fortune(108). I actually was suggesting that you add a tool to your toolbox, not limit it. In my experience (and I don't expect everyone else's to match) data manipulation that seems easier in Excel than R is only easier until the client comes back and wants me to redo the whole analysis with one typo fixed. Then rerunning the script in R (or Perl or other tool) is a lot easier than trying to remember where all I clicked, dragged, selected, etc. I do use Excel for somethings (though I would be happy to find other tools for that if it were possible to expunge Excel from the earth) and Word (I actually like using R2wd to send tables and graphs to word that I can then give to clients who just want to be able to copy and paste them to something else), I just think that many of the tasks that many people use excel for would be better served with a better tool. If someone reading this decides to put some more thought into a project up front and actually design a database up front rather than letting it evolve into some monstrosity in Excel, and that decision saves them some later grief, then the world will be a little bit better place. On Fri, Mar 2, 2012 at 6:04 PM, jim holtman jholt...@gmail.com wrote: Unfortunately they only know how to use Excel and Word. They are not folks who use a computer every day. Many of them run factories or warehouses and asking them to use something like Access would not happen in my lifetime (I have retired twice already). I don't have any problems with them messing up the data that I send them; they are pretty good about making changes within the context of the spreadsheet. The other issue is that I working with people in twenty different locations spread across the US, so I might be able to one of them to use Access (there is one I know that uses it), but that leaves 19 other people I would not be able to communicate with. The other thing is, is that I use Excel myself to slice/dice data since there are things that are easier in Excel than R (believe it or not). There are a number of tools I keep in my toolkit, and R is probably the most important, but I have not thrown the rest of them away since they still serve a purpose. So if you can come up with a way to 20 diverse groups, who are not computer literate, to change over in a couple of days from Excel to Access let me know. BTW, I tried to use Access once and gave it up because it was not as intuitive as some other tools and did not give me any more capability than the ones I was using. So I know I would have a problem in convincing other to make the change just so they could communicate with me, while they still had to use Excel to most of their other interfaces. This is the real world where you have to learn how to adapt to your environment and make the best of it. So you just have to learn that Excel can be your friend (or at least not your enemy) and can serve a very useful purpose in getting your ideas across to other people. On Fri, Mar 2, 2012 at 6:41 PM, Greg Snow 538...@gmail.com wrote: Try sending your clients a data set (data frame, table, etc) as an MS Access data table instead. They can still view the data as a table, but will have to go to much more effort to mess up the data, more likely they will do proper edits without messing anything up (mixing characters in with numbers, have more sexes than your biology teacher told you about, add extra lines at top or bottom that makes reading back into R more difficult, etc.) I have had a few clients that I talked into using MS Access from the start to enter their data, there was often a bit of resistance at first, but once they tried it and went through the process of designing the database up front they ended up thanking me and believed that the entire data entry process was easier and quicker than had the used excel as they originally planned. Access is still part of MS office, so they don't need to learn R or in any way break their chains from being prisoners of bill, but they will be more productive in more ways than just interfacing with you. Access (databases in general) force you to plan things out and do the correct thing from the start. It is possible to do the right thing in Excel, but Excel does not encourage (let alone force) you to do the right thing, but makes it easy to do the wrong thing. On Thu, Mar 1, 2012 at 6:15 AM, jim holtman jholt...@gmail.com wrote: But there are some important reasons to use Excel. In my work there are a lot of people that I have to send the equivalent of a data.frame to who want to look at the data and possibly slice/dice the data differently and then send back to me updates. These folks do not know how to use R, but do have Microsoft Office installed on their computers and know how to use the different products. I have been very successful in conveying what
Re: [R] Shape manipulation
A general solution if you always want 2 columns and the pattern is always every other column (but the number of total columns could change) would be: cbind( c(Dat[,c(TRUE,FALSE)]), c(Dat[,c(FALSE,TRUE)]) ) On Sat, Mar 3, 2012 at 11:40 AM, David Winsemius dwinsem...@comcast.net wrote: On Mar 3, 2012, at 11:02 AM, Bogaso Christofer wrote: Hi all, let say I have following matrix: Dat - matrix(1:30, 5, 6); colnames(Dat) - rep(c(Name1, Names2), 3) Dat Name1 Names2 Name1 Names2 Name1 Names2 [1,] 1 6 11 16 21 26 [2,] 2 7 12 17 22 27 [3,] 3 8 13 18 23 28 [4,] 4 9 14 19 24 29 [5,] 5 10 15 20 25 30 From this matrix, I want to create another matrix with 2 columns for Name1 and Name2. Therefore, my final matrix will have 2 columns and 15 rows. Is there any direct R function to achieve this? rbind(Dat[,1:2], Dat[,3:4], Dat[,5:6]) [[alternative HTML version deleted]] Bogaso; It is really long past due for you to learn how to send plain text messages from your mailer. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] contour for plotting confidence interval on scatter plot of bivariate normal distribution
The key part of the ellipse function is: matrix(c(t * scale[1] * cos(a + d/2) + centre[1], t * scale[2] * cos(a - d/2) + centre[2]), npoints, 2, dimnames = list(NULL, names)) Where (if I did not miss anything) the variable 't' is derived from a chisquare distribution and the confidence level, scale[1] and scale[2] are the standard deviations of the 2 variables, d is the eccentricity based on the correlation and a is just a sequence from 0 to 2*pi. So if you use 't' as 1 instead of derived based on confidence then you would get a 1 SD ellipse in the sense that any 1 dimensional slice through the mean point would cut the ellipse at 1 SD from the mean. You could then change t to 2 for the 2 SD curve, etc. On Sat, Mar 3, 2012 at 12:25 PM, drflxms drfl...@googlemail.com wrote: Thank you very much for your thoughts! Exactly what you mention is, what I am thinking about during the last hours: What is the relation between the den$z distribution and the z distribution. That's why I asked for ecdf(distribution)(value)-percentile earlier this day (thank you again for your quick and insightful answer on that!). I used it to compare certain values in both distributions by their percentile. I really think you are completely right: I urgently need some lessons in bivariate/multivariate normal distributions. (I am a neurologist and unfortunately did not learn too much about statistics in university :-() I'll take your statement as a starter: Once you go into two dimensions, SD loses all meaning, and adding nonparametric density estimation into the mix doesn't help, so just stop thinking in those terms! This makes me really think a lot! Is plotting the 0,68 confidence interval in 2D as equivalent to +-1 SD really nonsense!? By the way: all started very harmless. I was asked to draw an example of the well known target analogy for accuracy and precision based on real (=simulated) data. (see i.e. http://en.wikipedia.org/wiki/Accuracy_and_precision for a simple hand made 2d graphic). Well, I did by set.seed(138813) x-rnorm(n); y-rnorm(n) plot(x,y) I was asked whether it might be possible to add a histogram with superimposed normal curve to the drawing: no problem. And where is the standard deviation, well abline(v=sd(... OK. Then I realized, that this is of course only true for one of the distributions (x) and only in one slice of the scatterplot of x and y. The real thing is is a 3d density map above the scatterplot. A very nice example of this is demo(bivar) in the rgl package (for a picture see i.e http://rgl.neoscientists.org/gallery.shtml right upper corner). Great! But how to correctly draw the standard deviation boundaries for the shots on the target (the scatterplot of x and y)... I'd be grateful for hints on what to read on that matter (book, website etc.) Greetings from Munich, Felix. Am 03.03.12 19:22, schrieb peter dalgaard: On Mar 3, 2012, at 17:01 , drflxms wrote: # this is the critical block, which I still do not comprehend in detail z - array() for (i in 1:n){ z.x - max(which(den$x x[i])) z.y - max(which(den$y y[i])) z[i] - den$z[z.x, z.y] } As far as I can tell, the point is to get at density values corresponding to the values of (x,y) that you actually have in your sample, as opposed to den$z which is for an extended grid of all possible (x_i, y_j) combinations. It's unclear to me what happens if you look at quantiles for the entire den$z. I kind of suspect that it is some sort of approximate numerical integration, but maybe not of the right thing Re SD: Once you go into two dimensions, SD loses all meaning, and adding nonparametric density estimation into the mix doesn't help, so just stop thinking in those terms! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] contour for plotting confidence interval on scatter plot of bivariate normal distribution
To further explain. If you want contours of a bivariate normal, then you want ellipses. The density for a bivariate normal (with 0 correlation to keep things simple, but the theory will extend to correlated cases) is proportional to exp( -1/2 ( x1^2/v1 + x2^2/v2 ) so a contour of the distribution will be all points such that x1^2/v1 + x2^2/v2 = c for some constant c (each c will give a different contour), but that is the definition of an ellipse (well divide both sides by c so that the right side is 1 to get the canonical form). The ellipse function in the ellipse package chooses c from the chi squared distribution (since if x1 and x2 are normally distributed with mean 0 (or have the mean subtracted), then x1^2/v1 +x2^2/v2 is chi squared distributed with 2 degrees of freedom. So if you really want to you can try to approximate the contours in some other way, but any decent approach will just converge to the ellipse. On Sat, Mar 3, 2012 at 1:26 PM, drflxms drfl...@googlemail.com wrote: Wow, David, thank you for these sources, which I just screened. bagplot looks most promising to me. I found it in the package ‘aplpack’ as well as in the R Graph Gallery http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=112 Ellipses are not exactly what I am heading for. I am looking for a 2D equivalent of plotting 1D standard deviation boundaries. In other words how to plot SD boundaries on a 2D scatter plot. So I started searching for contour/boundaries etc. instead of ellipse leading me to Pascal Hänggi: http://www.mail-archive.com/r-help@r-project.org/msg24013.html To describe it in an image: I want to cut the density mountain above the scatter plot (see demo(bivar) in the rgl package) in a way so that the part of the mountain, that covers 68% of the data on the x-y-plane below it (+-1 SD) is removed. Then I'd like to project the edge that results from the cut to the x-y-plane below the mountain. This should be the 2d equivalent of 1s SD boundaries. I think this might be achieved as well by Hänggis code as by the function of Forester. Unfortunately they result in slightly different boundaries which shouldn't be the case. And I did not figure out which one is correct if one is correct at all (!?). Can anyone explain the difference? I compared them with this code: # parameters: n-100 # generate samples: set.seed(138813) x-rnorm(n); y-rnorm(n) a-list(x=x,y=y) # input for Foresters function which is appended at the very end # estimate non-parameteric density surface via kernel smoothing library(MASS) den-kde2d(x, y, n=n) z - array() for (i in 1:n){ z.x - max(which(den$x x[i])) z.y - max(which(den$y y[i])) z[i] - den$z[z.x, z.y] } # store class/level borders of confidence interval in variables confidence.border - quantile(z, probs=0.05, na.rm = TRUE)# 0.05 corresponds to 0.95 in draw.contour plot(x,y) draw.contour(a, alpha=0.95) par(new=TRUE) contour(den, levels=confidence.border, col = red, add = TRUE) ### ## drawcontour.R ## Written by J.D. Forester, 17 March 2008 ### ## This function draws an approximate density contour based on empirical, bivariate data. ##change testit to FALSE if sourcing the file testit=TRUE draw.contour-function(a,alpha=0.95,plot.dens=FALSE, line.width=2, line.type=1, limits=NULL, density.res=800,spline.smooth=-1,...){ ## a is a list or matrix of x and y coordinates (e.g., a=list(x=rnorm(100),y=rnorm(100))) ## if a is a list or dataframe, the components must be labeled x and y ## if a is a matrix, the first column is assumed to be x, the second y ## alpha is the contour level desired ## if plot.dens==TRUE, then the joint density of x and y are plotted, ## otherwise the contour is added to the current plot. ## density.res controls the resolution of the density plot ## A key assumption of this function is that very little probability mass lies outside the limits of ## the x and y values in a. This is likely reasonable if the number of observations in a is large. require(MASS) require(ks) if(length(line.width)!=length(alpha)){ line.width - rep(line.width[1],length(alpha)) } if(length(line.type)!=length(alpha)){ line.type - rep(line.type[1],length(alpha)) } if(is.matrix(a)){ a=list(x=a[,1],y=a[,2]) } ##generate approximate density values if(is.null(limits)){ limits=c(range(a$x),range(a$y)) } f1-kde2d(a$x,a$y,n=density.res,lims=limits) ##plot empirical density if(plot.dens) image(f1,...) if(is.null(dev.list())){ ##ensure that there is a window in which to draw the contour plot(a,type=n,xlab=X,ylab=Y) } ##estimate critical contour value ## assume that density outside of plot is very small zdens - rev(sort(f1$z)) Czdens - cumsum(zdens) Czdens - (Czdens/Czdens[length(zdens)]) for(cont.level in 1:length(alpha)){ ##This
Re: [R] Cleaning up messy Excel data
Try sending your clients a data set (data frame, table, etc) as an MS Access data table instead. They can still view the data as a table, but will have to go to much more effort to mess up the data, more likely they will do proper edits without messing anything up (mixing characters in with numbers, have more sexes than your biology teacher told you about, add extra lines at top or bottom that makes reading back into R more difficult, etc.) I have had a few clients that I talked into using MS Access from the start to enter their data, there was often a bit of resistance at first, but once they tried it and went through the process of designing the database up front they ended up thanking me and believed that the entire data entry process was easier and quicker than had the used excel as they originally planned. Access is still part of MS office, so they don't need to learn R or in any way break their chains from being prisoners of bill, but they will be more productive in more ways than just interfacing with you. Access (databases in general) force you to plan things out and do the correct thing from the start. It is possible to do the right thing in Excel, but Excel does not encourage (let alone force) you to do the right thing, but makes it easy to do the wrong thing. On Thu, Mar 1, 2012 at 6:15 AM, jim holtman jholt...@gmail.com wrote: But there are some important reasons to use Excel. In my work there are a lot of people that I have to send the equivalent of a data.frame to who want to look at the data and possibly slice/dice the data differently and then send back to me updates. These folks do not know how to use R, but do have Microsoft Office installed on their computers and know how to use the different products. I have been very successful in conveying what I am doing for them by communicating via Excel spreadsheets. It is also an important medium in dealing with some international companies who provide data via Excel and expect responses back via Excel. When dealing with data in a tabular form, Excel does provide a way for a majority of the people I work with to understand the data. Yes, there are problems with some of the ways that people use Excel, and yes I have had to invest time in scrubbing some of the data that I get from them, but if I did not, then I would probably not have a job working for them. I use R exclusively for the analysis that I do, but find it convenient to use Excel to provide a communication mechanism to the majority of the non-R users that I have to deal with. It is a convenient work-around because I would never get them to invest the time to learn R. So in the real world these is a need to Excel and we are not going to cause it to go away; we have to learn how to live with it, and from my standpoint, it has definitely benefited me in being able to communicate with my users and continuing to provide them with results that they are happy with. They refer to letting me work my magic on the data; all they know is they see the result via Excel and in the background R is doing the heavy lifting that they do not have to know about. On Wed, Feb 29, 2012 at 4:41 PM, Rolf Turner rolf.tur...@xtra.co.nz wrote: On 01/03/12 04:43, John Kane wrote: (mydata- as.factor(c(1,2,3, 2, 5, 2))) str(mydata) newdata- as.character(mydata) newdata[newdata==2]- 0 newdata- as.numeric(newdata) str(newdata) We really need to keep Excel (and other spreadsheets) out of peoples hands. Amen, bro'!!! cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] convert list to text file
Or lapply(LIST, cat, file='outtext.txt', append=TRUE) On Thu, Mar 1, 2012 at 6:20 AM, R. Michael Weylandt michael.weyla...@gmail.com wrote: Perhaps something like sink(outtext.txt) lapply(LIST, print) sink() You could replace print with cat and friends if you wanted more detailed control over the look of the output. Michael On Thu, Mar 1, 2012 at 5:28 AM, t.galesl...@ebh.umcn.nl wrote: Dear R users, Is it possible to write the following list to a text-file? List: [[1]] [1] 500 [[2]] [1] 1 [[3]] [,1] [,2] [,3] [,4] [,5] FID 1 2 3 4 5 Var 2 0 2 1 1 I would like to have the textfile look like this: 500 1 FID 1 2 3 4 5 Var 2 0 2 1 1 Thank you very much in advance for your help! Kind regards, Tessel Galesloot Department of Epidemiology, Biostatistics and HTA (133) Radboud University Nijmegen Medical Centre Het UMC St Radboud staat geregistreerd bij de Kamer van Koophandel in het handelsregister onder nummer 41055629. The Radboud University Nijmegen Medical Centre is listed in the Commercial Register of the Chamber of Commerce under file number 41055629. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fridays date to date
If you know that your first date is a Friday then you can use seq with by=7 day, then you don't need to post filter the vector. On Thu, Mar 1, 2012 at 1:40 PM, Ben quant ccqu...@gmail.com wrote: Great thanks! ben On Thu, Mar 1, 2012 at 1:30 PM, Marc Schwartz marc_schwa...@me.com wrote: On Mar 1, 2012, at 2:02 PM, Ben quant wrote: Hello, How do I get the dates of all Fridays between two dates? thanks, Ben Days - seq(from = as.Date(2012-03-01), to = as.Date(2012-07-31), by = day) str(Days) Date[1:153], format: 2012-03-01 2012-03-02 2012-03-03 2012-03-04 ... # See ?weekdays Days[weekdays(Days) == Friday] [1] 2012-03-02 2012-03-09 2012-03-16 2012-03-23 2012-03-30 [6] 2012-04-06 2012-04-13 2012-04-20 2012-04-27 2012-05-04 [11] 2012-05-11 2012-05-18 2012-05-25 2012-06-01 2012-06-08 [16] 2012-06-15 2012-06-22 2012-06-29 2012-07-06 2012-07-13 [21] 2012-07-20 2012-07-27 HTH, Marc Schwartz [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Connecting points on a line with arcs/curves
?xspline On Thu, Mar 1, 2012 at 8:15 AM, hendersi ir...@cam.ac.uk wrote: Hello, I have a spreadsheet of pairs of coordinates and I would like to plot a line along which curves/arcs connect each pair of coordinates. The aim is to visualise the pattern of point connections. Thanks! Ian -- View this message in context: http://r.789695.n4.nabble.com/Connecting-points-on-a-line-with-arcs-curves-tp4435247p4435247.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with sum function
Others explained why it happens, but you might want to look at the zapsmall function for one way to deal with it. On Thu, Mar 1, 2012 at 2:49 PM, Mark A. Albins kamoko...@gmail.com wrote: Hi! I'm running R version 2.13.0 (2011-04-13) Platform: i386-pc-mingw32/i386 (32-bit) When i type in the command: sum(c(-0.2, 0.8, 0.8, -3.2, 1.8)) R returns the value: -5.551115e-17 Why doesn't R return zero in this case? There shouldn't be any rounding error in a simple sum. Thanks, Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Computing line= for mtext
I would use the regular text function instead of mtext (remembering to set par(xpd=...)), then use the grconvertX and grconvertY functions to find the location to plot at (possibly adding in the results from strwidth or stheight). On Thu, Mar 1, 2012 at 4:52 PM, Frank Harrell f.harr...@vanderbilt.edu wrote: Rich's pointers deals with lattice/grid graphics. Does anyone have a solution for base graphics? Thanks Frank Richard M. Heiberger wrote Frank, This can be done directly with a variant of the panel.axis function. See function panel.axis.right in the HH package. This was provided for me by David Winsemius in response to my query on this list in October 2011 https://stat.ethz.ch/pipermail/r-help/2011-October/292806.html The email thread also includes comments by Deepayan Sarkar and Paul Murrell. Rich On Wed, Feb 29, 2012 at 8:48 AM, Frank Harrell lt;f.harrell@gt;wrote: I want to right-justify a vector of numbers in the right margin of a low-level plot. For this I need to compute the line parameter to give to mtext. Is this the correct scalable calculation? par(mar=c(4,3,1,5)); plot(1:20) s - 'abcde'; w=strwidth(s, units='inches')/par('cin')[1] mtext(s, side=4, las=1, at=5, adj=1, line=w-.5, cex=1) mtext(s, side=4, las=1, at=7, adj=1, line=2*(w-.5), cex=2) Thanks Frank - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Computing-line-for-mtext-tp4431554p4431554.html Sent from the R help mailing list archive at Nabble.com. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmllt;http://www.r-project.org/posting-guide.htmlgt; and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Computing-line-for-mtext-tp4431554p4436923.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] testing two data sets
?ks.test ?qqplot also look at permutation tests and possibly the vis.test function in the TeachingDemos package. Note that with all of these large samples may give you power to detect meaningless differences and small samples may not have enough power to detect potentially important differences. On Wed, Feb 22, 2012 at 12:37 AM, Mohammed Ouassou mohammed.ouas...@statkart.no wrote: Hi everyone, I have 2 data sets and I like to carry out a test to find out if they come from the same distribution. Any suggestions ? thanks in advance. M.O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Repeated cross-validation for a lm object
The validate function in the rms package can do cross validation of ols objects (ols is similar to lm, but with additional information), the default is to do bootstrap validation, but you can specify crossvalidation instead. On Thu, Feb 16, 2012 at 10:44 AM, samuel-rosa alessandrosam...@yahoo.com.br wrote: Dear R users I'd like to hear from someone if there is a function to do a repeated k-fold cross-validation for a lm object and get the predicted values for every observation. The situation is as follows: I had a data set composed by 174 observations from which I sampled randomly a subset composed by 150 observations. With the subset (n = 150) I fitted the model: y = a + bx. The model validation has to be done using a repeated k-fold cross-validation on the complete data set (n = 174). I need to use 10 folds and repeat the cross-validation 100 times. In the end of the procedure, I need to have access to the predicted values for each observation, that is, to the 100 predicted values for each observation. The function lmCV() in the package chemometrics provides the predicted values. However, it works only with multiple linear regression models. I hope there is a way of doing it. Best regards, - Bc.Sc.Agri. Alessandro Samuel-Rosa Postgraduate Program in Soil Science Federal University of Santa Maria Av. Roraima, nº 1000, Bairro Camobi, CEP 97105-970 Santa Maria, Rio Grande do Sul, Brazil -- View this message in context: http://r.789695.n4.nabble.com/Repeated-cross-validation-for-a-lm-object-tp4394833p4394833.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with e+01 number abbreviations
Also look at the zapsmall function. A useful but often overlooked tool. On Thu, Feb 16, 2012 at 2:54 AM, Petr Savicky savi...@cs.cas.cz wrote: On Thu, Feb 16, 2012 at 10:17:09AM +0100, Gian Maria Niccolò Benucci wrote: Dear List, I will appreciate any advice regarding how to convert the following numbers [I got in return by taxondive()] in numeric integers without the e.g. 6.4836e+01 abbreviations. Thank you very much in advance, Gian taxa_dive Species Delta Delta* Lambda+ Delta+ S Delta+ Nat1 5.e+00 6.4836e+01 9.5412e+01 6.7753e+02 8.7398e+01 436.99 Nat2 2.e+00 4.0747e+01 1.e+02 0.e+00 1.e+02 200.00 Nat3 3.e+00 4.5381e+01 7.7652e+01 2.8075e+02 8.8152e+01 264.46 Hi. The exponential format was used probably due to some small numbers. For example tst - rbind( c( 5.e+00, 6.4836e+01, 9.5412e+01, 6.7753e+02, 8.7398e+01, 436.99), c( 2.e+00, 4.0747e+01, 1.e+02, 0.e+00, 1.e+02, 200.00), c( 3.e+00, 4.5381e+01, 7.7652e+01, 2.8075e+02, 8.8152e+01, 264.46), c( 1e-8, 1e-8, 1e-8, 1e-8, 1e-8, 1 )) tst [,1] [,2] [,3] [,4] [,5] [,6] [1,] 5e+00 6.4836e+01 9.5412e+01 6.7753e+02 8.7398e+01 436.99 [2,] 2e+00 4.0747e+01 1.e+02 0.e+00 1.e+02 200.00 [3,] 3e+00 4.5381e+01 7.7652e+01 2.8075e+02 8.8152e+01 264.46 [4,] 1e-08 1.e-08 1.e-08 1.e-08 1.e-08 1.00 Try roudning the numbers, for example round(tst, digits=4) [,1] [,2] [,3] [,4] [,5] [,6] [1,] 5 64.836 95.412 677.53 87.398 436.99 [2,] 2 40.747 100.000 0.00 100.000 200.00 [3,] 3 45.381 77.652 280.75 88.152 264.46 [4,] 0 0.000 0.000 0.00 0.000 1.00 Alternatively, options(scipen=20) forces a fixed point printing with more digits. options(scipen=20) tst [,1] [,2] [,3] [,4] [,5] [,6] [1,] 5. 64.8360 95.4120 677.5300 87.3980 436.99 [2,] 2. 40.7470 100. 0. 100. 200.00 [3,] 3. 45.3810 77.6520 280.7500 88.1520 264.46 [4,] 0.0001 0.0001 0.0001 0.0001 0.0001 1.00 Hope this helps. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to test the random factor effect in lme
This post https://stat.ethz.ch/pipermail/r-sig-mixed-models/2009q1/001819.html may help you understand why the standard p-values in some cases are not the right thing to do and what one alternative is. On Tue, Feb 14, 2012 at 3:36 PM, Xiang Gao xianggao2...@gmail.com wrote: Hi I am working on a Nested one-way ANOVA. I don't know how to implement R code to test the significance of the random factor My R code so far can only test the fixed factor : anova(lme(PCB~Area,random=~1|Sites, data = PCBdata)) numDF denDF F-value p-value (Intercept) 1 12 1841.7845 .0001 Area 1 4 4.9846 0.0894 Here is my data and my hand calculation. PCBdata Area Sites PCB 1 A 1 18 2 A 1 16 3 A 1 16 4 A 2 19 5 A 2 20 6 A 2 19 7 A 3 18 8 A 3 18 9 A 3 20 10 B 4 21 11 B 4 20 12 B 4 18 13 B 5 19 14 B 5 20 15 B 5 21 16 B 6 19 17 B 6 23 18 B 6 21 By hand calculation, the result should be: Source SS DF MS Areas 18.00 1 18.00 Sites 14.44 4 3.61 Error 20.67 12 1.72 Total 53.11 17 --- MSareas/MSsites = 4.99 --- matching the R output MSsites/MSE = 2.10 Conclusion is that Neither of Areas nor Sites make differences. My R code so far can only test the fixed effect : anova(lme(PCB~Area,random=~1|Sites, data = PCBdata)) numDF denDF F-value p-value (Intercept) 1 12 1841.7845 .0001 Area 1 4 4.9846 0.0894 -- Xiang Gao, Ph.D. Department of Biology University of North Texas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] hexplom question(s)
Assuming this is the hexplom function from the hexbin package (it is best to be specific in case there are multiple versions of the function you ask about), you can specify lower.panel=function(...){} for a, and as.matrix=TRUE for c, for b I am not sure what exactly you want to do, but look at the diag.panel.splom function in the lattice package as a possible solution. On Mon, Feb 13, 2012 at 5:16 PM, Debs Majumdar debs_st...@yahoo.com wrote: Hi, I am trying to use the --hexplom-- function to draw a scatterplot matrix. The following works for me: hexplom(~file[,1:4], xbins=15, xlab=) However, I want to make some changes to the graph: a) I only want to print/draw only one-half of the plot. Is there anyway to get rid of the plots in the lower triangular matrix? b) Is there anyway, I can overwrite the xlabels? c) Not very important, but the variables start from the bottom and goes up. E.g. I am plotting 4 variables ans I have a 4x4 matrix for the plot. Is there anyway I can reverse the diagonals? i.e I would like to list the variables and axes on 1x1, 2x2, 3x3 and 4x4 rather than the default where it lists the first variable on 4x1 follwed by 3x2, 2x3 and 1x4? Thanks, -Joey __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] testing for a distribution of probability
All the distribution tests are rule out tests, i.e. they can tell you if your data does not match a given distribution, but they can never tell you that the data does come from a specific distribution. Note also that the results of any of these studies may not be that useful, for small sample sizes it is more important to rule out a given distribution, but unless there is a huge difference you won't have much power to do this. For large sample sizes it is less important because using a close distribution will generally give you robust results, but you will have power to detect small, meaningless differences. So often your choice is between a meaningless answer to a meaningful question or a meaningful answer to a meaningless question. What is more important and a better approach is to understand the science behind the process that generated the data and use that knowledge to find a distribution that is reasonable (even if not exact) or to use techniques that make fewer assumptions about the distribution if you cannot find something close enough to be reasonable (e.g. bootstrap, permutation, other non-parametric, simulations to determine cut-off values). On Tue, Feb 14, 2012 at 4:21 AM, Bianca A Santini b.sant...@sheffield.ac.uk wrote: Hello! I have several variables. Each of them has a different distribution. I was thinking to use a Generalized Linear Model, glm(), but I need to introduce the family. Do you know if R has any tests for matching data to any distribution ( I am aware of shapiro.test). All the best, -- BAS [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Wildcard for indexing?
Note that you can also do logical comparisons with the results of grepl like: grepl('^as', a) | grepl('^df',a) For the given example it is probably simplest to do it in the regular expression as shown, but for some more complex cases (or including other variables) the logic with the output may be simpler. On Tue, Feb 14, 2012 at 8:23 AM, Johannes Radinger jradin...@gmx.at wrote: Original-Nachricht Datum: Tue, 14 Feb 2012 10:18:33 -0500 Von: Sarah Goslee sarah.gos...@gmail.com An: Johannes Radinger jradin...@gmx.at CC: R-help@r-project.org Betreff: Re: [R] Wildcard for indexing? Hi, You should probably do a bit of reading about regular expressions, but here's one way: On Tue, Feb 14, 2012 at 10:10 AM, Johannes Radinger jradin...@gmx.at wrote: Hi, Original-Nachricht Datum: Tue, 14 Feb 2012 09:59:39 -0500 Von: R. Michael Weylandt michael.weyla...@gmail.com An: Johannes Radinger jradin...@gmx.at CC: R-help@r-project.org Betreff: Re: [R] Wildcard for indexing? I think the grep()-family (regular expressions) will be the easiest way to do this, though it sounds like you might prefer grepl() which returns a logical vector: ^[AB] # Starts with either an A or a B ^A_ # Starting with A_ a - c(A_A,A_B,C_A,BB,A_Asd grepl(^[AB], a) grepl(^A_) Yes grepl() is what I am looking for. is there also something like an OR statement e.g. if I want to select for elements that start with as OR df? a - c(as1, bb, as2, cc, df, aa, dd, sdf) grepl(^as|^df, a) [1] TRUE FALSE TRUE FALSE TRUE FALSE FALSE FALSE The square brackets match any of those characters, so are good for single characters. For more complex patterns, | is the or symbol. ^ marks the beginning. Thank you so much Sarah! I tried that | symbol intuitively, there was just a problem with the quotation marks :( Now everything is solved... /johannes Sarah -- Sarah Goslee http://www.functionaldiversity.org -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting bar graph over a geographical map
If you are willing to use base graphics instead of ggplot2 graphs, then look at the subplot function in the TeachingDemos package. One of the examples there shows adding multiple small bar graphs to a map. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of sjlabrie Sent: Tuesday, January 31, 2012 9:53 PM To: r-help@r-project.org Subject: [R] Plotting bar graph over a geographical map Hi, I am looking for a way to plot bar on a map instead of the standard points. I have been using ggplot2 and maps libraries. The points are added with the function geom_point. I know that there is a function geom_bar but I can't figure out how to use it. Thank you for your help, Simon ### R-code library(ggplot2) library(maps) measurements - read.csv(all_podo.count.csv, header=T) allworld - map_data(world) pdf(map.pdf) ggplot(measurements, aes(long, lat)) + geom_polygon(data = allworld, aes(x = long, y = lat, group = group), colour = grey70, fill = grey70) + geom_point(aes(size = ref)) + opts(axis.title.x = theme_blank(), axis.title.y = theme_blank()) + geom_bar(aes(y = normcount)) dev.off() ### -- View this message in context: http://r.789695.n4.nabble.com/Plotting- bar-graph-over-a-geographical-map-tp4346925p4346925.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] percentage from density()
If you use logspline estimation (logspline package) instead of kernel density estimation then this is simple as there are cumulative area functions for logspline fits. If you need to do this with kernel density estimates then you can just find the area over your region for the kernel centered at each data point and average those values together to get the area under the entire density estimate. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Duke Sent: Friday, January 27, 2012 3:45 PM To: r-help@r-project.org Subject: [R] percentage from density() Hi folks, I know that density function will give a estimated density for a give dataset. Now from that I want to have a percentage estimation for a certain range. For examle: y = density(c(-20,rep(0,98),20)) plot(y, xlim=c(-4,4)) Now if I want to know the percentage of data lying in (-20,2). Basically it should be the area of the curve from -20 to 2. Anybody knows a simple function to do it? Thanks, D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How do I compare 47 GLM models with 1 to 5 interactions and unique combinations?
What variables to consider adding and when to stop adding them depends greatly upon what question(s) you are trying to answer and the science behind your data. Are you trying to create a model to predict your outcome for future predictors? How precise of predictions are needed? Are you trying to understand how certain predictors relate to the response? How they relate after conditioning on other predictors? Will humans be using your equation directly? Or will it be in a black box that the computer generates predictions from but people never need to look at the details? What is the cost (money, time, difficulty, etc.) of collecting the different predictors? Answers to the above questions will be much more valuable in choosing the best model than AIC or other values (though you should still look at the results from analyses for information to combine with the other information). R and its programmers (no matter how great and wonderful they are) cannot answer these for you. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Jhope Sent: Thursday, January 26, 2012 2:26 PM To: r-help@r-project.org Subject: Re: [R] How do I compare 47 GLM models with 1 to 5 interactions and unique combinations? I ask the question about when to stop adding another variable even though it lowers the AIC because each time I add a variable the AIC is lower. How do I know when the model is a good fit? When to stop adding variables, keeping the model simple? Thanks, J -- View this message in context: http://r.789695.n4.nabble.com/How-do-I- compare-47-GLM-models-with-1-to-5-interactions-and-unique-combinations- tp4326407p4331848.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Placing a Shaded Box on a Plot
The locator() function can help you find coordinates of interest on an existing plot. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Stephanie Cooke Sent: Friday, January 27, 2012 1:03 AM To: r-help@r-project.org Subject: [R] Placing a Shaded Box on a Plot Hello, I would like to place shaded boxes on different areas of a phylogenetic tree plot. Since I can not determine how to find axes on the phylogenetic tree plot I am not able to place the box over certain areas. Below is example code for the shaded box that I have tried to use, and the first four values specify the position. rect(110, 400, 135, 450, col=grey, border=transparent) Any suggestions on how to place a shaded box to highlight certain areas of a plot I would greatly appreciate. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to write the entire session to file?
A different approach is to use the etxtStart function in the TeachingDemos package. You need to run this before you start, then it will save everything (commands and output and plots if you tell it to) to a file that can then be post processed to give a file that shows basic coloring (or with options in the post processing, even more coloring). Though it may be better to just run your R session through an editor like ESS/emacs or others. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Ajay Askoolum Sent: Friday, January 27, 2012 12:04 PM To: R General Forum Subject: [R] How to write the entire session to file? savehistory writes all the executed lines from the session. How can I write everything (executed lines and output) from the active session to a file? Using Edit | Select All then Edit Copy, I can copy everything to the clipboard and write the whole thing to a file manually. If I just used the clipboard, I can paste the whole content into another edittor (for documentation). Is there a way to copy the content of the session with the syntax colouring? Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] null distribution of binom.test p values
I believe that what you are seeing is due to the discrete nature of the binomial test. When I run your code below I see the bar between 0.9 and 1.0 is about twice as tall as the bar between 0.0 and 0.1, but the bar between 0.8 and 0.9 is not there (height 0), if you average the top 2 bars (0.8-0.9 and 0.9-1.0) then the average height is similar to that of the lowest bar. The bar between 0.5 and 0.6 is also 0, if you average that one with the next 2 (0.6-0.7 and 0.7-0.8) then they are also similar to the bars near 0. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Chris Wallace Sent: Thursday, January 26, 2012 5:44 AM To: r-help@r-project.org Subject: [R] null distribution of binom.test p values Dear R-help, I must be missing something very obvious, but I am confused as to why the null distribution for p values generated by binom.test() appears to be non-uniform. The histogram generated below has a trend towards values closer to 1 than 0. I expected it to be flat. hist(sapply(1:1000, function(i,n=100) binom.test(sum(rnorm(n)0),n,p=0.5,alternative=two)$p.value)) This trend is more pronounced for small n, and the distribution appears uniform for larger n, say n=1000. I had expected the distribution to be discrete for small n, but not skewed. Can anyone explain why? Many thanks, Chris. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] null distribution of binom.test p values
Yes that is due to the discreteness of the distribution, consider the following: binom.test(39,100,.5) Exact binomial test data: 39 and 100 number of successes = 39, number of trials = 100, p-value = 0.0352 alternative hypothesis: true probability of success is not equal to 0.5 95 percent confidence interval: 0.2940104 0.4926855 sample estimates: probability of success 0.39 binom.test(40,100,.5) Exact binomial test data: 40 and 100 number of successes = 40, number of trials = 100, p-value = 0.05689 alternative hypothesis: true probability of success is not equal to 0.5 95 percent confidence interval: 0.3032948 0.5027908 sample estimates: probability of success 0.4 (you can do the same for 60 and 61) So notice that the probability of getting 39 or more extreme is 0.0352, but anything less extreme will result in not rejecting the null hypothesis (because the probability of getting a 40 or a 60 (dbinom(40,100,.5)) is about 1% each, so we see a 2% jump there). So the size/probability of a type I error will generally not be equal to alpha unless n is huge or alpha is chosen to correspond to a jump in the distribution rather than using common round values. I have seen suggestions that instead of the standard test we use a test that rejects the null for values 39 and more extreme, don't reject the null for 41 and less extreme, and if you see a 40 or 60 then you generate a uniform random number and reject if it is below a certain value (that value chosen to give an overall probability of type I error of 0.05). This will correctly size the test, but becomes less reproducible (and makes clients nervous when they present their data and you pull out a coin, flip it, and tell them if they have significant results based on your coin flip (or more realistically a die roll)). I think it is better in this case if you know your final sample size is going to be 100 to explicitly state that alpha will be 0.352 (but then you need to justify why you are not using the common 0.05 to reviewers). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: Chris Wallace [mailto:chris.wall...@cimr.cam.ac.uk] Sent: Thursday, January 26, 2012 9:36 AM To: Greg Snow Cc: r-help@r-project.org Subject: Re: [R] null distribution of binom.test p values Greg, thanks for the reply. Unfortunately, I remain unconvinced! I ran a longer simulation, 100,000 reps. The size of the test is consistently too small (see below) and the histogram shows increasing bars even within the parts of the histogram with even bar spacing. See https://www-gene.cimr.cam.ac.uk/staff/wallace/hist.png y-sapply(1:10, function(i,n=100) binom.test(sum(rnorm(n)0),n,p=0.5,alternative=two)$p.value) mean(y0.01) # [1] 0.00584 mean(y0.05) # [1] 0.03431 mean(y0.1) # [1] 0.08646 Can that really be due to the discreteness of the distribution? C. On 26/01/12 16:08, Greg Snow wrote: I believe that what you are seeing is due to the discrete nature of the binomial test. When I run your code below I see the bar between 0.9 and 1.0 is about twice as tall as the bar between 0.0 and 0.1, but the bar between 0.8 and 0.9 is not there (height 0), if you average the top 2 bars (0.8-0.9 and 0.9-1.0) then the average height is similar to that of the lowest bar. The bar between 0.5 and 0.6 is also 0, if you average that one with the next 2 (0.6-0.7 and 0.7-0.8) then they are also similar to the bars near 0. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What does [[1]] mean?
Have you read ?[[ ? The short answer is that you can use both [] and [[]] on lists, the [] construct will return a subset of the list (which will be a list) while [[]] will return a single element of the list (which could be a list or a vector or whatever that element may be): compare: tmp - list( a=1, b=letters ) tmp[1] $a [1] 1 tmp[1] + 1 Error in tmp[1] + 1 : non-numeric argument to binary operator tmp[[1]] [1] 1 tmp[[1]] + 1 [1] 2 -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Ajay Askoolum Sent: Thursday, January 26, 2012 11:27 AM To: R General Forum Subject: [R] What does [[1]] mean? I know that [] is used for indexing. I know that [[]] is used for reference to a property of a COM object. I cannot find any explanation of what [[1]] does or, more pertinently, where it should be used. Thank you. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function for grouping
I nominate this response for the fortunes package. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of David Winsemius Sent: Wednesday, January 25, 2012 10:23 AM To: yan Cc: r-help@r-project.org Subject: Re: [R] function for grouping On Jan 25, 2012, at 12:10 PM, yan wrote: thanks petr, what if I got 200 elements, so I have to write expand.grid(x1=1, x2=1:2, x3=1:3, x4=1:3, x5=1:3x200=1:3))? Perhaps same thing that will happen when those monks finish the Towers of Hanoi? 2*3^198 [1] 5.902533e+94 -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to save the R script itself into a rData file?
You could use the saveHistory command to save the history of commands that you have written to a file, then read that into a variable using the scan function, then do the save or save.image to save everything. A different approach would be to save transcripts of your session that would show the commands run and the output created, on option for doing this is to run R inside of ESS/emacs, another option is the txtStart function in the TeachingDemos package. You could also use the addTaskCallback function to add a task callback that adds each command (well the successful ones, errors don't trigger the callbacks) to a text vector, then this text vector would be saved in .Rdata when doing save.image() -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Michael Sent: Saturday, January 21, 2012 5:25 PM To: r-help Subject: [R] how to save the R script itself into a rData file? Hi all, As a part of work flow, I do a lot of experiments and save all my results into rData file... i.e. at the end of all my experiments, I do save.image(experiment_name_with_series_number.rData)... However, some times even with the rData files, I cannot remember the context where these data files were generated. Of course, I can make the R data file names and the R script file names the same, so that whenever I see a data file, I will be able to track down to how the result file was generated. This is fine. But sometimes a bunch of different results rData files were generated simply from varying a parameter in the same R script file. It's kind of messy to save different R script files with different names when only parameters are different, and not to say if there are a bunch of parameters that need to be put into file names... Lets say I changed the parameters x to 0.123, y to -0.456, z to -999.99 Then I have to save the R script file as Experiment_001_x=0.123_y=-0.456_z=-999.99.r and the result file as Experiment_001_x=0.123_y=-0.456_z=-999.99.rData ... This is kind of messy, isn't it? Is there a way to save the whole script file (i.e. the context where the data file is generated) into the rData file? It cannot be the file location and/or file name of the R script file; it needs to be the whole content of the file... to prevent the parameters change .. i.e. the same R script file but with different combinations of parameters... How to do that? Any good tricks? Thanks a lot! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] graph paper look
In addition to the recommendations to use the grid function, you could just do: par(tck=1) before calling the plotting functions. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Erin Hodgess Sent: Wednesday, January 18, 2012 6:19 PM To: r-help@r-project.org Subject: [R] graph paper look Dear R People: Short of doing a series of ablines, is there a way to produce graph paper in R please? Thanks, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Read and show Bitmap Images
See the rasterImage function to do the plotting. If you need to read the image in then I would start with the EBImage package from bioconductor (though there are others as well). -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Alaios Sent: Monday, January 16, 2012 12:05 AM To: R-help@r-project.org Subject: [R] Read and show Bitmap Images Dear all, I am looking for a function that can plot bitmap images and by plotting I mean a function that can read an image's matrix structure with integers and assign colors. Do you please suggest me what else I can do for plotting these images? I would like to thank you for your reply B.R Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Display numbers on map
You might consider using the state.vbm map that is now part of the maptools package. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jeffrey Joh Sent: Tuesday, January 17, 2012 3:37 AM To: r-help@r-project.org Subject: [R] Display numbers on map I have a text file with states and numbers. I would like to display each number that corresponds to a state on a map. I am trying to use the maps package, but it doesn't show Alaska or Hawaii. Do you have suggestions on how to do this? Jeffrey [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how do I make a movie out of a timeseries of 2D data?
If you need the animation in a file outside of R (or possibly in R) then look at the animation package. This allows you quite a few options on how to save an animation, some of which depend on outside programs, but options mean that if you don't have one of those programs there are other ways to do this. If you just want to explore this within R then the development version of the TeachingDemos package (on R-Forge) has added an animate control to the tkexamp function. Just create a function that takes the time index as the argument and creates the plot you want for each time, then run tkexamp using the animate control for the time argument. You will see a new window with the plot for the 1st time index along with a slider that will let you move through time and a button that will step through the remaining times automatically (you can specify the speed when running tkexamp). -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Michael Sent: Thursday, January 12, 2012 7:38 AM To: r-help; r-sig-finance Subject: [R] how do I make a movie out of a timeseries of 2D data? Hi all, I have an array of 1 x 200 x 200 numbers... which is a time-series of 200x200 2D data... The 1st dimension is the time index. Is there a way to make a movie out of these data - i.e. playback 1 frames(200x200) at a playback rate per second? Thanks a lot! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Options for generating editable figures?
I have had clients who also wanted to make little changes to the graphs (mostly changing colors or line widths). Most after doing this a couple of times have been happy to give be better descriptions of what they want so I can just do it correctly the first time. I mostly give them the graphs in .wmf or .emf format, however I have found that if I create the file and send it to them, most have problems getting it into word or power point, instead I usually copy and paste it into a word document and send the word document to them, they can then copy and paste from there to their presentation or report. Of course this is only an option if you have MS word on the same computer as you are working on. With those files double clicking takes the user into a basic editor where they can change colors, line widths, etc. However, sometimes opening that editor will redo all the text, so what started as changing one line color also requires them to re orient all the axis and tick labels. Inkscape is a much more capable program for doing these kinds of edits, and for basic editing it is fairly straight forward, so for your description of options below, I would suggest that you make them learn Inkscape if they really want to edit the graphs themselves. Inkscape can also import pdf files (though it is an import rather than a simple open and you often need to ungroup a bunch of objects before editing them) so that may be another option for you. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Allen McBride Sent: Monday, January 02, 2012 7:51 PM To: r-help@r-project.org Subject: [R] Options for generating editable figures? Hello all, I'm using R to produce figures for people who want to be able to edit the figures directly, and who use PowerPoint a lot. I use a Mac, and I'd appreciate any advice about how to approach this. Here's what I've come up with so far: 1) I can use xfig() and then ask them to install Inkscape to edit the files. Downsides are no transparency and a learning curve with Inkscape. 2) I can do the same as above but with svg() instead of xfig(). But for reasons I don't understand, when I use svg() I can't seem to edit the resulting figures' text objects in Inkscape. 3) I can try to install UniConvertor, which sounds like quite a task for someone of my modest skills. This would supposedly allow me to create .wmf files, which might (and I've read conflicting things about this) be importable into PowerPoint as editable graphics. 4) I found an old suggestion in the archives that an EPS could be imported into PowerPoint and made editable. This almost worked for me (using Inkscape to convert a cairo_ps()-generated file to EPS) -- but only using PowerPoint under Windows, and lots of vectors and all text were lost along the way. Am I on the right track? Am I missing any better pathways? I know similar questions have come up before, but the discussions I found in the archives were old, and maybe things have changed in recent years. Thanks for any advice! --Allen McBride R version: 2.13.1 Platform: Mac OS 10.7.2 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Array element is function of its position in the array
Does vnew - vold[,,ks] accomplish what you want? -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Asher Meir Sent: Thursday, December 29, 2011 1:58 PM To: r-help@r-project.org Subject: [R] Array element is function of its position in the array I want to create a new array which selects values from an original array based on a function of the indices. That is: I want to create a new matrix Vnew[i,j,k]=Vold[i,j,ks] where ks is a function of the index elements i,j,k. I want to do this WITHOUT a loop. Call the function ksfunction, and the array dimensions nis,njs,nks. I can do this using a loop as follows: # Loop version: Vnew-array(NA,c(nis,njs,nks) for(i1 in 1:nis)for(j1 in 1:njs)for(k1 in 1:nks)Vnew[i1,j1,k1]-Vold[i1,k1,ksfunction(i1,j1,k1)] I already know how to create an array of the ks's: ksarray[i,j,k]=ksfunction(i,j,k) # I have the array ksarray ready I don't want a loop because nis,njs, and nks are pretty large and it takes forever. Would appreciate help with this issue. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [newbie] read row from file into vector
The scan function can be used to read a single row. If your file has multiple rows you can use the skip and nlines arguments to determine which row to read. With the what argument sent to a single item (a number or string depending on which you want) it will read each element on that row into a vector. If you want to do more of the hard work yourself you can read in a whole line as a single string using the readLines function then use the strsplit (or possibly better, tools from the gsubfun package) to split that string into a vector (the unlist function may also be of help). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Tom Roche Sent: Thursday, December 29, 2011 1:51 PM To: r-help@r-project.org Subject: [R] [newbie] read row from file into vector summary: how to read a row (not column) from a file into a vector (not a data frame)? details: I'm using $ lsb_release -ds Linux Mint Debian Edition $ uname -rv 3.0.0-1-amd64 #1 SMP Sun Jul 24 02:24:44 UTC 2011 $ R --version R version 2.14.1 (2011-12-22) I'm new to R (having previously used it only for graphics), but have worked in many other languages. I've got a CSV file from which I'd like to read the values from a single *row* into a vector. E.g., for a file such that $ head -n 2 ~/data/foo.csv | tail -n 1 5718,0.3,0.47,0,0,0,0,0,0,0,0,0.08,0.37,0,0,0.83,1.55,0,0,0,0,0,0,0,0,0 ,0.00,2.48,2.33,0.17,0,0,0,0,0,0,0.00,10.69,0.18,0,0,0,0 I'd like to be able to populate a vector 'v' s.t. v[1]=5718, ... v[43]=0 I can't seem to do that with, e.g., read.csv(...) or scan(...), both of which seem column-oriented. What am I missing? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] p values in lmer
This takes me back to listening to a professor lament about the researchers that would spend years collecting their data, then negate all that effort because they insist on using tools that are quick rather than correct. So, before dismissing the use of pvals.fnc you might ask how long it takes to run relative to how long it took to collect the data and the importance of the answer. If you feel the need to compute p-values multiple times, then you may need to rethink your approach (model selection based on repeated p-values results in p-values that are meaningless at best). If you consider the above and still feel the need for a quick p-value rather than a correct one then you can use the SnowsCorrectlySizedButOtherwiseUselessTestOfAnything function from the TeachingDemos package. It is quick (but be sure to fully read the documentation). -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of arunkumar Sent: Thursday, December 22, 2011 9:13 PM To: r-help@r-project.org Subject: [R] p values in lmer hi How to get p-values for lmer funtion other than pvals.fnc(), since it takes long time for execution - Thanks in Advance Arun -- View this message in context: http://r.789695.n4.nabble.com/p-values-in-lmer-tp4227434p4227434.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Renaming Within A Function
The error is because you are trying to assign to the result of a get call, and nobody has programmed that (hence could not find function) because it is mostly (if not completely) meaningless to do so. It is not completely clear what you want to accomplish, but there is probably a better way to accomplish it. Preferable is to create and modify the data object fully within the function then return that object (and let the caller of the function worry about assigning it). Some things to read that may be enlightening if you really feel the need to have your function modify existing objects: library(fortunes) fortune(236) And http://cran.r-project.org/doc/Rnews/Rnews_2001-3.pdf (the article in the Programmer's Niche) -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Pete Brecknock Sent: Thursday, December 22, 2011 12:15 PM To: r-help@r-project.org Subject: [R] Renaming Within A Function I am trying to rename column names in a dataframe within a function. I am seeing an error (listed below) that I don't understand. Would be grateful of an explanation of what I am doing wrong and how I should rewrite the function to allow me to be able to rename my variables. Thanks. # Test Function myfunc -function(var){ d = c(1,2,3,4,5) dts = seq(as.Date(2011-01-01),by=month,length.out=length(d)) assign(paste(var,.df,sep=), cbind(dts, d)) names(get(paste(var,.df,sep=))) - c(Dates,Data) } # Call Function myfunc(X) # Error Message Error in names(get(paste(var, .df, sep = ))) - c(Dates, Data) : could not find function get- -- View this message in context: http://r.789695.n4.nabble.com/Renaming- Within-A-Function-tp4226368p4226368.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to Fit a Set of Lines Parametrized by a Number
This looks like a hierarchical Bayes type problem. There are a few packages that do Bayes estimation or link to external tools (like openbugs) to do this. You would just set up each of the relationships like you define below, y is a function of a(k), b(k), x and e where e comes from a normal distribution with mean 0 and variance sigma2, then a(k) is the relationship that you show along with something similar for b(k), then you just need prior distributions for alpha, beta, (and the same for b(k)), and sigma2 and let it run. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Lorenzo Isella Sent: Wednesday, December 21, 2011 8:59 AM To: r-h...@stat.math.ethz.ch Subject: [R] How to Fit a Set of Lines Parametrized by a Number Dear All, It is not very difficult, in R, to perform a linear fit y=Ax+B on a single set of data. However, imagine that you have several datasets labelled by a number (real or integer does not matter) K. For each individual dataset, it would make sense to resort to a linear fit, but now A and B both depend on K. In other words you would like to fit all your data according to y=A(K)x+B(K). You already have an idea of the functional dependence of A and B on K (which involves other unknown parameters to estimate) e.g. A(K)=alpha+beta^K, with unknown parameters alpha and beta. How would you tackle this problem? On top of my head, if I have N datasets, I can only think about getting N estimates {A1,A2...AN} for the A parameter for all the N datasets by fitting them individually. I would then resort e.g. to a Levemberg-Marquardt algorithm to determine the values of alpha and beta that best fit alpha+beta^K to my set {A1,A2...AN} for the corresponding N values of K. For B(K), I would follow exactly the same procedure. Does anybody know any better method? Any suggestion is welcome. Cheers Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple plots in one subplot
Look at the layout function, it may do what you want. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of annek Sent: Thursday, December 15, 2011 11:36 PM To: r-help@r-project.org Subject: [R] Multiple plots in one subplot Hi, I making a figure with six sub-plots using par(mfcol=c(2,3)). In the last sub-plot I want to have two graphs instead of one. I have tried using par(fig=x,y,z,v) but this par seems to overwrite the first par. Is there a simple solution? Thanks! Anna -- View this message in context: http://r.789695.n4.nabble.com/Multiple- plots-in-one-subplot-tp4203525p4203525.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fundamental guide to use of numerical optimizers?
This really depends on more than just the optimizer, a lot can depend on what the data looks like and what question is being asked. In bootstrapping it is possible to get bootstrap samples for which there is no unique correct answer to converge to, for example if there is a category where there ends up being no data due to the bootstrap but you still want to estimate a parameter for that category then there are an infinite number of possible answers that are all equal in the likelihood so there will be a lack of convergence on that parameter. A stratified bootstrap or semi-parametric bootstrap can be used to avoid this problem (but may change the assumptions being made as well), or you can just throw out all those samples that don't have a full answer (which could be what your presenter did). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Paul Johnson Sent: Thursday, December 15, 2011 9:38 AM To: R-help Subject: [R] fundamental guide to use of numerical optimizers? I was in a presentation of optimizations fitted with both MPlus and SAS yesterday. In a batch of 1000 bootstrap samples, between 300 and 400 of the estimations did not converge. The authors spoke as if this were the ordinary cost of doing business, and pointed to some publications in which the nonconvergence rate was as high or higher. I just don't believe that's right, and if some problem is posed so that the estimate is not obtained in such a large sample of applications, it either means the problem is badly asked or badly answered. But I've got no traction unless I can actually do better Perhaps I can use this opportunity to learn about R functions like optim, or perhaps maxLik. From reading r-help, it seems to me there are some basic tips for optimization, such as: 1. It is wise to scale the data so that all columns have the same range before running an optimizer. 2. With estimates of variance parameters, don't try to estimate sigma directly, instead estimate log(sigma) because that puts the domain of the solution onto the real number line. 3 With estimates of proportions, estimate instead the logit, for the same reason. Are these mistaken generalizations? Are there other tips that everybody ought to know? I understand this is a vague question, perhaps the answers are just in the folklore. But if somebody has written them out, I would be glad to know. -- Paul E. Johnson Professor, Political Science 1541 Lilac Lane, Room 504 University of Kansas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nice report generator?
Duncan, If you are taking suggestions for expanding the tables package (looks great) then I would suggest some way to get the tables into MS products. If I create a full output/report myself then I am happy to work in LaTeX, but much of what I do is to produce tables and graphs to clients that don't know LaTeX and just want something that they can copy and paste into powerpoint or word. For this I have been using the R2wd package (and the wdTable function for the tables). I would love to have some toolset that I could use your tables package to create the main table, then transfer it fairly simply to word or excel. I don't care much about the fluff of how the table looks (coloring rows or columns, line widths, etc.) just getting it into a table (not just the text version). One possibility is just an as.matrix method that would produce something that I could feed to wdTable. Or just a textual representation of the table with columns separated by tabs so that it could be copied to the clipboard then pasted into excel or word (I would then let the client deal with all the tweaks on the appearance). Thanks, -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Duncan Murdoch Sent: Thursday, December 08, 2011 4:52 PM To: Tal Galili Cc: r-help Subject: Re: [R] nice report generator? On 11-12-08 1:37 PM, Tal Galili wrote: Helloe dear Duncan, Gabor, Michael and others, Do you think it could be (reasonably) possible to create a bridge between a cast_df object from the {reshape} package into a table in Duncan's new {tables} package? I'm not that familiar with the reshape package (and neither it nor reshape2 appears to have a vignette to give me an overview), so I don't have any idea if that makes sense. The table package is made to work on dataframes, and only dataframes. It converts them into matrices with lots of attributes, so that the print methods can put nice labels on. But it's strictly rectangular to rectangular in the kinds of conversions it does, and from the little I know about reshape, it works on more general arrays, converting them to and from dataframes. That would allow one to do pivot-table like operations on an object using {reshape}, and then display it (as it would have been in excel - or better) using the {tables} package. You'll have to give an example of what you want to do. Duncan Murdoch Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Thu, Dec 8, 2011 at 5:24 PM, Michaelcomtech@gmail.com wrote: Hi folks, In addition to Excel style tables, it would be great to have Excel 2010 Pivot Table in R... Any thoughts? Thanks a lot! On Thu, Dec 8, 2011 at 4:49 AM, Tal Galilital.gal...@gmail.com wrote: I think it would be *great *if an extension of Duncan's new tables package could include themes and switches as are seen in the video Gabor just linked to. Tal On Thu, Dec 8, 2011 at 6:58 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Wed, Dec 7, 2011 at 11:42 PM, Michaelcomtech@gmail.com wrote: Do you have an example...? Thanks a lot! See this video: http://www.woopid.com/video/1388/Format-as-Table -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] map at fips level using multiple variables
Colors probably are not the best for so many levels and combinations. Look at the symbols function (or the my.symbols and subplot functions in the TeachingDemos package) for ways to add symbols to a map showing multiple variables. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of bby2...@columbia.edu Sent: Wednesday, December 07, 2011 9:14 PM To: David Winsemius Cc: r-help@r-project.org Subject: Re: [R] map at fips level using multiple variables Hi David, Sorry it sounds vague. Here is my current code, which gives the distribution of family size at US county level. You will see on a US map family size distribution represented by different colors. Now if i have another variable income, which has 3 categories(50k, 50k-80k,80k). How do I show 5x3 categories on the map? I guess I could always create a third variable to do this. Just wondering maybe there is a function to do this readily? Thank you! Bonnie Yuan y=data1$size x11() hist(y,nclass=15) # histogram of y fivenum(y) # specify the cut-off point of y y.colorBuckets=as.numeric(cut(y, c(1,2, 3,6))) # legend showing the cut-off points. legend.txt=c(0-1,1-2,2-3,3-6,6) colorsmatched=y.colorBuckets[match(county.fips$fips,fips[,1])] x11() map(county, col = colors[colorsmatched], fill = TRUE, resolution = 0) map(state, col = white, fill = FALSE, add = TRUE, lty = 1, lwd = 0.2) title(Family Size) legend(bottom, legend.txt, horiz = TRUE, fill = colors, cex=0.7) Quoting David Winsemius dwinsem...@comcast.net: On Dec 7, 2011, at 6:12 PM, bby2...@columbia.edu wrote: Hi, I just started playing with county FIPS feature in maps package which allows geospatial visualization of variables on US county level. Pretty cool. Got code? I did some search but couldn't find answer to this question--how can I map more than 2 variables on US map? 2 variables is a bit on the vague side for programming purposes. For example, you can map by the breakdown of income or family size. How do you further breakdown based on the values of both variables and show them on the county FIPS level? breakdown suggests a factor construct. If so, then : ?interaction But the show part of the question remains very vague. Can't you be a bit more specific? What DO you want? -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nice report generator?
Richard, I have looked at SWord before, but to my knowledge it does not deal directly with the tabular objects created by the tables package (please correct me if I am wrong). These objects do have a matrix as the main data, but the attributes are different from the usual dimnames. There are functions in tables that will then take this structure and print it out in a nice way with the headers, or work with the latex function to create a nice table for LaTeX, but there are not (yet) tools for doing this in MS products. SWord and R2wd (and other tools) could transfer the data just fine, but then the column and row names/headers would still need to be put in by hand, which negates the whole convenience of using these tools. One option would be to have a function in tables that converted to a regular matrix with some form of meaningful dimnames that could then be used with R2wd or SWord or odfWeave or other tools (that was one of my suggestions). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 From: Richard M. Heiberger [mailto:r...@temple.edu] Sent: Wednesday, December 14, 2011 11:20 AM To: Greg Snow Cc: Duncan Murdoch; Tal Galili; r-help Subject: Re: [R] nice report generator? Greg, Please look at the SWord package. This package integrates MS Word with R in a manner similar to the SWeave integration of LaTeX with R. Download SWord from rcom.univie.ac.athttp://rcom.univie.ac.at If you have a recent download of RExcel from the RAndFriends installer, then you will already have SWord on your machine. Rich On Wed, Dec 14, 2011 at 12:39 PM, Greg Snow greg.s...@imail.orgmailto:greg.s...@imail.org wrote: Duncan, If you are taking suggestions for expanding the tables package (looks great) then I would suggest some way to get the tables into MS products. If I create a full output/report myself then I am happy to work in LaTeX, but much of what I do is to produce tables and graphs to clients that don't know LaTeX and just want something that they can copy and paste into powerpoint or word. For this I have been using the R2wd package (and the wdTable function for the tables). I would love to have some toolset that I could use your tables package to create the main table, then transfer it fairly simply to word or excel. I don't care much about the fluff of how the table looks (coloring rows or columns, line widths, etc.) just getting it into a table (not just the text version). One possibility is just an as.matrix method that would produce something that I could feed to wdTable. Or just a textual representation of the table with columns separated by tabs so that it could be copied to the clipboard then pasted into excel or word (I would then let the client deal with all the tweaks on the appearance). Thanks, -Original Message- From: r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org] On Behalf Of Duncan Murdoch Sent: Thursday, December 08, 2011 4:52 PM To: Tal Galili Cc: r-help Subject: Re: [R] nice report generator? On 11-12-08 1:37 PM, Tal Galili wrote: Helloe dear Duncan, Gabor, Michael and others, Do you think it could be (reasonably) possible to create a bridge between a cast_df object from the {reshape} package into a table in Duncan's new {tables} package? I'm not that familiar with the reshape package (and neither it nor reshape2 appears to have a vignette to give me an overview), so I don't have any idea if that makes sense. The table package is made to work on dataframes, and only dataframes. It converts them into matrices with lots of attributes, so that the print methods can put nice labels on. But it's strictly rectangular to rectangular in the kinds of conversions it does, and from the little I know about reshape, it works on more general arrays, converting them to and from dataframes. That would allow one to do pivot-table like operations on an object using {reshape}, and then display it (as it would have been in excel - or better) using the {tables} package. You'll have to give an example of what you want to do. Duncan Murdoch Contact Details:--- Contact me: tal.gal...@gmail.commailto:tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.comhttp://www.talgalili.com/ (Hebrew) | www.biostatistics.co.ilhttp://www.biostatistics.co.il/ (Hebrew) | www.r-statistics.comhttp://www.r-statistics.com/ (English) -- On Thu, Dec 8, 2011 at 5:24 PM, Michaelcomtech@gmail.commailto:comtech@gmail.com wrote: Hi folks, In addition to Excel style tables, it would be great to have Excel 2010 Pivot Table in R... Any thoughts? Thanks a lot! On Thu, Dec 8, 2011 at 4:49 AM, Tal Galilital.gal
Re: [R] nice report generator?
There is also the problem that SWord's license does not allow for commercial use. R2wd, write.table, and odfWeave don't have this restriction. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 From: Richard M. Heiberger [mailto:r...@temple.edu] Sent: Wednesday, December 14, 2011 11:20 AM To: Greg Snow Cc: Duncan Murdoch; Tal Galili; r-help Subject: Re: [R] nice report generator? Greg, Please look at the SWord package. This package integrates MS Word with R in a manner similar to the SWeave integration of LaTeX with R. Download SWord from rcom.univie.ac.athttp://rcom.univie.ac.at If you have a recent download of RExcel from the RAndFriends installer, then you will already have SWord on your machine. Rich On Wed, Dec 14, 2011 at 12:39 PM, Greg Snow greg.s...@imail.orgmailto:greg.s...@imail.org wrote: Duncan, If you are taking suggestions for expanding the tables package (looks great) then I would suggest some way to get the tables into MS products. If I create a full output/report myself then I am happy to work in LaTeX, but much of what I do is to produce tables and graphs to clients that don't know LaTeX and just want something that they can copy and paste into powerpoint or word. For this I have been using the R2wd package (and the wdTable function for the tables). I would love to have some toolset that I could use your tables package to create the main table, then transfer it fairly simply to word or excel. I don't care much about the fluff of how the table looks (coloring rows or columns, line widths, etc.) just getting it into a table (not just the text version). One possibility is just an as.matrix method that would produce something that I could feed to wdTable. Or just a textual representation of the table with columns separated by tabs so that it could be copied to the clipboard then pasted into excel or word (I would then let the client deal with all the tweaks on the appearance). Thanks, -Original Message- From: r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org] On Behalf Of Duncan Murdoch Sent: Thursday, December 08, 2011 4:52 PM To: Tal Galili Cc: r-help Subject: Re: [R] nice report generator? On 11-12-08 1:37 PM, Tal Galili wrote: Helloe dear Duncan, Gabor, Michael and others, Do you think it could be (reasonably) possible to create a bridge between a cast_df object from the {reshape} package into a table in Duncan's new {tables} package? I'm not that familiar with the reshape package (and neither it nor reshape2 appears to have a vignette to give me an overview), so I don't have any idea if that makes sense. The table package is made to work on dataframes, and only dataframes. It converts them into matrices with lots of attributes, so that the print methods can put nice labels on. But it's strictly rectangular to rectangular in the kinds of conversions it does, and from the little I know about reshape, it works on more general arrays, converting them to and from dataframes. That would allow one to do pivot-table like operations on an object using {reshape}, and then display it (as it would have been in excel - or better) using the {tables} package. You'll have to give an example of what you want to do. Duncan Murdoch Contact Details:--- Contact me: tal.gal...@gmail.commailto:tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.comhttp://www.talgalili.com/ (Hebrew) | www.biostatistics.co.ilhttp://www.biostatistics.co.il/ (Hebrew) | www.r-statistics.comhttp://www.r-statistics.com/ (English) -- On Thu, Dec 8, 2011 at 5:24 PM, Michaelcomtech@gmail.commailto:comtech@gmail.com wrote: Hi folks, In addition to Excel style tables, it would be great to have Excel 2010 Pivot Table in R... Any thoughts? Thanks a lot! On Thu, Dec 8, 2011 at 4:49 AM, Tal Galilital.gal...@gmail.commailto:tal.gal...@gmail.com wrote: I think it would be *great *if an extension of Duncan's new tables package could include themes and switches as are seen in the video Gabor just linked to. Tal On Thu, Dec 8, 2011 at 6:58 AM, Gabor Grothendieck ggrothendi...@gmail.commailto:ggrothendi...@gmail.com wrote: On Wed, Dec 7, 2011 at 11:42 PM, Michaelcomtech@gmail.commailto:comtech@gmail.com wrote: Do you have an example...? Thanks a lot! See this video: http://www.woopid.com/video/1388/Format-as-Table -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.comhttp://gmail.com/ __ R-help@r-project.orgmailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman
Re: [R] axis thickness in plot()
Often when someone wants lines (axes) in R plots to be thicker or thinner it is because they are producing the plots at the wrong size, then changing the size of the plot in some other program (like MSword) and the lines do not look as nice. If this is your case, then the better approach is to produce the original graph at the appropriate size, then you don't need to worry about the effects of resizing. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of AlexC Sent: Tuesday, December 06, 2011 9:35 AM To: r-help@r-project.org Subject: [R] axis thickness in plot() Hello, I am trying to increase the thickness of the axis in plot() without reverting to the use of paint programs i see posts on that topic for the xyplot function but want to see if i can do it with plot() because i've already setup my graph script using that i thought i could use axis() function and specify lwd=thickness or lwd.axis= but that does not work like it does for lwd.ticks If anyone has an idea, sincerely heres the script windows(width=7,height=7) plot(data$Winter,data$NbFirstBroods,ylab=number of breeding pairs,xlab=winter harshness,cex=1.5,cex.lab=1.5,cex.axis=1.5,font.axis=2,axes=FALSE) points(data$Winter,data$NbFirstBroods,cex=1.5,col=black,pch=19) abline(lm(data$NbFirstBroods~data$Winter),col=red,lwd=4) i tried axis(1, lwd.axis = 3,lwd.ticks=3) for example also when adding the y axis axis(2...) x and y axes are disconnected Thank you for your kind help in advance, Alexandre -- View this message in context: http://r.789695.n4.nabble.com/axis- thickness-in-plot-tp4165430p4165430.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Change the limits of a plot a posteriori
The zoomplot function in the TeachingDemos package can be used for this (it actually redoes the entire plot, but with new limits). This will generally work for a quick exploration, but for quality plots it is suggested to create the 1st plot with the correct range to begin with. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of jcano Sent: Thursday, December 01, 2011 11:12 AM To: r-help@r-project.org Subject: [R] Change the limits of a plot a posteriori Hi all How can I change the limits (xlim or ylim) in a plot that has been already created? For example, consider this naive example curve(dbeta(x,2,4)) curve(dbeta(x,8,13),add=T,col=2) When adding the second curve, it goes off the original limits computed by R for the first graph, which are roughly, c(0,2.1) I know two obvious solutions for this, which are: 1) passing a sufficiently large parameter e.g. ylim=c(0,4) to the first graphic curve(dbeta(x,2,4),ylim=c(0,4)) curve(dbeta(x,8,13),add=T,col=2) or 2) switch the order in which I plot the curves curve(dbeta(x,8,13),col=2) curve(dbeta(x,2,4),add=T) but I guess if there is any way of adjusting the limits of the graphic a posteriori, once you have a plot with the undesired limits, forcing R to redraw it with the new limits, but without having to execute again the curve commands Hope I made myself clear Best regards and thank you very much in advance -- View this message in context: http://r.789695.n4.nabble.com/Change-the- limits-of-a-plot-a-posteriori-tp4129750p4129750.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rearrange set of items randomly
If you don't want to go with the simple method mentioned by David and Ted, or you just want some more theory, you can check out: http://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle and implement that. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of flokke Sent: Monday, November 07, 2011 2:09 PM To: r-help@r-project.org Subject: [R] rearrange set of items randomly Dear all, I hope that this question is not too weird, I will try to explain it as good as I can. I have to write a function for a school project and one limitation is that I may not use the in built function sample() At one point in the function I would like to resample/rearrange the items of my sample (so I would want to use sample, but I am not allowed to do so), so I have to come up with sth else that does the same as the in built function sample() The only thing that sample() does is rearranging the items of a sample, so I searched the internet for a function that does that to be able to use it, but I cannot find anything that could help me. Can maybe someone help me with this? I would be very grateful, Cheers, Maria -- View this message in context: http://r.789695.n4.nabble.com/rearrange- set-of-items-randomly-tp4013723p4013723.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] drawing ellipses in R
Those formulas are the standard way to convert from polar coordinates to Euclidean coordinates. The polar coordinates are 'r' which is the radius or distance from the center point and 'theta' which is the angle (0 is pointing in the positive x direction). If r is constant and theta coveres a full cycle then you will get a circle. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of mms...@comcast.net Sent: Monday, October 31, 2011 10:50 PM To: r-h...@stat.math.ethz.ch Subject: [R] drawing ellipses in R Hello, I have been following the thread dated Monday, October 9, 2006 when Kamila Naxerova asked a question about plotting elliptical shapes. Can you explain the equations for X and Y. I believe they used the parametric form of x and y (x=r cos(theta), y=r sin(theta). I don't know what r is here ? Can you explain 1)the origin of these equations and 2) what is r? Sincerely, Mary A. Marion [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multivariate random variable
Your question is a bit too general to give a useful answer. One possible answer to your question is: mrv - matrix( runif(1000), ncol=10 ) Which generates multivariate random observations, but is unlikely to be what you are really trying to accomplish. There are many tools for generating multivariate random data including Metropolis-Hastigns, Gibbs sampling, rejection sampling, conditional generation, copulas, and many others, which one will be best (or which combination will be best) depends on what you are actually trying to accomplish. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Anera Salucci Sent: Tuesday, November 01, 2011 5:22 AM To: r-help@r-project.org Subject: [R] multivariate random variable Dear All, How can I generate multivariate random variable (not multivariate normal ) I am in urgent [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Export to .txt
Look at the txtStart function in the TeachingDemos package. It works like sink but also includes commands as well as output. Though I have never tried it with browser() (and it does not always include the results of errors). Another option in to use some type of editor that links with R such as emacs/ESS or tinn-R (or other) and then save the entire transcript. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of stat.kk Sent: Tuesday, November 01, 2011 4:15 PM To: r-help@r-project.org Subject: [R] Export to .txt Hi, I would like to export all my workspace (even with the evaluation of commands) to the text file. I know about the sink() function but it doesnt work as I would like. My R-function looks like this: there are instructions for user displayed by cat() command and browser() commands for fulfilling them. While using the sink() command the instructions dont display :( Can anyone help me with a equivalent command to File - Save to file... option? Thank you very much. -- View this message in context: http://r.789695.n4.nabble.com/Export-to-txt-tp3965699p3965699.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subplot strange behavoir
I see the problem, I fixed this bug for version 2.8 of TeachingDemos, but have not submitted the new version to CRAN yet (I thought that I had fixed this earlier, but apparently it is still only in the development version). An easy fix is to install version 2.8 from R-forge (install.packages(TeachingDemos, repos=http://R-Forge.R-project.org;) and then it should work for you. Sorry about not seeing this earlier. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of emorway Sent: Friday, October 21, 2011 4:21 PM To: r-help@r-project.org Subject: Re: [R] subplot strange behavoir Hello Dr. Snow, With regard to your response from earlier this month: When I copy and paste your code I get what is expected, the 2 subplots line up on the same y-value. What version of R are you using, which version of subplot? What platform? I'm still troubled by the fact that layout and subplot (from TeachingDemos) are not playing nicely together on my machine. sessionInfo(): sessionInfo() #R version 2.13.2 (2011-09-30) #Platform: x86_64-pc-mingw32/x64 (64-bit) #other attached packages: #[1] TeachingDemos_2.7 I'd really like to get this working on my machine as it seems to be working on yours. While I previously tried a simply example for the initial forum post, I'm curious if the real plot I'm trying to make works on your machine. Should you happen to have a spare moment and I'm not pushing my luck, I've attached 4 small data files, 1 text file containing the R commands I'm trying to run (including 'layout' and 'subplot' called R_Commands_Plot_MT3D_Analytical_Comparison_For_Paper.txt), and the incorrect tiff output I'm getting on my machine. I've directed all paths in the R code to c:/temp/ so everything should quickly work if files are dropped there. Should it work on your machine as we would expect, does anything come to mind for how to fix it on my machine? Very Respectfully, Eric http://r.789695.n4.nabble.com/file/n3926941/AnalyticalDissolvedSorbedCo ncAt20PoreVols.txt AnalyticalDissolvedSorbedConcAt20PoreVols.txt http://r.789695.n4.nabble.com/file/n3926941/AnalyticalEffluentConcentra tion.txt AnalyticalEffluentConcentration.txt http://r.789695.n4.nabble.com/file/n3926941/Conc_Breakthru_at_100cm.txt Conc_Breakthru_at_100cm.txt http://r.789695.n4.nabble.com/file/n3926941/Conc_Profile_20T.txt Conc_Profile_20T.txt http://r.789695.n4.nabble.com/file/n3926941/R_Commands_Plot_MT3D_Analyt ical_Comparison_For_Paper.txt R_Commands_Plot_MT3D_Analytical_Comparison_For_Paper.txt http://r.789695.n4.nabble.com/file/n3926941/NonEquilibrium_ForPaper.tif NonEquilibrium_ForPaper.tif -- View this message in context: http://r.789695.n4.nabble.com/subplot- strange-behavoir-tp3875917p3926941.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Irregular 3d objects with rgl
You could use the rgl package and plot a sprite at each of your points with the color based on the concentration: plume$col - cut(plume$conc, c(-1,0.01,0.02,0.3,0.7,1), labels=c('blue','green','yellow','orange','red')) plume2 - plume theta - atan2(plume2$y-mean(plume2$y), plume2$x-mean(plume2$x)) slice - pi/4 theta theta 3*pi/4 plume2$y[slice] - plume2$y[slice] + 3 library(rgl) open3d() sprites3d( plume2$x, plume2$y, plume2$z, color=as.character(plume$col), lit=FALSE, radius=1) It looks better with more points in each direction and a smaller radius on the sprites. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of emorway Sent: Friday, October 14, 2011 6:15 PM To: r-help@r-project.org Subject: [R] Irregular 3d objects with rgl Hello, While exploring if rgl is along the lines of what I need, I checked out demo(rgl) and didn't find quite what I'm looking for, and am therefore seeking additional help/suggestions. The application is geared towards displaying a 3D rendering of a contaminant plume in the subsurface with the following highlights: Once the plume was rendered as a 3D object, a pie-like wedge could be removed (or cut away) exposing the higher concentrations within the plume as 'hotter' colors. About the closest example I could find is here: http://mclaneenv.com/graphicdownloads/plume.jpg Whereas this particular rendering shows a bullet-like object where 3/4 of the object is removed, I would like to try and show something where 3/4 of the object remains, and where the object has been cut away the colors would show concentrations within the plume, just as in the example referenced above. It would seem most software capable of this type of thing is proprietary (and perhaps for good reason if it is a difficult problem to solve). I've put together a very simple 6x6x6 cube with non-zero values internal to it representing the plume. I wondering if an isosurface where conc = 0.01 can be rendered in 3D and then if a bite or wedge can be removed from the 3d object exposing the higher concentrations inside as discussed above? x-c(1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6) y-c(1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4,5,5,5,5,5,5,6,6,6,6,6,6,1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4,5,5,5,5,5,5,6,6,6,6,6,6,1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4,5,5,5,5,5,5,6,6,6,6,6,6,1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4,5,5,5,5,5,5,6,6,6,6,6,6,1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4,5,5,5,5,5,5,6,6,6,6,6,6,1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4,5,5,5,5,5,5,6,6,6,6,6,6) z-c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6) conc-c(0,0,0,0,0,0,0,0,0.1,0.1,0,0,0,0.1,1,1,0.1,0,0,0.1,0.5,1,0.1,0,0,0,0.2,0.2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.05,0.1,0,0,0,0.05,0.8,0.8,0.05,0,0,0.05,0.4,0.8,0.05,0,0,0,0.1,0.1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.05,0,0,0,0,0.6,0.6,0.02,0,0,0,0.2,0.5,0.02,0,0,0,0.05,0.05,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.05,0.2,0,0,0,0,0,0.05,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.02,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0) plume-data.frame(cbind(x=x,y=y,z=z,conc=conc)) if it helps to view the concentrations in layer by layer tabular form: Layer 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.10 0.10 0.00 0.00 0.00 0.10 1.00 0.50 0.20 0.00 0.00 0.10 1.00 1.00 0.20 0.00 0.00 0.00 0.10 0.10 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Layer 2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.05 0.05 0.00 0.00 0.00 0.05 0.80 0.40 0.10 0.00 0.00 0.10 0.80 0.80 0.10 0.00 0.00 0.00 0.05 0.05 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Layer 3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.60 0.20 0.05 0.00 0.00 0.05 0.60 0.50 0.05 0.00 0.00 0.00 0.02 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Layer 4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.05 0.00 0.00 0.00 0.00 0.00 0.20 0.05 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Layer 5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Re: [R] US States percentage change plot
Unless your audience is mainly interested in Texas and California and is completely content to ignore Rhode Island, then I would suggest that you look at the state.vbm map in the TeachingDemos package that works with the maptools package. The example there shows coloring based on a variable. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Michael Charles Bailey I Sent: Wednesday, October 12, 2011 6:46 PM To: r-help@r-project.org Subject: [R] US States percentage change plot Hi, I would like to make a plot of the US states (or lower 48) that are colored based upon a percentage change column. Ideally, it would gradually be more blue the larger the positive change, and more red the more negative is the change. The data I have looks like: State Percent.Change 1Alabama0.004040547 2 Alaska -0.000202211 3Arizona -0.002524567 4 Arkansas -0.008525333 5 California0.001828754 6 Colorado0.06150 I have read help for the maps library and similar plots online but can't grasp how to map the percentage.change column to the map. thank in advance, Michael Bailey [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] monotonic factors
One approach would be to code dummy variables for your factor levels, have d1 equal to 0 for 'low' and 1 for 'med' and 'high', then have d2 equal to 1 for 'high' and 0 otherwise. For linear regression there are functions that will fit a model with all non-negative coefficients, but I don't know of anything like that for glms, so one option is to fit with all the dummy variables, then if any of the estimated coefficients are negative remove that variable (force it to 0) and refit. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Jeffrey Pollock Sent: Wednesday, October 12, 2011 6:29 AM To: r-help@r-project.org Subject: [R] monotonic factors Hello all, I have an ordered factor that I would like to include in the linear predictor of a binomial glm, where the estimated coefficients are constrained to be monotonic. Does anyone know how to do this? I've tried using an ordered factor but this does not have the desired effect, an (artificial) example of this follows; n - 100 strings - sample(c(low, med, high), n, TRUE) x.ordered - ordered(strings, c(low, med, high)) x.unordered - factor(strings) pr - ifelse(strings == low, 0.4, ifelse(strings == med, 0.3, 0.2)) y - rbinom(n, 1, pr) mod.ordered - glm(y ~ x.ordered, binomial) mod.unordered - glm(y ~ x.unordered, binomial) summary(mod.ordered) summary(mod.unordered) - ** Confidentiality: The contents of this e-mail and any attachments transmitted with it are intended to be confidential to the intended recipient; and may be privileged or otherwise protected from disclosure. If you are not an intended recipient of this e-mail, do not duplicate or redistribute it by any means. Please delete it and any attachments and notify the sender that you have received it in error. This e-mail is sent by a William Hill PLC group company. The William Hill group companies include, among others, William Hill PLC (registered number 4212563), William Hill Organization Limited (registered number 278208), William Hill Credit Limited (registered number 413846), WHG (International) Limited (registered number 99191) and WHG Trading Limited (registered number 101439). Each of William Hill PLC, William Hill Organization Limited and William Hill Credit Limited is registered in Engl! and and Wales and has its registered office at Greenside House, 50 Station Road, Wood Green, London N22 7TP. Each of WHG (International) Limited and WHG Trading Limited is registered in Gibraltar and has its registered office at 6/1 Waterport Place, Gibraltar. Unless specifically indicated otherwise, the contents of this e-mail are subject to contract; and are not an official statement, and do not necessarily represent the views, of William Hill PLC, its subsidiaries or affiliated companies. Please note that neither William Hill PLC, nor its subsidiaries and affiliated companies can accept any responsibility for any viruses contained within this e-mail and it is your responsibility to scan any emails and their attachments. William Hill PLC, its subsidiaries and affiliated companies may monitor e-mail traffic data and also the content of e-mails for effective operation of the e-mail system, or for security, purposes. * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Chi-Square test and survey results
The chisq.test function is expecting a contingency table, basically one column should have the count of respondents and the other column should have the count of non-respondents (yours looks like it is the total instead of the non-respondents), so your data is wrong to begin with. A significant chi-square here just means that the proportion responding differs in some of the regions, that does not mean that the sample is representative (or not representative). What is more important (and not in the data or standard tests) is if there is a relationship between why someone chose to respond and the outcomes of interest. If you are concerned with different proportions responding then you could do post-stratification to correct for the inequality when computing other summaries or tests (though region 6 will still give you problems, you will need to make some assumptions, possibly combine it with another region that is similar). Throwing away data is rarely, if ever, beneficial. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of ghe...@mathnmaps.com Sent: Tuesday, October 11, 2011 1:32 PM To: r-help@r-project.org Subject: [R] Chi-Square test and survey results An organization has asked me to comment on the validity of their recent all-employee survey. Survey responses, by geographic region, compared with the total number of employees in each region, were as follows: ByRegion All.Employees Survey.Respondents Region_1735142 Region_2500 83 Region_3897 78 Region_4717133 Region_5167 48 Region_6309 0 Region_7806125 Region_8627122 Region_9858177 Region_10 851160 Region_11 336 52 Region_12 1823312 Region_1380 9 Region_14 774121 Region_15 561 24 Region_16 834134 How well does the survey represent the employee population? Chi-square test says, not very well: chisq.test(ByRegion) Pearson's Chi-squared test data: ByRegion X-squared = 163.6869, df = 15, p-value 2.2e-16 By striking three under-represented regions (3,6, and 15), we get a more reasonable, although still not convincing, result: chisq.test(ByRegion[setdiff(1:16,c(3,6,15)),]) Pearson's Chi-squared test data: ByRegion[setdiff(1:16, c(3, 6, 15)), ] X-squared = 22.5643, df = 12, p-value = 0.03166 This poses several questions: 1) Looking at a side-by-side barchart (proportion of responses vs. proportion of employees, per region), the pattern of survey responses appears, visually, to match fairly well the pattern of employees. Is this a case where we trust the numbers and not the picture? 2) Part of the problem, ironically, is that there were too many responses to the survey. If we had only one-tenth the responses, but in the same proportions by region, the chi-square statistic would look much better, (though with a warning about possible inaccuracy): data: data.frame(ByRegion$All.Employees, 0.1 * (ByRegion$Survey.Respondents)) X-squared = 17.5912, df = 15, p-value = 0.2848 Is there a way of reconciling a large response rate with an unrepresentative response profile? Or is the bad news that the survey will give very precise results about a very ill-specified sub-population? (Of course, I would put in softer terms, like you need to assess the degree of homogeneity across different regions .) 3) Is Chi-squared really the right measure of how representative is the survey? Thanks for any help you can give - hope these questions make sense - George H. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stop()
Replace stop() with break to see if that does what you want. (you may also want to include cat() or warn() to indicate the early stopping. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Doran, Harold Sent: Tuesday, October 11, 2011 11:32 AM To: r-help@r-project.org Subject: [R] stop() Suppose I have a function, such as the toy example below: myFun - function(x, max.iter = 5) { for(i in 1:10){ result - x + i iter - i if(iter == max.iter) stop('Max reached') } result } I can of course do this: myFun(10, max.iter = 11) However, if I reach the maximum number of iterations before my algorithm has finished (in my real application there are EM steps for a mixed model), I actually want the function to return the value of result up to that point. Currently using stop(), I would get myFun(10, max.iter = 4) Error in myFun(10, max.iter = 4) : Max reached But, in this toy case the function should return the value of result up to iteration 4. Not sure how I can adjust this. Thanks, Harold [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to draw 4 random weights that sum up to 1?
You probably want to generate data from a Dirichlet distribution. There are some functions in packages that will do this and give you more background, or you can just generate 4 numbers from an exponential (or gamma) distribution and divide them by their sum. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Alexander Engelhardt Sent: Monday, October 10, 2011 10:11 AM To: r-help Subject: [R] How to draw 4 random weights that sum up to 1? Hey list, This might be a more general question and not that R-specific. Sorry for that. I'm trying to draw a random vector of 4 numbers that sum up to 1. My first approach was something like a - runif(1) b - runif(1, max=1-a) c - runif(1, max=1-a-b) d - 1-a-b-c but this kind of distorts the results, right? Would the following be a good approach? w - sample(1:100, 4, replace=TRUE) w - w/sum(w) I'd prefer a general algorithm-kind of answer to a specific R function (if there is any). Although a function name would help too, if I can sourcedive. Thanks in advance, Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to draw 4 random weights that sum up to 1?
As an interesting extension to David's post, try: M4.e - matrix(rexp(4,1), ncol=4) Instead of the uniform and rerun the rest of the code (note the limits on the x-axis). With 3 dimensions and the restriction we can plot in 2 dimensions to compare: library(TeachingDemos) m3.unif - matrix(runif(3000), ncol=3) m3.unif - m3.unif/rowSums(m3.unif) m3.exp - matrix(rexp(3000,1), ncol=3) m3.exp - m3.exp/rowSums(m3.exp) dev.new() triplot(m3.unif) dev.new() triplot(m3.exp) now compare the 2 plots on the density of the points near the corners. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of David Winsemius Sent: Monday, October 10, 2011 12:05 PM To: Uwe Ligges Cc: r-help; Alexander Engelhardt Subject: Re: [R] How to draw 4 random weights that sum up to 1? On Oct 10, 2011, at 12:44 PM, Uwe Ligges wrote: On 10.10.2011 18:10, Alexander Engelhardt wrote: Hey list, This might be a more general question and not that R-specific. Sorry for that. I'm trying to draw a random vector of 4 numbers that sum up to 1. My first approach was something like a - runif(1) b - runif(1, max=1-a) c - runif(1, max=1-a-b) d - 1-a-b-c but this kind of distorts the results, right? Would the following be a good approach? w - sample(1:100, 4, replace=TRUE) w - w/sum(w) Yes, although better combine both ways to w - runif(4) w - w / sum(w) For the non-statisticians in the audience like myself who didn't know what that distribution might look like (it being difficult to visualize densities on your 3-dimensional manifold in 4-space), here is my effort to get an appreciation: M4 - matrix(runif(4), ncol=4) M4 - M4/rowSums(M4) # just a larger realization of Ligges' advice colMeans(M4) [1] 0.2503946 0.2499594 0.2492118 0.2504342 plot(density(M4[,1])) lines(density(M4[,2]),col=red) lines(density(M4[,3]),col=blue) lines(density(M4[,4]),col=green) plot(density(rowSums(M4[,1:2]))) plot(density(rowSums(M4[,1:3]))) plot(density(rowSums(M4[,2:4]))) # rather kewl results, noting that these are a reflecion around 0.5 of the single vector densities. Uwe Ligges I'd prefer a general algorithm-kind of answer to a specific R function (if there is any). Although a function name would help too, if I can sourcedive. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subplot strange behavoir
When I copy and paste your code I get what is expected, the 2 subplots line up on the same y-value. What version of R are you using, which version of subplot? What platform? -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of emorway Sent: Wednesday, October 05, 2011 1:40 PM To: r-help@r-project.org Subject: [R] subplot strange behavoir Hello, Below is some example code that should reproduce an error I'm encountering while trying to create a tiff plot with two subplots. If I run just the following bit of code through the R GUI the result is what I'd like to have appear in the saved tiff image: x-seq(0:20) y-c(1,1,2,2,3,4,5,4,3,6,7,1,1,2,2,3,4,5,4,3,6) plot(x,y,type=l,las=1,ylim=c(0,12)) subplot(edm.sub(x[seq(1:5)],y[seq(1:5)]),x=4,y=9,size=c(1,1.5)) subplot(edm.sub(x[seq(15,20,by=1)],y[seq(15,20,by=1)]),x=17,y=9,size=c( 1,1.5)) However, if expanding on this code with: edm.sub-function(x,y){plot(x,y,col=red,frame.plot=F, las=1,xaxs=i,yaxs=i,type=b, ylim=c(0,6),xlab=,ylab=)} png(c:/temp/lookat.tif,res=120,height=600,width=1200) layout(matrix(c(1,2),2,2,byrow=TRUE),c(1.5,2.5),respect=TRUE) plot(seq(1:10),seq(1:10),type=l,las=1,col=blue) plot(x,y,type=l,las=1,ylim=c(0,12)) subplot(edm.sub(x[seq(1:5)],y[seq(1:5)]),x=4,y=9,size=c(1,1.5)) subplot(edm.sub(x[seq(15,20,by=1)],y[seq(15,20,by=1)]),x=17,y=9,size=c( 1,1.5)) dev.off() One will notice the second subplot is out of position (notice the y-coordinate is the same for both subplots...y=9): http://r.789695.n4.nabble.com/file/n3875917/lookat.png If I try to 'guess' a new y-coordinate for the second subplot, say y=10: png(c:/temp/lookat.tif,res=120,height=600,width=1200) layout(matrix(c(1,2),2,2,byrow=TRUE),c(1.5,2.5),respect=TRUE) plot(seq(1:10),seq(1:10),type=l,las=1,col=blue) plot(x,y,type=l,las=1,ylim=c(0,12)) subplot(edm.sub(x[seq(1:5)],y[seq(1:5)]),x=4,y=9,size=c(1,1.5)) subplot(edm.sub(x[seq(15,20,by=1)],y[seq(15,20,by=1)]),x=17,y=10,size=c (1,1.5)) dev.off() R kicks back the following message Error in plot.new() : plot region too large Am I mis-using subplot? Thanks, Eric -- View this message in context: http://r.789695.n4.nabble.com/subplot- strange-behavoir-tp3875917p3875917.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.