[R] Is there any performance difference between subset() and list comprehension?
Hello, Suppose that you have a data frame 'df' with variables 'V1', 'V2', 'V3', etc. Is there any (performance) difference (except the difference of the return types) between the following two computations? subset(df, V1 0, V2) and df$V2[df$V1 0] Best Regards, hyunjo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grep or other complex string matching approach to capture necessary information...
On Sep 26, 2009, at 11:40 AM, John Kane wrote: ?subset problems - c( Water damage, Water off, water pipes damaged, leaking water) damaged - subset(house_info, house_info[,1]==problems[1] | house_info[,1]==problems[2] | house_info[,1]==problems[3] | house_info[,1]==problems[4]) or am I misunderstanding the question? or perhaps %in% which probably does the job more elegantly but I forget the syntax at the moment. problems - c( Water damage, Water off, water pipes damaged, leaking water) damaged - subset(house_info, house_info[,1] %in% problems) str(damaged) 'data.frame': 49 obs. of 2 variables: $ water_evaluation.water_evaluation_selection.: Factor w/ 5 levels No water damage,..: 5 3 5 2 5 3 5 3 5 5 ... $ house_number: num 276 594 591 376 229 428 248 237 534 517 ... --- On Fri, 9/25/09, Jason Rupert jasonkrup...@yahoo.com wrote: From: Jason Rupert jasonkrup...@yahoo.com Subject: [R] grep or other complex string matching approach to capture necessary information... To: R-help@r-project.org Received: Friday, September 25, 2009, 1:58 PM Say I have the following data: house_number-floor(runif(100, 200, 600)) water_evaluation-c(No water damage, Water damage, Water On, Water off, water pipes damaged, leaking water) water_evaluation_selection-floor(runif(100, 1,6)) house_info-data.frame(water_evaluation[water_evaluation_selection], house_number) And, that I only want to pull out the ones with negative water evaluations, i.e. Water damage, water pipes damaged, and leaking water. Should/could I use grep in order to pull the house numbers out of house_info with those negative water evaluations? I guess I want to know the house numbers from house_info where the water evaluation is negative. Is there a way to use grep or another R function in order to acquire that information? Thank you again in advance for any insights. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ Be smarter than spam. See how smart SpamGuard is at giving junk email the boot with the All __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Nested select
Thanks.It works -- View this message in context: http://www.nabble.com/Nested-select-tp25608506p25622242.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Conditional operation on data frame, shift/roll of vector
Conditionally, when Ind of a certain row is 1, want to get sum or delta of Val in that row and 1 row above. Val Ind Val Ind Del 10 010 0NA 11 011 0NA 13 1 --- 13 124 or 2 16 016 0NA A simple way I guess is to get shifted vector of Val (say, c(NA, 10, 11, 13)), add to or minus from Val, then and logically AND with Ind. Which function provides the shift operation of the vector Val? Also welcomed if any better way to do this. Thanks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame's column names not the same as in CSV
See the check.names argument in the help file for read.table. On Sat, Sep 26, 2009 at 1:58 AM, Derek Foo kc.de...@gmail.com wrote: Hello, I am trying to read in a csv file with column such as \\LS01\Processor(_Total)\% Processor Time with the command read.csv(file). However, the column name in the resulted data frame is changed to X..LS01.Processor._TotalProcessor.Time. Strangely, when I experimented with just reading the csv with the head flag set to false, the text was read correctly as the same to the raw file. I am wondering if anyone has encountered a similar problem. If so, I would really appreciate if you can share your insight. Best Regards, Derek [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] synchronisation of time series data using interpolation
I created separate text files for the 2 data sets. I enter the following comands: library(zoo) library(chron) z1-read.zoo(textConnection(/path/to/test1.txt),header=FALSE,sep=,,FUN=times) z2-read.zoo(textConnection(/path/to/test2.txt),header=FALSE,sep=,,FUN=times) z3-window(na.approx(merge(z1,z2)),time(z1)) plot(z3$z1,z3$z2) Error in plot.window(xlim, ylim, log, asp, ...) : need finite 'xlim' values In addition: Warning messages: 1: no non-missing arguments to min; returning Inf in: min(x) 2: no non-missing arguments to max; returning -Inf in: max(x) 3: no non-missing arguments to min; returning Inf in: min(x) 4: no non-missing arguments to max; returning -Inf in: max(x) The resultant graph window was blank, so I entered the following command plot(z3$z1,z3$z2,xlim=c(0,100),ylim=c(0,100)) The graph window showed y axis (labelled 'z3$z1') and x axis (labelled 'Index'). I do not understand the instruction ...to use window to pick off... __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multiple comparisons for coxph survival analysis model
Hello, all R-users! I am working on fitting a survival analysis model using the coxph function for Cox proportional hazards regression model. Data look like usual: == group blockdeathcensor Group1 1 4 1 Group1 1 12 1 ... Group2 304 1 Group2 304 1 ... Group3 57 161 Group3 57 161 == And I need to compare surviving among the particular groups. Fitting works normally: cph.1 - coxph(Surv(death, censor) ~ group + cluster(block), data = seedlings) summary(cph.1) Call: coxph(formula = Surv(death, censor) ~ group+ cluster(block), data = seedlings) n= 27000 coef exp(coef) se(coef) robust se zp groupGroup2 0.436 1.55 0.0539 0.296 1.47 0.14 groupGroup3 3.048 21.06 0.0439 0.283 10.77 0.00 exp(coef) exp(-coef) lower .95 upper .95 groupGroup2 1.55 0.6467 0.865 2.76 groupGroup3 21.06 0.047512.100 36.67 Rsquare= 0.38 (max possible= 0.997 ) Likelihood ratio test= 12892 on 2 df, p=0 Wald test= 271 on 2 df, p=0 Score (logrank) test = 16164 on 2 df, p=0, Robust = 84.2 p=0 == I have obtained tests of significance for differences between the second/third group and the first (reference) group, but I want to compare each group with each other, not only all groups with the first one! So I need to use some multiple comparison methods. I have tried the multcomp library, which I normally use for glm models. But it hasn't worked: == summary(glht(cph.1, linfct=mcp(group=Tukey))) Error in glht.matrix(model = list(coefficients = c(0.435824045783883, : ‘ncol(linfct)’ is not equal to ‘length(coef(model))’ So I tried a different approach using the contrast matrix: == summary(glht(cph.1, linfct = contrMat(coef(cph.1),type=Tukey))) Simultaneous Tests for General Linear Hypotheses Multiple Comparisons of Means: Tukey Contrasts Fit: coxph(formula = Surv(death, censor) ~ group + cluster(block), data = seedlings) Linear Hypotheses: Estimate Std. Error z value Pr(|z|) groupGroup2 - groupGroup3 == 0 2.6117 0.1781 14.66 2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 === Well, I have achieved comparison between the 2nd and 3rd group, but this time the 1st group is missing. In glm the reference group is expressed as the intercept, so comparing with it is comparing with the intercept. But there is no intercept in coxph! Please, is there any way how to accomplish full multiple comparisons in coxph? Thank you in advance! Pavel Kur __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there any performance difference between subset() and list comprehension?
On Sat, 26 Sep 2009 15:26:12 +0900 You Hyun Jo youhyu...@gmail.com wrote: YHJ Is there any (performance) difference (except the difference of YHJ the return types) YHJ between the following two computations? Try it yourself. ?system.time is useful for that purpose. Stefan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with downloading workspace file from a web address
Dear All, To load a previously saved workspace, one can do the following: load(/path/to/the/saved/workspace/file) However, if the path to the saved workspace file is a web address, one gets the following error: «Error in readChar(con, 5L, useBytes = TRUE) : cannot open the connection In addition: Warning message: In readChar(con, 5L, useBytes = TRUE) : cannot open compressed file 'http://phhs80.googlepages.com/workspace20090922', probable reason 'No such file or directory'» To circumvent this problem, one can download the saved workspace file to a local folder with download.file() and the option mode=wb active. My question is: Should not load() have the same mode option so that everything could be done only with load() (and not with two instructions: downaload.file() and load())? Thanks in advance, Paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame's column names not the same as in CSV
On Sat, 26 Sep 2009 01:58:38 -0400 Derek Foo kc.de...@gmail.com wrote: DF I am trying to read in a csv file with column such as DF \\LS01\Processor(_Total)\% Processor Time with the command DF read.csv(file). However, the column name in the resulted data DF frame is changed to X..LS01.Processor._TotalProcessor.Time. Yous should maybe specify a unique separator for the columns which is not existant in your colum name strings. Otherwise things might get messed up. It is not clear what the separator is in your example since you did not show the numbers. Probably it is \ so you have to specify it as such. ?read.csv Stefan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Adding variables
Hi, For very large matrices, is this the most efficient way to add two variables together? # attach(attenu) new-rowSums(cbind(mag, station)) # Also, could I be directed to some resources for working with very large datasets? Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simulating a model
Dear Rafael, first of all, your simulation works, at least in a technical sense, so I don't understand what you mean with can't simulate it properly. Second, your SIR-based model is a quite different from the SIR models I know (e.g. http://en.wikipedia.org/wiki/SIR_Model). The R code, however, seems to be technically correct, if compared with your system of equations. To help you solving your problem, we need more information, e.g. how the equations where derived, where the parameters come from, what is the process behind, and, most important, why do you think that the outcome is wrong. In addition, I guess that your simulation time is too long, compared with the speed of the process. Try something like times = c(from=0, to=1, by=0.01) Thomas Rafael Moral wrote: Dear useRs, I have written an ecological model, based on the epidemiology SIR model. I've been trying to simulate it in R. However, I can't simulate it properly. Two guesses: my script isn't right; I'm not setting the parameters properly I have uploaded an image to the model here: http://img24.imageshack.us/img24/743/imagemutr.jpg The script I am using is as it follows: require(simecol) mod1 - new(odeModel, main = function(time, init, parms) { x - init p - parms dx1 - p[K] - p[alpha]*x[1]*x[2] - p[gamma]*x[1] dx2 - x[1]*x[2]*(p[alpha] - p[beta]) dx3 - p[beta]*x[1]*x[2] + p[gamma]*x[1] list(c(dx1, dx2, dx3)) }, times = c(from=0, to=100, by=0.1), parms = c(K=100, alpha=0.3, gamma=0.5, beta=0.2), init = c(S=500, V=100, R=0), solver = lsoda ) plot(sim(mod1)) Thanks in advance! Rafael. [[elided Yahoo spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Mixed font in lattice xyplot lables
Hi all, can anyone suggest a reason as mto why my xlab is plotting this text at oposite ends of axis. I would like to represent my lable like this: Moran's I ...but with the I in italics. For some reason they seperate and position at oposite ends of the axis?? Thank you library(lattice) dat - data.frame(x = rnorm(10),y = rnorm(10)) xyplot(y ~ x, dat,xlab=expression(Moran's ,italic(I))) -- View this message in context: http://www.nabble.com/Mixed-font-in-lattice-xyplot-lables-tp25626332p25626332.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional operation on data frame, shift/roll of vector
On Sep 26, 2009, at 11:46 AM, jiangrm wrote: Conditionally, when Ind of a certain row is 1, want to get sum or delta of Val in that row and 1 row above. Val Ind Val Ind Del 10 010 0NA 11 011 0NA 13 1 --- 13 124 or 2 16 016 0NA A simple way I guess is to get shifted vector of Val (say, c(NA, 10, 11, 13)), add to or minus from Val, then and logically AND with Ind. ?diff df1-data.frame(Val=c(10,11,13,16), Ind=c(0,0,1,0)) c(NA, diff(df1$Val))[df1$Ind==1] [1] 2 Which function provides the shift operation of the vector Val? ?[ # with a suitable index vector ?lag # for time series Also welcomed if any better way to do this. Thanks. -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] synchronisation of time series data using interpolation
Your files do not have data appropriate to your commands. Since you did not provide the data (see last line of every message to r-help) there is not much more that can be said. On Sat, Sep 26, 2009 at 4:24 AM, e-letter inp...@gmail.com wrote: I created separate text files for the 2 data sets. I enter the following comands: library(zoo) library(chron) z1-read.zoo(textConnection(/path/to/test1.txt),header=FALSE,sep=,,FUN=times) z2-read.zoo(textConnection(/path/to/test2.txt),header=FALSE,sep=,,FUN=times) z3-window(na.approx(merge(z1,z2)),time(z1)) plot(z3$z1,z3$z2) Error in plot.window(xlim, ylim, log, asp, ...) : need finite 'xlim' values In addition: Warning messages: 1: no non-missing arguments to min; returning Inf in: min(x) 2: no non-missing arguments to max; returning -Inf in: max(x) 3: no non-missing arguments to min; returning Inf in: min(x) 4: no non-missing arguments to max; returning -Inf in: max(x) The resultant graph window was blank, so I entered the following command plot(z3$z1,z3$z2,xlim=c(0,100),ylim=c(0,100)) The graph window showed y axis (labelled 'z3$z1') and x axis (labelled 'Index'). I do not understand the instruction ...to use window to pick off... __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R as a web service
Dear R-helpers, I have been inquired about the possibility of developing a web distributed scoring system: a model is created in a central location, users fill a form in their browsers, and the central server calls this model and returns a YES/NO answer to them. I am tempted into using R for this assignment. I have used Rapache for similar tasks, but I am afraid that it is too of a novelty for many backward looking IT departments. For a number of reasons, a Java based infrastructure (tomcat, web services, etc.) would be much more palatable for them. My wishlist is as follows: * Minimal infrastructure changes in case of (statistical) model updates or changes. * Solid management of concurrence, so that simultaneous connections do not interfere with each other. * Maximum efficiency so that new connections do not require a fresh R startup. Any ideas on how to achieve this? Any documentation available? Best regards, Carlos J. Gil Bellosta http://www.datanalytics.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multiple lattice, xyplot levelplot on same page
Dear R-users, I'd like to place an xyplot() at the top of a page and a levelplot() at the bottom of the same page, and have the x-axes be the same. I've come close to finding a solution through Rarchive, and can produce an upside-down version of what I'd like (levelplot() on the top - see code below). However, the following error occurs when I try and plot the xyplot() at the top: Error in prepanel.default.function(x = 0:10, y = c(0, 1, 4, 9, 16, 25, : element 1 is empty; the part of the args list of 'length' being evaluated was: (subscripts) Any pointers in the right direction would be much appreciated. #OS: Windows XP 2002 SP3; R: 2.9.2; lattice 0.17-25; latticeExtra 0.6-1 Thanks and regards, Ky ### #Rcode for xyplot and lattice plot on the same page. library(lattice) library(latticeExtra) #xyplot x1 - 0:10 x2 - x1^2 p1 - xyplot(x2 ~ x1 , par.settings = list(layout.width = list(panel=1, ylab = 2 , axis.left =1.0, left.padding=1 , ylab.axis.padding=1, axis.panel=1))) #levelplot y.df - data.frame(y1 = rep(x1, times = 3) , y2 = rep(c('E1', 'E2', 'E3'), each = length(x1)) , y3 = c(x1, x1+2, x1-1)) p2 - levelplot(y3 ~ y1*y2, data = y.df, , par.settings = list(layout.width = list(panel=1, ylab = 2 , axis.left =1.0, left.padding=1 , ylab.axis.padding=1, axis.panel=1))) #Printing the plots on the same page #This is what I found on an Rarchive post (thank-you) #it works if the levelplot (p2) is at the top of the page #i.e. update(c(p1, p2, x.same = TRUE) , layout = c(1, 2) , ylab = list(c(p1, p2) , y = c(1/4, 3/4)) , par.settings = list(layout.heights = list(panel = c(1, 1 #however, the following error appears if the order is reversed (which is what I would like) update(c(p2, p1, x.same = TRUE) , layout = c(1, 2) , ylab = list(c(p2, p1) , y = c(1/4, 3/4)) , par.settings = list(layout.heights = list(panel = c(1, 1 The following error appears: #Error in prepanel.default.function(x = 0:10, y = c(0, 1, 4, 9, 16, 25, : # element 1 is empty; # the part of the args list of 'length' being evaluated was: # (subscripts) Also, I seem to have lost control of par settings such as las = 1 #--- Dr Ky L. Mathews Co-ordinator, CIMMYT ICARDA Communications Project Research Fellow, Plant Breeding Institute, The University of Sydney, Australia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] synchronisation of time series data using interpolation
Test1 file contained data set 1, test2 contained data set 2 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] merging columns from a large data set
Hi! All I am trying to merge very large data sets. Here fullset1 contains 13 data sets. Each data set has columns which I need to merge. Here I am trying to merge columns 2,6,10,14till end for all 13 data sets in fullset1. But I am only getting 2nd column here. Rchan1 = sapply(1:length(fullset1), function(i) exprs(fullset1[[i]])[ ,2]) Rch1 = as.list(Rchan1) red1 = do.call(cbind,Rch1) Likewise, here I need to merge columns 3,7,11,15...till end for all 13 data sets in fullset1. And I am getting 3rd column only. Gchan1 = sapply(1:length(fullset1), function(i) exprs(fullset1[[i]])[ ,3]) Gch1 = as.list(Gchan1) green1 = do.call(cbind,Gch1) I am stuck! Please Help! Cheers! Amit __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] packGrob and dynamic resizing
Hi, I just tried a fourth variant, closer to what ggplot2 uses (I think): to each grob is assigned a viewport with row and column positions (in my example during their construction, with ggplot2 upon editing), and they're all plotted in a given grid.layout. The timing is poor compared to pushing and upping viewports (twice as long). Why would that be? All the best, baptiste (the full, self-contained comparison file is attached, run as: R --vanilla -f comparison.r ) # below is version 4 only makeContentInVp - function(d){ content - as.character(unlist(c(d))) nc - ncol(d) nr - nrow(d) n2nm - function(nr, nc){ expand.grid(seq(1, nr), seq(1, nc)) } vp.ind - n2nm(nr, nc) textii - function(d, gp=gpar(), name=content-label-){ function(ii) textGrob(label=d[ii], gp=gp, name=paste(name, ii, sep=), vp=viewport(layout.pos.row=vp.ind[ii, 1], layout.pos.col=vp.ind[ii, 2])) } makeOneLabel - textii(d=content, gp=gpar(col=blue)) lg - lapply(seq_along(content), makeOneLabel) list(lg=lg, nrow=nrow(d), ncol=ncol(d)) } ## table4 uses grobs that already have a viewport assigned table4 - function(content){ padding - unit(4, mm) lg - content$lg ## retrieve the widths and heights of all textGrobs wg - lapply(lg, grobWidth) # list of grob widths hg - lapply(lg, grobHeight) # list of grob heights ## concatenate this units widths.all - do.call(unit.c, wg) # all grob widths heights.all - do.call(unit.c, hg)#all grob heights ## matrix-like operations on units to define the table layout widths - colMax.units(widths.all, content$ncol) # all column widths heights - rowMax.units(heights.all, content$nrow) # all row heights vp - viewport(layout=grid.layout(content$nrow,content$ncol, w=widths+padding, h=heights+padding)) grid.draw(gTree(children=do.call(gList, lg), vp=vp)) } # uncomment for timing d - head(iris) #d - iris content2 - makeContentInVp(d) # grid.newpage() # system.time(table3(content)) ##user system elapsed ## 4.422 0.091 4.787 grid.newpage() system.time(table4(content2)) ##user system elapsed ## 8.810 0.184 9.555 2009/9/25 hadley wickham h.wick...@gmail.com: This matches my experience with ggplot2 - I have been gradually moving away from frameGrob and packGrob because doing the placement myself is much faster (and for most of the cases I'm interested in, the full power of packGrob is not needed) Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] synchronisation of time series data using interpolation
On Sat, Sep 26, 2009 at 9:08 AM, e-letter inp...@gmail.com wrote: Test1 file contained data set 1, test2 contained data set 2 Its not clear to me what you are referring to. The data in your initial post do not exhibit this problem and there is no data in any of your subsequent posts in this thread. Here is what happens when I run it with your data -- no errors: Lines1 - time,datum + 01:00:00,500 + 01:00:15,600 + 01:00:30,750 + 01:00:45,720 + 01:01:00,700 + 01:01:15,725 + 01:01:30,640 + 01:01:45,710 Lines2 - time,datum + 01:00:12,20 + 01:01:01,55 + 01:01:55,22 library(zoo) library(chron) z1 - read.zoo(textConnection(Lines1), header = TRUE, sep = ,, FUN = times) z2 - read.zoo(textConnection(Lines2), header = TRUE, sep = ,, FUN = times) z3 - window(na.approx(merge(z1, z2)), time(z1)) plot(z3$z1, z3$z2) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Spliting columns, strings or reg exp returning substrings
the colsplit function in the reshape package does this really easily. --ista -- Forwarded message -- From: Dry, Jonathan R jonathan@astrazeneca.com To: r-help@R-project.org Date: Fri, 25 Sep 2009 15:01:46 +0100 Subject: [R] Spliting columns, strings or reg exp returning substrings Currently as the first column in a data frame I have string values in the format xx_yy - I want to create a new column with just the substring xx (for each row in turn). Three possible ways to do this might be (1) split the string by '_' using strsplit and paste the first of the resulting variables into a new column, but I have been unable to do this for each row of my data frame in turn (trying to use apply); (2) split the column into two based on '_', but I am not sure if this is possible; (3) use a regular expression to return the substring up to the '_', but I am unsure how to make a regular expression return the substring it matches to in R. Any ideas on all three counts would be gratefully recieved. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding variables
Hi Jim, I might be missing something but your command gives the error: Error in rowSums(mag) : 'x' must be an array of at least two dimensions # data(attenu) attach(attenu) rowSums(mag) + rowSums(station) attenu$new-rowSums(cbind(mag, station)) # Thanks On Sat, Sep 26, 2009 at 4:30 PM, jim holtman jholt...@gmail.com wrote: Probably more efficient if you remove the 'cbind' which would create a combined matrix. Use the following: rowSums(mag) + rowSums(station) On Sat, Sep 26, 2009 at 11:16 AM, tzygmund mcfarlane tzygm...@googlemail.com wrote: Hi, For very large matrices, is this the most efficient way to add two variables together? # attach(attenu) new-rowSums(cbind(mag, station)) # Also, could I be directed to some resources for working with very large datasets? Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Function source: desired characteristics
Hi, We've been calling the function source (package base) from Tinn-R editor to send files, marked blocks and selections to R interpreter because it avoids a lot of problems related with input/output synchronization in the Rgui output. The new RGedit plugin is also using this function in this way. We (Jakson Aquino and I) are just finishing a new version of a plug in to Vim (Vim-R-plugin2) which uses also this resource. So, we would like to propose two small changes in this function: 1. The max.deparse parameter could be a global option with 150 as the default value. Why? It will avoid the need to send this parameter repeatedly, which causes visual pollution in the console. 2. A new parameter (for example: new.line.echo) to allow the user to define whether a new blank line between the output and the subsequent input is desired when echo=T. Example, suppose we have in the editor the three lines below: a=rnorm(10) a sort(a) and we would like to send it to R interpreter (file, block or selection). The current output is (using Vim-R-plugin2): --- source('/tmp/.Rsource-jcfaria', echo=TRUE, max.deparse=50) a=rnorm(10) a [1] 0.08648104 -1.74996635 0.61027538 0.42042031 -0.02025884 -0.39891256 [7] -0.30219635 -0.84476668 1.06341674 -0.12030620 sort(a) [1] -1.74996635 -0.84476668 -0.39891256 -0.30219635 -0.12030620 -0.02025884 [7] 0.08648104 0.42042031 0.61027538 1.06341674 How it could be (desired): - source('/tmp/.Rsource-jcfaria', echo=TRUE) a=rnorm(10) a [1] 0.08648104 -1.74996635 0.61027538 0.42042031 -0.02025884 -0.39891256 [7] -0.30219635 -0.84476668 1.06341674 -0.12030620 sort(a) [1] -1.74996635 -0.84476668 -0.39891256 -0.30219635 -0.12030620 -0.02025884 [7] 0.08648104 0.42042031 0.61027538 1.06341674 We think that both new.line.echo and max.deparse could be both global options. max.deparse = 150 (default) new.line.echo = FALSE (default) Why? To get a clearer output! In this way the args of this function would become: --- function (file, local = FALSE, echo = verbose, print.eval = echo, verbose = getOption(verbose), prompt.echo = getOption(prompt), - max.deparse.length = getOption(max.deparse), - new.line.echo = getOption(new.line.echo), chdir = FALSE, encoding = getOption(encoding), continue.echo = getOption(continue), skip.echo = 0, keep.source = getOption(keep.source)) The extra \n is located at line 142 of the current source function: cat(\n, dep, if (do.trunc)... For GUI/Editor developers this changes will allow to send simpler instructions and to make standard interfaces. We think a bad ideia to create a custom version of source function because these changes would be of benefit to other people and projects. Is it possible to create the new.line.echo argument and to put it and max.deparse among the global options? We will appreciate the position of users and the Core Team. All the best, -- ///\\\///\\\///\\\///\\\///\\\///\\\///\\\///\\\ Jose Claudio Faria Estatistica - prof. Titular UESC/DCET/Brasil joseclaudio.fa...@gmail.com ///\\\///\\\///\\\///\\\///\\\///\\\///\\\///\\\ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional operation on data frame, shift/roll of vector
On Sep 26, 2009, at 12:52 PM, David Winsemius wrote: On Sep 26, 2009, at 11:46 AM, jiangrm wrote: Conditionally, when Ind of a certain row is 1, want to get sum or delta of Val in that row and 1 row above. Val Ind Val Ind Del 10 010 0NA 11 011 0NA 13 1 --- 13 124 or 2 16 016 0NA A simple way I guess is to get shifted vector of Val (say, c(NA, 10, 11, 13)), add to or minus from Val, then and logically AND with Ind. ?diff df1-data.frame(Val=c(10,11,13,16), Ind=c(0,0,1,0)) c(NA, diff(df1$Val))[df1$Ind==1] [1] 2 I suppose I ought to answer the question more fully. One approach using indexing, is to use the logical vector produced by df1$Ind==1 on both sides of an assignment operation at once to determine which of the values of hte above set of differences get transfered: df1$Del[df1$Ind==1] - c(NA, diff(df1$Val))[df1$Ind==1] df1 Val Ind Del 1 10 0 NA 2 11 0 NA 3 13 1 2 4 16 0 NA ifelse might also provide a solution. Something along the lines of: df1$Del3 - ifelse(df1$Ind ==1, c(NA, df1$Val[2:nrow(df1)]-df1$Val[1: (nrow(df1)-1)]), NA) But that seems so Baroque that I think you will agree that the indexing method is preferable in this question. Which function provides the shift operation of the vector Val? ?[ # with a suitable index vector ?lag # for time series Also welcomed if any better way to do this. Thanks. -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looking for a textbook that is more concise than Applied Linear Statistical Models (2004 version)
Check out Simon Wood's Generalized Additive Models: An Introduction with R. Its actually a lot more than its title suggests with linear model theory and related use of R in chapter 1 (and GLMs, GAMs, mixed models and GAMMs in subsequent chapters plus an appendix on matrix algebra). Google for more info. On Sat, Sep 26, 2009 at 9:45 AM, Peng Yu pengyu...@gmail.com wrote: Hi, I know this is a little bit offtopic on this list. But I can't find a more appropriate forum that I can ask. If there is a high quality forum on statistics textbook discussion, please let me know. I am reading Applied Linear Statistical Models. One drawback that I feel about this book is that it discuss many examples, which is to distracting. Numbers are give in those examples. Comments are buried in the examples. If I skip the examples, I would miss some important points. But if I don't skip the examples, it would take me too much time to finish the book (this book is of 1000 pages) However, I feel that the main points in the book can be concisely written in the matrix form. Athough this book has include matrix formulation, but it doesn't use it extensively. For example, the examples are not written with the abstract matrix (I mean just using symbols, such A, to represent the matrix) I'm wondering if there is a well-written book that is more concise than Applied Linear Statistical Models but roughly covers the same topics? Regards, Peng __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grep or other complex string matching approach to capture necessary information...
?subset problems - c( Water damage, Water off, water pipes damaged, leaking water) damaged - subset(house_info, house_info[,1]==problems[1] | house_info[,1]==problems[2] | house_info[,1]==problems[3] | house_info[,1]==problems[4]) or am I misunderstanding the question? or perhaps %in% which probably does the job more elegantly but I forget the syntax at the moment. --- On Fri, 9/25/09, Jason Rupert jasonkrup...@yahoo.com wrote: From: Jason Rupert jasonkrup...@yahoo.com Subject: [R] grep or other complex string matching approach to capture necessary information... To: R-help@r-project.org Received: Friday, September 25, 2009, 1:58 PM Say I have the following data: house_number-floor(runif(100, 200, 600)) water_evaluation-c(No water damage, Water damage, Water On, Water off, water pipes damaged, leaking water) water_evaluation_selection-floor(runif(100, 1,6)) house_info-data.frame(water_evaluation[water_evaluation_selection], house_number) And, that I only want to pull out the ones with negative water evaluations, i.e. Water damage, water pipes damaged, and leaking water. Should/could I use grep in order to pull the house numbers out of house_info with those negative water evaluations? I guess I want to know the house numbers from house_info where the water evaluation is negative. Is there a way to use grep or another R function in order to acquire that information? Thank you again in advance for any insights. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ Be smarter than spam. See how smart SpamGuard is at giving junk email the boot with the All __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mixed font in lattice xyplot lables
Hi, I think you are feeding two expressions to xlab instead of one. Try this instead, xyplot(y ~ x, dat,xlab=expression(Moran's * italic(I))) HTH, baptiste 2009/9/26 Andrewjohnclose a.j.cl...@ncl.ac.uk: Hi all, can anyone suggest a reason as mto why my xlab is plotting this text at oposite ends of axis. I would like to represent my lable like this: Moran's I ...but with the I in italics. For some reason they seperate and position at oposite ends of the axis?? Thank you library(lattice) dat - data.frame(x = rnorm(10),y = rnorm(10)) xyplot(y ~ x, dat,xlab=expression(Moran's ,italic(I))) -- View this message in context: http://www.nabble.com/Mixed-font-in-lattice-xyplot-lables-tp25626332p25626332.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Re tain current graphs in figure
Depends on the graphing system. For basic graphics have a look at ?points ?line ?par(new) for varous options ggplot2 is designed pretty much to do this so you might want to have a look at its documentation. Not sure about lattice as don't use it. --- On Thu, 9/24/09, Natalie Wong smartcookie...@live.com wrote: From: Natalie Wong smartcookie...@live.com Subject: [R] Re tain current graphs in figure To: r-help@r-project.org Received: Thursday, September 24, 2009, 11:31 PM I want to know, how do I retain the current plot and axes properties such that subsequent graphing commands add to the existing graph. Thank you very much!! -- View this message in context: http://www.nabble.com/Retain-current-graphs-in-figure-tp25606069p25606069.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ [[elided Yahoo spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] panel.text question
Hello, Thanks for your suggestion. It works in my simplified example. However, it didn't work in my real code. It is probably because I neglected to include the group argument in the example. I apologize for that. Below is the real code, if you need the actual data I can include it too. # this works well xyplot(PaCO2~time|group, group=animal,layout=c(3,1,1),aspect=1, panel=function(...){ panel.loess(...) panel.superpose(...)} ,data=pig,subset=time5 time181, xlab='Time (minutes)', ylab='PaCO2 (mmHg)') # this gives the following error Error in using packet 1 data, X argument missing with no #default in each of the plot panel xyplot(PaCO2~time|group, group=animal,layout=c(3,1,1),aspect=1, panel=function(x,y,subscripts,...){ panel.loess(...) panel.superpose(...) panel.text(100,110,label=c(' ','p=0.007','p=0.006')[tail(subscripts, 1)])} ,data=pig, subset=time5 time181, xlab='Time (minutes)', ylab='PaCO2 (mmHg)') Thanks tremendously for your help. I don't know why its soo hard just to add some text! Osman Osman O. Al-Radi, MD, MSc, FRCSC Staff Cardiovascular Surgeon Co-medical director, Tissue Bank The Hospital for Sick Children University of Toronto, Canada On Thu, Sep 24, 2009 at 2:18 PM, Henrique Dallazuanna www...@gmail.comwrote: Try this: xyplot(y ~ x | a, panel=function(x, y, subscripts, ...){ panel.loess(x, y) panel.text(0, 2, label=c('best','better','bad','worst')[tail(subscripts, 1)/100]) }) On Thu, Sep 24, 2009 at 2:45 PM, Osman Al-Radi osman.al.r...@gmail.com wrote: Dear R-help, I would like to add text to each of four panels in a plot generated by xyplot in lattice library. A sample code is given below, the plot generated has the first label repeated in all panels! How can I get the labels to be different in each panel? library(lattice) x - rnorm(400) y - rnorm(400) a - gl(4, 100) xyplot(y~x|a, panel=function(...){ panel.loess(...) panel.text(0,2,label=c('best','better','bad','worst'))}) Thanks Osman Osman O. Al-Radi, MD, MSc, FRCSC Staff Cardiovascular Surgeon Co-medical director, Tissue Bank The Hospital for Sick Children University of Toronto, Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Lattice, stripplot (xyplot), plotting data with median line, numeric x-axis
All, On p.52 of Deepayan Sarkar's Lattice book there is a nice plot of showing residuals with median lines superimposed or various groups: library(lattice) stripplot(sqrt(abs(residuals(lm(yield~variety+year+site ~ site, data = barley, groups = year, jitter.data = TRUE, type = c(p, a), fun = median) Suppose we wanted to make a similar plot for a numeric x-axis. Is there any way to do this with stripplot or does one have to xyplot and presumably panel functionality to get the median line? This does not work: barley$site.numeric =as.numeric(barley$site) stripplot(sqrt(abs(residuals(lm(yield~variety+year+site ~ site.numeric, data = barley, groups = year, jitter.data = TRUE, type = c(p, a), fun = median) Any tips much appreciated. For my data I had made my x-axis a factor but forgot that this doesn't work since the intervals are not equally spaced. Thanks! David __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Downloading data from from internet
Bogaso wrote: Thanks Duncan for your input. However I could not install the package RHTMLForms, it is saying as not not available : install.packages(RHTMLForms, repos = http://www.omegahat.org/R;) Warning in install.packages(RHTMLForms, repos = http://www.omegahat.org/R;) : argument 'lib' is missing: using 'C:\Users\Arrun's\Documents/R/win-library/2.9' Warning message: In getDependencies(pkgs, dependencies, available, lib) : package ‘RHTMLForms’ is not available I found this package in net : http://www.omegahat.org/RHTMLForms/ However it is gz file which I could not use as I am a window user. Can you please provide me alternate source? Hi Bogaso. Yes, I made the package available in source form with the expectation that people who were interested in using it would find out how to build it for themselves. I have made a binary version available of the package for R-2.9.* so install.packages() will work for you on Windows. However, you can use the source form of the package as a Windows user; you just have to install it. That involves finding out how to do this (either with Uwe's Windows package building service or by installing the tools that Brian Ripley and Duncan Murdoch have spent time making available to more easily use.) Generally (i.e. not pointing fingers at any one in particular), I do wish Windows users would learn how to do things for themselves and not put further burden on people who provide them with free software and free advice to also provide them with binary versions of easily installed packages. It does take time for us to maintain different operating systems and to create binaries. Running Windows and not being able to install R packages from source is a choice, not a technical limitation. D. Thanks, Duncan Temple Lang wrote: Bogaso wrote: Thank you so much for those helps. However I need little more help. In the site http://www.rateinflation.com/consumer-price-index/usa-historical-cpi.php; if I scroll below then there is an option Historical CPI Index For USA Next if I click on Get Data then another table pops-up, however without any significant change in address bar. This tables holds more data starting from 1999. Can you please help me how to get the values of this table? Hi again Well, this is a little bit more involved, as this is an HTML form and so we need to be able to emulate submitting a form with values for the different parameters the form expects, along with ensuring they are correct inputs. Ordinarily, this would involve looking at the source of the HTML document, finding the relevant form element, getting its action attribute, and all its inputs and figuring out the possible inputs. This is straightforward but involved. But we have an R package that does this reasonably well in an automated form. This is the RHTMLForms from the www.omegahat.org/R repository. We can use this with install.packages(RHTMLForms, repos = http://www.omegahat.org/R;) Then library(RHTMLForms) ff = getHTMLFormDescription(http://www.rateinflation.com/consumer-price-index/usa-historical-cpi.php;) # The form we want is the third one. We can determine this # from the names of the parameters. # So we request that this form description be turned into an R function g = createFunction(ff[[3]]) # Now we call this. xx = g(2001, 2008) # This returns the content of an HTML document # so we parse it and then pass this to readHTMLTable() # This is why we have methods for library(XML) doc = htmlParse(xx, asText = TRUE) tbls = readHTMLTable(doc) # we want the last of the tables. tbls[[length(tbls)]] So hopefully that helps solve your problem and introduces another Omegahat package that we hope people find through Google. The RHTMLForms package is an approach to the poor-man's Web services - HTML forms- rather than REST and SOAP that are becoming more relevant each day. The RCurl and SSOAP address the latter. D. Thanks Duncan Temple Lang wrote: Thanks for explaining this, Charlie. Just for completeness and to make things a little easier, the XML package has a function named readHTMLTable() and you can call it with a URL and it will attempt to read all the tables in the page. tbls = readHTMLTable('http://www.rateinflation.com/consumer-price-index/usa-cpi.php') yields a list with 10 elements, and the table of interest with the data is the 10th one. tbls[[10]] The function does the XPath voodoo and sapply() work for you and uses some heuristics. There are various controls one can specify and also various methods for working with sub-parts of the HTML document directly. D. cls59 wrote: Bogaso wrote: Hi all, I want to download data from those two different sources, directly into R : http://www.rateinflation.com/consumer-price-index/usa-cpi.php http://eaindustry.nic.in/asp2/list_d.asp First one is CPI of US and 2nd one is WPI of India. Can
Re: [R] Downloading data from from internet
Duncan Temple Lang wrote: However, you can use the source form of the package as a Windows user; you just have to install it. That involves finding out how to do this (either with Uwe's Windows package building service or by installing the tools that Brian Ripley and Duncan Murdoch have spent time making available to more easily use.) As a footnote to this, the tools required to enable package building on Windows are available at: http://www.murdoch-sutherland.com/Rtools/ Download and run the installer for your version of R. Make sure you allow the installer to modify your PATH. After installing the tools, you should be able to build and install most packages from within R via: install.packages( 'packageName', type = 'source' ) Duncan Temple Lang wrote: Generally (i.e. not pointing fingers at any one in particular), I do wish Windows users would learn how to do things for themselves and not put further burden on people who provide them with free software and free advice to also provide them with binary versions of easily installed packages. It does take time for us to maintain different operating systems and to create binaries. Running Windows and not being able to install R packages from source is a choice, not a technical limitation. D. I echo this sentiment as well-- but personally I believe this is mostly a symptom of Microsoft's decision to provide such a sorry excuse for a command line in Windows. Most Windows users never even consider building from source because it's not something that their operating system is capable of doing out of the box. This problem is further exacerbated by the fact that most IT departments go to such ridiculous lengths to lock their users out of Windows in an attempt to secure it. For example, I couldn't install Rtools on my workstation at the university even if I wanted to-- luckily all of our computers can dual boot into Linux. The lack of a decent command line prestocked with common tools, such as Perl and a C compiler, is the main reason I consider Windows an operating system of last resort. Here endeth the rant. -Charlie - Charlie Sharpsteen Undergraduate Environmental Resources Engineering Humboldt State University -- View this message in context: http://www.nabble.com/Downloading-data-from-from-internet-tp25568930p25627641.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Looking for a textbook that is more concise than Applied Linear Statistical Models (2004 version)
Hi, I know this is a little bit offtopic on this list. But I can't find a more appropriate forum that I can ask. If there is a high quality forum on statistics textbook discussion, please let me know. I am reading Applied Linear Statistical Models. One drawback that I feel about this book is that it discuss many examples, which is to distracting. Numbers are give in those examples. Comments are buried in the examples. If I skip the examples, I would miss some important points. But if I don't skip the examples, it would take me too much time to finish the book (this book is of 1000 pages) However, I feel that the main points in the book can be concisely written in the matrix form. Athough this book has include matrix formulation, but it doesn't use it extensively. For example, the examples are not written with the abstract matrix (I mean just using symbols, such A, to represent the matrix) I'm wondering if there is a well-written book that is more concise than Applied Linear Statistical Models but roughly covers the same topics? Regards, Peng __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there any performance difference between subset() and list comprehension?
Thanks, Stefan. I tested the expressions over a set of various size of data frames. The result shows 2) and 3) are faster than 1) especially over a data frame with a large number of columns. The third one is probably the best. 1) subset(df, V1 0, V2) or subset(df, V1 0, V2)$V2 2) df[df$V1 0.5, V2] 3) df$V2[df$V1 0] == TESTS == 1. test over 100*10 matrix df - as.data.frame.matrix(matrix(runif(1000),100)) system.time(subset(df, V1 0.5, V2), gcFirst=T) user system elapsed 0.260 0.044 0.302 system.time(subset(df, V1 0.5, V2)$V2, gcFirst=T) user system elapsed 0.256 0.044 0.300 system.time(df[df$V1 0.5, V2], gcFirst=T) user system elapsed 0.100 0.016 0.117 system.time(df$V2[df$V1 0.5], gcFirst=T) user system elapsed 0.104 0.012 0.117 2. test over 10*100 matrix df - as.data.frame.matrix(matrix(runif(1000),10)) system.time(subset(df, V1 0.5, V2), gcFirst=T) user system elapsed 0.040.000.04 system.time(subset(df, V1 0.5, V2)$V2, gcFirst=T) user system elapsed 0.040 0.000 0.042 system.time(df[df$V1 0.5, V2], gcFirst=T) user system elapsed 0.012 0.000 0.011 system.time(df$V2[df$V1 0.5], gcFirst=T) user system elapsed 0.012 0.000 0.011 3. test over 1*1000 matrix df - as.data.frame.matrix(matrix(runif(1000),1)) system.time(subset(df, V1 0.5, V2), gcFirst=T) user system elapsed 0.008 0.000 0.008 system.time(subset(df, V1 0.5, V2)$V2, gcFirst=T) user system elapsed 0.004 0.000 0.005 system.time(df[df$V1 0.5, V2], gcFirst=T) user system elapsed 0.004 0.000 0.001 system.time(df$V2[df$V1 0.5], gcFirst=T) user system elapsed 0.004 0.000 0.001 4. test over 100*10 matrix df - as.data.frame.matrix(matrix(runif(1000),100)) system.time(subset(df, V1 0.5, V2), gcFirst=T) user system elapsed 0.336 0.000 0.336 system.time(subset(df, V1 0.5, V2)$V2, gcFirst=T) user system elapsed 0.332 0.000 0.330 system.time(df[df$V1 0.5, V2], gcFirst=T) user system elapsed 0.004 0.000 0.005 system.time(df$V2[df$V1 0.5], gcFirst=T) user system elapsed 0 0 0 5. test over 10*100 matrix df - as.data.frame.matrix(matrix(runif(1000),10)) system.time(subset(df, V1 0.5, V2), gcFirst=T) user system elapsed 26.698 0.000 26.698 system.time(subset(df, V1 0.5, V2)$V2, gcFirst=T) user system elapsed 26.678 0.004 26.678 system.time(df[df$V10.5, V2], gcFirst=T) user system elapsed 0.060 0.000 0.057 system.time(df$V2[df$V10.5], gcFirst=T) user system elapsed 0 0 0 2009/9/26 Stefan Grosse singularit...@gmx.net On Sat, 26 Sep 2009 15:26:12 +0900 You Hyun Jo youhyu...@gmail.com wrote: YHJ Is there any (performance) difference (except the difference of YHJ the return types) YHJ between the following two computations? Try it yourself. ?system.time is useful for that purpose. Stefan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] questions on csv reading
Hi, Is there any official way to determine the colClasses of a data.frame? Why has POSIXct such a strange class structure? Why is colClasses ordered not allowed (and doesn't work)? Background == I am writing a chunked csv reader that provides the functionality of read.table for large files (in the next version of package ff). In chunked reading, one wants to learn the colClasses from the data.frame returned for the first chunk and submit this as argument colClasses= to the following chunks (following calls to read.table). for most column types colClasses - sapply(data.frame, class) works fine. However, two column types have more than one class: ordered has c(ordered, factor) - currently we can't tell read.table that a column is an ordered factor POSIXct has c(POSIXt,POSIXct) - here the LESS specific class POSIXt is in the first position and would win in class-dispatch over the MORE specific class POSIXct. Why? Jens Oehlschlägel -- GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding variables
I assumed (since you did not provide reproducible code) that 'mag' was a matrix. If 'station' is a matrix, then mag + rowSums(station) will work. If that does not work, then you need to tell us what your data objects are. On Sat, Sep 26, 2009 at 11:39 AM, tzygmund mcfarlane tzygm...@googlemail.com wrote: Hi Jim, I might be missing something but your command gives the error: Error in rowSums(mag) : 'x' must be an array of at least two dimensions # data(attenu) attach(attenu) rowSums(mag) + rowSums(station) attenu$new-rowSums(cbind(mag, station)) # Thanks On Sat, Sep 26, 2009 at 4:30 PM, jim holtman jholt...@gmail.com wrote: Probably more efficient if you remove the 'cbind' which would create a combined matrix. Use the following: rowSums(mag) + rowSums(station) On Sat, Sep 26, 2009 at 11:16 AM, tzygmund mcfarlane tzygm...@googlemail.com wrote: Hi, For very large matrices, is this the most efficient way to add two variables together? # attach(attenu) new-rowSums(cbind(mag, station)) # Also, could I be directed to some resources for working with very large datasets? Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multiclass SVM (e1071 package): number of estimated models
Hi, I run multiclass SVM for iris data, which contains 3 classes (manual page 52). Based on manual, the implementation uses one-against-one approach: k*(k-1)/2 binary classifiers trained. However, I am getting only two models instead of three (only two columns of support vectors and coefficients). What do I miss? Thanks a lot for help, John Bellow is the code. package(e1071) data(iris) x - subset(iris, select = -Species) y - Species model - svm(x, y) model$SV model$coefs -- View this message in context: http://www.nabble.com/multiclass-SVM-%28e1071-package%29%3A-number-of-estimated-models-tp25624020p25624020.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Downloading data from from internet
Here are three different approaches: 1. Using the first link as an example, on Windows you can copy the data and headers from IE (won't work in Firefox) to Excel and from there to clipboard again and then in R: library(zoo) DF - read.delim(clipboard) z - zooreg(c(t(DF[5:1, 2:13])), start = as.yearmon(2005-01), freq = 12) 2. on any platform you can read it straight into R: L - readLines(http://www.rateinflation.com/consumer-price-index/usa-cpi.php;) and then use the character manipulation functions (grep, sub, gsub, substr) and as.numeric to parse out the data or 3. on any platform, use the XML package adapting the code in this post: https://stat.ethz.ch/pipermail/r-help/2009-July/203063.html On Thu, Sep 24, 2009 at 9:34 AM, Bogaso bogaso.christo...@gmail.com wrote: Hi all, I want to download data from those two different sources, directly into R : http://www.rateinflation.com/consumer-price-index/usa-cpi.php http://eaindustry.nic.in/asp2/list_d.asp First one is CPI of US and 2nd one is WPI of India. Can anyone please give any clue how to download them directly into R. I want to make them zoo object for further analysis. Thanks, -- View this message in context: http://www.nabble.com/Downloading-data-from-from-internet-tp25568930p25568930.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] questions on csv reading
2009/9/26 Jens Oehlschlägel oehl_l...@gmx.de: Hi, Is there any official way to determine the colClasses of a data.frame? Why has POSIXct such a strange class structure? Why is colClasses ordered not allowed (and doesn't work)? Background == I am writing a chunked csv reader that provides the functionality of read.table for large files (in the next version of package ff). In chunked reading, one wants to learn the colClasses from the data.frame returned for the first chunk and submit this as argument colClasses= to the following chunks (following calls to read.table). for most column types colClasses - sapply(data.frame, class) works fine. However, two column types have more than one class: ordered has c(ordered, factor) - currently we can't tell read.table that a column is an ordered factor Possibly more complex than one would wish but it is possible to do this: Lines - A B D C setOldClass(ordered) setAs(character, ordered, function(from) ordered(from)) DF - read.table(textConnection(Lines), colClasses = ordered) str(DF) POSIXct has c(POSIXt,POSIXct) - here the LESS specific class POSIXt is in the first position and would win in class-dispatch over the MORE specific class POSIXct. Why? Its a historical error that is too late to correct now. See discussion in Chambers' recent book. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R as a web service
Carlos J. Gil Bellosta wrote: Dear R-helpers, I have been inquired about the possibility of developing a web distributed scoring system: a model is created in a central location, users fill a form in their browsers, and the central server calls this model and returns a YES/NO answer to them. I am tempted into using R for this assignment. I have used Rapache for similar tasks, but I am afraid that it is too of a novelty for many backward looking IT departments. For a number of reasons, a Java based infrastructure (tomcat, web services, etc.) would be much more palatable for them. My wishlist is as follows: * Minimal infrastructure changes in case of (statistical) model updates or changes. * Solid management of concurrence, so that simultaneous connections do not interfere with each other. * Maximum efficiency so that new connections do not require a fresh R startup. Any ideas on how to achieve this? Any documentation available? Hi Carlos -- See RWebServices http://www.bioconductor.org/packages/bioc/html/RWebServices.html as one possible solution. This produces a Java-based SOAP front end for tomcat (probably good for the IT guys) with tasks dispatched to a series of java-embedded R 'workers' to handle concurrency (probably not so good for the IT guys, as this requires maintaining the infrastructure for the service / worker communication and for handling gracefully the demise of workers). The workers are persistent, and can have their R implementation changed independent of the web service (though that is not necessarily best practice). The relevant vignettes are 'Enabling packages as web services' and 'Installing and testing...'. This is in ongoing development, so use R-devel (and the appropriate RWebServices). Martin Best regards, Carlos J. Gil Bellosta http://www.datanalytics.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] implementation of matrix logarithm (inverse of matrix exponential)
Dear R users, Does anyone has implemented the inverse of the matrix exponential (expm in the package Matrix)? In Matlab, there're logm and expm, there's only expm in R. Cheers Mimosa [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame's column names not the same as in CSV
At 1:58 AM -0400 9/26/09, Derek Foo wrote: Hello, I am trying to read in a csv file with column such as \\LS01\Processor(_Total)\% Processor Time with the command read.csv(file). However, the column name in the resulted data frame is changed to X..LS01.Processor._TotalProcessor.Time. Strangely, Not so strange. Data can be anything, but column names are names of variables. In R, as in most (all? many?) computer languages, variable names have rules they must follow. Yours don't follow R's rules. See Gabor's response to learn how to tell R to ignore the rules (in this particular instance). You will find, however, that later on, when you want to use those variables, it will be more difficult to use variables whose names do not follow the rules. when I experimented with just reading the csv with the head flag set to false, the text was read correctly as the same to the raw file. I am wondering if anyone has encountered a similar problem. If so, I would really appreciate if you can share your insight. Best Regards, Derek [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://*stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://*www.*R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- -- Don MacQueen Environmental Protection Department Lawrence Livermore National Laboratory Livermore, CA, USA 925-423-1062 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] evaluate a set of symbols within an IF statement
?any -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of zubin Sent: Friday, September 25, 2009 6:00 PM To: r-help@r-project.org Subject: [R] evaluate a set of symbols within an IF statement Hello, writing some R code to cleanse a data set, if the following set of symbols are identified then perform some actions. trying to write the minimum code to do this. tname = VIX checkticker = c(VIX, TYX, TNX, IRX) if (tname == checkticker) { //perform some operations } result i get is tname == checkticker [1] TRUE FALSE FALSE FALSE how do i evaluate this whole list to a single boolean True or False? If any of these are true the whole statement is True, else False. this only seems to work for the first ticker, the rest don't perform the operations within the loop. tname = IRX tname == checkticker [1] FALSE FALSE FALSE TRUE __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] implementation of matrix logarithm (inverse of matrix exponential)
Try: expm( - M) On Sat, Sep 26, 2009 at 5:06 PM, Mimosa Zeus mimosa1...@yahoo.fr wrote: Dear R users, Does anyone has implemented the inverse of the matrix exponential (expm in the package Matrix)? In Matlab, there're logm and expm, there's only expm in R. Cheers Mimosa [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] evaluate a set of symbols within an IF statement
Hi zubin, Try also tname = VIX checkticker = c(VIX, TYX, TNX, IRX) is.element(tname, checkticker) # [1] TRUE HTH, Jorge On Fri, Sep 25, 2009 at 8:00 PM, zubin binab...@bellsouth.net wrote: Hello, writing some R code to cleanse a data set, if the following set of symbols are identified then perform some actions. trying to write the minimum code to do this. tname = VIX checkticker = c(VIX, TYX, TNX, IRX) if (tname == checkticker) { //perform some operations } result i get is tname == checkticker [1] TRUE FALSE FALSE FALSE how do i evaluate this whole list to a single boolean True or False? If any of these are true the whole statement is True, else False. this only seems to work for the first ticker, the rest don't perform the operations within the loop. tname = IRX tname == checkticker [1] FALSE FALSE FALSE TRUE __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] implementation of matrix logarithm (inverse of matrix exponential)
On Sat, 26 Sep 2009, Gabor Grothendieck wrote: Try: expm( - M) Mimosa probably meant say 'the inverse function'. I do not see one in R. Chuck On Sat, Sep 26, 2009 at 5:06 PM, Mimosa Zeus mimosa1...@yahoo.fr wrote: Dear R users, Does anyone has implemented the inverse of the matrix exponential (expm in the package Matrix)? In Matlab, there're logm and expm, there's only expm in R. Cheers Mimosa [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] implementation of matrix logarithm (inverse of matrix exponential)
OK. Try this: library(Matrix) M - matrix(c(2, 1, 1, 2), 2); M [,1] [,2] [1,]21 [2,]12 # log of expm(M) is original matrix M with(eigen(expm(M)), vectors %*% diag(log(values)) %*% t(vectors)) [,1] [,2] [1,]21 [2,]12 On Sat, Sep 26, 2009 at 6:24 PM, Charles C. Berry cbe...@tajo.ucsd.edu wrote: On Sat, 26 Sep 2009, Gabor Grothendieck wrote: Try: expm( - M) Mimosa probably meant say 'the inverse function'. I do not see one in R. Chuck On Sat, Sep 26, 2009 at 5:06 PM, Mimosa Zeus mimosa1...@yahoo.fr wrote: Dear R users, Does anyone has implemented the inverse of the matrix exponential (expm in the package Matrix)? In Matlab, there're logm and expm, there's only expm in R. Cheers Mimosa [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] implementation of matrix logarithm (inverse of matrix exponential)
On Sat, 26 Sep 2009, Gabor Grothendieck wrote: OK. Try this: library(Matrix) M - matrix(c(2, 1, 1, 2), 2); M [,1] [,2] [1,]21 [2,]12 Right. expm( M ) is diagonalizable. But for M - matrix( c(0,1,0,0), 2 ) you get the wrong result. Maybe I should have added that I do not see the machinery in R for dealing with Jordan blocks. HTH, Chuck # log of expm(M) is original matrix M with(eigen(expm(M)), vectors %*% diag(log(values)) %*% t(vectors)) [,1] [,2] [1,]21 [2,]12 On Sat, Sep 26, 2009 at 6:24 PM, Charles C. Berry cbe...@tajo.ucsd.edu wrote: On Sat, 26 Sep 2009, Gabor Grothendieck wrote: Try: expm( - M) Mimosa probably meant say 'the inverse function'. I do not see one in R. Chuck On Sat, Sep 26, 2009 at 5:06 PM, Mimosa Zeus mimosa1...@yahoo.fr wrote: Dear R users, Does anyone has implemented the inverse of the matrix exponential (expm in the package Matrix)? In Matlab, there're logm and expm, there's only expm in R. Cheers Mimosa [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] implementation of matrix logarithm (inverse of matrix exponential)
Often one uses matrix logarithms on symmetric positive definite matrices so the assumption of being symmetric is sufficient in many cases. On Sat, Sep 26, 2009 at 7:28 PM, Charles C. Berry cbe...@tajo.ucsd.edu wrote: On Sat, 26 Sep 2009, Gabor Grothendieck wrote: OK. Try this: library(Matrix) M - matrix(c(2, 1, 1, 2), 2); M [,1] [,2] [1,] 2 1 [2,] 1 2 Right. expm( M ) is diagonalizable. But for M - matrix( c(0,1,0,0), 2 ) you get the wrong result. Maybe I should have added that I do not see the machinery in R for dealing with Jordan blocks. HTH, Chuck # log of expm(M) is original matrix M with(eigen(expm(M)), vectors %*% diag(log(values)) %*% t(vectors)) [,1] [,2] [1,] 2 1 [2,] 1 2 On Sat, Sep 26, 2009 at 6:24 PM, Charles C. Berry cbe...@tajo.ucsd.edu wrote: On Sat, 26 Sep 2009, Gabor Grothendieck wrote: Try: expm( - M) Mimosa probably meant say 'the inverse function'. I do not see one in R. Chuck On Sat, Sep 26, 2009 at 5:06 PM, Mimosa Zeus mimosa1...@yahoo.fr wrote: Dear R users, Does anyone has implemented the inverse of the matrix exponential (expm in the package Matrix)? In Matlab, there're logm and expm, there's only expm in R. Cheers Mimosa [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] implementation of matrix logarithm (inverse of matrix exponential)
Sylvester's formula (http://en.wikipedia.org/wiki/Sylvester%27s_formula) applies to a square matrix A = S L solve(S), where L = a diagonal matrix and S = matrix of eigenvectors. Let f be an analytic function [for which f(A) is well defined]. Then f(A) = S f(L) solve(S). We can code this as follows: sylvester - function(x, f){ n - nrow(x) eig - eigen(x) vi - solve(eig$vectors) with(eig, (vectors * rep(f(values), each=n)) %*% vi) } logm - function(x)sylvester(x, log) Example: A - matrix(1:4, 2) eA - expm(A) logm(eA) With Chuck Berry's example, we get the following: M - matrix( c(0,1,0,0), 2 ) sylvester(M, log) Error in solve.default(eig$vectors) : system is computationally singular: reciprocal condition number = 1.00208e-292 This is a perfectly sensible answer in this case. We get the same result from sylvester(M, exp), though expm(M) works fine. A better algorithm for this could be obtains by studying the code for expm in the Matrix package and the references in the associated help page. Hope this helps. Spencer Gabor Grothendieck wrote: Often one uses matrix logarithms on symmetric positive definite matrices so the assumption of being symmetric is sufficient in many cases. On Sat, Sep 26, 2009 at 7:28 PM, Charles C. Berry cbe...@tajo.ucsd.edu wrote: On Sat, 26 Sep 2009, Gabor Grothendieck wrote: OK. Try this: library(Matrix) M - matrix(c(2, 1, 1, 2), 2); M [,1] [,2] [1,]21 [2,]12 Right. expm( M ) is diagonalizable. But for M - matrix( c(0,1,0,0), 2 ) you get the wrong result. Maybe I should have added that I do not see the machinery in R for dealing with Jordan blocks. HTH, Chuck # log of expm(M) is original matrix M with(eigen(expm(M)), vectors %*% diag(log(values)) %*% t(vectors)) [,1] [,2] [1,]21 [2,]12 On Sat, Sep 26, 2009 at 6:24 PM, Charles C. Berry cbe...@tajo.ucsd.edu wrote: On Sat, 26 Sep 2009, Gabor Grothendieck wrote: Try: expm( - M) Mimosa probably meant say 'the inverse function'. I do not see one in R. Chuck On Sat, Sep 26, 2009 at 5:06 PM, Mimosa Zeus mimosa1...@yahoo.fr wrote: Dear R users, Does anyone has implemented the inverse of the matrix exponential (expm in the package Matrix)? In Matlab, there're logm and expm, there's only expm in R. Cheers Mimosa [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Spencer Graves, PE, PhD President and Chief Operating Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San José, CA 95126 ph: 408-655-4567 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 3D to 2D projection
Is there a method that I can use to convert 3D coordinates into 2D? I was looking at persp and trans3d. Are those the ones I should be looking at ? Thanks ../Murli __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 3D to 2D projection
On Sep 26, 2009, at 10:07 PM, Nair, Murlidharan T wrote: Is there a method that I can use to convert 3D coordinates into 2D? Yes. I was looking at persp and trans3d. Are those the ones I should be looking at ? Yes. Thanks ../Murli -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Summary/Bootstrap for Design library's lrm function
Can anyone tell me what I might be doing incorrectly for an ordinal logistic regression for lrm? I cannot get R(2.9.1)to run either summary nor will it let me bootstrp to validate. ### Y is a 5 value measure with a range from 1-5, the independent variables are the same. N=75 but when we knock out the NAs it comes down to 51 lrm(formula = Y ~ permemp + rev + gconec + scorpstat, data = data, na.action = na.delete, var.penalty = simple) ## It will give me coefficients and residuals, but nothing else really. When I try to enter summary it gives me this error message## summary(bigassmall) Error in summary.Design(bigassmall) : could not find function Varcov ##So I thought I'd try to find a back door in, manually bootstrapping to verify then getting values that way and I get this error message## validate(bigassmall, method=boot, B=50) Error in validate.lrm(bigassmall, method = boot, B = 50) : fit did not use x=T,y=T Any clue as to what I'm doing wrong? any help would be much appreciated. Karl PhD Student, Political Science University of California at Irvine __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Re ad in multiple datasets
Hello, all: I have twenty datasets named as: data1.csv, data2.csv, …, data20.csv. I am trying to read all of them into R by using loop and function read.table(), but I don't know how to handle the name of datasets. Has anybody have encountered a similar problem? Or do you have any suggestions? Your help would be greatly appreciated. Legen -- View this message in context: http://www.nabble.com/Read-in-multiple-datasets-tp25630688p25630688.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Re ad in multiple datasets
input - lapply(1:20, function(.file) read.csv(paste('data', .file, '.csv', sep=''))) This will create a list of 20 with the dataframe from each file in the list. On Sat, Sep 26, 2009 at 11:47 PM, legen lege...@gmail.com wrote: Hello, all: I have twenty datasets named as: data1.csv, data2.csv, …, data20.csv. I am trying to read all of them into R by using loop and function read.table(), but I don't know how to handle the name of datasets. Has anybody have encountered a similar problem? Or do you have any suggestions? Your help would be greatly appreciated. Legen -- View this message in context: http://www.nabble.com/Read-in-multiple-datasets-tp25630688p25630688.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Converting SAS Data code to R.
I am contemplating bringing in and merging three NHANES-III datasets from the National Center for Health Statistics that are fixed format with record length=3348, line counts around 20,000 and described by SAS DATA steps. I have downloaded and linked similar datasets from the Continuous NHANES public data releases, but never ones with this many variables at once. In the prior effort I managed the task by some cut- paste-editing from the SAS code file into a corresponding read.fwf R call, but the earlier NHANES-III data is far more voluminous than the more recent Continuous version. I am wondering if anyone has experience with such a process and would be willing to share some advice? The SAS code can be seen here: ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Datasets/NHANES/NHANESIII/1A/adult.sas The main code file Data step starts out... FILENAME ADULT D:\Questionnaire\DAT\ADULT.DAT LRECL=3348; *** LRECL includes 2 positions for CRLF, assuming use of PC SAS; DATA WORK; INFILE ADULT MISSOVER; LENGTH SEQN 7 DMPFSEQ 5 DMPSTAT 3 DMARETHN 3 DMARACER 3 DMAETHNR 3 HSSEX 3 The corresponding positions in the INPUT section are INPUT SEQN 1-5 DMPFSEQ 6-10 DMPSTAT 11 DMARETHN 12 DMARACER 13 DMAETHNR 14 HSSEX15 The note about CRLF appears to be implying that those characters are being counted as part of the length of the first variable, SEQN, but that there are only 5 meaningful positions. I suppose I can find out by trial and error how to read such files, but it would save me some time if anyone in the audience has worked through this on this data before. One thought would be to import the data with the SAS work-alike program, WKS, (which I have not used before) and then to read in with read.xport from the foreign library. That would obviate the need to understand the character position issue, but probably has a time commitment to get it up and running and learn how to use it. Another thought would be to parse the fixed width SAS Data step code into pieces and build a data.frame from which I then extract the row.names, col.names, and colClasses from that centralized structure. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summary/Bootstrap for Design library's lrm function
On Sep 26, 2009, at 5:11 PM, kkr...@uci.edu wrote: Can anyone tell me what I might be doing incorrectly for an ordinal logistic regression for lrm? I cannot get R(2.9.1)to run either summary nor will it let me bootstrp to validate. ### Y is a 5 value measure with a range from 1-5, the independent variables are the same. N=75 but when we knock out the NAs it comes down to 51 lrm(formula = Y ~ permemp + rev + gconec + scorpstat, data = data, na.action = na.delete, var.penalty = simple) ## It will give me coefficients and residuals, but nothing else really. When I try to enter summary it gives me this error message## summary(bigassmall) Error in summary.Design(bigassmall) : could not find function Varcov Frank has answered this question a couple of times in the last month. He has moved his active effort away from Design over to the rms package. In the process the Varcov function got left out of Design. He posted a replacement. I thought he was going to put it back into a fixed version, so the first thing I would check is to see if your version is outdated. If updating Hmisc and Design does not work, (and it did work for me), then see Frank's posting: https://stat.ethz.ch/pipermail/r-help/2009-September/211306.html ... which also worked for me before I updated. ##So I thought I'd try to find a back door in, manually bootstrapping to verify then getting values that way and I get this error message## validate(bigassmall, method=boot, B=50) Error in validate.lrm(bigassmall, method = boot, B = 50) : fit did not use x=T,y=T ?lrm That seems to be a fairly explanatory error message. Looking at your call to lrm, which I infer from you code snippets and error messages was assigned to bigassmall, it certainly does not appear that you have set x=T and y=T. Any clue as to what I'm doing wrong? any help would be much appreciated. Karl PhD Student, Political Science University of California at Irvine David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Teach me how to transpose in R
*Hum* bbb=t(as.matrix(data2)) ? good luck milton On Sun, Sep 27, 2009 at 12:39 AM, Hyo Lee totem...@gmail.com wrote: Hi guys, I need your help!! My goal is to make a csv file from ncdf file. This is the code i've used : hyo=open.ncdf(C:/CRUTEM3.nc) hyo [1] file C:/CRUTEM3.nc has 4 dimensions:[1] longitude Size: 72 [1] latitude Size: 36 [1] unspecified Size: 1 [1] t Size: 1916 [1] [1] file C:/CRUTEM3.nc has 1 variables: [1] float temp[longitude,latitude,unspecified,t] Longname:Temperature T Missval:2.0004008175e+20 data2=get.var.ncdf(hyo) write.csv(data2,file=C:/ple.csv) But the problem is, I expected this data would be 17000 * 72 (row* col) ; but, it is the other way around. 72*17000 Because the maximum col number in excel is 16383, this cvs file doesn't show all data. Obviously, I need to transpose the matrix.. I tried to use transpose function but failed. bbb=t(data2) Error in t.default(data2) : argument is not a matrix ccc=t(hyo) ccc [1] file has dimensions: Error in if (nc$ndims 0) for (i in 1:nc$ndims) { : argument is of length zero Teach me how to deal with this problem. Thank you very much. -Hyo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.