Re: [R] BUG: atan(1i) / 5 = NaN+Infi ?
On Fri, Sep 13, 2024 at 1:10 PM Duncan Murdoch wrote: > [You don't often get email from murdoch.dun...@gmail.com. Learn why this is > important at https://aka.ms/LearnAboutSenderIdentification ] > Caution: External email. > On 2024-09-13 8:53 a.m., Jonathan Dushoff wrote: > >> Message: 4 > >> Date: Thu, 12 Sep 2024 11:21:02 -0400 > >> From: Duncan Murdoch > >> That's not the correct formula, is it? I think the result should be x * > >> Conj(y) / Mod(y)^2 . > > Correct, sorry. And thanks. > >> So that would involve * and > >> / , not just real arithmetic. > > Not an expert, but I don't see it. Conj and Mod seem to be numerically > > straightforward real-like operations. We do those, and then multiply > > one complex number by one real quotient. > Are you sure? We aren't dealing with real numbers and complex numbers > here, we're dealing with those sets extended with infinities and other > weird things. Definitely not sure, just thought I would suggest it as a possibility. > So for example if y is some kind of infinite complex number, then 1/y > should come out to zero, and if x is finite, the final result of x/y > should be zero. > But if we evaluate x/y as (x / Mod(y)^2) * Conj(y), won't we get a NaN > from zero times infinity? Yes, and it's not trivial to work around, so probably not worth it. Thanks, __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] BUG: atan(1i) / 5 = NaN+Infi ?
> Message: 4 > Date: Thu, 12 Sep 2024 11:21:02 -0400 > From: Duncan Murdoch > That's not the correct formula, is it? I think the result should be x * > Conj(y) / Mod(y)^2 . Correct, sorry. And thanks. > So that would involve * and > / , not just real arithmetic. Not an expert, but I don't see it. Conj and Mod seem to be numerically straightforward real-like operations. We do those, and then multiply one complex number by one real quotient. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Subject: Re: BUG: atan(1i) / 5 = NaN+Infi ?
> In this case, I do think we should look into the consequences of > indeed distinguishing >* >* and >/ > from their respective current {1. coerce to complex, 2. use complex arith} > arithmetic. I'm wondering whether – if this indeed gets opened up – it might also make sense to calculate x / y using real arithmetic (as x*y / |y|²) Jonathan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R
Hi I found your email on a website Can I ask some questions about R please Many thanks Jonathan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Best settings for RStudio video recording?
Folks: I was wondering if you all would suggest some helpful RStudio configurations that make recording a session via e.g. zoom the most useful for students doing remote learning. Thoughts? --j -- Jonathan A. Greenberg, PhD Randall Endowed Professor and Associate Professor of Remote Sensing Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Natural Resources & Environmental Science University of Nevada, Reno 1664 N Virginia St MS/0186 Reno, NV 89557 Phone: 415-763-5476 https://www.gearslab.org/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Checking for a proper "stop" statement...
Folks: Consider the following two use cases: goodfunction <- function() { stop("Something went wrong..." } # vs. badfunction <- function() { notgood() } Is there a way for me to test if the functions make use of a stop() statement WITHOUT modifying the stop() output (assume I can't mod the function containing the stop() statement itself)? For "goodfunction" the answer is TRUE, for "badfunction" the answer is FALSE. Both return an error, but only one does it "safely". I thought the answer might lie in a tryCatch statement but I'm having a hard time figuring out how to do this test. --j -- -- Jonathan A. Greenberg, PhD Randall Endowed Professor and Associate Professor of Remote Sensing Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Natural Resources & Environmental Science University of Nevada, Reno 1664 N Virginia St MS/0186 Reno, NV 89557 Phone: 415-763-5476 http://www.unr.edu/nres Gchat: jgrn...@gmail.com, Skype: jgrn3007 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] New package: rDotNet
I’ve published a package on CRAN called ‘rDotNet’. rDotNet allows R to access .NET libraries. From R one can: * create .NET objects * call member functions * call class functions (i.e. static members) * access and set properties * access indexing members The package will run with either mono on OS X / Linux or the Microsoft .NET VM on windows. Find the source and description of the package on: https://github.com/tr8dr/.Net-Bridge/blob/master/src/R/rDotNet/ <https://github.com/tr8dr/.Net-Bridge/blob/master/src/R/rDotNet/> And the CRAN link as: https://cran.r-project.org/web/packages/rDotNet/index.html <https://cran.r-project.org/web/packages/rDotNet/index.html> The package is stable, as has been in use for some years, but only now packaged up for public use on CRAN. Feel free to contact with questions or suggestions on GitHub or by email. Regards -- Jonathan Shore [[alternative HTML version deleted]] ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odd results from rpart classification tree
Thanks Terry! I managed to figure that out shortly after posting (as is the way!) Adding an additional covariate that splits below one of the x branches but not the other and means the class proportion to go over 0.5 means the x split is retained. However, I now have another conundrum, this time with rpart in anova mode... library(rpart) test_split <- function(offset) { y <- c(rep(0,10),rep(0.5,2)) + offset x <- c(rep(0,10),rep(1,2)) if (is.null(rpart(y ~ x, minsplit=1, cp=0, xval=0)$splits)) 0 else 1 } sum(replicate(1000, test_split(0))) # 1000, i.e. always splits sum(replicate(1000, test_split(0.5))) # 2-12, i.e. splits only sometimes... Adding a constant to y and getting different trees is a bit strange, particularly stochastically. Will see if I can track down a copy of the CART book. Jonathan From: Therneau, Terry M., Ph.D. [thern...@mayo.edu] Sent: 16 May 2017 00:43 To: r-help@r-project.org; Marshall, Jonathan Subject: Re: Odd results from rpart classification tree You are mixing up two of the steps in rpart. 1: how to find the best candidate split and 2: evaluation of that split. With the "class" method we use the information or Gini criteria for step 1. The code finds a worthwhile candidate split at 0.5 using exactly the calculations you outline. For step 2 the criteria is the "decision theory" loss. In your data the estimated rate is 0 for the left node and 15/45 = .333 for the right node. As a decision rule both predict y=0 (since both are < 1/2). The split predicts 0 on the left and 0 on the right, so does nothing. The CART book (Brieman, Freidman, Olshen and Stone) on which rpart is based highlights the difference between odds-regression (for which the final prediction is a percent, and error is Gini) and classification. For the former treat y as continuous. Terry T. On 05/15/2017 05:00 AM, r-help-requ...@r-project.org wrote: > The following code produces a tree with only a root. However, clearly the > tree with a split at x=0.5 is better. rpart doesn't seem to want to produce > it. > > Running the following produces a tree with only root. > > y <- c(rep(0,65),rep(1,15),rep(0,20)) > x <- c(rep(0,70),rep(1,30)) > f <- rpart(y ~ x, method='class', minsplit=1, cp=0.0001, > parms=list(split='gini')) > > Computing the improvement for a split at x=0.5 manually: > > obs_L <- y[x<.5] > obs_R <- y[x>.5] > n_L <- sum(x<.5) > n_R <- sum(x>.5) > gini <- function(p) {sum(p*(1-p))} > impurity_root <- gini(prop.table(table(y))) > impurity_L <- gini(prop.table(table(obs_L))) > impurity_R <- gini(prop.table(table(obs_R))) > impurity <- impurity_root * n - (n_L*impurity_L + n_R*impurity_R) # 2.880952 > > Thus, an improvement of 2.88 should result in a split. It does not. > > Why? > > Jonathan > > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odd results from rpart classification tree
The following code produces a tree with only a root. However, clearly the tree with a split at x=0.5 is better. rpart doesn't seem to want to produce it. Running the following produces a tree with only root. y <- c(rep(0,65),rep(1,15),rep(0,20)) x <- c(rep(0,70),rep(1,30)) f <- rpart(y ~ x, method='class', minsplit=1, cp=0.0001, parms=list(split='gini')) Computing the improvement for a split at x=0.5 manually: obs_L <- y[x<.5] obs_R <- y[x>.5] n_L <- sum(x<.5) n_R <- sum(x>.5) gini <- function(p) {sum(p*(1-p))} impurity_root <- gini(prop.table(table(y))) impurity_L <- gini(prop.table(table(obs_L))) impurity_R <- gini(prop.table(table(obs_R))) impurity <- impurity_root * n - (n_L*impurity_L + n_R*impurity_R) # 2.880952 Thus, an improvement of 2.88 should result in a split. It does not. Why? Jonathan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] New package: ggghost 0.1.0 - Capture the spirit of your ggplot2 calls
Greetings, R users! I am pleased to announce the release of my first CRAN package: ggghost. https://cran.r-project.org/web/packages/ggghost https://github.com/jonocarroll/ggghost Features: - Minimal user-space overhead for implementation; p %g<% ggplot(dat, aes(x,y)) - ggplot2 components added to the plot object (p <- p + geom_point()) are stored in a list within p, and evaluation delayed - The incoming data is captured and retained for reproducibility - The list of calls can be added to (+), subtracted from (-, via regex), and subset - The list of calls can be inspected (via summary) - The data and calls can be recovered from the object p even if removed from the workspace. Provides a solution to a question posed here: https://twitter.com/JennyBryan/status/755417584359632896 Whether the pun name or the R code came first is a secret that dies with me. I welcome any feedback or suggestions you may have. Kind regards, - Jonathan Carroll. [[alternative HTML version deleted]] ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 2x2x2 rm ANOVA, varying results
Hello, I ran a 2x2x2 repeated measures ANOVA which turned out fine: DfSum Sq Mean Sq F value Pr(>F) Attend1 0.5540 0.55402 7.03740.01079 *PercGrp 1 0.0058 0.00580 0.07370.78719 Pres 1 0.1794 0.17944 2.27940.13766 Attend:PercGrp 1 0.0017 0.00172 0.02180.88324 Attend:Pres 1 0.0189 0.018940.2406 0.62598 PercGrp:Pres 1 0.0534 0.053440.6789 0.41405 Attend:PercGrp:Pres 1 0.0046 0.004640.0590 0.80912 Residuals 483.7788 0.07872 However, when I run the interactions alone (from the same dataset), I get a different set of results than what was originally shown e.g. > anova(lm(main~Attend+PercGrp+Pres))Analysis of Variance Table Response: main Df Sum SqMean SqF value Pr(>F) Attend 10.5540 0.554027.4682 0.008561 **PercGrp 1 0.00580.005800.0782 0.780849 Pres 10.17940.179442.4189 0.125942 Residuals 52 3.8575 0.07418 I also get different results when i run the interactions alone too. Curious to know why this is. Thanks,Jon [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problems with pooling Multiply Imuputed datasets, of a multilevel logistic model, using (MICE)
I am having problems with the MICE package in R, particularity with pooling the imputed data sets. I am running a multilevel binomial logistic regression, with Level1 - topic (participant response to 10 questions on different topics, e.g. T_Darkness, T_Day) nested within Level2 - individuals. The model is created using R2MLwiN, the formula is > fit1 <-runMLwiN( c(probit(T_Darkness, cons), probit(T_Day, cons), > probit(T_Light, cons), probit(T_Night, cons), probit(T_Rain, cons), > probit(T_Rainbows, cons), probit(T_Snow, cons), probit(T_Storms, cons), > probit(T_Waterfalls, cons), probit(T_Waves, cons)) ~ 1, D=c("Mixed", > "Binomial", "Binomial","Binomial","Binomial", "Binomial", "Binomial", > "Binomial", "Binomial", "Binomial" ,"Binomial"), estoptions = list(EstM = 0), > data=data)Unfortunately, there is missing data in all of the Level1 (topic) > responses. I have been using the mice package ([CRAN][1]) to multiply impute > the missing values. I can fit the model to the imputed datasets, using the formula > fitMI <- (with(MI.Data, runMLwiN( c(probit(T_Darkness, cons), probit(T_Day, > cons), probit(T_Light, cons), probit(T_Night, cons), probit(T_Rain, cons), > probit(T_Rainbows, cons), probit(T_Snow, cons), probit(T_Storms, cons), > probit(T_Waterfalls, cons), probit(T_Waves, cons)) ~ 1, D=c("Mixed", > "Binomial", "Binomial","Binomial","Binomial", "Binomial", "Binomial", > "Binomial", "Binomial", "Binomial" ,"Binomial"), estoptions = list(EstM = 0), > data=data))) However, when I come to pool the analyses with the call code > pool(fitMI) it fails, with the Error:Error in pool(with(tempData, runMLwiN(c(probit(T_Darkness, cons), probit(T_Day, : Object has no coef() method. I am not sure why it is saying there is no coefficient, as the analyses of the individual MI datasets provide both fixed parts (coefficients) and random parts (covariances) Any help with what is going wrong would be much appreciated. I should warn you that this is my first foray into using R and multilevel modelling. Also I know there is a MlwiN package ([REALCOM][2]) that can do this but I don't have the background to use the MLwiN software outside of R. thanks johnny R reproducible example Libraries used > library(R2MLwiN) > library(mice) Subset of data` > T_Darkness <- c(0, 1, 0, 0, 0, 0, 0, 1, 0, 0, NA, 0, 0, 0, NA, 1, 0, NA,NA, > 1, 0, 0, 0, 1, 0, 0, 0, NA, 0, 0, 0, NA, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 1, NA, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, NA, 1, 0) > T_Day <- c(0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, > NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, NA, 0, 0, 0, > 0, NA, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, NA, NA, 0) > T_Light <- c(0, 0, NA, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, > 0, 0, 1, 0, 0, NA, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 0, 1, NA, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NA, 0, 0) > T_Night <- c(0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, > 0, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, > 0, 0,NA, 0, NA, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, NA, 0, 0) > T_Rain <- c(1, 0, 0, 1, 1, 0, 0, NA, 0, 1, 0, 0, 1, 0, 0, 0, 0, NA, 0, 0, 1, > 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, NA, 0, 0, 0, 0, 1, 0, > 0, 0, NA, 1, NA, 0, 0, 0, 0, 1, NA, 1, 0, 0, 0, 0, 1, NA, 0, 0) > T_Rainbows <- c(1, 1, 1, 1, 0, 1, 0, 1, 0, 1, NA, 1, 1, 0, 0, 1, 0, NA, 0, 1, > 0, NA, 0, 1, 0, 0, 0, 0, 0, NA, 0, 0, 0, NA, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, > 0, 1, 0, 1, 1, 1, 1, NA, 1, 0, 1, NA, 0, 0, 1, 0, 1, 1, 1, 0, 1) > T_Snow <- c(0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, NA, 0, 0, 1, 0, 0, 0, 0, 0, 0, > 0, 0, 1, 1, 0, 0, 0, NA, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, > 0, 0, NA, 0, 0, 1, NA, 1, 0, 1, 1, 0, 0, 0, 0, 0, NA, 0, 0, 0) > T_Storms <- c(0, 0, 0, 1, 1, 1, 0, 1, 0, 1, NA, 0, 0, 0, 0, 1, 0, NA, 0, 0, > 1, 0, 0, NA, 1, 1, NA, 0, 0, NA, 0, 1, 0, NA, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, > 0, 0, 1, 0, 0, 0, 1, 0, NA, 1, 0, NA, 0, 0, 0, 1, 1, 0, 1, NA, NA, 1) > T_Waterfalls <- c(0, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, > 0, 1, 0, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, NA, 0, > 0, 0, 0, 0, NA, 0, 1, 0, NA, 1, 0, 1, 0, 0, 0, NA, 0, 0, 0, NA, NA, 0) > T_Waves <- c(0, 1, 0, 1, 1, 0, 1, NA, 0, 0, NA, 0, 0, 0, NA, 1, 0, 0, 0, 0, > 1, 0, NA, 0, NA, 0, 0, NA, 0, 0, 0, 0, 0, 0, NA, 1, 0, 0, 0, 1, 0, 0, NA, 0, > 1, 0, 0, 0, 0, 0, 1, 1, NA, 1, 1, NA, 0, 0, 0, NA, 0, 0, 0, NA, 0, 0) > data <- data.frame (T_Darkness, T_Day, T_Light, T_Night, T_Rain, T_Rainbows, > T_Snow, T_Storms, T_Waterfalls, T_Waves) > data$cons <- 1 Data imputed using mice with > MI.Data <- mice(data,m=5,maxit=50,meth='pmm',seed=500) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/list
[R] R Licensing Question
Hello, I have found a list of all software licenses supported by CRAN at the following site: https://svn.r-project.org/R/trunk/share/licenses/license.db There is also the list of commonly used licenses here: https://cran.r-project.org/web/licenses/ I have tried to read through some of these licenses, but I am not a lawyer and some of the legal jargon is difficult to get through. I have a simple question: Are there any packages available on CRAN that have a license that requires that every use of a particular package (e.g. in an analysis) be made open source as well? I have never heard of this being the case, and it does not appear to be true for any of the most commonly used licenses, but from what I understand it would be possible for someone to create a license that has this requirement. I apologize for the mass email if this is not the best forum for this question, but I could not find an answer elsewhere. Thank you, Jonathan __ Jonathan Gellar Statistician Mathematica Policy Research 1100 First Street NE, 12th Floor Washington, DC 20002 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replace NaN with value from the same row
Ok, i will do, thanks for your help. J > Subject: RE: [R] Replace NaN with value from the same row > From: jdnew...@dcn.davis.ca.us > Date: Sun, 18 Oct 2015 12:55:14 -0700 > To: jonathanrear...@outlook.com > CC: r-help@r-project.org > > You should (re-)read the intro document that comes with R, "An Introduction > to R". Pay particular attention to sections 2., 2.7, and 5.2. > > The "idx" variable that I defined is a vector in the current environment (in > your case apparently a local function environment). It is not a column in > your data frame. You should look at it using the str function. (You might > need to print the result of str, or use the debug capability of R to > single-step through your function and then use str. Read the help at ?debug.) > > The df[ idx, "offset" ] notation uses the logical indexing and string > indexing concepts in section 2.7 to select a subset of the rows and one > column of the data frame. > --- > Jeff NewmillerThe . . Go Live... > DCN:Basics: ##.#. ##.#. Live Go... > Live: OO#.. Dead: OO#.. Playing > Research Engineer (Solar/BatteriesO.O#. #.O#. with > /Software/Embedded Controllers) .OO#. .OO#. rocks...1k > --- > Sent from my phone. Please excuse my brevity. > > On October 18, 2015 12:24:42 PM PDT, Jonathan Reardon > wrote: > >Hi, Sorry to be a pain. Would you be kind enough to briefly explain > >what the lines are doing? > >From what i can gather, 'idx <- is.na( df$mean )' is making a new > >column called 'idx', finds the NaN values and inserts the boolean TRUE > >in the respective cell. > >df[ idx, "mean" ] <- df[ idx, "offset" ] << i am unsure what this > >is doing exactly. > >Jon > > > > > >> Subject: RE: [R] Replace NaN with value from the same row > >> From: jdnew...@dcn.davis.ca.us > >> Date: Sun, 18 Oct 2015 12:09:02 -0700 > >> To: jonathanrear...@outlook.com > >> > >> The Posting Guide mentioned at the bottom of every email in the list > >tells you that such an option is in your email software, which I know > >nothing about. Most software lets you choose the format as part of > >composing each email, but some software will let you set a default > >format to use for each email address (so all your emails to e.g. > >r-help@r-project.org will be plain text). > >> > >--- > >> Jeff NewmillerThe . . Go > >Live... > >> DCN:Basics: ##.#. ##.#. Live > >Go... > >> Live: OO#.. Dead: OO#.. > >Playing > >> Research Engineer (Solar/BatteriesO.O#. #.O#. with > >> /Software/Embedded Controllers) .OO#. .OO#. > >rocks...1k > >> > >--- > > > >> Sent from my phone. Please excuse my brevity. > >> > >> On October 18, 2015 11:29:51 AM PDT, Jonathan Reardon > > wrote: > >> >How do i send an email in plain text format and not HTML? > >> >I tried: > >> >idx <- is.na( df$mean ) > >> >df[ idx, "mean" ] <- df[ idx, "offset" ] > >> >I got the error message: > >> >In is.na(df$mean) : is.na() applied to non-(list or vector) of type > >> >'NULL' > >> >Jon > >> > > >> >> Subject: Re: [R] Replace NaN with value from the same row > >> >> From: jdnew...@dcn.davis.ca.us > >> >> Date: Sun, 18 Oct 2015 11:06:44 -0700 > >> >> To: jonathanrear...@outlook.com; r-help@r-project.org > >> >> > >> >> Next time send your email using plain text format rather than HTML > >so > >> >we see what you saw. > >> >> > >> >> Try > >> >> > >> >> idx <- is.na( df$mean ) > >> >> df[ idx, "mean" ] <- df[ idx, "offset" ] > >> >> > >> >> BTW there is a commonly-used function called df, so you might > >improve > >> >clarity by using DF for your temporary data frame name. > >> >> > >
[R] Replace NaN from 1 column with a value from the same row
Hi everyone, Ignore my previous post, i realised that the rows and columns i typed into the email were unreadable, sincere apologies for this. A simple question, but i cannot figure this out. I have a data-frame with 4 columns (onset, offset, outcome, mean): df<-data.frame(onset=c(72071,142598,293729), offset=c(72503,143030,294161), outcome=c(1,1,1), mean=c(7244615,NaN,294080)) For each 'NaN' in the mean column, i want to replace that NaN with the 'offset' value in the same row. I tried: df$mean <- replace(df$mean, is.na(df$mean), df$offset) but i get the error message: 'number of items to replace is not a multiple of replacement length'. I'm assuming because this is trying to insert the whole 'offset' column into my one NaN cell. Is this a correct interpretation of the error message? Can anyone tell me how to replace any mean row NaN's with the offset value from that very same row?I don't want to use any pasting etc as this needs to be used as part of a function working over a larger data set than the one shown here. CheersJonathan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Replace NaN with value from the same row
Hi everyone, A simple question, but i cannot figure this out. I have a data-frame with 4 columns (onset, offset, outcome, mean): onset offset outcome mean8 72071 72503 1 7244615 142598 143030 1NaN30 293729 294161 1 294080 For each 'NaN' in the mean column, i want to replace that NaN with the 'offset' value in the same row. Intended outcome: onset offset outcome mean8 72071 72503 1 7244615 142598 143030 114303030 293729 294161 1 294080 I have tried: df$mean <- replace(df$mean, is.na(df$mean), df$offset) but i get the error message: 'number of items to replace is not a multiple of replacement length'. I'm assuming because this is trying to insert the whole 'offset' column into my one NaN cell. Is this a correct interpretation of the error message? Can anyone tell me how to replace any mean row NaN's with the offset value from that very same row? I don't want to use any pasting etc as this needs to be used as part of a function working over a large dataset than the one shown here. Cheers Jonathan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] kknn::predict and kknn$fitted.values
In thinking about this 'problem' last night, I found the 'solution'. Any NN algorithm needs to keep track of all the data it is given, both X and Y data, otherwise how could it find and report the nearest neighbour! When predicting (i.e. predict.kknn) it will find the closest match (nearest neighbour), which, for a point from the original dataset /is that point/! In contrast, the kknn$fitted.values are derived from some cross validation approach; likely either finding the nearest point with non-zero distance, or build a model without that point and see where it falls. Otherwise, it wouldn't be possible to report the accuracy of the model using only a single dataset. I will retest the algorithm using a split training/test dataset to better understand how predict.kknn selects a model from the suite generated by train.kknn—my original question. I assume it chooses kknn$best.parameters, but want to verify this. Hopefully that clarifies the issue. I post here in case future users have a similar question. Thanks to any who took the time to think about this! Jonathan -- View this message in context: http://r.789695.n4.nabble.com/kknn-predict-and-kknn-fitted-values-tp4711625p4711634.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] kknn::predict and kknn$fitted.values
I am noticing that there is a difference between the fitted.values returned by train.kknn, and the values returned using predict with the same model and dataset. For example: > data (glass) > tmp <- train.kknn(Type ~ ., glass, kmax=1, kernel="rectangular", > distance=1) > tmp$fitted.values [[1]] [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 [62] 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 1 2 2 1 2 2 5 2 2 2 6 2 2 2 2 2 2 2 2 2 2 2 2 [123] 2 2 2 2 3 2 2 2 5 5 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 2 3 3 3 3 3 3 2 3 7 5 5 5 5 5 5 5 5 5 5 2 5 6 6 6 6 6 6 6 [184] 2 6 7 7 2 6 7 7 7 7 7 7 7 7 7 7 7 7 5 7 7 7 7 7 7 7 7 7 7 7 7 attr(,"kernel") [1] rectangular attr(,"k") [1] 1 Levels: 1 2 3 5 6 7 > predict (tmp,glass) [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [62] 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 [123] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 5 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 [184] 6 6 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 Levels: 1 2 3 5 6 7 When I check the confusion matricies for these I see that fitted.values is giving some confusion, that is, like it is a true fit, whereas predict is returning the exact answers. > table (tmp$fitted.values[[1]],glass$Type) 1 2 3 5 6 7 1 69 4 0 0 0 0 2 1 67 2 1 1 1 3 0 1 15 0 0 0 5 0 3 0 11 0 1 6 0 1 0 0 8 1 7 0 0 0 1 0 26 > table (predict(tmp,glass),glass$Type) 1 2 3 5 6 7 1 70 0 0 0 0 0 2 0 76 0 0 0 0 3 0 0 17 0 0 0 5 0 0 0 13 0 0 6 0 0 0 0 9 0 7 0 0 0 0 0 29 Can anyone clarify what fitted.values and predict actually do? I would have expected they would give the same output. Thanks... Jonathan -- View this message in context: http://r.789695.n4.nabble.com/kknn-predict-and-kknn-fitted-values-tp4711625.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R command to open a file "browser" on Windows and Mac?
Folks: Is there an easy function to open a finder window (on mac) or windows explorer window (on windows) given an input folder? A lot of times I want to be able to see via a file browser my working directory. Is there a good R hack to do this? --j [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Correlation question
Of course! Thank you, I knew I was missing something painfully obvious. Its seems, then, that this line 1-sum((cars$dist-fitted.wrong)^2)/sum((cars$dist-mean(cars$dist))^2) is finding something other than the traditional correlation. I found this in a lecture introducing correlation, but , now, I'm not sure what it is. It does do a better job of showing that the fitted.wrong variable is not a good prediction of the distance. On Feb 21, 2015, at 4:36 PM, Kehl Dániel wrote: > Hi, > > try > > cor(fitted.right,fitted.wrong) > > should give 1 as both are a linear function of speed! Hence > cor(cars$dist,fitted.right)^2 and cor(x=cars$dist,y=fitted.wrong)^2 must be > the same. > > HTH > d > > Feladó: R-help [r-help-boun...@r-project.org] ; meghatalmazó: Jonathan > Thayn [jth...@ilstu.edu] > Küldve: 2015. február 21. 22:42 > To: r-help@r-project.org > Tárgy: [R] Correlation question > > I recently compared two different approaches to calculating the correlation > of two variables, and I cannot explain the different results: > > data(cars) > model <- lm(dist~speed,data=cars) > coef(model) > fitted.right <- model$fitted > fitted.wrong <- -17+5*cars$speed > > > When using the OLS fitted values, the lines below all return the same R2 > value: > > 1-sum((cars$dist-fitted.right)^2)/sum((cars$dist-mean(cars$dist))^2) > cor(cars$dist,fitted.right)^2 > (sum((cars$dist-mean(cars$dist))*(fitted.right-mean(fitted.right)))/(49*sd(cars$dist)*sd(fitted.right)))^2 > > > However, when I use my estimated parameters to find the fitted values, > "fitted.wrong", the first equation returns a much lower R2 value, which I > would expect since the fit is worse, but the other lines return the same R2 > that I get when using the OLS fitted values. > > 1-sum((cars$dist-fitted.wrong)^2)/sum((cars$dist-mean(cars$dist))^2) > cor(x=cars$dist,y=fitted.wrong)^2 > (sum((cars$dist-mean(cars$dist))*(fitted.wrong-mean(fitted.wrong)))/(49*sd(cars$dist)*sd(fitted.wrong)))^2 > > > I'm sure I'm missing something simple, but can someone explain the difference > between these two methods of finding R2? Thanks. > > Jon >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Correlation question
I recently compared two different approaches to calculating the correlation of two variables, and I cannot explain the different results: data(cars) model <- lm(dist~speed,data=cars) coef(model) fitted.right <- model$fitted fitted.wrong <- -17+5*cars$speed When using the OLS fitted values, the lines below all return the same R2 value: 1-sum((cars$dist-fitted.right)^2)/sum((cars$dist-mean(cars$dist))^2) cor(cars$dist,fitted.right)^2 (sum((cars$dist-mean(cars$dist))*(fitted.right-mean(fitted.right)))/(49*sd(cars$dist)*sd(fitted.right)))^2 However, when I use my estimated parameters to find the fitted values, "fitted.wrong", the first equation returns a much lower R2 value, which I would expect since the fit is worse, but the other lines return the same R2 that I get when using the OLS fitted values. 1-sum((cars$dist-fitted.wrong)^2)/sum((cars$dist-mean(cars$dist))^2) cor(x=cars$dist,y=fitted.wrong)^2 (sum((cars$dist-mean(cars$dist))*(fitted.wrong-mean(fitted.wrong)))/(49*sd(cars$dist)*sd(fitted.wrong)))^2 I'm sure I'm missing something simple, but can someone explain the difference between these two methods of finding R2? Thanks. Jon [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] prediction intervals for robust regression
I have created robust regression models using least trimmed squares and MM-regression (using the R package robustbase). I am now looking to create prediction intervals for the predicted results. While I have seen some discussion in the literature about confidence intervals on the estimates for robust regression, I haven't had much success in finding out how to create prediction intervals for the results. I was wondering if anyone would be able to provide some direction on how to create these prediction intervals in the robust regression setting. Thanks, Jonathan Burns Sr. Statistician General Dynamics Information Technology Medicare & Medicaid Solutions One West Pennsylvania Avenue Baltimore, MD 21204 (410)-842-1594 jonathan.bur...@gdit.com<mailto:jonathan.bur...@gdit.com> www.gdit.com<http://www.gdit.com/> [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using PCA to filter a series
This is exactly what I was looking for. Thank you. Jonathan Thayn On Oct 3, 2014, at 10:32 AM, David L Carlson wrote: > You can reconstruct the data from the first component. Here's an example > using singular value decomposition on the original data matrix: > >> d <- cbind(d1, d2, d3, d4) >> d.svd <- svd(d) >> new <- d.svd$u[,1] * d.svd$d[1] > > new is basically your cp1. If we multiply it by each of the loadings, we can > create reconstructed values based on the first component: > >> dnew <- sapply(d.svd$v[,1], function(x) new * x) >> round(head(dnew), 1) > [,1] [,2] [,3] [,4] > [1,] 119.3 134.1 135.7 134.6 > [2,] 104.2 117.2 118.6 117.6 > [3,] 109.7 123.3 124.8 123.8 > [4,] 109.3 122.9 124.3 123.3 > [5,] 105.8 119.0 120.4 119.4 > [6,] 111.5 125.4 126.9 125.8 >> head(d) > d1 d2 d3 d4 > [1,] 113 138 138 134 > [2,] 108 115 120 115 > [3,] 105 127 129 120 > [4,] 103 127 129 120 > [5,] 109 119 120 117 > [6,] 115 126 126 123 > >> diag(cor(d, dnew)) > [1] 0.9233742 0.9921703 0.9890085 0.9910287 > > Since you want a single variable to stand for all four, you could scale new > to the mean: > >> newd <- new*mean(d.svd$v[,1]) >> head(newd) > [1] 130.9300 114.3972 120.3884 119.9340 116.1588 122.3983 > > ----- > David L Carlson > Department of Anthropology > Texas A&M University > College Station, TX 77840-4352 > > > > -Original Message- > From: Jonathan Thayn [mailto:jth...@ilstu.edu] > Sent: Thursday, October 2, 2014 11:11 PM > To: David L Carlson > Cc: r-help@r-project.org > Subject: Re: [R] Using PCA to filter a series > > I suppose I could calculate the eigenvectors directly and not worry about > centering the time-series, since they essentially the same range to begin > with: > > vec <- eigen(cor(cbind(d1,d2,d3,d4)))$vector > cp <- cbind(d1,d2,d3,d4)%*%vec > cp1 <- cp[,1] > > I guess there is no way to reconstruct the original input data using just the > first component, though, is there? Not the original data in it entirety, just > one time-series that we representative of the general pattern. Possibly > something like the following, but with just the first component: > > o <- cp%*%solve(vec) > > Thanks for your help. It's been a long time since I've played with PCA. > > Jonathan Thayn > > > > > On Oct 2, 2014, at 4:59 PM, David L Carlson wrote: > >> I think you want to convert your principal component to the same scale as >> d1, d2, d3, and d4. But the "original space" is a 4-dimensional space in >> which d1, d2, d3, and d4 are the axes, each with its own mean and standard >> deviation. Here are a couple of possibilities >> >> # plot original values for comparison >>> matplot(cbind(d1, d2, d3, d4), pch=20, col=2:5) >> # standardize the pc scores to the grand mean and sd >>> new1 <- scale(pca$scores[,1])*sd(c(d1, d2, d3, d4)) + mean(c(d1, d2, d3, >>> d4)) >>> lines(new1) >> # Use least squares regression to predict the row means for the original >> four variables >>> new2 <- predict(lm(rowMeans(cbind(d1, d2, d3, d4))~pca$scores[,1])) >>> lines(new2, col="red") >> >> ----- >> David L Carlson >> Department of Anthropology >> Texas A&M University >> College Station, TX 77840-4352 >> >> >> >> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On >> Behalf Of Don McKenzie >> Sent: Thursday, October 2, 2014 4:39 PM >> To: Jonathan Thayn >> Cc: r-help@r-project.org >> Subject: Re: [R] Using PCA to filter a series >> >> >> On Oct 2, 2014, at 2:29 PM, Jonathan Thayn wrote: >> >>> Hi Don. I would like to "de-rotate� the first component back to its >>> original state so that it aligns with the original time-series. My goal is >>> to create a �cleaned�, or a �model� time-series from which noise has been >>> removed. >> >> Please cc the list with replies. It�s considered courtesy plus you�ll get >> more help that way than just from me. >> >> Your goal sounds almost metaphorical, at least to me. Your first axis >> �aligns� with the original time series already in that it captures the >> dominant variation >> across all four. Beyond that, there are many approaches to signal/noise >> relations within time-series analysis. I am not a good source of help on >> these, and you probably need a statistical consul
Re: [R] Using PCA to filter a series
I suppose I could calculate the eigenvectors directly and not worry about centering the time-series, since they essentially the same range to begin with: vec <- eigen(cor(cbind(d1,d2,d3,d4)))$vector cp <- cbind(d1,d2,d3,d4)%*%vec cp1 <- cp[,1] I guess there is no way to reconstruct the original input data using just the first component, though, is there? Not the original data in it entirety, just one time-series that we representative of the general pattern. Possibly something like the following, but with just the first component: o <- cp%*%solve(vec) Thanks for your help. It's been a long time since I've played with PCA. Jonathan Thayn On Oct 2, 2014, at 4:59 PM, David L Carlson wrote: > I think you want to convert your principal component to the same scale as d1, > d2, d3, and d4. But the "original space" is a 4-dimensional space in which > d1, d2, d3, and d4 are the axes, each with its own mean and standard > deviation. Here are a couple of possibilities > > # plot original values for comparison >> matplot(cbind(d1, d2, d3, d4), pch=20, col=2:5) > # standardize the pc scores to the grand mean and sd >> new1 <- scale(pca$scores[,1])*sd(c(d1, d2, d3, d4)) + mean(c(d1, d2, d3, d4)) >> lines(new1) > # Use least squares regression to predict the row means for the original four > variables >> new2 <- predict(lm(rowMeans(cbind(d1, d2, d3, d4))~pca$scores[,1])) >> lines(new2, col="red") > > - > David L Carlson > Department of Anthropology > Texas A&M University > College Station, TX 77840-4352 > > > > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf Of Don McKenzie > Sent: Thursday, October 2, 2014 4:39 PM > To: Jonathan Thayn > Cc: r-help@r-project.org > Subject: Re: [R] Using PCA to filter a series > > > On Oct 2, 2014, at 2:29 PM, Jonathan Thayn wrote: > >> Hi Don. I would like to "de-rotate� the first component back to its original >> state so that it aligns with the original time-series. My goal is to create >> a �cleaned�, or a �model� time-series from which noise has been removed. > > Please cc the list with replies. It�s considered courtesy plus you�ll get > more help that way than just from me. > > Your goal sounds almost metaphorical, at least to me. Your first axis > �aligns� with the original time series already in that it captures the > dominant variation > across all four. Beyond that, there are many approaches to signal/noise > relations within time-series analysis. I am not a good source of help on > these, and you probably need a statistical consult (locally?), which is not > the function of this list. > >> >> >> Jonathan Thayn >> >> >> >> On Oct 2, 2014, at 2:33 PM, Don McKenzie wrote: >> >>> >>> On Oct 2, 2014, at 12:18 PM, Jonathan Thayn wrote: >>> >>>> I have four time-series of similar data. I would like to combine these >>>> into a single, clean time-series. I could simply find the mean of each >>>> time period, but I think that using principal components analysis should >>>> extract the most salient pattern and ignore some of the noise. I can >>>> compute components using princomp >>>> >>>> >>>> d1 <- c(113, 108, 105, 103, 109, 115, 115, 102, 102, 111, 122, 122, 110, >>>> 110, 104, 121, 121, 120, 120, 137, 137, 138, 138, 136, 172, 172, 157, 165, >>>> 173, 173, 174, 174, 119, 167, 167, 144, 170, 173, 173, 169, 155, 116, 101, >>>> 114, 114, 107, 108, 108, 131, 131, 117, 113) >>>> d2 <- c(138, 115, 127, 127, 119, 126, 126, 124, 124, 119, 119, 120, 120, >>>> 115, 109, 137, 142, 142, 143, 145, 145, 163, 169, 169, 180, 180, 174, 181, >>>> 181, 179, 173, 185, 185, 183, 183, 178, 182, 182, 181, 178, 171, 154, 145, >>>> 147, 147, 124, 124, 120, 128, 141, 141, 138) >>>> d3 <- c(138, 120, 129, 129, 120, 126, 126, 125, 125, 119, 119, 122, 122, >>>> 115, 109, 141, 144, 144, 148, 149, 149, 163, 172, 172, 183, 183, 180, 181, >>>> 181, 181, 173, 185, 185, 183, 183, 184, 182, 182, 181, 179, 172, 154, 149, >>>> 156, 156, 125, 125, 115, 139, 140, 140, 138) >>>> d4 <- c(134, 115, 120, 120, 117, 123, 123, 128, 128, 119, 119, 121, 121, >>>> 114, 114, 142, 145, 145, 144, 145, 145, 167, 172, 172, 179, 179, 179, 182, >>>> 182, 182, 182, 182, 184, 184, 182, 184, 183, 183, 181, 179, 172, 149, 149, >>>> 149, 149, 124, 124, 119, 131, 135, 135, 134) >>>> >>>> >>>>
[R] Using PCA to filter a series
I have four time-series of similar data. I would like to combine these into a single, clean time-series. I could simply find the mean of each time period, but I think that using principal components analysis should extract the most salient pattern and ignore some of the noise. I can compute components using princomp d1 <- c(113, 108, 105, 103, 109, 115, 115, 102, 102, 111, 122, 122, 110, 110, 104, 121, 121, 120, 120, 137, 137, 138, 138, 136, 172, 172, 157, 165, 173, 173, 174, 174, 119, 167, 167, 144, 170, 173, 173, 169, 155, 116, 101, 114, 114, 107, 108, 108, 131, 131, 117, 113) d2 <- c(138, 115, 127, 127, 119, 126, 126, 124, 124, 119, 119, 120, 120, 115, 109, 137, 142, 142, 143, 145, 145, 163, 169, 169, 180, 180, 174, 181, 181, 179, 173, 185, 185, 183, 183, 178, 182, 182, 181, 178, 171, 154, 145, 147, 147, 124, 124, 120, 128, 141, 141, 138) d3 <- c(138, 120, 129, 129, 120, 126, 126, 125, 125, 119, 119, 122, 122, 115, 109, 141, 144, 144, 148, 149, 149, 163, 172, 172, 183, 183, 180, 181, 181, 181, 173, 185, 185, 183, 183, 184, 182, 182, 181, 179, 172, 154, 149, 156, 156, 125, 125, 115, 139, 140, 140, 138) d4 <- c(134, 115, 120, 120, 117, 123, 123, 128, 128, 119, 119, 121, 121, 114, 114, 142, 145, 145, 144, 145, 145, 167, 172, 172, 179, 179, 179, 182, 182, 182, 182, 182, 184, 184, 182, 184, 183, 183, 181, 179, 172, 149, 149, 149, 149, 124, 124, 119, 131, 135, 135, 134) pca <- princomp(cbind(d1,d2,d3,d4)) plot(pca$scores[,1]) This seems to have created the clean pattern I want, but I would like to project the first component back into the original axes? Is there a simple way to do that? Jonathan B. Thayn [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Building R for better performance
All, I’ve attached the actual benchmark TACC and I used. I’ve also attached a paper I wrote covering this in a little more detail. The paper specifies the hardware configuration I used. Let me know if you have any other questions. Regards, Jonathan Anspach Sr. Software Engineer Intel Corp. jonathan.p.ansp...@intel.com<mailto:jonathan.p.ansp...@intel.com> 713-751-9460 From: henrik.bengts...@gmail.com [mailto:henrik.bengts...@gmail.com] On Behalf Of Henrik Bengtsson Sent: Thursday, September 11, 2014 9:18 AM To: Anspach, Jonathan P Cc: arnaud gaboury; r-help@r-project.org Subject: Re: [R] Building R for better performance You'll find R-benchmark-25.R, which I assume is the same and the proper pointer to use, at http://<http://r.research.att.com/benchmarks/>r.research.att.com<http://r.research.att.com/benchmarks/>/benchmarks/<http://r.research.att.com/benchmarks/> Henrik I'm out of the office today, but will resend it tomorrow. Jonathan Anspach Intel Corp. Sent from my mobile phone. On Sep 11, 2014, at 3:49 AM, "arnaud gaboury" mailto:arnaud.gabo...@gmail.com>> wrote: >>> I got the benchmark script, which I've attached, from Texas Advanced >>> Computing Center. Here are my results (elapsed times, in secs): > > > Where can we get the benchmark script? __ R-help@r-project.org<mailto:R-help@r-project.org> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Building R for better performance
Yes, that's the original. Then TACC increased the matrix sizes for their tests. Jonathan Anspach Intel Corp. Sent from my mobile phone. On Sep 11, 2014, at 9:18 AM, "Henrik Bengtsson" mailto:h...@biostat.ucsf.edu>> wrote: You'll find R-benchmark-25.R, which I assume is the same and the proper pointer to use, at http://<http://r.research.att.com/benchmarks/>r.research.att.com<http://r.research.att.com/benchmarks/>/benchmarks/<http://r.research.att.com/benchmarks/> Henrik I'm out of the office today, but will resend it tomorrow. Jonathan Anspach Intel Corp. Sent from my mobile phone. On Sep 11, 2014, at 3:49 AM, "arnaud gaboury" mailto:arnaud.gabo...@gmail.com>> wrote: >>> I got the benchmark script, which I've attached, from Texas Advanced >>> Computing Center. Here are my results (elapsed times, in secs): > > > Where can we get the benchmark script? __ R-help@r-project.org<mailto:R-help@r-project.org> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Building R for better performance
I'm out of the office today, but will resend it tomorrow. Jonathan Anspach Intel Corp. Sent from my mobile phone. On Sep 11, 2014, at 3:49 AM, "arnaud gaboury" wrote: >>> I got the benchmark script, which I've attached, from Texas Advanced >>> Computing Center. Here are my results (elapsed times, in secs): > > > Where can we get the benchmark script? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] table over a matrix dimension...
R-helpers: I'm trying to determine the frequency of characters for a matrix applied to a single dimension, and generate a matrix as an output. I've come up with a solution, but it appears inelegant -- I was wondering if there is an easier way to accomplish this task: # Create a matrix of "factors" (characters): random_characters=matrix(sample(letters[1:4],1000,replace=TRUE),100,10) # Applying with the table() function doesn't work properly, because not all rows # have ALL of the factors, so I get a list output: apply(random_characters,1,table) # Hacked solution: unique_values = letters[1:4] countsmatrix <- t(apply(random_characters,1,function(x,unique_values) { counts=vector(length=length(unique_values)) for(i in seq(unique_values)) { counts[i] = sum(x==unique_values[i]) } return(counts) }, unique_values=unique_values )) # Gets me the output I want but requires two nested loops (apply and for() ), so # not efficient for very large datasets. ### Is there a more elegant solution to this? --j -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 259 Computing Applications Building, MC-150 605 East Springfield Avenue Champaign, IL 61820-6371 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] StatET and R 3.1.0
R-helpers: I posted a message to the statet listserv, but I thought I'd ask here as well since it is one of the major R developer environments-- has anyone gotten the StatET plugin for Eclipse working with R 3.1.0 yet? Any tricks? I did manage to get rj updated to 2.0 via: install.packages(c("rj", "rj.gd"), repos="http://download.walware.de/rj-2.0",type="source";) But the plugin is throwing an error (using the last maintenance update at http://download.walware.de/eclipse-4.3/testing): Launching the R Console was cancelled, because it seems starting the R engine failed. Please make sure that R package 'rj' (1.1 or compatible) is installed and that the R library paths are set correctly for the R environment configuration 'R'. --j -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 259 Computing Applications Building, MC-150 605 East Springfield Avenue Champaign, IL 61820-6371 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ignore escape characters in a string...
Thanks all, I'll try some of these suggestions out but it seems like a raw string ability could come in helpful -- there aren't any packages out there that have this capability? --j On Tue, Apr 8, 2014 at 1:23 PM, Jeff Newmiller wrote: > What is wrong with > > winpath <- readLines("clipboard ") > > ? > > If you want to show that as a literal in your code, then don't bother > assigning it to a variable, but let it echo to output and copy THAT and put > it in your source code. > > There is also file.choose()... > > --- > Jeff NewmillerThe . . Go Live... > DCN:Basics: ##.#. ##.#. Live Go... > Live: OO#.. Dead: OO#.. Playing > Research Engineer (Solar/BatteriesO.O#. #.O#. with > /Software/Embedded Controllers) .OO#. .OO#. rocks...1k > --- > Sent from my phone. Please excuse my brevity. > > On April 8, 2014 8:00:03 AM PDT, Jonathan Greenberg wrote: >>R-helpers: >> >>One of the minor irritations I have is copying paths from Windows >>explorer, which look like: >> >>C:\Program Files\R\R-3.0.3 >> >>and using them in a setwd() statement, since the "\" is, of course, >>interpreted as an escape character. I have to, at present, manually >>add in the double slashes or reverse them. >> >>So, I'd like to write a quick function that takes this path: >> >>winpath <- "C:\Program Files\R\R-3.0.3" >> >>and converts it to a ready-to-go R path -- is there a way to have R >>IGNORE escape characters in a character vector? >> >>Alternatively, is there some trick to using a copy/paste from Windows >>explorer I'm not aware of? >> >>--j > -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 259 Computing Applications Building, MC-150 605 East Springfield Avenue Champaign, IL 61820-6371 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Ignore escape characters in a string...
R-helpers: One of the minor irritations I have is copying paths from Windows explorer, which look like: C:\Program Files\R\R-3.0.3 and using them in a setwd() statement, since the "\" is, of course, interpreted as an escape character. I have to, at present, manually add in the double slashes or reverse them. So, I'd like to write a quick function that takes this path: winpath <- "C:\Program Files\R\R-3.0.3" and converts it to a ready-to-go R path -- is there a way to have R IGNORE escape characters in a character vector? Alternatively, is there some trick to using a copy/paste from Windows explorer I'm not aware of? --j -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 259 Computing Applications Building, MC-150 605 East Springfield Avenue Champaign, IL 61820-6371 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] numeric to factor via lookup table
R-helpers: Hopefully this is an easy one. Given a lookup table: mylevels <- data.frame(ID=1:10,code=letters[1:10]) And a set of values (note these do not completely cover the mylevels range): values <- c(1,2,5,5,10) How do I convert values to a factor object, using the mylevels to define the correct levels (ID matches the values), and code is the label? --j -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 259 Computing Applications Building, MC-150 605 East Springfield Avenue Champaign, IL 61820-6371 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Determine breaks based on a break type...
R-helpers: I was wondering, given a vector of data, if there is a way to calculate the break points based on the breaks= parameter from histogram, but skipping all the other calculations (all I want is the breakpoints, not the frequencies). I can, of course, simply run the histogram and extract the break component: mybreaks <- hist(runif(100))$breaks But is there a faster way to do this, if this is all I want? --j -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 259 Computing Applications Building, MC-150 605 East Springfield Avenue Champaign, IL 61820-6371 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overriding predict based on newdata...
David: Thanks! Is it generally frowned upon (if I'm Incorporating this into a package) to "override" a generic function like "predict", even if I plan on making it a pass-through function (same parameters, and if the data type doesn't match my "weird" data type, it will simply pass the parameters through to the generic S3 "predict")? --j On Mon, Mar 17, 2014 at 4:08 AM, David Winsemius wrote: > S3 classes only dispatch on the basis of the first parameter class. That was > one of the reasons for the development of S4-classed objects. You say you > have the expectation that the object is of a class that has an ordinary > `predict` method presumably S3 in character, so you probably need to write a > function that will mask the existing method. You would rewrite the existing > test for the existence of 'newdata' and the the definition of the new > function would persist through the rest of the session and could be > source()-ed in further sessions. > > -- > David. > > > On Mar 16, 2014, at 2:09 PM, Jonathan Greenberg wrote: > >> R-helpers: >> >> I'm having some trouble with this one -- I figure because I'm a bit of >> a noob with S3 classes... Here's my challenge: I want to write a >> custom predict statement that is triggered based on the presence and >> class of a *newdata* parameter (not the "object" parameter). The >> reason is I am trying to write a custom function based on an oddly >> formatted dataset that has been assigned an R class. If the predict >> function "detects" it (class(newdata) == "myweirdformat") it does a >> conversion of the newdata to what most predict statements expect (e.g. >> a dataframe) and then passes the converted dataset along to the >> generic predict statement. If newdata is missing or is not of the odd >> class it should just pass everything along to the generic predict as >> usual. >> >> What would be the best way to approach this problem? Since (my >> understanding) is that predict is dispatched based on the object >> parameter, this is causing me confusion -- my object should still >> remain the model, I'm just allowing a new data type to be fed into the >> predict model(s). >> >> Cheers! >> >> --j >> >> -- >> Jonathan A. Greenberg, PhD >> Assistant Professor >> Global Environmental Analysis and Remote Sensing (GEARS) Laboratory >> Department of Geography and Geographic Information Science >> University of Illinois at Urbana-Champaign >> 259 Computing Applications Building, MC-150 >> 605 East Springfield Avenue >> Champaign, IL 61820-6371 >> Phone: 217-300-1924 >> http://www.geog.illinois.edu/~jgrn/ >> AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius > Alameda, CA, USA > -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 259 Computing Applications Building, MC-150 605 East Springfield Avenue Champaign, IL 61820-6371 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Overriding predict based on newdata...
R-helpers: I'm having some trouble with this one -- I figure because I'm a bit of a noob with S3 classes... Here's my challenge: I want to write a custom predict statement that is triggered based on the presence and class of a *newdata* parameter (not the "object" parameter). The reason is I am trying to write a custom function based on an oddly formatted dataset that has been assigned an R class. If the predict function "detects" it (class(newdata) == "myweirdformat") it does a conversion of the newdata to what most predict statements expect (e.g. a dataframe) and then passes the converted dataset along to the generic predict statement. If newdata is missing or is not of the odd class it should just pass everything along to the generic predict as usual. What would be the best way to approach this problem? Since (my understanding) is that predict is dispatched based on the object parameter, this is causing me confusion -- my object should still remain the model, I'm just allowing a new data type to be fed into the predict model(s). Cheers! --j -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 259 Computing Applications Building, MC-150 605 East Springfield Avenue Champaign, IL 61820-6371 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Building R for better performance
CE, Sorry for the delay. I haven't installed any additional packages, so I don't know the answer to your question. Let me look into it and get back to you. Regards, Jonathan Anspach Sr. Software Engineer Intel Corp. jonathan.p.ansp...@intel.com 713-751-9460 -Original Message- From: ce [mailto:zadi...@excite.com] Sent: Wednesday, March 05, 2014 8:54 PM To: r-help@r-project.org; Anspach, Jonathan P Subject: Re: [R] Building R for better performance Hi Jonathan, I think most people would be interested in such a tool, because main complaint of R is its slowness for some operations and big data. Even thought the intel software is paying , I could install it free since I am not selling any software and work for non-profit. I compiled successfully on my opensuse.. My question is : after make install , do I need to give special options to install.packages or they will be complied with icc automatically ? Regards CE -Original Message- From: "Anspach, Jonathan P" [jonathan.p.ansp...@intel.com] Date: 03/05/2014 12:28 AM To: "r-help@r-project.org" Subject: [R] Building R for better performance Greetings, I'm a software engineer with Intel. Recently I've been investigating R performance on Intel Xeon and Xeon Phi processors and RH Linux. I've also compared the performance of R built with the Intel compilers and Intel Math Kernel Library to a "default" build (no config options) that uses the GNU compilers. To my dismay, I've found that the GNU build always runs on a single CPU core, even during matrix operations. The Intel build runs matrix operations on multiple cores, so it is much faster on those operations. Running the benchmark-2.5 on a 24 core Xeon system, the Intel build is 13x faster than the GNU build (21 seconds vs 275 seconds). Unfortunately, this advantage is not documented anywhere that I can see. Building with the Intel tools is very easy. Assuming the tools are installed in /opt/intel/composerxe, the process is simply (in bash shell): $ . /opt/intel/composerxe/bin/compilervars.sh intel64 $ ./configure --with-blas="-L/opt/intel/composerxe/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lm" --with-lapack CC=icc CFLAGS=-O2 CXX=icpc CXXFLAGS=-O2 F77=ifort FFLAGS=-O2 FC=ifort FCFLAGS=-O2 $ make $ make check My questions are: 1) Do most system admins and/or R installers know about this performance difference, and use the Intel tools to build R? 2) Can we add information on the advantage of building with the Intel tools, and how to do it, to the installation instructions and FAQ? I can post my data if anyone is interested. Thanks, Jonathan Anspach Sr. Software Engineer Intel Corp. jonathan.p.ansp...@intel.com 713-751-9460 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Building R for better performance
Simon, Thanks for the information and links. First of all, did you ever resolve your problem? If not, did you file an issue in Intel Premier Support? That's the best way to bring it to our attention. If you don't want to do that I can try to get a compiler or MKL support engineer to look at your Intel Developer Zone discussion. I have no experience with OS X, so I wouldn't be much help. I got the benchmark script, which I've attached, from Texas Advanced Computing Center. Here are my results (elapsed times, in secs): gcc build (default) icc/MKL build Creation, transp., deformation of a 5000x5000 matrix 3.25 2.95 5000x5000 normal distributed random matrix ^1000 5.13 1.52 Sorting of 14,000,000 random values 1.61 1.64 5600x5600 cross-product matrix (b = a' * a) 97.44 0.56 Linear regr. over a 4000x4000 matrix (c = a \ b') 46.06 0.49 FFT over 4,800,000 random values 0.65 0.61 Eigenvalues of a 1200x1200 random matrix 5.55 1.37 Determinant of a 5000x5000 random matrix 34.18 0.55 Cholesky decomposition of a 6000x6000 matrix 37.07 0.47 Inverse of a 3200x3200 random matrix 29.49 0.57 3,500,000 Fibonacci numbers calculation (vector calc) 1.310.38 Creation of a 6000x6000 Hilbert matrix (matrix calc) 0.77 0.99 Grand common divisors of 400,000 pairs (recursion) 0.63 0.56 Creation of a 1000x1000 Toeplitz matrix (loops) 2.24 2.34 Escoufier's method on a 90x90 matrix (mixed) 9.55 6.02 Total 274.93 21.01 Regards, Jonathan Anspach Sr. Software Engineer Intel Corp. jonathan.p.ansp...@intel.com 713-751-9460 -Original Message- From: Simon Zehnder [mailto:szehn...@uni-bonn.de] Sent: Wednesday, March 05, 2014 3:55 AM To: Anspach, Jonathan P Cc: r-help@r-project.org Subject: Re: [R] Building R for better performance Jonathan, I myself tried something like this - comparing gcc, clang and intel on a Mac. From my experiences in HPC on the university cluster (where we also use the Xeon Phi, Landeshochleistungscluster University RWTH Aachen), the Intel compiler has better code optimization in regard to vectorisation, etc. (clang is up to now suffering from a not yet implemented OpenMP library). Here is a revolutionanalytics article about this topic: http://blog.revolutionanalytics.com/2010/06/performance-benefits-of-multithreaded-r.html As I usually use the Rcpp package for C++ extensions this could give me further performance. Though, I already failed when trying to compile R with the Intel compiler and linking against the MKL (see my topic in the Intel developer zone: http://software.intel.com/en-us/comment/1767418 and my threads on the R-User list: https://stat.ethz.ch/pipermail/r-sig-mac/2013-November/010472.html). So, to your questions: 1) I think that most admins do not even use the Intel compiler to compile R - this seems to me rare. There are some people I know they do and I think they could be aware of it - but these are only a few. As R is growing in usage and I do know from regional user meetings that very large companies start using it in their BI units - this should be of interest. 2) I would really welcome this step because compilation with intel (especially on a Mac) and linking to the MKL seems to be delicate. I am interested in the data - so if it is possible send it via the list or directly to my account. Fur
[R] Building R for better performance
Greetings, I'm a software engineer with Intel. Recently I've been investigating R performance on Intel Xeon and Xeon Phi processors and RH Linux. I've also compared the performance of R built with the Intel compilers and Intel Math Kernel Library to a "default" build (no config options) that uses the GNU compilers. To my dismay, I've found that the GNU build always runs on a single CPU core, even during matrix operations. The Intel build runs matrix operations on multiple cores, so it is much faster on those operations. Running the benchmark-2.5 on a 24 core Xeon system, the Intel build is 13x faster than the GNU build (21 seconds vs 275 seconds). Unfortunately, this advantage is not documented anywhere that I can see. Building with the Intel tools is very easy. Assuming the tools are installed in /opt/intel/composerxe, the process is simply (in bash shell): $ . /opt/intel/composerxe/bin/compilervars.sh intel64 $ ./configure --with-blas="-L/opt/intel/composerxe/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lm" --with-lapack CC=icc CFLAGS=-O2 CXX=icpc CXXFLAGS=-O2 F77=ifort FFLAGS=-O2 FC=ifort FCFLAGS=-O2 $ make $ make check My questions are: 1) Do most system admins and/or R installers know about this performance difference, and use the Intel tools to build R? 2) Can we add information on the advantage of building with the Intel tools, and how to do it, to the installation instructions and FAQ? I can post my data if anyone is interested. Thanks, Jonathan Anspach Sr. Software Engineer Intel Corp. jonathan.p.ansp...@intel.com 713-751-9460 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] install packages from R-forge SVN
R-helpers: I was curious if anyone developed a package/approach to installing packages directly from the R-forge SVN subsystem (rather than waiting for it to build)? I can, of course, SVN it command line but I was hoping for an install.packages("svn://") sort of approach. Cheers! --j -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 259 Computing Applications Building, MC-150 605 East Springfield Avenue Champaign, IL 61820-6371 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Checking for and adding "..." arguments to a function...
R-helpers: I'm guessing this is an easy one for some of you, but I'm a bit stumped. Given some arbitrary function (doesn't matter what it does): myfunction <- function(a,b,c) { return(a+b+c) } I want to test this function for the presence of the ellipses ("...") and, if they are missing, create a new function that has them: myfunction <- function(a,b,c,...) { return(a+b+c) } So, 1) how do I test for whether a function has an ellipses argument and, 2) how do I "append" the ellipses to the argument list if they do exist? Note that the test/modification should be done without invoking the function, e.g. I'm not asking how to test for this WITHIN the function, I'm asking how to test "myfunction" directly as an R object. Thanks! --j -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 259 Computing Applications Building, MC-150 605 East Springfield Avenue Champaign, IL 61820-6371 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Official way to set/retrieve options in packages?
Wanted to re-start this thread a bit, since I'm still not exactly sure the best approach to my problem -- basically, the parameters I'm try to make persistent are installation locations of a particular command line program that is not installed along with an R package I'm working on (GDAL, for those of you who are interested in the specifics). The function tries to dummy-proof this process by doing a (mostly) brute-force search of the user's drive for the program location the first time it executes, and then stores this information (the path to a given executable) in an option for use with other functions. This search process can take some time, so I'd prefer to have this option set in a semi-permanent way (so it persists between sessions). Now, Brian Ripley suggested modifying the .Rprofile, but Bert Guntner suggested this might not be a welcome behavior. Given that, on an operating system level, there are often per-program directories for preferences, would it follow that it might make sense to store package-options in some standardized location? If so, where might this be? Would it make sense to drop then in the package directory? Is this a discussion that should move over to r-developers? --j On Sat, Jun 1, 2013 at 4:57 PM, Prof Brian Ripley wrote: > On 01/06/2013 22:44, Anthony Damico wrote: >> hope this helps.. :) >> >> # define an object `x` >> x <- list( "any value here" , 10 ) >> >> # set `myoption` to that object >> options( "myoption" = x ) >> >> # retrieve it later (perhaps within a function elsewhere in the package) >> ( y <- getOption( myoption ) ) >> >> >> it's nice to name your options `mypackage.myoption` so users know what >> package the option is associated with in case they type `options()` >> >> >> here's the `.onLoad` function in the R survey package. notice how the >> options are only set *if* they don't already exist-- > > But a nicer convention is that used by most packages in R itself: if the > option is not set, the function using it assumes a suitable default. > That would make sense for all the FALSE defaults below. > > Note though that this is not 'persistent': users have to set options in > their startup files (see ?Startup). There is no official location to > store package configurations. Users generally dislike software saving > settings in their own file space so it seems very much preferable to use > the standard R mechanisms (.Rprofile etc). > >> >>> survey:::.onLoad >> >> function (...) >> { >> if (is.null(getOption("survey.lonely.psu"))) >> options(survey.lonely.psu = "fail") >> if (is.null(getOption("survey.ultimate.cluster"))) >> options(survey.ultimate.cluster = FALSE) >> if (is.null(getOption("survey.want.obsolete"))) >> options(survey.want.obsolete = FALSE) >> if (is.null(getOption("survey.adjust.domain.lonely"))) >> options(survey.adjust.domain.lonely = FALSE) >> if (is.null(getOption("survey.drop.replicates"))) >> options(survey.drop.replicates = TRUE) >> if (is.null(getOption("survey.multicore"))) >> options(survey.multicore = FALSE) >> if (is.null(getOption("survey.replicates.mse"))) >> options(survey.replicates.mse = FALSE) >> } >> >> >> >> >> >> On Sat, Jun 1, 2013 at 4:01 PM, Jonathan Greenberg wrote: >> >>> R-helpers: >>> >>> Say I'm developing a package that has a set of user-definable options that >>> I would like to be persistent across R-invocations (they are saved >>> someplace). Of course, I can create a little text file to be written/read, >>> but I was wondering if there is an "officially sanctioned" way to do this? >>> I see there is an options() and getOptions() function, but I'm unclear how >>> I would use this in my own package to create/save new options for my >>> particular package. Cheers! >>> >>> --j >>> >>> -- >>> Jonathan A. Greenberg, PhD >>> Assistant Professor >>> Global Environmental Analysis and Remote Sensing (GEARS) Laboratory >>> Department of Geography and Geographic Information Science >>> University of Illinois at Urbana-Champaign >>> 607 South Mathews Avenue, MC 150 >>> Urbana, IL 61801 >>> Phone: 217-300-1924 >>> http://www.geog.illinois.edu/~jgrn/ >>> AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 >>> >>>
[R] Multimodal multidimensional optimization
Hello all I've been performing a series of multidimensional optimizations (3 variables) using the optima() function. Recently, I noticed that the solution is rarely unimodal. Is there a package or function that handles multimodal multidimensional optimizations? I really appreciate any suggestions, I'm quite a bit beyond my expertise here. Thanks. Jonathan B. Thayn, Ph.D. Ridgely Fellow of Geography Department of Geography Geology Illinois State University Felmley Hall of Science, Rm 200A Normal, IL 61790 jth...@ilstu.edu my.ilstu.edu/~jthayn [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: C stack usage is too close to the limit when using list.files()
Thanks all -- ok, so the symbolic link issue is a distinct possibility, but fundamentally doesn't solve the issue since most users will have symbolic links on their machines SOMEPLACE, so a full drive scan will run into these issues -- is list.files calling find, or is it using a different algorithm? This seems like a shortcoming in the list.files algorithm -- is there a better solution (short of a System call, which I'm still not sure will work on Macs without Xcode -- a colleague of mine did NOT have Xcode, and reported not being able to run find from the command line) -- perhaps a different package? --j On Fri, Sep 27, 2013 at 3:08 PM, William Dunlap wrote: > Toss a couple of extra files in there and you will see the output grow > exponentially. > > % touch dir/IMPORTANT_1 dir/subdir/IMPORTANT_2 > > and in R those two new files cause 82 more strings to appear in list.file's > output: > >> nchar(list.files("dir", recursive=TRUE)) > [1] 11 18 33 40 55 62 77 84 99 106 121 128 143 150 165 172 187 194 > 209 > [20] 216 231 238 253 260 275 282 297 304 319 326 341 348 363 370 385 392 407 > 414 > [39] 429 436 451 458 473 480 495 502 517 524 539 546 561 568 583 590 605 612 > 627 > [58] 634 649 656 671 678 693 700 715 722 737 744 759 766 781 788 803 810 825 > 832 > [77] 847 854 869 876 891 898 901 > > 'find', by default, does not following symbolic links. > > % find dir > dir > dir/subdir > dir/subdir/IMPORTANT_2 > dir/subdir/linkToUpperDir > dir/IMPORTANT_1 > > The -L option makes it follow them, but it won't follow loops: > > % find -L dir > dir > dir/subdir > dir/subdir/IMPORTANT_2 > find: File system loop detected; `dir/subdir/linkToUpperDir' is part of the > same file system loop as `dir'. > dir/IMPORTANT_1 > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > > >> -----Original Message- >> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On >> Behalf >> Of William Dunlap >> Sent: Friday, September 27, 2013 12:56 PM >> To: Jonathan Greenberg; r-help >> Subject: Re: [R] Error: C stack usage is too close to the limit when using >> list.files() >> >> Do you have some symbolic links that make loops in your file system? >> list.files() has problems with such loops and find does not. E.g., on a >> Linux box: >> >> % cd /tmp >> % mkdir dir dir/subdir >> % cd dir/subdir >> % ln -s ../../dir linkToUpperDir >> % cd /tmp >> % R --quiet >> > list.files("dir", recursive=TRUE, full=TRUE) >> [1] >> "dir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToU >> pperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkT >> oUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/li >> nkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdi >> r/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/su >> bdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir >> /subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpper >> Dir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUp >> perDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkTo >> UpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/lin >> kToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir" >> > system("find dir") >> dir >> dir/subdir >> dir/subdir/linkToUpperDir >> >> Bill Dunlap >> Spotfire, TIBCO Software >> wdunlap tibco.com >> >> >> > -Original Message- >> > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] >> > On Behalf >> > Of Jonathan Greenberg >> > Sent: Friday, September 27, 2013 12:13 PM >> > To: r-help >> > Subject: [R] Error: C stack usage is too close to the limit when using >> > list.files() >> > >> > R-helpers: >> > >> > I'm running a file search on my entire drive (Mac OS X) using: >> > >> > files_found <- >> > list.files(dir="/",pattern=somepattern,recursive=TRUE,full.names=TRUE) >> > where somepattern is a search pattern (which I have confirmed via a >> > unix "find / -name somepattern" only returns ~ 3 results). >> > >> > I keep getting an error: >> > >> > Error: C stack usage is too close
Re: [R] Error: C stack usage is too close to the limit when using list.files()
Ben: I'd like to avoid using that (previous version of my code solved it in that way) -- I would like cross-platform compatibility and I am pretty sure, along with Windows, vanilla Macs don't come with "find" either unless XCode has been installed. Is the list.files() code itself recursive when using recursive=TRUE (so it has one recursion per bottom-folder)? --j P.S. I recognized that in my initial post I indicated using "dir" as the parameter -- it should have been "path" (the error occurred through the correct usage of list.files(path="/",...) That'll teach me not to copy/paste from my code... On Fri, Sep 27, 2013 at 2:36 PM, Ben Bolker wrote: > Jonathan Greenberg illinois.edu> writes: > >> >> R-helpers: >> >> I'm running a file search on my entire drive (Mac OS X) using: >> >> files_found <- > list.files(dir="/",pattern=somepattern,recursive=TRUE,full.names=TRUE) >> where somepattern is a search pattern (which I have confirmed via a >> unix "find / -name somepattern" only returns ~ 3 results). >> >> I keep getting an error: >> >> Error: C stack usage is too close to the limit >> >> when running this command. Any ideas on 1) how to fix this or 2) if >> there is an alternative to using list.files() to accomplish this >> search without resorting to an external package? > > I assuming that using > > system("find / -name somepattern") > > (possibly with intern=TRUE) isn't allowed? (I don't know what you're > trying to do, but if you don't need it to work on Windows-without-cygwin, > this should work across most Unix variants (although a "-print" might > be required) > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 259 Computing Applications Building, MC-150 605 East Springfield Avenue Champaign, IL 61820-6371 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error: C stack usage is too close to the limit when using list.files()
R-helpers: I'm running a file search on my entire drive (Mac OS X) using: files_found <- list.files(dir="/",pattern=somepattern,recursive=TRUE,full.names=TRUE) where somepattern is a search pattern (which I have confirmed via a unix "find / -name somepattern" only returns ~ 3 results). I keep getting an error: Error: C stack usage is too close to the limit when running this command. Any ideas on 1) how to fix this or 2) if there is an alternative to using list.files() to accomplish this search without resorting to an external package? Cheers! --jonathan -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 259 Computing Applications Building, MC-150 605 East Springfield Avenue Champaign, IL 61820-6371 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Confusing behaviour in data.table: unexpectedly changing variable
Thanks for your help, and sorry for mis-posting. JD On Wed, Sep 25, 2013 at 3:18 AM, Matthew Dowle wrote: > Very sorry to hear this bit you. If you need a copy of names before > changing them by reference : > oldnames <- copy(names(DT)) > This will be documented and it's on the bug list to do so. copy is needed in > other circumstances too, see ?copy. > More details here : > http://stackoverflow.com/questions/18662715/colnames-being-dropped-in-data-table-in-r > http://stackoverflow.com/questions/15913417/why-does-data-table-update-namesdt-by-reference-even-if-i-assign-to-another-v __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Confusing behaviour in data.table: unexpectedly changing variable
I got bitten badly when a variable I created for the purpose of recording an old set of names changed when I didn't think I was going near it. I'm not sure if this is a desired behaviour, or documented, or warned about. I read the data.table intro and the FAQ, and also ?setnames. Ben Bolker created a minimal reproducible example: library(data.table) DT = data.table(x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9) names(DT) ## [1] "x" "y" "v" oldnames <- names(DT) print(oldnames) ## [1] "x" "y" "v" setnames(DT, LETTERS[1:3]) print(oldnames) ## [1] "A" "B" "C" -- McMaster University Department of Biology http://lalashan.mcmaster.ca/theobio/DushoffLab/index.php/Main_Page https://twitter.com/jd_mathbio __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R-3.0.1 g77 errors
Sadly, I am limited to the Solaris 10 system. I wish that I could use Linux, the world uses it. -- Jonathan M. Prigot Partners Healthcare Systems On Wed, 2013-09-18 at 17:20 +0200, Simon Zehnder wrote: > On my systems Linux Scientific and Mac OS X I use as well for the F77 the > gfortran compiler and this works. You could give it a trial. > > Best > > Simon > > On Sep 18, 2013, at 3:14 PM, "Prigot, Jonathan" wrote: > > > I am trying to build R-3.0.1 on our SPARC Solaris 10 system, but it > > fails part way through with g77 errors. Has anyone run into this? Any > > suggestions? For what it's worth, R-2.15.1 is the last one to build > > error free for us. > > === > > Jon Prigot > > > > R is now configured for sparc-sun-solaris2.10 > > > > Source directory: . > > Installation directory:/usr/local > > > > C compiler:gcc -std=gnu99 -g -O2 > > Fortran 77 compiler: g77 -g -O2 > > > > C++ compiler: g++ -g -O2 > > Fortran 90/95 compiler:gfortran > > Obj-C compiler: > > > > Interfaces supported: X11, tcltk > > External libraries:readline, ICU > > Additional capabilities: PNG, JPEG, TIFF, NLS > > Options enabled: shared BLAS, R profiling > > > > Recommended packages: yes > > > > make > > ... > > > > g77 -fPIC -g -O2 -ffloat-store -c dlamch.f -o dlamch.o > > dlamch.f: In function `dlamch': > > dlamch.f:89: warning: > > INTRINSIC DIGITS, EPSILON, HUGE, MAXEXPONENT, > > ^ > > Reference to unimplemented intrinsic `DIGITS' at (^) (assumed EXTERNAL) > > dlamch.f:89: > > INTRINSIC DIGITS, EPSILON, HUGE, MAXEXPONENT, > > ^ > > Invalid declaration of or reference to symbol `digits' at (^) [initially > > seen at (^)] > > dlamch.f:89: warning: > > INTRINSIC DIGITS, EPSILON, HUGE, MAXEXPONENT, > > ^ > > Reference to unimplemented intrinsic `EPSILON' at (^) (assumed EXTERNAL) > > > > -- > > Jonathan M. Prigot > > Partners Healthcare Systems > > > > > > > > The information in this e-mail is intended only for th...{{dropped:18}} > > The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R-3.0.1 g77 errors
I am trying to build R-3.0.1 on our SPARC Solaris 10 system, but it fails part way through with g77 errors. Has anyone run into this? Any suggestions? For what it's worth, R-2.15.1 is the last one to build error free for us. === Jon Prigot R is now configured for sparc-sun-solaris2.10 Source directory: . Installation directory:/usr/local C compiler:gcc -std=gnu99 -g -O2 Fortran 77 compiler: g77 -g -O2 C++ compiler: g++ -g -O2 Fortran 90/95 compiler:gfortran Obj-C compiler: Interfaces supported: X11, tcltk External libraries:readline, ICU Additional capabilities: PNG, JPEG, TIFF, NLS Options enabled: shared BLAS, R profiling Recommended packages: yes make ... g77 -fPIC -g -O2 -ffloat-store -c dlamch.f -o dlamch.o dlamch.f: In function `dlamch': dlamch.f:89: warning: INTRINSIC DIGITS, EPSILON, HUGE, MAXEXPONENT, ^ Reference to unimplemented intrinsic `DIGITS' at (^) (assumed EXTERNAL) dlamch.f:89: INTRINSIC DIGITS, EPSILON, HUGE, MAXEXPONENT, ^ Invalid declaration of or reference to symbol `digits' at (^) [initially seen at (^)] dlamch.f:89: warning: INTRINSIC DIGITS, EPSILON, HUGE, MAXEXPONENT, ^ Reference to unimplemented intrinsic `EPSILON' at (^) (assumed EXTERNAL) -- Jonathan M. Prigot Partners Healthcare Systems The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] library() and install.packages() no longer working ("Access is denied" error)
In the last week, SOMETHING on my system must have changed because when trying to library() or install.packages() on R 3.0.1 x64 on a Windows 2008 R2 server: > library("raster") Error in normalizePath(path.expand(path), winslash, mustWork) : path[1]="D:/Users/[UID]/Documents/R/win-library/3.0": Access is denied > install.packages("raster") Installing package into ‘D:/Users/[UID]/Documents/R/win-library/3.0’ (as ‘lib’ is unspecified) trying URL 'http://ftp.osuosl.org/pub/cran/bin/windows/contrib/3.0/raster_2.1-49.zip' Content type 'application/zip' length 2363295 bytes (2.3 Mb) opened URL downloaded 2.3 Mb Error in normalizePath(path.expand(path), winslash, mustWork) : path[1]="D:\Users\[UID]\Documents\R\win-library\3.0": Access is denied In addition: Warning message: In normalizePath(path.expand(path), winslash, mustWork) : path[1]="D:/Users/[UID]/Documents/R/win-library/3.0": Access is denied The permissions on that directory APPEAR to be correct (I can add files/folders, rename them, delete them), but alas R continues to give me these errors. Both the users and the sysadmin claim nothing was changed, but clearly something did. As a heads up, I did try removing PATHTO/win-library/3.0, and re-ran the install.packages("raster"), at which point R asked me "Would you like to use a personal directory instead?". I clicked yes. It then asks me "Would you like to create a personal library 'D:/Users/[UID]/Documents/R/win-library/3.0' to install packages into? I clicked yes. The Mirror browser shows up, I select a mirror. A 3.0 directory is created, but I got the same error, and when examining the (new) 3.0 directory, nothing is created inside of it. Any ideas what this could be caused by? --j -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ifelse question (I'm not sure why this is working)...
R-helpers: One of my intrepid students came up with a solution to a problem where they need to write a function that takes a vector x and a "scalar" d, and return the indices of the vector x where x %% d is equal to 0 (x is evenly divisible by d). I thought I had a good handle on the potential solutions, but one of my students sent me a function that WORKS, but for the life of me I can't figure out WHY. Here is the solution: remainderFunction<-function(x,d) { ifelse(x%%d==0,yes=return(which(x%%d==0)),no=return(NULL)) } remainderFunction(x=c(23:47),d=3) I've never seen an ifelse statement used that way, and I was fully expecting that to NOT work, or to place the output of which(x%%d==0) in each location where the statement x%%d==0 was true. Any ideas on deconstructing this? --j -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Vectorized version of colMeans/rowMeans for higher dimension arrays?
For matrices, colMeans/rowMeans are quick, vectorized functions. But say I have a higher dimensional array: moo <- array(runif(400*9*3),dim=c(400,9,3)) And I want to get the mean along the 2nd dimension. I can, of course, use apply: moo1 <- apply(moo,c(1,3),mean) But this is not a vectorized operation (so it doesn't execute as quickly). How would one vectorize this operation (if possible)? Is there an array equivalent of colMeans/rowMeans? --j -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Parallel version of Map(rather, mapply)
Hi Saptarshi: There are quite a few parallel mapply's out there -- my recommendation is to use the foreach package, since it allows you to be flexible in the parallel backend, and you don't have to write two statements (a sequential and a parallel statement) -- if a parallel backend is running, it will use that, otherwise it'll execute in sequential mode. --j From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] on behalf of Saptarshi Guha [saptarshi.g...@gmail.com] Sent: Wednesday, August 28, 2013 1:24 PM To: R-help@r-project.org Subject: [R] Parallel version of Map(rather, mapply) Hello, I find Map to be nice interface to mapply. However Map calls mapply which in turn calls mapply via .Internal. Is there a parallel version of mapply (like mcapply) or do I need to write this myself? Regards Saptarshi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Determining the maximum memory usage of a function
Jim: Thanks, but I'm looking for something that can be used somewhat automatically -- the function in question would be user-provided and passed to my "chunking" algorithm, so in this case it would be the end-user (not me) who would have to embed these -- would Rprof(memory.profiling=TRUE) # my function Rprof(NULL) ... and then taking the max of the tseries output be a reasonable approach? If so, which of the three outputs (vsize.small vsize.large nodes) would be best compared against the available memory? Cheers! --j On Thu, Jun 20, 2013 at 10:07 AM, jim holtman wrote: > What I would do is to use "memory.size()" to get the amount of memory being > used. Do a call at the beginning of the function to determine the base, and > then at other points in the code to see what the difference from the base is > and keep track of the maximum difference. I am not sure if just getting the > memory usage at the end would be sufficient since there may be some garbage > collection in between, or you might be creating some large objects and then > deleting/reusing them. So keep track after large chunks of code to see what > is happening. > > > On Thu, Jun 20, 2013 at 10:45 AM, Jonathan Greenberg > wrote: >> >> Folks: >> >> I apologize for the cross-posting between r-help and r-sig-hpc, but I >> figured this question was relevant to both lists. I'm writing a >> function to be applied to an input dataset that will be broken up into >> chunks for memory management reasons and for parallel execution. I am >> trying to determine, for a given function, what the *maximum* memory >> usage during its execution is (which may not be the beginning or the >> end of the function, but somewhere in the middle), so I can "plan" for >> the chunk size (e.g. have a table of chunk size vs. max memory usage). >> >> Is there a trick for determining this? >> >> --j >> >> -- >> Jonathan A. Greenberg, PhD >> Assistant Professor >> Global Environmental Analysis and Remote Sensing (GEARS) Laboratory >> Department of Geography and Geographic Information Science >> University of Illinois at Urbana-Champaign >> 607 South Mathews Avenue, MC 150 >> Urbana, IL 61801 >> Phone: 217-300-1924 >> http://www.geog.illinois.edu/~jgrn/ >> AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > > > -- > Jim Holtman > Data Munger Guru > > What is the problem that you are trying to solve? > Tell me what you want to do, not how you want to do it. -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Determining the maximum memory usage of a function
Folks: I apologize for the cross-posting between r-help and r-sig-hpc, but I figured this question was relevant to both lists. I'm writing a function to be applied to an input dataset that will be broken up into chunks for memory management reasons and for parallel execution. I am trying to determine, for a given function, what the *maximum* memory usage during its execution is (which may not be the beginning or the end of the function, but somewhere in the middle), so I can "plan" for the chunk size (e.g. have a table of chunk size vs. max memory usage). Is there a trick for determining this? --j -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Instant search for R documentation
Hi Spencer, Thanks for the pointers. > > I just wanted to share with you that we made a website over the weekend > > that allows "instant search" of the R documentation on CRAN, see: > > www.Rdocumentation.org. It's a first version, so any > > feedback/comments/criticism most welcome. > > >Interesting. Are you aware of the following: > * help.start()>>> Of course, but I think checking R > documentation online instead of with the built-in R help function could > provide some extra benefits. First, you are capable of searching through the > latest version of all R packages, even those that are not installed on your > device. This makes it not only a help tool, but also a tool for discovery > (the fact that you can see search results while typing in the search box, > increases the "discovery element" further). Second, I added the discussion > system Disqus. For every function and package, Disqus allows users to ask > questions, add extra examples to the documentation, etc. This could become > an added value (conditionally on the fact that people use of it of course > :-)> * The R Wiki (the fourth item under "Documentation" on the > > left at "r-project.org"). Might you want to consider merging your > "rdocumentation.org" with this? >>> That would be an interesting option, but I guess that's not up to me to >>> decide ;-). >- NOTE: The R Wiki unfortunately has not gotten the > attention and development I believe it deserves. I'm not sure why this > is. The standard Wikipedia gets many contributors. One difference I > noticed is that to edit this, one needs to login. That's not true for > Wikimedia projects. Beyond that, with stardard Mediawiki markup > language can intimidate some people. Fortunately, difficulties in using > the Mediawiki softaware will soon be reduced. One of the primary > priorities of the software development team at the Wikimedia Foundation > is modifying the Mediawiki software to include a beta version of a > visual (WYSIWYG) editor. I saw a demo of a beta version of this a month > ago. It's already available for limited use, but I don't think it's > quite ready yet. I think it might be wise to check for it later this year. > > > * The "sos" package with its vignette for searching CRAN > packages and getting the result sorted to place first the package with > the most matches.>>> I was unaware of the sos package, looks very nice, thank > you for sharing! >Hope this helps. >Spencer > > > Best regards, > > > > Jonathan > > [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Instant search for R documentation
Hi, I just wanted to share with you that we made a website over the weekend that allows "instant search" of the R documentation on CRAN, see: www.Rdocumentation.org. It's a first version, so any feedback/comments/criticism most welcome. Best regards, Jonathan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Official way to set/retrieve options in packages?
What would be an example of setting, saving, and re-loading an option to a user's .Rprofile -- and would this be a no-no in a CRAN package? --j On Sat, Jun 1, 2013 at 4:57 PM, Prof Brian Ripley wrote: > On 01/06/2013 22:44, Anthony Damico wrote: > > hope this helps.. :) > > > > # define an object `x` > > x <- list( "any value here" , 10 ) > > > > # set `myoption` to that object > > options( "myoption" = x ) > > > > # retrieve it later (perhaps within a function elsewhere in the > package) > > ( y <- getOption( myoption ) ) > > > > > > it's nice to name your options `mypackage.myoption` so users know what > > package the option is associated with in case they type `options()` > > > > > > here's the `.onLoad` function in the R survey package. notice how the > > options are only set *if* they don't already exist-- > > But a nicer convention is that used by most packages in R itself: if the > option is not set, the function using it assumes a suitable default. > That would make sense for all the FALSE defaults below. > > Note though that this is not 'persistent': users have to set options in > their startup files (see ?Startup). There is no official location to > store package configurations. Users generally dislike software saving > settings in their own file space so it seems very much preferable to use > the standard R mechanisms (.Rprofile etc). > > > > >> survey:::.onLoad > > > > function (...) > > { > > if (is.null(getOption("survey.lonely.psu"))) > > options(survey.lonely.psu = "fail") > > if (is.null(getOption("survey.ultimate.cluster"))) > > options(survey.ultimate.cluster = FALSE) > > if (is.null(getOption("survey.want.obsolete"))) > > options(survey.want.obsolete = FALSE) > > if (is.null(getOption("survey.adjust.domain.lonely"))) > > options(survey.adjust.domain.lonely = FALSE) > > if (is.null(getOption("survey.drop.replicates"))) > > options(survey.drop.replicates = TRUE) > > if (is.null(getOption("survey.multicore"))) > > options(survey.multicore = FALSE) > > if (is.null(getOption("survey.replicates.mse"))) > > options(survey.replicates.mse = FALSE) > > } > > > > > > > > > > > > On Sat, Jun 1, 2013 at 4:01 PM, Jonathan Greenberg >wrote: > > > >> R-helpers: > >> > >> Say I'm developing a package that has a set of user-definable options > that > >> I would like to be persistent across R-invocations (they are saved > >> someplace). Of course, I can create a little text file to be > written/read, > >> but I was wondering if there is an "officially sanctioned" way to do > this? > >> I see there is an options() and getOptions() function, but I'm > unclear how > >> I would use this in my own package to create/save new options for my > >> particular package. Cheers! > >> > >> --j > >> > >> -- > >> Jonathan A. Greenberg, PhD > >> Assistant Professor > >> Global Environmental Analysis and Remote Sensing (GEARS) Laboratory > >> Department of Geography and Geographic Information Science > >> University of Illinois at Urbana-Champaign > >> 607 South Mathews Avenue, MC 150 > >> Urbana, IL 61801 > >> Phone: 217-300-1924 > >> http://www.geog.illinois.edu/~jgrn/ > >> AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 > >> > >> [[alternative HTML version deleted]] > >> > >> __ > >> R-help@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > > [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > -- > Brian D. Ripley, rip...@stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of
[R] Official way to set/retrieve options in packages?
R-helpers: Say I'm developing a package that has a set of user-definable options that I would like to be persistent across R-invocations (they are saved someplace). Of course, I can create a little text file to be written/read, but I was wondering if there is an "officially sanctioned" way to do this? I see there is an options() and getOptions() function, but I'm unclear how I would use this in my own package to create/save new options for my particular package. Cheers! --j -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RCurl: using ls instead of NLST
R-helpers: I'm trying to retrieve the contents of a directory from an ftp site (ideally, the file/folder names as a character vector): "ftp://e4ftl01.cr.usgs.gov/MOTA/MCD12C1.005/"; # (MODIS data) Where I get the following error via RCurl: require("RCurl") url <- "ftp://e4ftl01.cr.usgs.gov/MOTA/MCD12C1.005/"; filenames = getURL(url,ftp.use.epsv=FALSE,ftplistonly=TRUE) > Error in function (type, msg, asError = TRUE) : RETR response: 550 Through some sleuthing, it turns out the ftp site does not support NLST (which RCurl is using), but will use "ls" to list the directory contents -- is there any way to use "ls" remotely on this site? Thanks! --j -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] points overlay axis
Apologies to John - I should have thought to give an example. However, xpd is what I was looking for. Thanks for the help! On 14 May 2013 14:55, David Carlson wrote: > Let's try again after restraining Outlook's desire to use html. > > set.seed(42) > dat <- matrix(c(runif(48), 0, 0), 25, 2, byrow=TRUE) > > # Complete plot symbol on axes, but axis on top > plot(dat, xaxs="i", yaxs="i", pch=16, col="red", xpd=TRUE) > > # Complete plot symbol on axes with symbol on top > plot(dat, xaxs="i", yaxs="i", type="n") > points(dat, xaxs="i", yaxs="i", pch=16, col="red", xpd=TRUE) > > > David L Carlson > Associate Professor of Anthropology > Texas A&M University > College Station, TX 77840-4352 > > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > On > Behalf Of John Kane > Sent: Tuesday, May 14, 2013 7:47 AM > To: Jonathan Phillips; r-help@r-project.org > Subject: Re: [R] points overlay axis > > Probably but since we don't know what you are doing, it is very hard to > give > any advice. > > Please read this for a start > https://github.com/hadley/devtools/wiki/Reproducibility and give us a > clear > statement of the problem > > Thanks > > John Kane > Kingston ON Canada > > > > -Original Message- > > From: 994p...@gmail.com > > Sent: Tue, 14 May 2013 13:34:35 +0100 > > To: r-help@r-project.org > > Subject: [R] points overlay axis > > > > Hi, > > I'm trying to do quite a simple task, but I'm stuck. > > > > I've set xaxs = 'i' as I want the origin to be (0,0), but > > unfortunately I have points that are sat on the axis. R draws the > > axis over the points, which hides the points somewhat and looks > unsightly. > > Is there any way of getting a point to be drawn over the axis? > > > > Thanks, > > Jon Phillips > > > > [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > FREE ONLINE PHOTOSHARING - Share your photos online with your friends and > family! > Visit http://www.inbox.com/photosharing to find out more! > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R help: Batch read files based on names in a list
* I am currently reading in a series of files, applying the same functions to them one at a time, and then merging the resulting data frames e.g.: >MyRows <- c("RowA", "RowB", "RowC")>>File1_DF <- >read.delim("DirectoryToFiles\\File1_Folder\\File1.txt", >stringsAsFactors=FALSE, check.names=FALSE)>File1_DF <- >as.data.frame(t(File1_DF[MyRows,]))>File1_DF <- >as.data.frame(t(File1_DF))>mergeDF <- merge(mergeDF,File1_DF, by.x = >"Row.names", by.y="row.names")>>File2_DF <- >read.delim("DirectoryToFiles\\File2_Folder\\File2.txt", >stringsAsFactors=FALSE, check.names=FALSE)>File2_DF <- >as.data.frame(t(File2_DF[MyRows,]))>File2_DF <- >as.data.frame(t(File2_DF))>mergeDF <- merge(mergeDF,File2_DF, by.x = >"Row.names", by.y="row.names") ...etc I want to know if I can use a list of the filenames c("File1", "File2", "File2") etc. and apply a function to do this in a more automated fasion? This would involve using the list value in the directory path to read in the file i.e. >*MyFilesValue*_DF <- >read.delim("DirectoryToFolders\\*MyFilesValue*_Folder\\*MyFilesValue*.txt", > stringsAsFactors=FALSE, check.names=FALSE) Any help appreciated * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] points overlay axis
Hi, I'm trying to do quite a simple task, but I'm stuck. I've set xaxs = 'i' as I want the origin to be (0,0), but unfortunately I have points that are sat on the axis. R draws the axis over the points, which hides the points somewhat and looks unsightly. Is there any way of getting a point to be drawn over the axis? Thanks, Jon Phillips [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Stepwise regression for multivariate case in R?
Hi! I am trying to make a stepwise regression in the multivariate case, using Wilks' Lambda test. I've tried this: > greedy.wilks(cbind(Y1,Y2) ~ . , data=my.data ) But it only returns: Error in model.frame.default(formula = X[, j] ~ grouping, drop.unused.levels = TRUE) : variable lengths differ (found for 'grouping') What can be wrong here? I have checked and all variables in my.data is of the same length. //Jonathan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Singular design matrix in rq
Roger: Doh! Just realized I had that error in the code -- raw_data is the same as mydata, so it should be: mydata <- read.csv("singular.csv") plot(mydata$predictor,mydata$response) # A big cloud of points, nothing too weird summary(mydata) # No NAs: # Xresponse predictor # Min. :1 Min. :0.0 Min. : 0.000 # 1st Qu.:12726 1st Qu.: 851.2 1st Qu.: 0.000 # Median :25452 Median : 2737.0 Median : 0.000 # Mean :25452 Mean : 3478.0 Mean : 5.532 # 3rd Qu.:38178 3rd Qu.: 5111.6 3rd Qu.: 5.652 # Max. :50903 Max. :26677.8 Max. :69.342 fit_spl <- rq(response ~ bs(predictor,df=15),tau=1,data=mydata) # Error in rq.fit.br(x, y, tau = tau, ...) : Singular design matrix --j On Fri, Apr 19, 2013 at 8:15 AM, Koenker, Roger W wrote: > Jonathan, > > This is not what we call a reproducible example... what is raw_data? Does > it have something to do with mydata? > what is i? > > Roger > > url:www.econ.uiuc.edu/~rogerRoger Koenker > emailrkoen...@uiuc.eduDepartment of Economics > vox: 217-333-4558University of Illinois > fax: 217-244-6678Urbana, IL 61801 > > On Apr 16, 2013, at 2:58 PM, Greenberg, Jonathan wrote: > > > Quantreggers: > > > > I'm trying to run rq() on a dataset I posted at: > > > https://docs.google.com/file/d/0B8Kij67bij_ASUpfcmJ4LTFEUUk/edit?usp=sharing > > (it's a 1500kb csv file named "singular.csv") and am getting the > following error: > > > > mydata <- read.csv("singular.csv") > > fit_spl <- rq(raw_data[,1] ~ bs(raw_data[,i],df=15),tau=1) > > > Error in rq.fit.br(x, y, tau = tau, ...) : Singular design matrix > > > > Any ideas what might be causing this or, more importantly, suggestions > for how to solve this? I'm just trying to fit a smoothed hull to the top > of the data cloud (hence the large df). > > > > Thanks! > > > > --jonathan > > > > > > -- > > Jonathan A. Greenberg, PhD > > Assistant Professor > > Global Environmental Analysis and Remote Sensing (GEARS) Laboratory > > Department of Geography and Geographic Information Science > > University of Illinois at Urbana-Champaign > > 607 South Mathews Avenue, MC 150 > > Urbana, IL 61801 > > Phone: 217-300-1924 > > http://www.geog.illinois.edu/~jgrn/ > > AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 > > -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Singular design matrix in rq
Quantreggers: I'm trying to run rq() on a dataset I posted at: https://docs.google.com/file/d/0B8Kij67bij_ASUpfcmJ4LTFEUUk/edit?usp=sharing (it's a 1500kb csv file named "singular.csv") and am getting the following error: mydata <- read.csv("singular.csv") fit_spl <- rq(raw_data[,1] ~ bs(raw_data[,i],df=15),tau=1) > Error in rq.fit.br(x, y, tau = tau, ...) : Singular design matrix Any ideas what might be causing this or, more importantly, suggestions for how to solve this? I'm just trying to fit a smoothed hull to the top of the data cloud (hence the large df). Thanks! --jonathan -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Check if a character vector can be coerced to numeric?
Yep, type.convert was exactly what I was looking for (with as.is=TRUE). Thanks! On Thu, Mar 21, 2013 at 1:31 PM, Prof Brian Ripley wrote: > On 21/03/2013 18:20, Jonathan Greenberg wrote: > >> Given an arbitrary set of character vectors: >> >> myvect1 <- c("abc","3","4") >> myvect2 <- c("2","3","4") >> >> I would like to develop a function that will convert any vectors that can >> be PROPERLY converted to a numeric (myvect2) into a numeric, but leaves >> character vectors which cannot be converted (myvect1) alone. Is there any >> simple way to do this (e.g. some function that tests if a vector is >> coercible to a numeric before doing so)? >> >> --j >> > > ?type.convert > > It does depend what you mean by 'properly'. Can > "123.456789012344567890123455" be converted 'properly'? [See the NEWS for > R-devel.] > > -- > Brian D. Ripley, rip...@stats.ox.ac.uk > Professor of Applied Statistics, > http://www.stats.ox.ac.uk/~**ripley/<http://www.stats.ox.ac.uk/~ripley/> > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UKFax: +44 1865 272595 > > __** > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide http://www.R-project.org/** > posting-guide.html <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Check if a character vector can be coerced to numeric?
Given an arbitrary set of character vectors: myvect1 <- c("abc","3","4") myvect2 <- c("2","3","4") I would like to develop a function that will convert any vectors that can be PROPERLY converted to a numeric (myvect2) into a numeric, but leaves character vectors which cannot be converted (myvect1) alone. Is there any simple way to do this (e.g. some function that tests if a vector is coercible to a numeric before doing so)? --j -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Issue with matrices within nested for-loops
Greetings! I am trying to compare simulated environmental conditions from a model against a recruitment time series for a species of crab by first dropping 5 data points, and then using the remainder to attempt to simulate the missing data as a measure of best fit and using the following code: all.mat<-as.matrix(comb,ncol=ncol(comb),nrow=nrow(comb)) obs<-as.matrix(R2,24,1) mod<-all.mat results<-numeric(ncol(mod)) for(i in mod) { x<-mod[,i] resid <- matrix(NA, 1000, 5) for(k in 1:1000) { sub<-sample(1:24,19) fit<-lm(obs~x,subset=sub) cf<-coef(fit) p <- cf[1] + cf[2] * x[-sub] resid[k,] <- obs[-sub] - p } results[i] <- mean(resid^2) } where* R2* is a 24x1 matrix with recruitment data, *comb* was a cbind() object combining two matrices and *all.mat* is the final 565x24 matrix of modeled environmental scenarios. When the script is run the first 99 scenarios are processed properly and I get readable output. At scenario 100 however, I get this message: *Error in na.omit.data.frame(list(obs = c(0.414153096303487, 1.39649463342491, : subscript out of bounds* Which I understand to mean that the bounds of the indicated vector/matrix have been violated. I am however at a loss as to how to resolve this. Any advice would be appreciated Cheers! JR -- Jonathan Richar Doctoral candidate UAF SFOS Fisheries Division 17101 Pt. Lena Loop Rd. University of Alaska Fairbanks Juneau, AK 99801 Phone: (907) 796-5459 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Mixed models with missing data
Hi, I am creating a mixed model based on a experiment where each subject has 2 repeats. In some instances though there is only data for one of a given subjects repeats for most there is data for both. Can I still justify having subject as a random effect? Thanks, Jonathan [X] [X] [X] [X] [X] [X] [X] [X] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mixed effects regression with non-independent data
Hi, I will be performing mixed effects regression on subjects total scores from 2 player games (prisoners dilemma) that played. I am aware that including both players score from a game will cause problems due to non-independence. Is there a way that to deal with this apart from randomly picking one subject from each game for the analysis (and so losing half the data). Is there a way to introduce this into the model instead, perhaps as a random effect?? Thanks, Jonathan [X] [X] [X] [X] [X] [X] [X] [X] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Inserting rows of interpolated data
Dear help list - I have light data with 5-min time-stamps. I would like to insert four 1-min time-stamps between each row and interpolate the light data on each new row. To do this I have come up with the following code: lightdata <- read.table("Test_light_data.csv", header = TRUE, sep = ",") # read data file into object "lightdata" library(chron) mins <- data.frame(times(1:1439/1440)) # generate a dataframe of 24 hours of 1-min timestamps Nth.delete <- function(dataframe, n)dataframe[-(seq(n, to=nrow(dataframe), by=n)),] # function for deleting nth row empty <- data.frame("1/9/13", Nth.delete(mins, 5), "NA") # delete all 5-min timestamps in a new dataframe colnames(empty) <- c("date", "time", "light") # add correct column name to empty timestamp dataframe newdata <- rbind(lightdata, empty) I get the following error message: Warning message: In `[<-.factor`(`*tmp*`, ri, value = c(0.000694, 0.00139, : invalid factor level, NAs generated Digging into this a little, I can see that the two time columns are doing what I need and APPEAR to be similar in format: > head(lightdata) datetime light 1 1/9/13 0:00:00 -0.00040925 2 1/9/13 0:05:00 -0.00023386 3 1/9/13 0:10:00 -0.00032155 4 1/9/13 0:15:00 -0.00017539 5 1/9/13 0:20:00 -0.00029232 6 1/9/13 0:25:00 -0.00038002 > head(empty) date time light 1 1/9/13 00:01:00NA 2 1/9/13 00:02:00NA 3 1/9/13 00:03:00NA 4 1/9/13 00:04:00NA 5 1/9/13 00:06:00NA 6 1/9/13 00:07:00NA but they clearly are not as far as R is concerned, as shown by str: > str(lightdata) 'data.frame': 288 obs. of 3 variables: $ date : Factor w/ 1 level "1/9/13": 1 1 1 1 1 1 1 1 1 1 ... $ time : Factor w/ 288 levels "0:00:00","0:05:00",..: 1 2 3 4 5 6 7 8 9 10 ... $ light: num -0.000409 -0.000234 -0.000322 -0.000175 -0.000292 ... > str(empty) 'data.frame': 1152 obs. of 3 variables: $ date : Factor w/ 1 level "1/9/13": 1 1 1 1 1 1 1 1 1 1 ... $ time :Class 'times' atomic [1:1152] 0.000694 0.001389 0.002083 0.002778 0.004167 ... .. ..- attr(*, "format")= chr "h:m:s" $ light: Factor w/ 1 level "NA": 1 1 1 1 1 1 1 1 1 1 ... In the first (original) dataframe, light is a factor, while in the dataframe of generated timestamps, the timestamps are actually still in fractions of a day. Presumably this is why rbind is not working? Can anyone help? By the way, I know I can use na.approx in zoo to do the eventual interpolation of the light data. It's getting there that has me stumped for now. Many thanks, Jon (new R user). __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Count of Histogram Bins using Shingles with lattice
I know that I can get a count of histogram bins in base R with plot=FALSE. However, I'd like to do the same thing with lattice. The problem is that I've set up shingles, and I'd like to get the count within each bin within each shingle. plot=FALSE doesn't seem to do it. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loading a list into the environment
Thanks all! list2env was exactly what I was looking for. As an FYI (and please correct me if I'm wrong), if you want to load a list into the current environment, use: myvariables <- list(a=1:10,b=20) loadenv <- list2env(myvariables ,envir=environment()) a b --j On Fri, Feb 1, 2013 at 5:49 PM, Rui Barradas wrote: > Hello, > > Something like this? > > myfun <- function(x, envir = .GlobalEnv){ > nm <- names(x) > for(i in seq_along(nm)) > assign(nm[i], x[[i]], envir) > } > > myvariables <- list(a=1:10,b=20) > > myfun(myvariables) > a > b > > > Hope this helps, > > Rui Barradas > > Em 01-02-2013 22:24, Jonathan Greenberg escreveu: > > R-helpers: >> >> Say I have a list: >> >> myvariables <- list(a=1:10,b=20) >> >> Is there a way to load the list components into the environment as >> variables based on the component names? i.e. by applying this theoretical >> function to myvariables I would have the variables a and b loaded into the >> environment without having to explicitly define them. >> >> --j >> >> -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Loading a list into the environment
R-helpers: Say I have a list: myvariables <- list(a=1:10,b=20) Is there a way to load the list components into the environment as variables based on the component names? i.e. by applying this theoretical function to myvariables I would have the variables a and b loaded into the environment without having to explicitly define them. --j -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] scaling of nonbinROC penalties - accurate classification with random data?
ccuracy for an ordinal gold standard should depend on the absolute values of the penalty matrix. So, I would like to ask, ought there to be some constraint on the values of the penalty matrix? For example, (a) should the penalty matrix always contain at least one penalty with a value of 1 and/or (b) should there be any other constraint on the sum of penalties in the matrix (e.g. should the matrix sum to some multiple of the number of categories), or (c) is one free to use arbitrarily-scaled penalty matrices? I apologise if I am wasting your by making an obvious mistake. I am a clinician, not a statistician. So, I do not understand the mathematics. Thanks, in advance, for your help, Jonathan Williams __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] scaling of nonbinROC penalties - accurate classification with random data?
[[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding a line to barchart
Great! This really helped! One quick follow-up -- is there a trick to placing a label wherever the line intersects the x-axis (either above or below the plot)? On Tue, Jan 22, 2013 at 11:49 PM, PIKAL Petr wrote: > Hi > This function adds line to each panel > > addLine <- function (a = NULL, b = NULL, v = NULL, h = NULL, ..., once = F) > { > tcL <- trellis.currentLayout() > k <- 0 > for (i in 1:nrow(tcL)) for (j in 1:ncol(tcL)) if (tcL[i, > j] > 0) { > k <- k + 1 > trellis.focus("panel", j, i, highlight = FALSE) > if (once) > panel.abline(a = a[k], b = b[k], v = v[k], h = h[k], > ...) > else panel.abline(a = a, b = b, v = v, h = h, ...) > trellis.unfocus() > } > } > > > addLine(v=2, col=2, lty=3) > > Petr > > > -Original Message- > > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- > > project.org] On Behalf Of Jonathan Greenberg > > Sent: Tuesday, January 22, 2013 11:42 PM > > To: r-help > > Subject: [R] Adding a line to barchart > > > > R-helpers: > > > > I need a quick help with the following graph (I'm a lattice newbie): > > > > require("lattice") > > npp=1:5 > > names(npp)=c("A","B","C","D","E") > > barchart(npp,origin=0,box.width=1) > > > > # What I want to do, is add a single vertical line positioned at x = 2 > > that lays over the bars (say, using a dotted line). How do I go about > > doing this? > > > > --j > > > > -- > > Jonathan A. Greenberg, PhD > > Assistant Professor > > Global Environmental Analysis and Remote Sensing (GEARS) Laboratory > > Department of Geography and Geographic Information Science University > > of Illinois at Urbana-Champaign > > 607 South Mathews Avenue, MC 150 > > Urbana, IL 61801 > > Phone: 217-300-1924 > > http://www.geog.illinois.edu/~jgrn/ > > AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 > > > > [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting- > > guide.html > > and provide commented, minimal, self-contained, reproducible code. > -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Adding a line to barchart
R-helpers: I need a quick help with the following graph (I'm a lattice newbie): require("lattice") npp=1:5 names(npp)=c("A","B","C","D","E") barchart(npp,origin=0,box.width=1) # What I want to do, is add a single vertical line positioned at x = 2 that lays over the bars (say, using a dotted line). How do I go about doing this? --j -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] scaling of nonbinROC penalties
penalties in the matrix (e.g. should the matrix sum to some multiple of the number of categories), or (c) is one free to use arbitrarily-scaled penalty matrices for estimates of the accuracy of an ordinal gold standard? Thanks, in advance, for your help, Jonathan Williams [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Best way to coerce numerical data to a predetermined histogram bin?
Folks: Say I have a set of histogram breaks: breaks=c(1:10,15) # With bin ids: bin_ids=1:(length(breaks)-1) # and some data (note that some of it falls outside the breaks: data=runif(min=1,max=20,n=100) *** What is the MOST EFFICIENT way to "classify" data into the histogram bins (return the bin_ids) and, say, return NA if the value falls outside of the bins. By classify, I mean if the data value is greater than one break, and less than or equal to the next break, it gets assigned that bin's ID (note that length(breaks) = length(bin_ids)+1) Also note that, as per this example, the bins are not necessarily equal widths. I can, of course, cycle through each element of data, and then move through breaks, stopping when it finds the correct bin, but I feel like there is probably a faster (and more elegant) approach to this. Thoughts? --j -- Jonathan A. Greenberg, PhD Assistant Professor Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 217-300-1924 AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Anomalous outputs from rbeta when using two different random number seeds
Hi, in the code below, I am drawing 1000 samples from two beta distributions, each time using the same random number seed. Using set.seed(80) produces results I expect, in that the differences between the distributions are very small. Using set.seed(20) produces results I can't make sense of. Around half of the time, it behaves as with set.seed(80), but around half of the time, it behaves very differently, with a much wider distribution of differences between the two distributions. # Beta parameters #distribution 1 u1.a <- 285.14 u1.b <- 190.09 # distribution 2 u2.a <- 223.79 u2.b <- 189.11 #Good example: output is as expected set.seed(80); u1.good <- rbeta(1000, u1.a, u1.b) set.seed(80); u2.good <- rbeta(1000, u2.a, u2.b) #Bad example: output is different to expected set.seed(20); u1.bad <- rbeta(1000, u1.a, u1.b) set.seed(20); u2.bad <- rbeta(1000, u2.a, u2.b) # plot of distributions using set.seed(80), which behaves as expected plot(u2.good ~ u1.good, ylim=c(0.45, 0.70), xlim=c(0.45, 0.70)) abline(0,1) # plot of distributions using set.seed(20), which is different to expected plot(u2.bad ~ u1.bad, ylim=c(0.45, 0.70), xlim=c(0.45, 0.70)) abline(0,1) # plot of differences when using set.seed(80) plot(u1.good - u2.good, ylim=c(-0.2, 0.2)) abline(h=0) # plot of differences when using set.seed(20) plot(u1.bad - u2.bad, ylim=c(-0.2, 0.2)) abline(h=0) Could you explain why using set.seed(20) produces this chaotic pattern of behaviour? Many thanks, Jon -- Dr Jon Minton Research Associate Health Economics & Decision Science School of Health and Related Research University of Sheffield Times Higher Education University of the Year Tel: +44(0)114 222 0836 email: j.min...@sheffield.ac.uk http://www.shef.ac.uk/scharr/sections/heds http://scharrheds.blogspot.co.uk/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to change smoothing constant selection procedure for Winters Exponential Smoothing models?
Hello all, I am looking for some help in understanding how to change the way R optimizes the smoothing constant selection process for the HoltWinters function. I'm a SAS veteran but very new to R and still learning my way around. Here is some sample data and the current HoltWinters code I'm using: rawdata <- c(294, 316, 427, 487, 441, 395, 473, 423, 389, 422, 458, 411, 433, 454, 551, 623, 552, 520, 553, 510, 565, 547, 529, 526, 550, 577, 588, 606, 595, 622, 603, 672, 733, 793, 890, 830) timeseries_01 <- ts(rawdata, frequency=12, start=c(2009,1)) plot.ts(timeseries_01) m <- HoltWinters(timeseries_01, alpha = NULL, beta = NULL, gamma = TRUE, seasonal = c("multiplicative"), start.periods = 2, l.start = NULL, b.start = NULL, s.start = NULL) p <- predict(m, 24, prediction.interval = TRUE) plot(m, p) My problem is that I disagree with how R is choosing these smoothing constants and I would like to explore how some of the other methodologies listed in the OPTIM function [such as Nelder-Mead, BFGS, CG, L-BFGS-B, SANN, and Brent], but it is unclear to me how I would go about doing this. For example, the above code results in the following constants: alpha: 0.7952587 beta : 0.01382988 gamma: 1 However, using alternate software, I find that... alpha: 0.990 beta : 0.001 gamma: 0.001 ...actually fit this series much better, thus I would like to see if I can adjust R to reproduce this method of optimizing the three smoothing constants. Can anyone help? Thank you, Jonathan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Selecting the "non-attribute" part of an object
Thanks for all of these useful answers. Thanks also to Ben Bolker, who told me offline that c() is a general way to access the "main" part of an object (not tested). I also tried: > identical(matrix(tm), matrix(tmm)) [1] TRUE which also works, but does _not_ solve the problem Rolf warns about below (to my disappointment). JD On Thu, Nov 15, 2012 at 6:47 PM, Rolf Turner wrote: > I think that what you are looking for is: > all.equal(tm,tmm,check.attributes=FALSE) > But BEWARE: > m <- matrix(1:36,4,9) > mm <- matrix(1:36,12,3) > all.equal(m,mm,check.attributes=FALSE) > gives TRUE!!! I.e. sometimes attributes really are vital characteristics. > cheers, > Rolf Turner > On 16/11/12 08:52, Jonathan Dushoff wrote: >> I have two matrices, generated by R functions that I don't understand. >> I want to confirm that they're the same, but I know that they have >> different attributes. >> If I want to compare the dimnames, I can say >>> identical(attr(tm, "dimnames"), attr(tmm, "dimnames")) >> [1] FALSE >> or even: >>> identical(dimnames(tm), dimnames(tmm)) >> [1] FALSE >> But I can't find any good way to compare the "main" part of objects. >> What I'm doing now is: >>> tm_new <- tm >>> tmm_new <- tmm >>> attributes(tm_new) <- attributes(tmm_new) <- NULL >>> identical(tm_new, tmm_new) >> [1] TRUE >> But that seems very inaesthetic, besides requiring that I create two >> pointless objects. >> I have read ?attributes, ?attr and some web introductions to how R >> objects work, but have not found an answer. >> Thanks for any help. >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Selecting the "non-attribute" part of an object
I have two matrices, generated by R functions that I don't understand. I want to confirm that they're the same, but I know that they have different attributes. If I want to compare the dimnames, I can say > identical(attr(tm, "dimnames"), attr(tmm, "dimnames")) [1] FALSE or even: > identical(dimnames(tm), dimnames(tmm)) [1] FALSE But I can't find any good way to compare the "main" part of objects. What I'm doing now is: > tm_new <- tm > tmm_new <- tmm > attributes(tm_new) <- attributes(tmm_new) <- NULL > identical(tm_new, tmm_new) [1] TRUE But that seems very inaesthetic, besides requiring that I create two pointless objects. I have read ?attributes, ?attr and some web introductions to how R objects work, but have not found an answer. Thanks for any help. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Best R textbook for undergraduates
R-helpers: I'm sure this question has been asked and answered through the ages, but given there are some new textbooks out there, I wanted to re-pose it. For a course that will cover the application of R for general computing and spatial modeling, what textbook would be best to introduce computing with R to *undergrads*? I understand Bivand and Pebesma's book is fine for spatial work, but it appears to be for more advanced users -- I'd like a companion textbook that is better for complete beginners to ALL forms of programming (e.g. they don't know what an object is, a loop is, an if-then statement, etc). Suggestions? In particular, I'd like to hear from those of you who have TAUGHT classes using R. Thanks! --jonathan -- Jonathan A. Greenberg, PhD Assistant Professor Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 217-300-1924 AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Can I please be taken off the mailing list
[[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Completely ignoring an error in a function...
The code base is a bit too complicated to paste in here, but the gist of my question is this: given I have a function myfunction <- function(x) { # Do something A # Do something B # Do something C } Say "#Do something B" returns this error: Error in cat(list(...), file, sep, fill, labels, append) : argument 2 (type 'list') cannot be handled by 'cat' A standard function would stop here. HOWEVER, I want, in this odd case, to say "keep going" to my function and have it proceeed to # Do something C. How do I accomplish this? I thought suppressWarnings() would do it but it doesn't appear to. Assume that debugging "Do something B" is out of the question. Why am I doing this? Because in my odd case, "Do something B" actually does what I needed it to, but returned an error that is irrelevant to my special case (it creates two files, failing on the second of the two files -- but the first file it creates is what I wanted and there is no current way to create that single file on its own without a lot of additional coding). --j -- Jonathan A. Greenberg, PhD Assistant Professor Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 217-300-1924 AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Proposal: Package update log
Thanks to all for the responses and suggestions. I was primarily proposing a more detailed change log for packages on CRAN. To my mind, repositories like R-forge host packages more 'raw' than those on CRAN (i.e. CRAN seems to me to contain more 'finished' packages which occasionally are updated or added-to). Also, some packages on R-forge do not contain any information regarding changes/updates [I'm hesitant to offer an example because I'm really just a lemming in terms of R community stature...]. I guess what I'm saying is, the news(package = "yourPackageHere") function is not particularly useful currently because (in my *limited* experience) very few packages contain the news file and those which do, do not contain much in the way of description. Perhaps I'm being a bit too ambitious here, but I would just like to be able to see what has been changed (and why; if possible) each time a package is updated. It would seem, from my rather rudimentary understanding, that using current TeX/LaTeX based tools for the basis of package documentation lends itself to having a better, more organized, change log or news file based on the package manual table of contents (toc). For instance, it would be great if we had something like: news(package = "yourPackageHere", function = "functionOfInterest") which could display a log of changes/updates sequentially for the named function of interest. Admittedly, I have not created a package myself, but I do have some experience with LaTeX and it may be as simple as changing the preamble to existing TeX file templates or style files. In terms of enforcement; yes I agree it would require more work from the package authors, as well as managers/moderators of CRAN; but, if we expect each package to have a working help file, then why not a (meaningfully) working 'news' file. Respectfully, Jon Starkweather, PhD University of North Texas Research and Statistical Support http://www.unt.edu/rss/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Proposal: Package update log
I'm relatively new to R and would first like to sincerely thank all those who contribute to its development. Thank you. I would humbly like to propose a rule which creates a standard (i.e., strongly encouraged, mandatory, etc.) for authors to include a `change log' documenting specific changes for each update made to their package(s). The more detailed the description, the better; and it would be exceptionally useful if the document were available from the R-console (similar to the help function). In other words, I am suggesting that the `change log' file be included in the package(s) and preferably accessible from the R-console. I am aware that many packages available on CRAN have a `change log' or `news' page. However; not all packages have something like that and many which do, are not terribly detailed in conveying what has been updated or changed. I am also aware that package authors are not a particularly lazy group, sitting around with nothing to do. My proposal would likely add a non-significant amount of work to the already very generous (and appreciated) work performed by package authors, maintainers, etc. I do, however, believe it would be greatly appreciated and beneficial to have more detailed update information available from the R-console as some of us (users) update packages daily and are often left wondering what exactly has been updated. I did not post this to the R-devel list because I consider this proposal more of a documentation issue than a development issue. Also, I would like to see some discussion of this proposal from a varied pool of stakeholders (e.g., users and not just developers, package authors, package maintainers, etc.). Respectfully, Jon Starkweather, PhD University of North Texas Research and Statistical Support http://www.unt.edu/rss/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [R-sig-hpc] Quickest way to make a large "empty" file on disk?
Rui: Quick follow-up -- it looks like seek does do what I want (I see Simon suggested it some time ago) -- what do mean by "trash your disk"? What I'm trying to accomplish is getting parallel, asynchronous writes to a large binary image (just a binary file) working. Each node writes to a different sector of the file via mmap, "filling in" the values as the process runs, but the file needs to be pre-created before I can mmap it. Running a writeBin with a bunch of 0s would mean I'd basically have to write the file twice, but the seek/ff trick seems to be much faster. Do I risk doing some damage to my filesystem if I use seek? I see there is a strongly worded warning in the help for ?seek: "Use of seek on Windows is discouraged. We have found so many errors in the Windows implementation of file positioning that users are advised to use it only at their own risk, and asked not to waste the *R* developers' time with bug reports on Windows' deficiencies." --> there's no detail here on which errors people have experienced, so I'm not sure if doing something as simple as just "creating" a file using seek falls under the "discouraging" category. As a note, we are trying to work this up on both Windows and *nix systems, hence our wanting to have a single approach that works on both OSs. --j On Thu, Sep 27, 2012 at 3:49 PM, Rui Barradas wrote: > Hello, > > If you really need to trash your disk, why not use seek()? > > > fl <- file("Test.txt", open = "wb") > > seek(fl, where = 1024, origin = "start", rw = "write") > [1] 0 > > writeChar(character(1), fl, nchars = 1, useBytes = TRUE) > Warning message: > In writeChar(character(1), fl, nchars = 1, useBytes = TRUE) : > writeChar: more characters requested than are in the string - will > zero-pad > > close(fl) > > > File "Test.txt" is now 1Kb in size. > > Hope this helps, > > Rui Barradas > Em 27-09-2012 20:17, Jonathan Greenberg escreveu: > > Folks: > > Asked this question some time ago, and found what appeared (at first) to be > the best solution, but I'm now finding a new problem. First off, it seemed > like ff as Jens suggested worked: > > # outdata_ncells = the number of rows * number of columns * number of bands > in an image: > out<-ff(vmode="double",length=outdata_ncells,filename=filename) > finalizer(out) <- close > close(out) > > This was working fine until I attempted to set length to a VERY large > number: outdata_ncells = 17711913600. This would create a file that is > 131.964GB. Big, but not obscenely so (and certainly not larger than the > filesystem can handle). However, length appears to be restricted > by .Machine$integer.max (I'm on a 64-bit windows box): > > .Machine$integer.max > > [1] 2147483647 > > Any suggestions on how to solve this problem for much larger file sizes? > > --j > > > On Thu, May 3, 2012 at 10:44 AM, Jonathan Greenberg > wrote: > > > Thanks, all! I'll try these out. I'm trying to work up something that is > platform independent (if possible) for use with mmap. I'll do some tests > on these suggestions and see which works best. I'll try to report back in a > few days. Cheers! > > --j > > > > 2012/5/3 "Jens Oehlschlägel" > > > Jonathan, > > On some filesystems (e.g. NTFS, see below) it is possible to create > 'sparse' memory-mapped files, i.e. reserving the space without the cost of > actually writing initial values. > Package 'ff' does this automatically and also allows to access the file > in parallel. Check the example below and see how big file creation is > immediate. > > Jens Oehlschlägel > > > > library(ff) > library(snowfall) > ncpus <- 2 > n <- 1e8 > system.time( > > + x <- ff(vmode="double", length=n, filename="c:/Temp/x.ff") > + ) >User System verstrichen >0.010.000.02 > > # check finalizer, with an explicit filename we should have a 'close' > > finalizer > > finalizer(x) > > [1] "close" > > # if not, set it to 'close' inorder to not let slaves delete x on slave > > shutdown > > finalizer(x) <- "close" > sfInit(parallel=TRUE, cpus=ncpus, type="SOCK") > > R Version: R version 2.15.0 (2012-03-30) > > snowfall 1.84 initialized (using snow 0.3-9): parallel execution on 2 > CPUs. > > > sfLibrary(ff) > > Library ff loaded. > Library ff loaded in cluster. > > Warnmeldung: > In library(package = "ff
Re: [R] [R-sig-hpc] Quickest way to make a large "empty" file on disk?
Folks: Asked this question some time ago, and found what appeared (at first) to be the best solution, but I'm now finding a new problem. First off, it seemed like ff as Jens suggested worked: # outdata_ncells = the number of rows * number of columns * number of bands in an image: out<-ff(vmode="double",length=outdata_ncells,filename=filename) finalizer(out) <- close close(out) This was working fine until I attempted to set length to a VERY large number: outdata_ncells = 17711913600. This would create a file that is 131.964GB. Big, but not obscenely so (and certainly not larger than the filesystem can handle). However, length appears to be restricted by .Machine$integer.max (I'm on a 64-bit windows box): > .Machine$integer.max [1] 2147483647 Any suggestions on how to solve this problem for much larger file sizes? --j On Thu, May 3, 2012 at 10:44 AM, Jonathan Greenberg wrote: > Thanks, all! I'll try these out. I'm trying to work up something that is > platform independent (if possible) for use with mmap. I'll do some tests > on these suggestions and see which works best. I'll try to report back in a > few days. Cheers! > > --j > > > > 2012/5/3 "Jens Oehlschlägel" > >> Jonathan, >> >> On some filesystems (e.g. NTFS, see below) it is possible to create >> 'sparse' memory-mapped files, i.e. reserving the space without the cost of >> actually writing initial values. >> Package 'ff' does this automatically and also allows to access the file >> in parallel. Check the example below and see how big file creation is >> immediate. >> >> Jens Oehlschlägel >> >> >> > library(ff) >> > library(snowfall) >> > ncpus <- 2 >> > n <- 1e8 >> > system.time( >> + x <- ff(vmode="double", length=n, filename="c:/Temp/x.ff") >> + ) >>User System verstrichen >>0.010.000.02 >> > # check finalizer, with an explicit filename we should have a 'close' >> finalizer >> > finalizer(x) >> [1] "close" >> > # if not, set it to 'close' inorder to not let slaves delete x on slave >> shutdown >> > finalizer(x) <- "close" >> > sfInit(parallel=TRUE, cpus=ncpus, type="SOCK") >> R Version: R version 2.15.0 (2012-03-30) >> >> snowfall 1.84 initialized (using snow 0.3-9): parallel execution on 2 >> CPUs. >> >> > sfLibrary(ff) >> Library ff loaded. >> Library ff loaded in cluster. >> >> Warnmeldung: >> In library(package = "ff", character.only = TRUE, pos = 2, warn.conflicts >> = TRUE, : >> 'keep.source' is deprecated and will be ignored >> > sfExport("x") # note: do not export the same ff multiple times >> > # explicitely opening avoids a gc problem >> > sfClusterEval(open(x, caching="mmeachflush")) # opening with >> 'mmeachflush' inststead of 'mmnoflush' is a bit slower but prevents OS >> write storms when the file is larger than RAM >> [[1]] >> [1] TRUE >> >> [[2]] >> [1] TRUE >> >> > system.time( >> + sfLapply( chunk(x, length=ncpus), function(i){ >> + x[i] <- runif(sum(i)) >> + invisible() >> + }) >> + ) >>User System verstrichen >>0.000.00 30.78 >> > system.time( >> + s <- sfLapply( chunk(x, length=ncpus), function(i) quantile(x[i], >> c(0.05, 0.95)) ) >> + ) >>User System verstrichen >>0.000.004.38 >> > # for completeness >> > sfClusterEval(close(x)) >> [[1]] >> [1] TRUE >> >> [[2]] >> [1] TRUE >> >> > csummary(s) >> 5% 95% >> Min.0.04998 0.95 >> 1st Qu. 0.04999 0.95 >> Median 0.05001 0.95 >> Mean0.05001 0.95 >> 3rd Qu. 0.05002 0.95 >> Max.0.05003 0.95 >> > # stop slaves >> > sfStop() >> >> Stopping cluster >> >> > # with the close finalizer we are responsible for deleting the file >> explicitely (unless we want to keep it) >> > delete(x) >> [1] TRUE >> > # remove r-side metadata >> > rm(x) >> > # truly free memory >> > gc() >> >> >> >> *Gesendet:* Donnerstag, 03. Mai 2012 um 00:23 Uhr >> *Von:* "Jonathan Greenberg" >> *An:* r-help , r-sig-...@r-project.org >> *Betreff:* [R-sig-hpc] Quickest way to make
Re: [R] list of funtions
Yes, seen it, and it's obviously the wrong thing to do or I'd be getting the result I'm looking for. But I can't see the correct way of doing it. I.e. I can't see any way of setting each element of the list to a function with a different 'form' value without using some 'i' like variable in a loop. I don't think it's something obvious I've missed... On 13 September 2012 18:06, Uwe Ligges wrote: > > > On 13.09.2012 19:01, Jonathan Phillips wrote: >> >> Hi, >> I have a function called fitMicroProtein which takes a value called >> form, this can be any integer from 0-38. >> In a larger function I'm making (it's called Newton), the first thing >> I want to do is construct a list of functions where form is already >> set. So in pseudocode >> >> fs[[1]](...) <- fitMicroProtein(form=0,...) >> fs[[2]](...) <- fitMicroProtein(form=1,...) >> . >> . >> . >> >> I've tried that and it doesn't work. Here's my code: >> >> Newton <- function(metaf,par,niter,dealwith_NA,...) >> { >> fs <- list() >> for(i in 0:(length(par)-1)) >> { >> fs[[i+1]] <- function(par) return(metaf(par,form=i,...)) >> } >> . >> . >> . >> >> >> and the problem is with the variable 'i'. >> If I use the debugger, I find that it is specifically that: >> >> When it makes f[[1]] we have >> f[[1]] == function(par) return(metaf(par,form=0,...) >> but the next thing it does is increment 'i', so f[[1]] becomes >> function(par) return(metaf(par,form=1,...) >> where I want f[[1]] to stay as >> function(par) return(metaf(par,form=0,...) >> >> Does anybody know how to stop the value of f[[1]] being dependant on >> the current value of 'i'? > > > Er, you know that you have > > function(par) return(metaf(par,form=i,...)) > > in your loop. If you want to have it independent if i, why do you specify > it? > > Uwe Ligges > >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] list of funtions
Hi, I have a function called fitMicroProtein which takes a value called form, this can be any integer from 0-38. In a larger function I'm making (it's called Newton), the first thing I want to do is construct a list of functions where form is already set. So in pseudocode fs[[1]](...) <- fitMicroProtein(form=0,...) fs[[2]](...) <- fitMicroProtein(form=1,...) . . . I've tried that and it doesn't work. Here's my code: Newton <- function(metaf,par,niter,dealwith_NA,...) { fs <- list() for(i in 0:(length(par)-1)) { fs[[i+1]] <- function(par) return(metaf(par,form=i,...)) } . . . and the problem is with the variable 'i'. If I use the debugger, I find that it is specifically that: When it makes f[[1]] we have f[[1]] == function(par) return(metaf(par,form=0,...) but the next thing it does is increment 'i', so f[[1]] becomes function(par) return(metaf(par,form=1,...) where I want f[[1]] to stay as function(par) return(metaf(par,form=0,...) Does anybody know how to stop the value of f[[1]] being dependant on the current value of 'i'? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Shading in prediction intervals
I have the following code for the minimum and maximum of my prediction interval > y.down=lines(x[x.order], set1.pred[,2][x.order], col=109) > y.up=lines(x[x.order], set1.pred[,3][x.order], col=109) domain=min(x):max(x) polygon(c(domain,rev(domain)),c(y.up,rev(y.down)),col=109) It doesnt seem to shade the right region, it gives me a trapezoid. Any help? Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How do you learn/teach R?
Hi, I am conducting a survey on how people learn and teach R. I think the results of the survey could lead to interesting insights that could benefit the R community, and the educational practices in specific. The data and the results of the survey will be shared with everyone. Please help us and fill out the survey on http://www.Rcademy.org. It takes less than 5 minutes. Thank you, Jonathan Cornelissen Doctoral researcher, KU Leuven [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem plotting in a grid
Hi all, I'm trying to generate a grid of four plots. The first 2 appear just fine, but the final 2 will not appear in the grid, instead overwriting the first two.Any ideas on how to get them all in the same window would be greatly appreciated. Cheers, Jonathan library(fields) par(mfrow=c(2,2)) #2x2 plot windows plot(c(2,4),c(2,2)) # works fine plot(c(2,4),c(2,2)) # works fine x <- 1:4 y <- 5:10 z <- matrix(0,length(x),length(y)) z2 <- matrix(0,length(x),length(y)) for(i in 1:length(x)) { for (j in 1:length(y)) { z[i,j] <- sample(4:10,1) z2[i,j] <- sample(4:10,1) } } filled.contour(x,y,z,color.palette=topo.colors) # doesn't work image.plot(x,y,z2,add=TRUE) # doesn't work __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] fitting particular distributions
Hi there, I got a question that is both about stats and R. Imagine two alternative stochastic processes: a. Each subject has an exponential distribution of lifetime, with a parameter lambda; If I simulate 100 such subjects and rank their lifetimes, they'd look like this: >plot(sort(rexp(100), decreasing=T)) b. The alternative process is slightly different. Imagine that the first subject obeys the same rule as (a) and therefore would have an expected lifetime of 1/lambda. The second subject would have half of the lifetime of the first subject (i.e., 1/(2*lambda)), the third would have one third of the lifetime as the first subject (1/(2*lambda), and so on. The distribution of ranked data of process (b) would therefore be more concave than (a). Now, here's my question. If I have a given dataset of subject lifetimes, how can I specify and test these alternative processes in R? Any help will be greatly appreciated. Jonathan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with filled.contour() -
Dear all, I can't figure out a way to have more than one plot using filled.contour() in a single plate. I tried to use layout() or par(), but the way filled.contour() is written seems to override those commands. Any suggestions would be really appreciated. Jonathan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.