[R] Help with shiny::reactiveFileReader()
Hello, Is it possible to execute functions (outside the ui and server shiny environments) after reading data using reactiveFileReader() ? For example, I'd like to fit a linear model on data read using reactiveFileReader() outside ui/server. library(shiny) library(dplyr) bigData <- reactiveFileReader(1000, NULL, 'data.csv', read.csv) fit <- lm(y ~., data = bigData()) ui <- function() { } server <- function(input, output) { } Thank you, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Avoid duplication in dplyr::summarise
Exactly what I was looking for Eric, thanks! I agree on your second point. Best, Lars. On Sat, Sep 9, 2017 at 9:02 AM, Eric Berger <ericjber...@gmail.com> wrote: > Hi Lars, > Two comments: > 1. You can achieve what you want with a slight modification of your > definition of s(), using the hint from the error message that you need an > argument '.': > s <- function(.) { > dplyr::summarise(., x1m = mean(X1), >x2m = mean(X2), >x3m = mean(X3), >x4m = mean(X4)) > } > > 2. You have not given a great test case in how you set your two factors > because the two group_by()'s will give the identical groupings, An > alternative which confirms that the function s() does what you want might > be: > > df <- data.frame(matrix(rnorm(40), 10, 4), > f1 = base::sample(letters[1:3],30,replace=TRUE), > f2 = base::sample(letters[4:6],30,replace=TRUE)) > > HTH, > > Eric > > On Sat, Sep 9, 2017 at 1:52 PM, Edjabou Vincent <makl...@gmail.com> wrote: > >> Hi Lars >> >> I am not very sure what you really want. However, I am suggesting the >> following code that enables (1) to obtain the full summary of your data >> and >> (2) retrieve only mean of X values as function of factors f1 and f2. >> >> library(tidyverse) >> library(psych) >> df <- data.frame(matrix(rnorm(40), 10, 4), >> f1 = gl(3, 10, labels = letters[1:3]), >> f2 = gl(3, 10, labels = letters[4:6])) >> >> ##To get all summary of your data >> df%>% gather(X_name,X_value,X1:X4)%>% >> group_by(f1,f2,X_name)%>% >> do(describe(.$X_value)) >> >> ##To obtain only means of your data >> df%>% gather(X_name,X_value,X1:X4)%>% >> group_by(f1,f2,X_name)%>% >> do(describe(.$X_value))%>% >> select(mean)%>%# You select only mean value >> spread(X_name,mean)# >> >> Vincent >> >> Med venlig hilsen/ Best regards >> >> Edjabou Maklawe Essonanawe Vincent >> Mobile: +45 31 95 99 33 >> >> On Sat, Sep 9, 2017 at 12:30 PM, Lars Bishop <lars...@gmail.com> wrote: >> >> > Dear group, >> > >> > Is there a way I could avoid the sort of duplication illustrated below? >> > i.e., I have the same dplyr::summarise function on different group_by >> > arguments. So I'd like to create a single summarise function that could >> be >> > applied to both. My attempt below fails. >> > >> > df <- data.frame(matrix(rnorm(40), 10, 4), >> > f1 = gl(3, 10, labels = letters[1:3]), >> > f2 = gl(3, 10, labels = letters[4:6])) >> > >> > >> > df %>% >> > group_by(f1, f2) %>% >> > summarise(x1m = mean(X1), >> > x2m = mean(X2), >> > x3m = mean(X3), >> > x4m = mean(X4)) >> > >> > df %>% >> > group_by(f1) %>% >> > summarise(x1m = mean(X1), >> > x2m = mean(X2), >> > x3m = mean(X3), >> > x4m = mean(X4)) >> > >> > # My fail attempt >> > >> > s <- function() { >> > dplyr::summarise(x1m = mean(X1), >> >x2m = mean(X2), >> >x3m = mean(X3), >> >x4m = mean(X4)) >> > } >> > >> > df %>% >> > group_by(f1) %>% s >> > Error in s(.) : unused argument (.) >> > >> > Regards, >> > Lars. >> > >> > [[alternative HTML version deleted]] >> > >> > __ >> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide http://www.R-project.org/ >> > posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posti >> ng-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Avoid duplication in dplyr::summarise
Dear group, Is there a way I could avoid the sort of duplication illustrated below? i.e., I have the same dplyr::summarise function on different group_by arguments. So I'd like to create a single summarise function that could be applied to both. My attempt below fails. df <- data.frame(matrix(rnorm(40), 10, 4), f1 = gl(3, 10, labels = letters[1:3]), f2 = gl(3, 10, labels = letters[4:6])) df %>% group_by(f1, f2) %>% summarise(x1m = mean(X1), x2m = mean(X2), x3m = mean(X3), x4m = mean(X4)) df %>% group_by(f1) %>% summarise(x1m = mean(X1), x2m = mean(X2), x3m = mean(X3), x4m = mean(X4)) # My fail attempt s <- function() { dplyr::summarise(x1m = mean(X1), x2m = mean(X2), x3m = mean(X3), x4m = mean(X4)) } df %>% group_by(f1) %>% s Error in s(.) : unused argument (.) Regards, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Server - Resource Manager
Hi All, I'd appreciate if you can point me to any good open source (and free) Resource Manager for R installed on a unix server. Essentially, I'm looking to have the ability to selectively allocate computational resources to individuals or groups who have access to this server. I understand R Studio Server Pro and MS R Server have this capability but they are not free. I'm looking for a free equivalent if available. Thank you, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help please with error from nnet::multinom
Thanks, David. Sorry, do you mean this? library(nnet) set.seed(1) ysim <- gl(3, 100) y <- model.matrix(~ysim -1) X <- matrix( 3 * runif(length(ysim)), nrow = 300, ncol = 3) X_new <- matrix( 3 * runif(length(ysim)), nrow = 200, ncol = 3) fit <- multinom(y ~ X, trace = FALSE) pred <- predict(fit, setNames(data.frame(X_new),c("X1","X2","X3") ), type = "probs") Error in predict.multinom(fit, setNames(data.frame(X_new), c("X1", "X2", : NAs are not allowed in subscripted assignments In addition: Warning message: 'newdata' had 200 rows but variables found have 300 rows On Sun, Jun 26, 2016 at 3:46 PM, David Winsemius <dwinsem...@comcast.net> wrote: > > > On Jun 26, 2016, at 12:39 PM, Lars Bishop <lars...@gmail.com> wrote: > > > > Many thanks David. That works. Looks then this error will always occur > in predict.multinom whenever the data argument is missing in the mutlinom > fit, but the data argument is optional as per documentation. > > I don't agree with that analysis. The problem occurs because of a mismatch > of names in the new data argument. With your original code this runs > without error: > > pred <- predict(fit, setNames(data.frame(X_new),c("X1","X2","X3") ), type > = "probs") > > -- > David. > > > > Best, > > Lars. > > > > On Sun, Jun 26, 2016 at 3:14 PM, David Winsemius <dwinsem...@comcast.net> > wrote: > > > > > On Jun 26, 2016, at 11:32 AM, Lars Bishop <lars...@gmail.com> wrote: > > > > > > Thanks Bert. > > > > > > But I it doesn't complain when predict is used on X instead of X_new > > > (using nnet_7.3-12), which is even more puzzling to me: > > > > > > pred <- predict(fit, X, type = "probs") > > > > Indeed: There is a predict.multinom function and it does have 'probs' as > an acceptable argument to type: > > > > I got success (or at least an absence of an error message) with: > > > > #-- > > X <- data.frame(matrix( 3 * runif(length(ysim)), nrow = 300, ncol = 3)) > > X_new <- data.frame(matrix( 3 * runif(length(ysim)), nrow = 200, ncol = > 3)) > > str(X) > > > > 'data.frame': 300 obs. of 3 variables: > > $ X1: num 0.797 1.116 1.719 2.725 0.605 ... > > $ X2: num 0.797 1.116 1.719 2.725 0.605 ... > > $ X3: num 0.797 1.116 1.719 2.725 0.605 ... > > > > fit <- multinom(y ~ ., data=X, trace = FALSE) > > pred <- predict(fit, setNames(X_new, names(X)), type = "probs") > > > > > head(pred) > > ysim1 ysim2 ysim3 > > 1 0.3519378 0.3517418 0.2963204 > > 2 0.3135513 0.3138573 0.3725915 > > 3 0.3603779 0.3600461 0.2795759 > > 4 0.3572297 0.3569498 0.2858206 > > 5 0.3481512 0.3480128 0.3038360 > > 6 0.3813310 0.3806118 0.2380572 > > > > # > > > > > > > head(pred) > > > ysim1 ysim2 ysim3 > > > 1 0.3059421 0.3063284 0.3877295 > > > 2 0.3200219 0.3202551 0.3597230 > > > 3 0.3452414 0.3451460 0.3096125 > > > 4 0.3827077 0.3819603 0.2353320 > > > 5 0.2973288 0.2977994 0.4048718 > > > 6 0.3817027 0.3809759 0.2373214 > > > > > > Thanks again, > > > Lars. > > > > > > > > > On Sun, Jun 26, 2016 at 1:05 PM, Bert Gunter <bgunter.4...@gmail.com> > wrote: > > > > > >> Well, for one thing, there is no "probs" method for predict.nnet, at > > >> least in my version: nnet_7.3-12 > > >> > > >> Cheers, > > >> Bert > > >> > > >> > > >> Bert Gunter > > >> > > >> "The trouble with having an open mind is that people keep coming along > > >> and sticking things into it." > > >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > >> > > >> > > >> On Sun, Jun 26, 2016 at 9:27 AM, Lars Bishop <lars...@gmail.com> > wrote: > > >>> Hello, > > >>> > > >>> I'd appreciate your help in spotting the reason for the error and > warning > > >>> messages below. > > >>> > > >>> library(nnet) > > >>> set.seed(1) > > >>> ysim <- gl(3, 100) &g
Re: [R] Help please with error from nnet::multinom
Many thanks David. That works. Looks then this error will always occur in predict.multinom whenever the data argument is missing in the mutlinom fit, but the data argument is optional as per documentation. Best, Lars. On Sun, Jun 26, 2016 at 3:14 PM, David Winsemius <dwinsem...@comcast.net> wrote: > > > On Jun 26, 2016, at 11:32 AM, Lars Bishop <lars...@gmail.com> wrote: > > > > Thanks Bert. > > > > But I it doesn't complain when predict is used on X instead of X_new > > (using nnet_7.3-12), which is even more puzzling to me: > > > > pred <- predict(fit, X, type = "probs") > > Indeed: There is a predict.multinom function and it does have 'probs' as > an acceptable argument to type: > > I got success (or at least an absence of an error message) with: > > #-- > X <- data.frame(matrix( 3 * runif(length(ysim)), nrow = 300, ncol = 3)) > X_new <- data.frame(matrix( 3 * runif(length(ysim)), nrow = 200, ncol = > 3)) > str(X) > > 'data.frame': 300 obs. of 3 variables: > $ X1: num 0.797 1.116 1.719 2.725 0.605 ... > $ X2: num 0.797 1.116 1.719 2.725 0.605 ... > $ X3: num 0.797 1.116 1.719 2.725 0.605 ... > > fit <- multinom(y ~ ., data=X, trace = FALSE) > pred <- predict(fit, setNames(X_new, names(X)), type = "probs") > > > head(pred) > ysim1 ysim2 ysim3 > 1 0.3519378 0.3517418 0.2963204 > 2 0.3135513 0.3138573 0.3725915 > 3 0.3603779 0.3600461 0.2795759 > 4 0.3572297 0.3569498 0.2858206 > 5 0.3481512 0.3480128 0.3038360 > 6 0.3813310 0.3806118 0.2380572 > > # > > > > head(pred) > > ysim1 ysim2 ysim3 > > 1 0.3059421 0.3063284 0.3877295 > > 2 0.3200219 0.3202551 0.3597230 > > 3 0.3452414 0.3451460 0.3096125 > > 4 0.3827077 0.3819603 0.2353320 > > 5 0.2973288 0.2977994 0.4048718 > > 6 0.3817027 0.3809759 0.2373214 > > > > Thanks again, > > Lars. > > > > > > On Sun, Jun 26, 2016 at 1:05 PM, Bert Gunter <bgunter.4...@gmail.com> > wrote: > > > >> Well, for one thing, there is no "probs" method for predict.nnet, at > >> least in my version: nnet_7.3-12 > >> > >> Cheers, > >> Bert > >> > >> > >> Bert Gunter > >> > >> "The trouble with having an open mind is that people keep coming along > >> and sticking things into it." > >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >> > >> > >> On Sun, Jun 26, 2016 at 9:27 AM, Lars Bishop <lars...@gmail.com> wrote: > >>> Hello, > >>> > >>> I'd appreciate your help in spotting the reason for the error and > warning > >>> messages below. > >>> > >>> library(nnet) > >>> set.seed(1) > >>> ysim <- gl(3, 100) > >>> y <- model.matrix(~ysim -1) > >>> X <- matrix( 3 * runif(length(ysim)), nrow = 300, ncol = 3) > >>> X_new <- matrix( 3 * runif(length(ysim)), nrow = 200, ncol = 3) > >>> > >>> fit <- multinom(y ~ X, trace = FALSE) > >>> pred <- predict(fit, X_new, type = "probs") > >>> > >>> Error in predict.multinom(fit, X_new, type = "probs") : > >>> NAs are not allowed in subscripted assignments > >>> In addition: Warning message: > >>> 'newdata' had 200 rows but variables found have 300 rows > >>> > >>> Thanks, > >>> Lars. > >>> > >>>[[alternative HTML version deleted]] > >>> > >>> __ > >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>> https://stat.ethz.ch/mailman/listinfo/r-help > >>> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >>> and provide commented, minimal, self-contained, reproducible code. > >> > > > > [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > David Winsemius > Alameda, CA, USA > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help please with error from nnet::multinom
Thanks Bert. But I it doesn't complain when predict is used on X instead of X_new (using nnet_7.3-12), which is even more puzzling to me: pred <- predict(fit, X, type = "probs") head(pred) ysim1 ysim2 ysim3 1 0.3059421 0.3063284 0.3877295 2 0.3200219 0.3202551 0.3597230 3 0.3452414 0.3451460 0.3096125 4 0.3827077 0.3819603 0.2353320 5 0.2973288 0.2977994 0.4048718 6 0.3817027 0.3809759 0.2373214 Thanks again, Lars. On Sun, Jun 26, 2016 at 1:05 PM, Bert Gunter <bgunter.4...@gmail.com> wrote: > Well, for one thing, there is no "probs" method for predict.nnet, at > least in my version: nnet_7.3-12 > > Cheers, > Bert > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Sun, Jun 26, 2016 at 9:27 AM, Lars Bishop <lars...@gmail.com> wrote: > > Hello, > > > > I'd appreciate your help in spotting the reason for the error and warning > > messages below. > > > > library(nnet) > > set.seed(1) > > ysim <- gl(3, 100) > > y <- model.matrix(~ysim -1) > > X <- matrix( 3 * runif(length(ysim)), nrow = 300, ncol = 3) > > X_new <- matrix( 3 * runif(length(ysim)), nrow = 200, ncol = 3) > > > > fit <- multinom(y ~ X, trace = FALSE) > > pred <- predict(fit, X_new, type = "probs") > > > > Error in predict.multinom(fit, X_new, type = "probs") : > > NAs are not allowed in subscripted assignments > > In addition: Warning message: > > 'newdata' had 200 rows but variables found have 300 rows > > > > Thanks, > > Lars. > > > > [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help please with error from nnet::multinom
Hello, I'd appreciate your help in spotting the reason for the error and warning messages below. library(nnet) set.seed(1) ysim <- gl(3, 100) y <- model.matrix(~ysim -1) X <- matrix( 3 * runif(length(ysim)), nrow = 300, ncol = 3) X_new <- matrix( 3 * runif(length(ysim)), nrow = 200, ncol = 3) fit <- multinom(y ~ X, trace = FALSE) pred <- predict(fit, X_new, type = "probs") Error in predict.multinom(fit, X_new, type = "probs") : NAs are not allowed in subscripted assignments In addition: Warning message: 'newdata' had 200 rows but variables found have 300 rows Thanks, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Issue installing packages - Linux
Thanks Jim. I don't think that is the issue...if anyone else can shed some light here, that would be much appreciated. Regards Lars. On Saturday, 30 April 2016, Jim Lemon <drjimle...@gmail.com> wrote: > Hi Lars, > A mystery, but for the bodgy characters in your error message. Perhaps > there is a problem with R trying to read a different character set > from that used in the package. > > Jim > > On Sat, Apr 30, 2016 at 8:22 PM, Lars Bishop <lars...@gmail.com > <javascript:;>> wrote: > > Hello, > > > > I can’t seem to be able to install packages on a redhat-linux-gnu. For > > instance, this is what happens when I try to install “bitops”. Any hint > on > > what might be the issue would be much appreciated. > > > >> sessionInfo() > > R version 3.2.3 (2015-12-10) > > Platform: x86_64-redhat-linux-gnu (64-bit) > > Running under: Red Hat Enterprise Linux > > > >> Sys.setenv(https_proxy="https://labproxy.com:8080;) > >> install.packages("bitops", lib="mypath ") > > > > Here I choose: 22: (HTTP mirrors) and then a mirror 16:Canada(ON) > > > > * installing *source* package âbitopsâ ... > > ** package âbitopsâ successfully unpacked and MD5 sums checked > > Error in readRDS(pfile) : error reading from connection > > ERROR: lazy loading failed for package âbitopsâ > > > > I’ve also tried from the shell (after downloading the package source) > > > > $ R CMD INSTALL bitops_1.0-6.tar.gz > > ERROR: cannot extract package from bitops_1.0-6.tar.gz > > > > Thank you, > > Lars. > > > > [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org <javascript:;> mailing list -- To UNSUBSCRIBE and > more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Issue installing packages - Linux
Hello, I can’t seem to be able to install packages on a redhat-linux-gnu. For instance, this is what happens when I try to install “bitops”. Any hint on what might be the issue would be much appreciated. > sessionInfo() R version 3.2.3 (2015-12-10) Platform: x86_64-redhat-linux-gnu (64-bit) Running under: Red Hat Enterprise Linux > Sys.setenv(https_proxy="https://labproxy.com:8080;) > install.packages("bitops", lib="mypath ") Here I choose: 22: (HTTP mirrors) and then a mirror 16:Canada(ON) * installing *source* package âbitopsâ ... ** package âbitopsâ successfully unpacked and MD5 sums checked Error in readRDS(pfile) : error reading from connection ERROR: lazy loading failed for package âbitopsâ I’ve also tried from the shell (after downloading the package source) $ R CMD INSTALL bitops_1.0-6.tar.gz ERROR: cannot extract package from bitops_1.0-6.tar.gz Thank you, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Interactive ggvis plot - question
Hello, I'm trying to plot a histogram (or alternatively the density) of a variable, with a slider that displays a vertical line with the corresponding quantile of the distribution of the variable. That is, I what to interactivity pick for example, the median, and have the vertical line move to the corresponding value. Is this possible to implement in ggvis? I read this comment from Hadley that says vertical lines are not implemented in ggvis (at least as of last year). https://groups.google.com/forum/#!topic/ggvis/VpUYCvhbfX8 Any guidance would be greatly appreciated. Best, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Annoying startup error message
yes! Thanks! On Fri, Feb 26, 2016 at 12:07 PM, Sarah Goslee <sarah.gos...@gmail.com> wrote: > Perhaps you at one point added it to your .RProfile so the package is > loaded at startup. You can check by starting R from the command line > with > R --vanilla > which doesn't load any of the profile files. > > https://stat.ethz.ch/R-manual/R-devel/library/base/html/Startup.html > > Sarah > > On Fri, Feb 26, 2016 at 12:01 PM, Lars Bishop <lars...@gmail.com> wrote: > > Thank you Ulrik. I actually don't want to install SparkR, just don't want > > to have that error message when R starts. For some reason, R is trying to > > load the package every time it starts... > > > > Thanks > > Lars. > > > > On Fri, Feb 26, 2016 at 11:53 AM, Ulrik Stervbo <ulrik.ster...@gmail.com > > > > wrote: > > > >> Hi Lars, > >> > >> The error tells you that SparkR is not installed. > >> > >> I believe you can install it like this: > >> > >> library(devtools) > >> install_github("amplab-extras/SparkR-pkg", subdir="pkg") > >> > >> I took it from https://github.com/amplab-extras/SparkR-pkg and I > haven't > >> tried it myself. > >> > >> Hope this helps, > >> Ulrik > >> > >> On Fri, 26 Feb 2016 at 17:40 Lars Bishop <lars...@gmail.com> wrote: > >> > >>> Hello, > >>> > >>> Just installed R version 3.2.3, and I'm getting the error message below > >>> every time I start R. I had SparkR installed in the prior version. I > >>> googled this problem, but didn;t find anything useful. > >>> > >>> Any help would be very appreciated. > >>> > >>> Error in library(SparkR) : there is no package called ‘SparkR’ > >>> [R.app GUI 1.66 (7060) x86_64-apple-darwin13.4.0] > >>> > >>> > >>> Best, > >>> Lars. > >>> > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Annoying startup error message
I understand the solution would be to unset my "SPARK_HOME" environment variable. I can do this with Sys.unsetenv() but it does not unset permanently (only in the session). How can I unset permanently? Sys.getenv("SPARK_HOME") [1] "/Users/lars/Downloads/spark-1.6.0-bin-hadoop2.6/bin/spark" Sys.unsetenv("SPARK_HOME") Sys.getenv("SPARK_HOME") [1] "" Thanks again for you help! Lars. On Fri, Feb 26, 2016 at 2:11 PM, Duncan Murdoch <murdoch.dun...@gmail.com> wrote: > On 26/02/2016 12:07 PM, Sarah Goslee wrote: > >> Perhaps you at one point added it to your .RProfile so the package is >> loaded at startup. You can check by starting R from the command line >> with >> R --vanilla >> which doesn't load any of the profile files. >> >> https://stat.ethz.ch/R-manual/R-devel/library/base/html/Startup.html >> > > Another possibility is that SparkR defines some S4 classes, and you have > an object of that type in your workspace. To read it might need SparkR > installed. > > The solution here is to run R --vanilla as above, or to delete the object > from the workspace, or the whole workspace. > > Duncan Murdoch > > >> Sarah >> >> On Fri, Feb 26, 2016 at 12:01 PM, Lars Bishop <lars...@gmail.com> wrote: >> > Thank you Ulrik. I actually don't want to install SparkR, just don't >> want >> > to have that error message when R starts. For some reason, R is trying >> to >> > load the package every time it starts... >> > >> > Thanks >> > Lars. >> > >> > On Fri, Feb 26, 2016 at 11:53 AM, Ulrik Stervbo < >> ulrik.ster...@gmail.com> >> > wrote: >> > >> >> Hi Lars, >> >> >> >> The error tells you that SparkR is not installed. >> >> >> >> I believe you can install it like this: >> >> >> >> library(devtools) >> >> install_github("amplab-extras/SparkR-pkg", subdir="pkg") >> >> >> >> I took it from https://github.com/amplab-extras/SparkR-pkg and I >> haven't >> >> tried it myself. >> >> >> >> Hope this helps, >> >> Ulrik >> >> >> >> On Fri, 26 Feb 2016 at 17:40 Lars Bishop <lars...@gmail.com> wrote: >> >> >> >>> Hello, >> >>> >> >>> Just installed R version 3.2.3, and I'm getting the error message >> below >> >>> every time I start R. I had SparkR installed in the prior version. I >> >>> googled this problem, but didn;t find anything useful. >> >>> >> >>> Any help would be very appreciated. >> >>> >> >>> Error in library(SparkR) : there is no package called ‘SparkR’ >> >>> [R.app GUI 1.66 (7060) x86_64-apple-darwin13.4.0] >> >>> >> >>> >> >>> Best, >> >>> Lars. >> >>> >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Annoying startup error message
Thank you Ulrik. I actually don't want to install SparkR, just don't want to have that error message when R starts. For some reason, R is trying to load the package every time it starts... Thanks Lars. On Fri, Feb 26, 2016 at 11:53 AM, Ulrik Stervbo <ulrik.ster...@gmail.com> wrote: > Hi Lars, > > The error tells you that SparkR is not installed. > > I believe you can install it like this: > > library(devtools) > install_github("amplab-extras/SparkR-pkg", subdir="pkg") > > I took it from https://github.com/amplab-extras/SparkR-pkg and I haven't > tried it myself. > > Hope this helps, > Ulrik > > On Fri, 26 Feb 2016 at 17:40 Lars Bishop <lars...@gmail.com> wrote: > >> Hello, >> >> Just installed R version 3.2.3, and I'm getting the error message below >> every time I start R. I had SparkR installed in the prior version. I >> googled this problem, but didn;t find anything useful. >> >> Any help would be very appreciated. >> >> Error in library(SparkR) : there is no package called ‘SparkR’ >> [R.app GUI 1.66 (7060) x86_64-apple-darwin13.4.0] >> >> >> Best, >> Lars. >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Annoying startup error message
Hello, Just installed R version 3.2.3, and I'm getting the error message below every time I start R. I had SparkR installed in the prior version. I googled this problem, but didn;t find anything useful. Any help would be very appreciated. Error in library(SparkR) : there is no package called ‘SparkR’ [R.app GUI 1.66 (7060) x86_64-apple-darwin13.4.0] Best, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Order of formula terms in model.matrix
Question: I(a*b) would work as long as both “a" and “b” are numeric. Is there a way I can force the behaviour of model.matrix when one of these variables is a factor (as in "f1:trt" from my example below)? Specifically, based on my example below, I would like to always return the first matrix (where the first level of “f1” is omitted in the resulting matrix). This would’t happen if the user specifies the terms in reverse order (as per the second matrix). Thanks again, Lars. > On Jan 17, 2016, at 2:53 PM, Lars Bishop <lars...@gmail.com> wrote: > > This is very helpful, thanks! > > Lars. > > >> On Jan 17, 2016, at 1:34 PM, Charles C. Berry <ccbe...@ucsd.edu> wrote: >> >> On Sun, 17 Jan 2016, Lars Bishop wrote: >> >>> I’d appreciate your help on understanding the following. >> >>> It is not very clear to me from the model.matrix documentation, why simply >>> changing the order of terms in the formula may change the number of >>> resulting columns. Please note I’m purposely not including main effects in >>> the model formula in this case. >> >> >> IIRC, there are some heuristics involved harking back to the White Book. I >> recall there have been discussions of whether and how this could be fixed >> before on this list and or R-devel, but I cannot seem to lay my browser on >> them right now. >> >> >>> >>> set.seed(1) >>> x1 <- rnorm(100) >>> f1 <- factor(sample(letters[1:3], 100, replace = TRUE)) >>> trt <- sample(c(-1,1), 100, replace = TRUE) >>> df <- data.frame(x1=x1, f1=f1, trt=trt) >>> >>> dim(model.matrix( ~ x1:trt + f1:trt, data = df)) >>> [1] 100 4 >>> >>> dim(model.matrix(~ f1:trt + x1:trt, data = df)) >>> [1] 100 5 >>> >> >> By `x1:trt' I guess you mean the same thing as `I(x1*trt)'. >> >> If you use the latter form, the issue you raise goes away. >> >> Note that `I(some.expr)' gives you the ability to force the behavior of >> model.matrix to be exactly what you want by suitably crafting `some.expr', >> heuristics notwithstanding. >> >> HTH, >> >> Chuck >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Order of formula terms in model.matrix
This is very helpful, thanks! Lars. > On Jan 17, 2016, at 1:34 PM, Charles C. Berry <ccbe...@ucsd.edu> wrote: > > On Sun, 17 Jan 2016, Lars Bishop wrote: > >> I’d appreciate your help on understanding the following. > >> It is not very clear to me from the model.matrix documentation, why simply >> changing the order of terms in the formula may change the number of >> resulting columns. Please note I’m purposely not including main effects in >> the model formula in this case. > > > IIRC, there are some heuristics involved harking back to the White Book. I > recall there have been discussions of whether and how this could be fixed > before on this list and or R-devel, but I cannot seem to lay my browser on > them right now. > > >> >> set.seed(1) >> x1 <- rnorm(100) >> f1 <- factor(sample(letters[1:3], 100, replace = TRUE)) >> trt <- sample(c(-1,1), 100, replace = TRUE) >> df <- data.frame(x1=x1, f1=f1, trt=trt) >> >> dim(model.matrix( ~ x1:trt + f1:trt, data = df)) >> [1] 100 4 >> >> dim(model.matrix(~ f1:trt + x1:trt, data = df)) >> [1] 100 5 >> > > By `x1:trt' I guess you mean the same thing as `I(x1*trt)'. > > If you use the latter form, the issue you raise goes away. > > Note that `I(some.expr)' gives you the ability to force the behavior of > model.matrix to be exactly what you want by suitably crafting `some.expr', > heuristics notwithstanding. > > HTH, > > Chuck > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Order of formula terms in model.matrix
I’d appreciate your help on understanding the following. It is not very clear to me from the model.matrix documentation, why simply changing the order of terms in the formula may change the number of resulting columns. Please note I’m purposely not including main effects in the model formula in this case. set.seed(1) x1 <- rnorm(100) f1 <- factor(sample(letters[1:3], 100, replace = TRUE)) trt <- sample(c(-1,1), 100, replace = TRUE) df <- data.frame(x1=x1, f1=f1, trt=trt) dim(model.matrix( ~ x1:trt + f1:trt, data = df)) [1] 100 4 dim(model.matrix(~ f1:trt + x1:trt, data = df)) [1] 100 5 Thanks, Lars. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Median expected survival
Hi All, Apologies for the simple question, but I could not find a straightforward answer based on my limited knowledge of survival analysis. Iâm trying to obtain the predicted median survival time for each subject on a new dataset from a fitted coxph{survival} or cph{rms} object. Would the quantile.survfit function (as used below) return the expected median survival? Why this function returns NAs in this case, when all predictors have non-missing values? As an alternative, Iâve tried to use the Quntile{rms} function as in my second chunk of code, but in this case I get an error message (most likely due to my lack of understanding as well). library(MASS) library(survival) library(rms) data(gehan) leuk.cox -coxph(Surv(time, cens) ~ treat + factor(pair), data = gehan) leuk_new - gehan[1:10, ] # take first 10 patients pred_leuk - survfit(leuk.cox, newdata=leuk_new) quantile(pred_leuk, 0.5)$quantile ### alternative using rms leuk.cox.rms -cph(Surv(time, cens) ~ treat + factor(pair), data = gehan, surv = T) med - Quantile(leuk.cox.rms) Predict(leuk.cox.rms, data = leuk_new, fun=function(x)med(lp=x)) Error in Predict(leuk.cox.rms, data = leuk_new, fun = function(x) med(lp = x)) : predictors(s) not in model: data Thank you for your help. Best, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Math symbols in ggplot facets
Hello, I would like to show in my facet labels the equivalent in LaTex of $\sigma_{0}= \sqrt{2}$. I think I'm close below, but not yet as it shows $(\sigma_{0}, \sqrt{2})$ m - mpg levels(m$drv) - c(sigma[0]=sqrt(2), sigma[0]=2 * sqrt(2), sigma[0]= 3 * sqrt(2)) ggplot(m, aes(x = displ, y = cty)) + geom_point() + facet_grid(. ~ drv, labeller = label_parsed) Thanks, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with {tables} package
Dear list, I'm most likely doing something wrong, but I'm getting an error message in tab2 below (tab1 is fine). Any hint is much appreciated. library(tables) set.seed(1) dd - data.frame(x = rnorm(100), f1 = gl(2, 50, labels = c(A, B)), f2 = gl(4, 25, labels = c(a, b, c, d)), f3 = as.factor(sample(1:10, 100, replace = T))) tab1 - tabular((n=1) + Format(digits=1) * ((x) * (Avg. = mean)) + Format(digits=4) * ((Factor(f1) + Factor(f2)) * (Pctn. = Percent('col'))) ~ Justify(c) * (f3 + 1), data = dd) tab1 tab2 - tabular((n=1) + Format(digits=1) * ((x) * (Avg. = mean)) + Format(digits=4) * ((Factor(f1)) * (Pctn. = Percent('col'))) ~ Justify(c) * (f3 + 1), data = dd) Error in justification[j, ] - rightjustification : number of items to replace is not a multiple of replacement length Best, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Order of factors with facets in ggplot2
Hello, I'd like to produce a ggplot where the order of factors within facets is based on the average of another variable. Here's a reproducible example. My problem is that the factors are ordered similarly in both facets. I would like to have, within each facet of `f1', boxplots for 'x' within each factor `f2', where the boxplots are ordered based on the average of x within each facet. So in this case, for facet A: the order should be M4, M1, M3, M2; and for facet B: M4, M2, M1, M3 library(ggplot2) library(plyr) set.seed(1) f1 - sample(c(A, B), 100, replace= T) f2 - gl(4, 25, labels = paste(M, 1:4, sep=)) x - runif(100) df - data.frame(f1, f2, x) #df - df[order(df[,1]), ] df - ddply(df, ~ f1 + f2, transform, Avg_x = mean(x)) myplot - ggplot(df, aes(x=reorder(f2, Avg_x), x)) + geom_boxplot() + facet_wrap(~ f1, scale = free, ncol = 1) + stat_summary(fun.y=mean, geom=point, col = red) myplot Thanks in advance for any help! Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Permutation tests in {coin}
Hello, I'm trying to get familiar with the coin package for doing permutation tests. I'm not sure I understand the documentation regarding the difference between distribution = asymptotic and approximate in the function independence_test. Are permutations of the test statistic actually computed in the asymptotic case, or only when the distribution is specified as approximate? When should I use each option? Thanks Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with ordering values
Hi, Is it possible to use x and y below to produce the vector shown in desired.result (which represents the values in y ordered according to the order specified by x)? Ideally the solution must be efficient in terms of timing as I'm dealing with *very* long vectors. x - c(7, 6, 5, 9, 4, 14, 8, 12, 10, 11, 2, 13, 3, 1) y - c(1, 2, 9, 8, 11) desired.result - c(9, 8, 11, 2, 1) Thanks, Lars. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Issue with xxx-package.Rd
Dear list, I have an issue with placing the information stored in the file mypackage-package.Rd file (as created by promptPackage(mypackage)) in the package manual pdf file. The information on that .Rd file is not placed in the head of the manual as I would expect. I think a possible reason is that the package name is also the name of one of the functions within the package. But that is the case for many packages (if not most packages), so not sure. Any hint what might be going on? Thanks, Lars. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with effects package
Dear List, Several times I use this package I get the error message shown below. When I work out simple examples, it turns out to be fine, but when working with real and moderate size data sets I always get the same error. Do you know what could be the cause of the problem? Error in apply(mod.matrix[, components], 1, prod) : subscript out of bounds Error in plot(effect(myvariable, glm.sev1)) : error in evaluating the argument 'x' in selecting a method for function 'plot' Thanks, Lars/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with effects package
ok. Here's an example: R version 2.11.1 effects_2.0-10 var1 - c(25631.9392, 2521.2590, 6656.6516, 1362.5997, 6369.9818, 27253.4223, 2073.1909, 9959.3792, 3318.2500, 15323.8103, 11583.8717, 3054.5558, 625.6597, 2500., 11996.2271) var2 - as.factor(c(B:=500, B:=500, B:=500, B:=500, B:=500, B:=500, B:=500, B:=500, B:=500, C:750-1000, C:750-1000, B:=500, B:=500, B:=500, B:=500)) glm1 - glm(var1 ~ var2, family = Gamma(link = log)) summary(glm1) library(effects) plot(effect(var2, glm1)) Error in apply(mod.matrix[, components], 1, prod) : subscript out of bounds Error in plot(effect(var2, glm1)) : error in evaluating the argument 'x' in selecting a method for function 'plot' On Sat, Jul 30, 2011 at 3:10 PM, jim holtman jholt...@gmail.com wrote: PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. We can not help unless we at least have the data that you are using. On Sat, Jul 30, 2011 at 3:01 PM, Lars Bishop lars...@gmail.com wrote: Dear List, Several times I use this package I get the error message shown below. When I work out simple examples, it turns out to be fine, but when working with real and moderate size data sets I always get the same error. Do you know what could be the cause of the problem? Error in apply(mod.matrix[, components], 1, prod) : subscript out of bounds Error in plot(effect(myvariable, glm.sev1)) : error in evaluating the argument 'x' in selecting a method for function 'plot' Thanks, Lars/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with effects package
Thanks for your quick response on this issue John! Yes, in my real dataset I have many predictors and more observations than in this example. I've started to remove one predictor at the time until this one was found to be causing the problem. Thanks again! Lars. On Sat, Jul 30, 2011 at 5:33 PM, John Fox j...@mcmaster.ca wrote: Dear Lars, The problem is the : in the levels of var2, which confuses effect() about the structure of the model, since a colon indicates interaction. Try, e.g., removing the colons: var2 - as.factor(c(B=500, B=500, B=500, B=500, B=500, B=500, B=500, B=500, B=500, C750-1000, C750-1000, B=500, B=500, B=500, B=500)) Then, effect(var2, glm1) var2 effect var2 B=500 C750-1000 7947.932 13453.841 (See how helpful it can be to furnish a reproducible example?) I agree, BTW, that this is a deficiency in effect(); I'll look into it when I have a chance. Finally, there really is no advantage to making an effect display for a model with a single predictor, though perhaps this isn't what you were doing in your actual application. Best, John -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Lars Bishop Sent: July-30-11 4:30 PM To: jim holtman; r-help@r-project.org Subject: Re: [R] Problem with effects package ok. Here's an example: R version 2.11.1 effects_2.0-10 var1 - c(25631.9392, 2521.2590, 6656.6516, 1362.5997, 6369.9818, 27253.4223, 2073.1909, 9959.3792, 3318.2500, 15323.8103, 11583.8717, 3054.5558, 625.6597, 2500., 11996.2271) var2 - as.factor(c(B:=500, B:=500, B:=500, B:=500, B:=500, B:=500, B:=500, B:=500, B:=500, C:750-1000, C:750-1000, B:=500, B:=500, B:=500, B:=500)) glm1 - glm(var1 ~ var2, family = Gamma(link = log)) summary(glm1) library(effects) plot(effect(var2, glm1)) Error in apply(mod.matrix[, components], 1, prod) : subscript out of bounds Error in plot(effect(var2, glm1)) : error in evaluating the argument 'x' in selecting a method for function 'plot' On Sat, Jul 30, 2011 at 3:10 PM, jim holtman jholt...@gmail.com wrote: PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. We can not help unless we at least have the data that you are using. On Sat, Jul 30, 2011 at 3:01 PM, Lars Bishop lars...@gmail.com wrote: Dear List, Several times I use this package I get the error message shown below. When I work out simple examples, it turns out to be fine, but when working with real and moderate size data sets I always get the same error. Do you know what could be the cause of the problem? Error in apply(mod.matrix[, components], 1, prod) : subscript out of bounds Error in plot(effect(myvariable, glm.sev1)) : error in evaluating the argument 'x' in selecting a method for function 'plot' Thanks, Lars/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (unclassified?) Help Question
Dear List, I'd appreciate you guidance for obtaining the desired result shown below, by combining tapply(x, g, mean) and g in the example. Basically, I'm trying to create a vector whose values are based on the result from tapply(x, g, mean) but that follow the pattern and length given by the factor g. Of course I'm looking for a generic solution (i.e, not something that just work for this particualr case). set.seed(1) f1 - gl(2, 1, 10, labels=c(M, F)) f2 - gl(2, 2, 10, labels=c(H, L)) x - rnorm(10) d - data.frame(f1, f2, x) g - interaction(f1, f2) tapply(x, g, mean) M.H F.H M.LF.L 0.0929451 -0.3140711 -0.1740998 1.1668028 g [1] M.H F.H M.L F.L M.H F.H M.L F.L M.H F.H Levels: M.H F.H M.L F.L Desired result: x [1] 0.0929451 -0.3140711 -0.1740998 1.1668028 0.0929451 -0.3140711 -0.1740998 [8] 1.1668028 Thanks for any help! Lars. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Simple R Question...
Let's say I have the data frame 'dd' below. I'd like to select one column from this data frame (say 'a') and keep its name in the resulting data frame. That can be done as in #2. However, what if I want to make my selection based on a vector of names (and again keep those names in the resulting data frame). My attempt is #4 but doesn't work. dd - data.frame(a = gl(3,4), b = gl(4,1,12), c=rnorm(12)) #1 data.frame(a = dd[,a]) #2 mynames - a #3 data.frame(eval(mynames) = dd[, mynames]) #4 thanks, Lars. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Contrasts in Penalized Package
Hi, The penalized documentation says that Unordered factors are turned into as many dummy variables as the factor has levels. This is done by a function in the package called contr.none. I'm trying to figure out how exactly is a model matrix created with this contrast option when the user calls the function with a formula. I typed library(penalized) ; penalized but couldn't point the part of the code where this is done. I'll appreciate any help on this. Lars. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with contrasts
Hi, I need to build a function to generate one column for each level of a factor in the model matrix created on an arbitrary formula (instead of using the available contrasts options such as contr.treatment, contr.SAS, etc). My approach to this was first to use the built-in function for contr.treatment but changing the default value of the contrasts argument to FALSE (I named this function contr.identity and it shown at the bottom of the email for reference). So this function works fine, contr.identity(4) 1 2 3 4 1 1 0 0 0 2 0 1 0 0 3 0 0 1 0 4 0 0 0 1 contr.treatment(4) 2 3 4 1 0 0 0 2 1 0 0 3 0 1 0 4 0 0 1 However, when I try to create a model matrix using contr.identity specified in options(), it actually uses the contr.treatment option. Why is that? Any hint on how can I do this? options(contrasts = c('contr.identity', 'contr.poly')) options(contrasts) dd - data.frame(a = gl(3,4), b = gl(4,1,12)) model.matrix(~ a + b, dd) #creates 2 columns for a and 3 for b instead of 3 and 4, respectively contr.identity - function(n, base = 1, contrasts = FALSE, sparse = FALSE) { if(is.numeric(n) length(n) == 1L) { if(n 1L) levels - as.character(seq_len(n)) else stop(not enough degrees of freedom to define contrasts) } else { levels - as.character(n) n - length(n) } contr - .Diag(levels, sparse=sparse) if(contrasts) { if(n 2L) stop(gettextf(contrasts not defined for %d degrees of freedom, n - 1L), domain = NA) if (base 1L | base n) stop(baseline group number out of range) contr - contr[, -base, drop = FALSE] } contr } Thanks for any help, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Second largest element from each matrix row
Hi, I need to extract the second largest element from each row of a matrix. Below is my solution, but I think there should be a more efficient way to accomplish the same, or not? set.seed(1) a - matrix(rnorm(9), 3 ,3) sec.large - as.vector(apply(a, 1, order, decreasing=T)[2,]) ans - sapply(1:length(sec.large), function(i) a[i, sec.large[i]]) ans Thanks in advance for your help, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Second largest element from each matrix row
Thanks a lot to all of you! On Tue, Apr 26, 2011 at 10:34 AM, Peter Ehlers ehl...@ucalgary.ca wrote: On 2011-04-26 05:26, Dimitris Rizopoulos wrote: one way is the following: a- matrix(rnorm(9), 3 ,3) aa- a[order(row(a), -a)] matrix(aa, nrow(a), byrow = TRUE)[, 2] That's clever, Dmitris. And very fast! Lars: my apologies; I didn't read your request carefully. You asked for more _efficient_ and I just thought 'shorter code'. I should know that whenver I think 'apply', I should think 'matrix'. Peter Ehlers I hope it helps. Best, Dimitris On 4/26/2011 2:01 PM, Lars Bishop wrote: Hi, I need to extract the second largest element from each row of a matrix. Below is my solution, but I think there should be a more efficient way to accomplish the same, or not? set.seed(1) a- matrix(rnorm(9), 3 ,3) sec.large- as.vector(apply(a, 1, order, decreasing=T)[2,]) ans- sapply(1:length(sec.large), function(i) a[i, sec.large[i]]) ans Thanks in advance for your help, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help in splitting a list
Dear R users, Let's say I have a list with components being 'm' matrices (as exemplified in the mylist object below). Now, I'd like to subset this list based on an index vector, which will partition each matrix 'm' in 2 sub-matrices. My questions are: 1. Is there an elegant way to have the results shown in mylist2 for an arbitrary number of matrices in mylist? 2. The column names are 'lost' for mylist2[[2]] and mylist2[[4]] (but not for mylist2[[1]] and mylist2[[3]]). Is there a way to keep the column names in the results of mylist2? mylist - list(matrix(1:9,3,3), matrix(10:18,3,3)) colnames(mylist[[1]])=c('x1','x2','x3') colnames(mylist[[2]])=c('x4','x5','x6') index - list(2) index[[1]] - c(TRUE,FALSE,TRUE) index[[2]] - c(FALSE,TRUE,TRUE) mylist2 - list(as.matrix(mylist[[1]][,index[[1]]]), as.matrix(mylist[[1]][,!index[[1]]]), as.matrix(mylist[[2]][,index[[2]]]), as.matrix(mylist[[2]][,!index[[2]]])) Thanks for any help, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help in Compiling R for AIX 5.3 servers
Dear List, I've used R fr a while now, but I'm totally unfamiliar with the compiling process. In particular, I'd like to compile R for AIX 5.3. servers. I'll appreciate your guidance on the best place to get started on this. Thanks in advance, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with RODBC sqlSave
Hi, I'm able to establish a successful odbc connection using RODBC 1.3-2 on Win 7 and R 2.12.0. But I'm getting the following error message when I try to save a data frame into the debase as shown below. library(RODBC) bbdb - odbcConnect(bbdb) odbcGetInfo(bbdb) # returns ok sqlSave(bbdb, USArrests, rownames = state, addPK=TRUE) # example from the RODBC manual Error in sqlSave(bbdb, USArrests, rownames = state, addPK = TRUE) : [RODBC] ERROR: Could not SQLExecDirect 'CREATE TABLE USArrests (state varchar(255) NOT NULL PRIMARY KEY, Murder double, Assault integer, UrbanPop integer, Rape double)' Thanks in advance, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Warning with predict.glm method
Dear list, When I use the predict.glm method on a glm fitted object, I get the following warning message: In addition: Warning message: In predict.lm(object, x) : prediction from a rank-deficient fit may be misleading As the documentation says this happens if the fit is rank-deficient, some of the columns of the design matrix will have been dropped. It is my understanding that this means that at least one predictor variable in the design matrix can be replicated by a linear combination of the other predictors. But this doesn't seem the case in my data set. I noticed that the problem goes away after removing higher order interaction terms from the model. Some of the coefficients from those terms were NA as shown in the summary(glm). I'm wondering if my interpretation of the message is correct, and/or in what type of situations would I get this warning message. Thanks in advance for your help. Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error Message - effects package
Hi, I'm running R 2.11.1 on Windows 32-bit. I fitted a glm to a large data set with no problems (i.e., nothing is wrong with the created glm object). However, when I try to use the effects package to plot the effects I get the error message shown below. I get the same message if I try to plot a single effect. Any idea what could be wrong here? glm.eff - allEffects(glm1) plot(glm.eff) Error in apply(mod.matrix[, components], 1, prod) : subscript out of bounds In addition: Warning messages: 1: In `[-.factor`(`*tmp*`, ri, value = c(1, 1, 1, 1, 1, 1, 1, 1, 1, : invalid factor level, NAs generated 2: In matrix(apply(as.matrix(X.mod[, facs]), 2, mean), nrow = nrow(mod.matrix), : data length exceeds size of matrix 3: In matrix(apply(as.matrix(X.mod[, covs]), 2, typical), nrow = nrow(mod.matrix), : data length exceeds size of matrix Thanks, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sweave for big data analysis
Hi, Maybe I'm missing the point here...but let's suppose you are working with large data sets and using functions that take a significant amount of time to run in R. I woulnd't like to run these functions every time I call Sweave(myfile.Rnw) within R. What is the common practice to use Sweave in these situations. I would just run the function once, save the results and only load them each time I run Sweave on the .Rnw file. Makes sense? Sorry, the question seems silly, but I'd appreciate your thoughts. Thanks, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] varclus in Hmisc vs SAS PROC VARCLUS
Hi, I'll apreciate your guidance on how can I re-create the output from SAS PROC VARCLUS in R. I've found the varclus function in Hmisc. However, is it possible to use that function to compute for each variable the 1-R**2 ratio (this is the ratio of 1 minus the R-squared with Own Cluster to one minus the R-squared in the Next Closest cluster)? Thanks in advance for any help, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Lasso Logistic Regression - Query
Hi, Is it possible to include class factor inputs in a Logistic Regression model with the glmnet package? It only seems to allow for numeric inputs. Looks like the Penalized package does this? Any other package you would recommend for this purpose? Thanks in advance for any help. Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] GAM Predictions (mgvc)
Hi, Is there any way I can see how exactly the prediction equation is constructed in the example below? I'd like to be able to replicate the predictions from the fitted gam object as given by the predict.gam method in the package mgcv. library(mgcv) data(trees) ct1 - gam(Volume ~ s(Height) + s(Girth), family=Gamma(link=log), data=trees) summary(ct1) newdata - cbind(Girth=mean(trees$Girth), Height=mean(trees$Height), Volume=mean(mean(trees$Volume))) pred - predict(ct1,as.data.frame(newdata),type=response) pred Thanks in advance, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R 64-bit and Revolution
Dear users, The company where I work is considering getting a license for Revolution Enterprise - Windows 64-bit. I'll appreciate for those familiar with the product if can share your experiences with it? In particular, how does it compare to the free version of R 64-bit? Thanks in advance. Regards, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Penalized Gamma GLM
Hi, I couldn't find a package to fit a penalized (lasso/ridge) Gamma regression model. Does anybody know any? Thanks in advance, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Lattice Wireframe Plot into LaTex
Dear R/LaTex user, I'm simply trying to include a plot created with the Lattice wireframe function into LaTex. I have no problems including other R plots into LaTex by exporting as a Postcript and then including the graph in LaTex using \begin{figure} % Requires \usepackage{graphicx} \includegraphics[width=]{mygraph.eps}\\ \caption{}\label{} \end{figure} However, for some reason when I do this with a wireframe plot I can see the plot area in LaTex but not the plot itself. I've search for this exhaustively on the web but couldn't find an answer. I'll appreciate your help. Thanks, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Gradient Boosting Trees with correlated predictors in gbm
Dear R users, Im trying to understand how correlated predictors impact the Relative Importance measure in Stochastic Boosting Trees (J. Friedman). As Friedman described with single decision trees (referring to Briemans CART algorithm), the relative importance measure is augmented by a strategy involving surrogate splits intended to uncover the masking of influential variables by others highly associated with them. This strategy is most helpful with single decision trees where the opportunity for variables to participate in splitting is limited by the size of the tree. In the context of Boosting, however, the number of splitting opportunities is vastly increased, and surrogate unmasking is less essential. Based on the results from the simulated example below, if I have, say two variables which are highly correlated, then the relative importance measure derived from Boosting will tend to be high for one of the predictors and low for the other. Im trying to reconcile this observation with Friedmans description above, which according to my understanding, these two variables should have about the same measure of importance. I'll appreciate your comments. require(gbm) require(MASS) #Generate multivariate random data such that X1 is moderetly correlated by X2, strongly # correlated with X3, and not correlated with X4 or X5. cov.m - matrix(c(1,0.5,0.9,0,0,0.5,1,0.2,0,0,0.9,0.2,1,0,0,0,0,0,1,0,0,0,0,0,1),5,5, byrow=T) n - 2000 # obs X - mvrnorm(n, rep(0, 5), cov.m) Y - apply(X, 1, sum) SNR - 10 # signal-to-noise ratio sigma - sqrt(var(Y)/SNR) Y - Y + rnorm(n,0,sigma) mydata - data.frame(X,Y) #Fit Model (should take less than 20 seconds on an average modern computer) gbm1 - gbm(formula = Y ~ X1 + X2 + X3 + X4 + X5, data=mydata, distribution = gaussian, n.trees = 500, interaction.depth = 2, n.minobsinnode = 10, shrinkage = 0.1, bag.fraction = 0.5, train.fraction = 1, cv.folds=5, keep.data = TRUE, verbose = TRUE) ## Plot variable influence best.iter - gbm.perf(gbm1, plot.it = T, method=cv) print(best.iter) summary(gbm1,n.trees=best.iter) # based on the estimated best number of trees [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Graphics into Latex
Hi, I'm new in Latex and I'm trying to include an R chart into a Latex document. This is what I'm doing: 1) In R: save the chart as a a Postcript in a folder C:/xxx/Density.eps 2) In Latex (using TexWorks on windows xp) : In the preambule: \documentclass[11pt]{article} \usepackage{graphicx} \begin{document} blah..blah blah \begin{figure} \centering \includegraphics{C:/xxx/Density.eps} \label{fig:Density} \end{figure} --This is the Error Message I'm getting: LaTeX Warning: File `R:/MarsTH/Studies/Misc/LIA QA/R/Density.eps' not found on input line 26. ! LaTeX Error: Unknown graphics extension: .eps. See the LaTeX manual or LaTeX Companion for explanation. Type H return for immediate help. I'll appreciate your help. Thanks in advance, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R on Large Data Sets (again)
Dear R users, Ive search the R site for help on this topic but it is hard to find a precise answer for my questions. Which are the best options to overcome the RAM memory limitation problems when using R on large data sets (such as 2 or 3 million records)? - Is the free available version of R (as opposed to the one provided by REvolution Computing) compatible with a windows 64-bit machine? And if I increase the RAM memory enough on win-64, would this virtually solve my memory limitation problems? - Is a Unix-like platform a better option than win-64? Again, would this solve my memory limitation problems? - Any better option? Thanks in advance for your help, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Design Package - Penalized Logistic Reg. - Query
Dear R experts, The lrm function in the Design package can perform penalized (Ridge) logistic regression. It is my understanding that the ridge solutions are not equivalent under scaling of the inputs, so one normally standardizes the inputs. Do you know if input standardization is done internally in lrm or I would have to do it prior to applying this function. Also, as I'm new in R (coming from SAS) I don't know how well R will handle relatively large data sets (e.g. 1/2 million observations on 40 variables). I'll appreciate your comments. Many thanks in advance. Lars/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Best Subset Selection - Leaps Package
Dear Experts, I'm a new R user and I'll appreciate your help regarding the following. I'm trying to generate an exhaustive search of all candidate models in a simple linear regression and select the one with the lowest CV-error (or alternatively the lowest Error on a Test set -- if I have lots of data). The leaps package can generate this exhaustive search but all models are evaluated on the train data (without cross-validation). How can I implement what I'm trying to achieve? Any guidance will help... library(ElemStatLearn) #Follow the example of Page 58 in Elements of Stat Learning Book train - subset(prostate, train==TRUE )[,1:9] test - subset(prostate, train=FALSE )[,1:9] #Best subset selection library(leaps) prostate.leaps - regsubsets( lpsa ~ . , data=train, nbest=100, nvmax=8, method=exhaustive, really.big=T) prostate.leaps.sum - summary(prostate.leaps) prostate.models - prostate.leaps.sum$which prostate.models prostate.models.rss - prostate.leaps.sum$rss prostate.models.rss prostate.models.size - as.numeric(attr(prostate.models, dimnames)[[1]]) prostate.models.best.rss -tapply(prostate.models.rss, prostate.models.size, min) prostate.models.best.rss Thanks a lot! Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with GLM please!
Dear experts, I have a few quick questions related to GLMs: 1) Suppose my response is of the type Yes/No, How can I control which response I'm modelling? 2) How can I perform Type III tests? Is it with - drop1(mymodel, test=Chisq) ? 3) I have a numerical variable which I converted to an ordered factor as shown below, but in the summary results of the logistic regression I get labels for this variable which are not the ones I specified (I get labels I even don't know where they are coming from)? set.seed(12345) dv - gl(n=2, k=1, length=20, label=c(Yes, No)) x1 - rnorm(n=20, mean=100, sd=10) x2 - gl(n=3, k=1, length=20, label=c(x2_1, x2_2, x2_3)) mydata - data.frame(dv,x1,x2) mydata$f.x1 - cut(x1, breaks=c(80,90,100,120)) mydata$f.x1 - ordered(mydata$f.x1, labels=c(L1, L2, L3) , levels(mydata$f.x1)) #Specify desired labels mydata$f.x1 attach(mydata) mymodel - glm(dv ~ f.x1 + x2, family=binomial) summary(mymodel) drop1(mymodel, test=Chisq) #Type 3 test? lastly, 4) how can I test statistical significance between levels of a factor? Thanks a lot for your help! Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with Loop!
Dear experts, I'm new in R and trying to learn by writing a version of the Perceptron Algorithm. How can I tell in the code below to stop the iteration when the condition in the for loop is not satisfied for all training examples? Thanks in advance for your help! ## Generate a linearly separable data set in R2 sample - as.data.frame(cbind(runif(n=100),runif(n=100))) sample$Y - ifelse(sample$V1sample$V2,1,-1) ## Plot data; attach(sample) plot(V1, V2, type=p, pch=Y,main=Sample Data) ##Perceptron algorithm sample_m - as.matrix(sample) w - c(0,0); b - 0; k - 0; nu - 1e-3 R - max(sqrt(V1^2+V2^2)) repeat { for (i in 1:nrow(sample_m)){ if (sample_m[i,3]*(t(w)%*%sample_m[i,1:2] + b) = 0) w - w + nu*sample_m[i,3]*sample_m[i,1:2] b - b +nu*sample_m[i,3]*R^2 k - k+1 cat(k, w, b, \n) } } [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] nnet function - help!
Dear experts, I'm new in R. I'd like to know if I need to standarize the input variables prior to using the nnet function or does this function standarizes the variables internally? In the first case, is there a fast way to standarize continuous and categorical inputs? Thanks Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error using the Rdonlp2 Package
Dear experts, I'm attempting to solve a constrained optimization problem using the Rdonlp2 package. I created a Lagrange function (L=f(x)-lambda(g(x)-c)), where x is a vector of 16 parameters. This is what I'm using as objective function in the code below. In addition, I set bounds on these parameters (par.u and par.l). When I run the code, I get the error message shown below. Any idea why or what does it mean? Thanks in advance for your help! ans - donlp2(par=rate0, fn=L,par.u=par.u, par.l=par.l) 1 fx= 2.869234e+08 upsi= 0.0e+00 b2n= 5.0e+08 umi= 0.0e+00 nr 0 si-1 2 fx= -2.111002e+09 upsi= 0.0e+00 b2n= 5.1e+08 umi= 0.0e+00 nr 14 si-1 3 fx= -5.961701e+09 upsi= 0.0e+00 b2n= 4.1e+08 umi= 0.0e+00 nr 16 si-1 *Error Message:* Error in tryCatchList(expr, classes, parentenv, handlers) : SET_VECTOR_ELT() can only be applied to a 'list', not a 'character' [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Package for Clustering - Query
Dear R users, Is there any package for Latent Class Analysis (to be used in a clustering application) which supports mixed indicator variables (categorical and continuous)? Alternatively, is there any other clustering algorithm available that supports this type of data? Thanks in advance for your help. Regards, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Neural Networks in R - Query
Dear R users, I'd like to ask your guidance regarding the following two questions: (i) I just finished reading Chris Bishop's book Neural Networks for Pattern Recognition. Although the book gave me good theoretical foundation about NN, I'm now looking for something more practical regarding architecture selection strategies. Is there any good reference about best practices for architecure selection? (ii) Which R package provides a good implementation of NN? Many thanks in advance for your help! Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Offset Variable in Statistical Models - Query
Hi, Id appreciate your thoughts regarding the following. Im working with an insurance data set with the objective of predicting a binary outcome (claim or not). Policyholders in the sample are not observed for equivalent time periods. I have an exposure variable that reflects the amount of time each individual has been observed. In traditional GLM models, the usual way to handle this is to use the exposure as an offset variable (i.e. the coefficient for this variable is fixed at 1). I would like to extend the class of models I can use to model this data, using more recent techniques such as Support Vector Machines, Neural Nets, etc. But my question is how I can include the exposure information in the model in the way I do with GLM. Im especially interested in SVM in R. Any thoughts are much appreciated. Regards, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Non-Linear Optimization - Query
Dear All, I couple of weeks ago, Ive asked for a package recommendation for nonlinear optimization. In my problem I have a fairly complicated non-linear objective function subject to one non-linear equality constrain. Ive been suggested to use the *Rdonlp2* package, but I did not get any results after running the program for 5 hrs. Is it normal to run this type of programs for hours? Also, Id like to ask the experts whether there is any other alternative I could use to solve this. For example, can I define a Lagrange function (add lambda as a parameter) and use optim() or any other optimization function? Many thanks in advance for your help. Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with Function!
Dear All, I need to write 'n' functions on 'm' variables. The functions should be constructed according to the values of an (nxm) matrix of '1/0' values as follows. For example, if row1 is equal to ,say [1 0 ...0 0] then f1 - (1+x1) if row 2 is equal to, say [1 1 1 0...0 1] then f2 -(1+x1)*(1+x2)*(1+x3)*(1+xm) if row n is equal to [0 1 0 1 1 0 . 0] then fn -(1+x2)*(1+x4)*(1+x5) Is there an efficient way to write these functions? Note that each f(x) is a function of m variables, independently on whether the function explicitly includes a given variable or not. Many thanks for your help! Regards, Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Matrices in R - Simple question?
Hi, I'm a new R user and would appreciate your help regarding the following: Can I create a matrix whose elements are n functions of a vector x? In my problem I have 3 vectors (a,b,c) with elements a=[f1(x) f2(x)fn(x)]; b=[g1(x) g2(x)gn(x)]; c=[h1(x) h2(x)hn(x)]. I need to create a final function that looks like: f(x)=f1(x)*g1(x)*h1(x)+.+fn(x)*gn(x)*hn(x). (f(x) is my objective function of an optimization problem).Is this possible to create f(x) in the way I described? Many thanks in advance for your help! Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] NonLinear Programming in R - QUERY
Hi All, I'll appreciate your help on this. Do you know of any package that can be used to solve optimization problems subject to general *non-linear* equality constraints. Thanks! Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] EM for Density Estimation
Hi, Which package would you recommend for an implementation of EM for density estimation (eg. mixture of Gaussian)? Thanks! Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] General query regarding scoring new observations
Hi, I was wondering if I can have some advice on the following problem. Let's say that I have a problem in which I want to predict a binary outcome and I use logistic regression for that purpose. In addition, suppose that my model includes predictors that will not be used in scoring new observations but must be used during model training to absorb certain effects that could bias the parameter estimates of the other variables. Because one needs to have the same predictors in model development and scoring, how it is usually done in practice to overcome this problem? I could exclude the variables that will not be available during scoring, but that will bias the estimates for the other variables. Many thanks for your help. Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.