[R] plotly: ability to drag points on x axis only and prevent change of y axis value
Hello, wonder if you could provide input on the following: please see toy example below, wanted to see if there is a way to have restrictions on how the points are dragged on the plot. More specifically I would like the points draggable horizontally ONLY and have their y axis value remain fixed, ie the movement vertically would be restricted and no change allowed to that direction for each point? much appreciate any input you may have, library(plotly) library(purrr) # creates a list of 32 circle shapes (one for each row/car) circles <- map2( mtcars$mpg, mtcars$wt, ~list( type = "circle", # anchor circles at (mpg, wt) xanchor = .x, yanchor = .y, # give each circle a 2 pixel diameter x0 = -5, x1 = 5, y0 = -5, y1 = 5, xsizemode = "pixel", ysizemode = "pixel", # other visual properties fillcolor = "blue", line = list(color = "transparent") ) ) plot_ly() %>% layout(shapes = circles) %>% config(edits = list(shapePosition = TRUE)) appreciate the help, thanks, Andras __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (almost) rolling or fill function?
Thanks Bert, here it is in plain. Hello, please see if you have a thought on how to achieve the following: we have: df<-data.frame(a=Sys.Date()+1:10, b=Sys.Date()+c(NA,NA,NA,rep(3,4),NA,NA,3), c=Sys.Date()+c(NA,NA,NA,rep(9,4),NA,NA,9)) the idea I have difficulty wrapping my head around is to do the following: I need the system to look at df$a by row (lets call it the index row) and look at df$b and df$c 1 row before the given row in df$a (lets call it index row -1) and evaluate if the index row value in df$a falls into the range (>= and <=) of the index row -1 values in df$b and df$c. If it does, then copy over the index row -1 values in df$b and df$c into the index row in df$b and df$c, if not place an NA in both cells of the index row in df$b and df$c. examples: 1. the date value in df$a[8] is between df$b[7] and df$c[7] so we can copy the values in df$b[7] and df$c[7] into df$b[8] and df$c[8] 2. the date value in df$a[9] is between df$b[8] and df$c[8] (as we copied it in in step 1) so we can copy the values in df$b[8] and df$c[8] into df$b[9] and df$c[9] 3. the date value in df$a[10] is NOT between df$b[9] and df$c[9] (as we copied it in in step 2) so we can place NA in df$b[10] and df$c[10] also would like to do this going up, too, similar to fill(...,"downup"). On the end we would want to have this: dfwanted<-data.frame(a=Sys.Date()+1:10, b=Sys.Date()+c(NA,NA,rep(3,7),NA), c=Sys.Date()+c(NA,NA,rep(9,7),NA)) much appreciate any help you could provide. thanks, Andras Andras __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (almost) rolling function or fill?
Hello, please see if you have a thought on how to achieve the following: we have: df<-data.frame(a=Sys.Date()+1:10, b=Sys.Date()+c(NA,NA,NA,rep(3,4),NA,NA,3), c=Sys.Date()+c(NA,NA,NA,rep(9,4),NA,NA,9)) the idea I have difficulty wrapping my head around is to do the following: I need the system to look at df$a by row (lets call it the index row) and look at df$b and df$c 1 row before the given row in df$a (lets call it index row -1) and evaluate if the index row value in df$a falls into the range (>= and <=) of the index row -1 values in df$b and df$c. If it does, then copy over the index row -1 values in df$b and df$c into the index row in df$b and df$c, if not place an NA in both cells of the index row in df$b and df$c. examples: 1. the date value in df$a[8] is between df$b[7] and df$c[7] so we can copy the values in df$b[7] and df$c[7] into df$b[8] and df$c[8]2. the date value in df$a[9] is between df$b[8] and df$c[8] (as we copied it in in step 1) so we can copy the values in df$b[8] and df$c[8] into df$b[9] and df$c[9]3. the date value in df$a[10] is NOT between df$b[9] and df$c[9] (as we copied it in in step 2) so we can place NA in df$b[10] and df$c[10] also would like to do this going up, too, similar to fill(...,"downup"). On the end we would want to have this: dfwanted<-data.frame(a=Sys.Date()+1:10, b=Sys.Date()+c(NA,NA,rep(3,7),NA), c=Sys.Date()+c(NA,NA,rep(9,7),NA)) much appreciate any help you could provide. thanks, Andras [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MuMIn package with gamlss error
Hello, could you please provide your thoughts on what I may be missing? gamlss models are supposedly supported by MuMIn yet this one fails: library(MuMIn) #this lm runs linearMod <- lm(Sepal.Length ~ ., data=iris) options(na.action = "na.fail") res <-dredge(linearMod,beta = T, evaluate = T) confset.95p<-get.models(res, subset = cumsum(weight) <= .95) avgm <- model.avg(confset.95p) predict(avgm,se.fit = TRUE, type="response") #this gamlss fails on dredge(), and writing out the formula does not solve the initial error... gamlssMod <- gamlss(Sepal.Length ~ ., data=iris,) res <-dredge(gamlssMod,beta = T, evaluate = T) confset.95p<-get.models(res, subset = cumsum(weight) <= .95) avgm <- model.avg(confset.95p) predict(avgm,se.fit = TRUE, type="response") options(na.action = "na.omit") appreciate any thoughts you may have, Andras __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] data frame solution
Hello All, wonder if you have thoughts on a clever solution for this code: df <- data.frame(a = c(6,1), b = c(1000,1200), c =c(-1,3)) #the caveat here is that the number of rows for df can be anything from 1 row to in the hundreds. I kept it to 2 to have minimal reproducible t<-seq(-5,24,0.1) #min(t) will always be <=df$c[1], which is the value that is always going to equal to min(df$c) times1 <- c(rbind(df$c[1],df$c[1]+df$a[1]),max(t)) #length of times1 will always be 3, see times2 is of length 4 input1 <- c(rbind(df$b[1]/df$a[1],rep(0,length(df$b[1]))),0) #length of input1 will always be 3, see input2 is of length 4 out1 <-data.frame(t,ifelse(t>=times1[1]=times1[2]=times2[1]=times2[2]=times2[3]https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reduce and intersect question (maybe)?
Works well! Thanks! Andras On Thursday, February 21, 2019, 8:47:51 AM EST, Jeff Newmiller wrote: Use ?merge instead of intersect. On February 21, 2019 5:22:46 AM PST, Andras Farkas via R-help wrote: >Hello All, > >wonder if you have a suggestion for the following: > >we have >a<-data.frame(ID=c(1,2,3,4,5,6,7),date=as.POSIXct(seq(as.Date('2011-01-01'),as.Date('2011-01-07'),by >= 1),format='%m/%d/%Y %H:%M'),z=rnorm(7,1,1)) >b<-data.frame(ID=c(1,2,3,11,12,13,14,15),date=as.POSIXct(seq(as.Date('2011-01-01'),as.Date('2011-01-08'),by >= 1),format='%m/%d/%Y %H:%M'),z=rnorm(8,1,1)) >c<-data.frame(ID=c(1,2,3,4,5,6,7,8,9,10),date=as.POSIXct(c(seq(as.Date('2011-01-01'),as.Date('2011-01-05'),by >= 1),seq(as.Date('2011-01-11'),as.Date('2011-01-15'),by = >1)),format='%m/%d/%Y %H:%M'),z=rnorm(10,1,1)) >d<-data.frame(ID=c(1,2,3,21,22,23,24,25,26,27,28),date=as.POSIXct(c(as.Date('2011-01-01'),as.Date('2011-11-01'),as.Date('2011-01-03'),seq(as.Date('2011-01-01'),as.Date('2011-01-08'),by >= 1)),format='%m/%d/%Y %H:%M'),z=rnorm(11,1,1)) > > >#this function will do the obvious and give the IDs that are in all of >the data frames based on the ID column > >intersect_all <- function(a,b,...){ > Reduce(intersect, list(a,b,...)) >} > >intersect_all(a$ID,b$ID,c$ID,d$ID) > > >#I would like to extend this (or use another function) where the >function would give all the rows (ie based on both columns as a >condition) that are in all of the data frames, so the result should be >as below as these 2 rows are in all of the data frames (the fact that >the rows that are common in all data frames ie 1 and 3 in my example >are I only set up for the sake of convenience, in reality their row >number in each of the data frames may be different) . The value of z is >of no particular importance, but once the common rows are identified I >would want to subset the data frames to get these results: > >a[c(1,3),] >b[c(1,3),] >c[c(1,3),] >d[c(1,3),] > >much appreciate your input, > >thanks > >Andras > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. -- Sent from my phone. Please excuse my brevity. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] reduce and intersect question (maybe)?
Hello All, wonder if you have a suggestion for the following: we have a<-data.frame(ID=c(1,2,3,4,5,6,7),date=as.POSIXct(seq(as.Date('2011-01-01'),as.Date('2011-01-07'),by = 1),format='%m/%d/%Y %H:%M'),z=rnorm(7,1,1)) b<-data.frame(ID=c(1,2,3,11,12,13,14,15),date=as.POSIXct(seq(as.Date('2011-01-01'),as.Date('2011-01-08'),by = 1),format='%m/%d/%Y %H:%M'),z=rnorm(8,1,1)) c<-data.frame(ID=c(1,2,3,4,5,6,7,8,9,10),date=as.POSIXct(c(seq(as.Date('2011-01-01'),as.Date('2011-01-05'),by = 1),seq(as.Date('2011-01-11'),as.Date('2011-01-15'),by = 1)),format='%m/%d/%Y %H:%M'),z=rnorm(10,1,1)) d<-data.frame(ID=c(1,2,3,21,22,23,24,25,26,27,28),date=as.POSIXct(c(as.Date('2011-01-01'),as.Date('2011-11-01'),as.Date('2011-01-03'),seq(as.Date('2011-01-01'),as.Date('2011-01-08'),by = 1)),format='%m/%d/%Y %H:%M'),z=rnorm(11,1,1)) #this function will do the obvious and give the IDs that are in all of the data frames based on the ID column intersect_all <- function(a,b,...){ Reduce(intersect, list(a,b,...)) } intersect_all(a$ID,b$ID,c$ID,d$ID) #I would like to extend this (or use another function) where the function would give all the rows (ie based on both columns as a condition) that are in all of the data frames, so the result should be as below as these 2 rows are in all of the data frames (the fact that the rows that are common in all data frames ie 1 and 3 in my example are I only set up for the sake of convenience, in reality their row number in each of the data frames may be different) . The value of z is of no particular importance, but once the common rows are identified I would want to subset the data frames to get these results: a[c(1,3),] b[c(1,3),] c[c(1,3),] d[c(1,3),] much appreciate your input, thanks Andras __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] list with list function
Thanks Rui and Ivan, works perfectly... Andras On Monday, February 4, 2019, 4:18:39 PM EST, Rui Barradas wrote: Hello, Like this? Map('[', listA, lapply(listB, '*', -1)) Hope this helps, Rui Barradas Às 21:01 de 04/02/2019, Andras Farkas via R-help escreveu: > Hello everyone, > > wonder if you would have a thought on a function for the following: > > > we have > > a<-sample(seq(as.Date('1999/01/01'), as.Date('2000/01/01'), by="day"),5) > b<-sample(seq(as.Date('1999/01/01'), as.Date('2000/01/01'), by="day"), 4) > c<-sample(seq(as.Date('1999/01/01'), as.Date('2000/01/01'), by="day"), 3) > > d<-c(1,3,5) > e<-c(1,4) > f<-c(1,2) > > listA<-list(a,b,c) > listB<-list(d,e,f) > > > what I would like to do with a function (my real listA and listB can be of > any length but always equal length, but their components like a,b,and c those > can be unequal) as opposed to manually is to derive the following answer > > listfinal<-list(a[-d],b[-e],c[-f]) > listfinal > > > essentially the elements in listB serve as identifying the position of > corresponding list element in listA and removing it from listA. > > these lists listA and listB in practice are columns of a data frame that I am > trying to work with and were generated with a function using lapply... > > appreciate any thoughts you may have to make this functional... > > thanks, > > Andras > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] list with list function
Hello everyone, wonder if you would have a thought on a function for the following: we have a<-sample(seq(as.Date('1999/01/01'), as.Date('2000/01/01'), by="day"),5) b<-sample(seq(as.Date('1999/01/01'), as.Date('2000/01/01'), by="day"), 4) c<-sample(seq(as.Date('1999/01/01'), as.Date('2000/01/01'), by="day"), 3) d<-c(1,3,5) e<-c(1,4) f<-c(1,2) listA<-list(a,b,c) listB<-list(d,e,f) what I would like to do with a function (my real listA and listB can be of any length but always equal length, but their components like a,b,and c those can be unequal) as opposed to manually is to derive the following answer listfinal<-list(a[-d],b[-e],c[-f]) listfinal essentially the elements in listB serve as identifying the position of corresponding list element in listA and removing it from listA. these lists listA and listB in practice are columns of a data frame that I am trying to work with and were generated with a function using lapply... appreciate any thoughts you may have to make this functional... thanks, Andras __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame transformation
Thanks Bert this will do... Andras Sent from Yahoo Mail on Android On Sun, Jan 6, 2019 at 1:09 PM, Bert Gunter wrote: ... and my reordering of column indices was unnecessary: merge(dat, d, all.y = TRUE)will do. Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Sun, Jan 6, 2019 at 5:16 AM Andras Farkas via R-help wrote: Hello Everyone, would you be able to assist with some expertise on how to get the following done in a way that can be applied to a data set with different dimensions and without all the line items here? we have: id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ of course in real data set, usually in magnitude of 1 letter<-c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2), sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number of unique "letters" is less than 4000 in real data set and they are no duplicates within same ID weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2), sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is below 50 in real data set and they are no duplicates within same ID data<-data.frame(id=id,letter=letter,weight=weight) #goal is to get the following transformation where a column is added for each unique letter and the weight is pulled into the column if the letter exist within the ID, otherwise NA #so we would get datatransform like below but without the many steps described here datatransfer<-data.frame(data,apply(data[2],2,function(x) ifelse(x=="A",data$weight,NA))) datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="B",data$weight,NA))) datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="C",data$weight,NA))) datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="D",data$weight,NA))) datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="E",data$weight,NA))) colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E") much appreciate the help, thanks Andras __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] data frame transformation
Hello Everyone, would you be able to assist with some expertise on how to get the following done in a way that can be applied to a data set with different dimensions and without all the line items here? we have: id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ of course in real data set, usually in magnitude of 1 letter<-c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2), sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number of unique "letters" is less than 4000 in real data set and they are no duplicates within same ID weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2), sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is below 50 in real data set and they are no duplicates within same ID data<-data.frame(id=id,letter=letter,weight=weight) #goal is to get the following transformation where a column is added for each unique letter and the weight is pulled into the column if the letter exist within the ID, otherwise NA #so we would get datatransform like below but without the many steps described here datatransfer<-data.frame(data,apply(data[2],2,function(x) ifelse(x=="A",data$weight,NA))) datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="B",data$weight,NA))) datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="C",data$weight,NA))) datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="D",data$weight,NA))) datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="E",data$weight,NA))) colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E") much appreciate the help, thanks Andras __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ddply (or other suitable solution) question
thank you all, Bert's idea will get it done... good question also re what if 1 row: have a separate plan for that... Anyhow, finishing up Bert's lines with z<-lapply(ix, function(i) df[i,]) lapply(z, function(x) split(x, rep(1:ceiling(nrow(x)/2), each=2)[1:nrow(x)])) seems to do what I need, thanks again... Andras On Thursday, September 13, 2018, 5:16:54 PM EDT, Bert Gunter wrote: Mod my earlier question, it seems that you just want to replicate all rows within an id if there more than 2 rows. If this is incorrect, ignore the rest of this post. Otherwise... (I assume the data frame is listed in ID order, whatever that is) set.seed(123.456) df <-data.frame(ID=c(1,1,2,2,2,3,3,3,3,4,4,5,5), read=c(1,1,0,1,1,1,0,0,0,1,0,0,0), int=c(1,1,0,0,0,1,1,0,0,1,1,1,1), z=rnorm(13,1,5), y=rnorm(13,1,5)) yielded on my Mac and R version 3.5.1 > df ID read int z y 1 1 1 1 -1.8023782 1.55341358 2 1 1 1 -0.1508874 -1.77920567 3 2 0 0 8.7935416 9.93456568 4 2 1 0 1.3525420 3.48925239 5 2 1 0 1.6464387 -8.83308578 6 3 1 1 9.5753249 4.50677951 7 3 0 1 3.3045810 -1.36395704 8 3 0 0 -5.3253062 -4.33911853 9 3 0 0 -2.4342643 -0.08987457 10 4 1 1 -1.2283099 -4.13002224 11 4 0 1 7.1204090 -2.64445615 12 5 0 1 2.7990691 -2.12519634 13 5 0 1 3.0038573 -7.43346655 ## The following doubles up the rows by ID > ix <- tapply(seq_len(nrow(df)),df$ID, + function(x){ + lenx <- length(x) + if(lenx > 2) + c(x[1],rep(x[2]:x[lenx-1],e=2),x[lenx]) + else x + } + ) > ix $`1` [1] 1 2 $`2` [1] 3 4 4 5 $`3` [1] 6 7 7 8 8 9 $`4` [1] 10 11 $`5` [1] 12 13 ## now use the ix list to break up df: > lapply(ix, function(i)df[i,]) $`1` ID read int z y 1 1 1 1 -1.8023782 1.553414 2 1 1 1 -0.1508874 -1.779206 $`2` ID read int z y 3 2 0 0 8.793542 9.934566 4 2 1 0 1.352542 3.489252 4.1 2 1 0 1.352542 3.489252 5 2 1 0 1.646439 -8.833086 $`3` ID read int z y 6 3 1 1 9.575325 4.50677951 7 3 0 1 3.304581 -1.36395704 7.1 3 0 1 3.304581 -1.36395704 8 3 0 0 -5.325306 -4.33911853 8.1 3 0 0 -5.325306 -4.33911853 9 3 0 0 -2.434264 -0.08987457 $`4` ID read int z y 10 4 1 1 -1.228310 -4.130022 11 4 0 1 7.120409 -2.644456 $`5` ID read int z y 12 5 0 1 2.799069 -2.125196 13 5 0 1 3.003857 -7.433467 I leave it to you to modify the lapply() function to break up each id data frame into sublists of pairs if that is what you wish to do. Assuming again that this is actually what you want. Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Thu, Sep 13, 2018 at 1:40 PM Bert Gunter wrote: > > What if there is only one read in the id? > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > On Thu, Sep 13, 2018 at 12:11 PM Andras Farkas via R-help > wrote: > > > > Dear All, > > > > I have data frame: > > set.seed(123.456) > > df <-data.frame(ID=c(1,1,2,2,2,3,3,3,3,4,4,5,5), > > read=c(1,1,0,1,1,1,0,0,0,1,0,0,0), > > int=c(1,1,0,0,0,1,1,0,0,1,1,1,1), > > z=rnorm(13,1,5), > > y=rnorm(13,1,5)) > > > > what I would like to achieve (as best as I see it now) is to create > > multiple lists (and lists within lists using the data in df) that would be > > based on the groups in the ID column ("top level of list") and "join > > together" each line item within the group followed by the next line item > > ("bottom level list"), so would look like this for > > > > [[ID=1]] > > [[1]][[1]] > > ID read int z y > > 1 1 1 5.188935 5.107905 > > 1 1 1 1.766866 4.443201 > > [[ID=2]] > > [[2]][[1]] ID read int z y > > 2 0 0 -4.690685 3.7695883 > > 2 1 0 7.269075 0.6904414[[ID=2]] > > [[2]][[2]] ID read int z y > > 2 1 0 7.269075 0.6904414 > > 2 1 0 3.132321 -0.5298133[[ID=3]] > > [[3]][[1]] ID read int z y > > 3 1 1 -0.4753574 -0.902355 > > 3 0 1 5.4756283 -2.473535 > > [[ID=3]] > > [[3]][[2]] > > 3 0 1 5.475628 -2.47353489 > > 3 0 0 5.3
[R] ddply (or other suitable solution) question
Dear All, I have data frame: set.seed(123.456) df <-data.frame(ID=c(1,1,2,2,2,3,3,3,3,4,4,5,5), read=c(1,1,0,1,1,1,0,0,0,1,0,0,0), int=c(1,1,0,0,0,1,1,0,0,1,1,1,1), z=rnorm(13,1,5), y=rnorm(13,1,5)) what I would like to achieve (as best as I see it now) is to create multiple lists (and lists within lists using the data in df) that would be based on the groups in the ID column ("top level of list") and "join together" each line item within the group followed by the next line item ("bottom level list"), so would look like this for [[ID=1]] [[1]][[1]] ID read int z y 1 1 1 5.188935 5.107905 1 1 1 1.766866 4.443201 [[ID=2]] [[2]][[1]] ID read int z y 2 0 0 -4.690685 3.7695883 2 1 0 7.269075 0.6904414[[ID=2]] [[2]][[2]] ID read int z y 2 1 0 7.269075 0.6904414 2 1 0 3.132321 -0.5298133[[ID=3]] [[3]][[1]] ID read int z y 3 1 1 -0.4753574 -0.902355 3 0 1 5.4756283 -2.473535 [[ID=3]] [[3]][[2]] 3 0 1 5.475628 -2.47353489 3 0 0 5.390667 -0.03958639 hoping example clear enough... all our help is appreciated, thanks, Andras __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] kSamples ad.test question
Dear All, once we run the following code, the results of the test will give us the expected obvious, samples are from the common distribution... library(kSamples) u1 <- sample(rnorm(500,10,1),20,replace = TRUE) u2 <- sample(rnorm(500,10,1),20,replace = TRUE) u3 <- sample(rnorm(500,10,1),20,replace = TRUE) u4 <- sample(rnorm(500,10,1),20,replace = TRUE) u5 <- sample(rnorm(500,10,1),20,replace = TRUE) ad.test(u1, u2, u3,u4,u5, method = "exact", dist = FALSE, Nsim = 1000) next, if I change "u5" to: u5 <- sample(rnorm(500,20,1),20,replace = TRUE) the results of the test again gives us what we expect, ie samples are not from the common distribution my question is: would you know of a way to be able to automatically select out or identify "u5", the distribution that is "responsible" for the results generated showing that the samples are not from the common distribution? much appreciate your help, Andras __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MICE data analysis with glmulti
Dear All, wonder if you have some thoughts on running the with() function (and perhaps including the pool() function to get the results?) in glmulti? In other words, how to run glmulti with a data set that is produced by mice()? publicly available code: data <- airquality data[4:10,3] <- rep(NA,7) data[1:5,4] <- NA data <- data[-c(5,6)] library(mice) library(glmulti) the following line will compute the missing data: tempData <- mice(data,m=5,maxit=50,meth='pmm',seed=500) and the following 2 lines will run the regression on the mice output and pool the results to establish the final result of interest for the model specified... modelFit1 <- with(tempData,glm(Temp~ Ozone+Solar.R+Wind)) summary(pool(modelFit1)) with glmulti I am trying to establish the "best" model by evaluating combinations of all predictors and interactions in different models and would like to force the variable "Ozone" into all models with the following code: glm.redefined = function(formula, data, always="", ...) {glm(as.formula(paste(deparse(formula), always)), data=data, ...)} then run glmulti: output<-glmulti(with(tempData,Temp~Solar.R+Wind), fitfunc=glm.redefined, level=1, crit=aic, method="h", always= "+Ozone") which will obviously fail once you give it a try... any thoughts on how to identify the best model using glmulti in this fashion that would fit the different combination of predictors with interactions on the mice() output of tempData? much appreciate the help... Andras __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] summary.rms help
Dear All, using the example from the help of summary.rms library(rms) n <- 1000# define sample size set.seed(17) # so can reproduce the results age<- rnorm(n, 50, 10) blood.pressure <- rnorm(n, 120, 15) cholesterol<- rnorm(n, 200, 25) sex<- factor(sample(c('female','male'), n,TRUE)) label(age)<- 'Age' # label is in Hmisc label(cholesterol)<- 'Total Cholesterol' label(blood.pressure) <- 'Systolic Blood Pressure' label(sex)<- 'Sex' units(cholesterol)<- 'mg/dl' # uses units.default in Hmisc units(blood.pressure) <- 'mmHg' # Specify population model for log odds that Y=1 L <- .4*(sex=='male') + .045*(age-50) + (log(cholesterol - 10)-5.2)*(-2*(sex=='female') + 2*(sex=='male')) # Simulate binary y to have Prob(y=1) = 1/[1+exp(-L)] y <- ifelse(runif(n) < plogis(L), 1, 0) ddist <- datadist(age, blood.pressure, cholesterol, sex) options(datadist='ddist') fit <- lrm(y ~ blood.pressure + sex * (age + rcs(cholesterol,4))) s <- summary(fit) plot(s) as you will see the plot will by default include the low and high values from the summary printed on the plot to the right of the variable name... Any thoughts on how printing these low and high values can be suppressed, ie: prevent them from being printed? appreciate your help, Andras __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] effects package x axis labels
Dear All, probably a simple enough solution but don;t seem to be able to get my head around it...example based on a publicly available data set: mydata <- read.csv("https://stats.idre.ucla.edu/stat/data/binary.csv;) mylogit <- glm(admit ~ gre + gpa + rank, data = mydata, family = "binomial") library(effects) plot(allEffects(mylogit) ,axes=list(y=list(lab="Prob(xyz)")) ) axes=list(y=list(lab="Prob(xyz)")) changes the y axis labels for all 3 plots... Any thoughts on how I could change the x axis labels to let say 'black' (plot 1), 'white' (plot 2) and 'green' (plot 3) for the 3 respective plots produced? appreciate the help... Andras __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] bquote question
Dear All, could you please provide input on the following: plot(1:10,main=paste("\n ","\nABCD","\n","\n","\n"),cex.main=1.3) a<-500 b<-12 mtext(bquote(bold(.(formatC(1.2*a,decimal.mark=",",digits=2,format="f")))~ " words "~bold(.(b))~" words"~"\n"~"\n")) as you will see form the sub-title only the result of formatC(1.2*a,decimal.mark=",",digits=2,format="f") gets bolded, while the part bold(.(b)) does not seem to bold the letter 'b'... In addition, the spacing ("\n ") at the end of the mtext line also does not seem to get recognized any thoughts on what I may be doing wrong? much appreciate your help... Andras __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame question
thank you both... assumption is in fact that a and b are always the same length... these work for me well... much appreciate it... Andras On Sunday, August 6, 2017 12:14 PM, Ulrik Stervbo <ulrik.ster...@gmail.com> wrote: Hi Andreas, assuming that the increment is always indicated by the same value (in your example 0), this could work: df$a <- cumsum(seq_along(df$b) %in% which(df$b == 0)) df HTH, Ulrik On Sun, 6 Aug 2017 at 18:06 Bert Gunter <bgunter.4...@gmail.com> wrote: Your specification is a bit unclear to me, so I'm not sure the below >is really what you want. For example, your example seems to imply that >a and b must be of the same length, but I do not see that your >description requires this. So the following may not be what you want >exactly, but one way to do this(there may be cleverer ones!) is to >make use of ?rep. Everything else is just fussy detail. (Your example >suggests that you should also learn about ?seq. Both of these should >be covered in any good R tutorial, which you should probably spend >time with if you haven't already). > >Anyway... > >## WARNING: Not thoroughly tested! May (probably :-( ) contain bugs. > >f <- function(x,y,switch_val =0) >{ > wh <- which(y == switch_val) > len <- length(wh) > len_x <- length(x) > if(!len) x > else if(wh[1] == 1){ > if(len ==1) return(rep(x[1],len_x)) > else { > wh <- wh[-1] > len <- len -1 > } > } > count <- c(wh[1]-1,diff(wh)) > if(wh[len] == len_x) count<- c(count,1) > else count <- c(count, len_x - wh[len] +1) > rep(x[seq_along(count)],times = count) >} > >> a <- c(1:5,1:8) >> b <- c(0:4,0:7) >> f(a,b) > [1] 1 1 1 1 1 2 2 2 2 2 2 2 2 > > > >Bert Gunter > >"The trouble with having an open mind is that people keep coming along >and sticking things into it." >-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > >On Sun, Aug 6, 2017 at 4:10 AM, Andras Farkas via R-help ><r-help@r-project.org> wrote: >> Dear All, >> >> wonder if you have thoughts on the following: >> >> let us say we have: >> >> df<-data.frame(a=c(1,2,3,4,5,1,2,3,4,5,6,7,8),b=c(0,1,2,3,4,0,1,2,3,4,5,6,7)) >> >> >> I would like to rewrite values in column name "a" based on values in column >> name "b", where based on a certain value of column "b" the next value of >> column 'a' is prompted, in other words would like to have this as a result: >> >> df<-data.frame(a=c(1,1,1,1,1,2,2,2,2,2,2,2,2),b=c(0,1,2,3,4,0,1,2,3,4,5,6,7)) >> >> >> where at the value of 0 in column 'b' the number in column a changes from 1 >> to 2. From the first zero value of column 'b' and until the next zero in >> column 'b' the numbers would not change in 'a', ie: they are all 1 in my >> example... then from 2 it would change to 3 again as 'b' will have zero >> again in a row, and so on.. Would be grateful for a solution that would >> allow me to set the values (from 'b') that determine how the values get >> established in 'a' (ie: lets say instead of 0 I would want 3 being the value >> where 1 changes to 2 in 'a') and that would be flexible to take into account >> that the number of rows and the number of time 0 shows up in a row in column >> 'b' may vary... >> >> much appreciate your thoughts.. >> >> Andras >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] data frame question
Dear All, wonder if you have thoughts on the following: let us say we have: df<-data.frame(a=c(1,2,3,4,5,1,2,3,4,5,6,7,8),b=c(0,1,2,3,4,0,1,2,3,4,5,6,7)) I would like to rewrite values in column name "a" based on values in column name "b", where based on a certain value of column "b" the next value of column 'a' is prompted, in other words would like to have this as a result: df<-data.frame(a=c(1,1,1,1,1,2,2,2,2,2,2,2,2),b=c(0,1,2,3,4,0,1,2,3,4,5,6,7)) where at the value of 0 in column 'b' the number in column a changes from 1 to 2. From the first zero value of column 'b' and until the next zero in column 'b' the numbers would not change in 'a', ie: they are all 1 in my example... then from 2 it would change to 3 again as 'b' will have zero again in a row, and so on.. Would be grateful for a solution that would allow me to set the values (from 'b') that determine how the values get established in 'a' (ie: lets say instead of 0 I would want 3 being the value where 1 changes to 2 in 'a') and that would be flexible to take into account that the number of rows and the number of time 0 shows up in a row in column 'b' may vary... much appreciate your thoughts.. Andras __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] select from data frame
thank you David and Bert, these solutions will work for me... Andras On Saturday, July 15, 2017 6:05 PM, Bert Gunter <bgunter.4...@gmail.com> wrote: ... and here is a slightly cleaner and more transparent way of doing the same thing (setdiff() does the matching) > with(df, setdiff(ID,ID[samples %in% c("B","C") ])) [1] 3 -- Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Sat, Jul 15, 2017 at 9:23 AM, Bert Gunter <bgunter.4...@gmail.com> wrote: > If I understand correctly, no looping (ave(), for()) or type casting > (as.character()) is needed -- indexing and matching suffice: > >> with(df, ID[!ID %in% unique(ID[samples %in% c("B","C") ])]) > [1] 3 3 > > > > Cheers, > > Bert > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Sat, Jul 15, 2017 at 8:54 AM, David Winsemius <dwinsem...@comcast.net> > wrote: >> >>> On Jul 15, 2017, at 4:01 AM, Andras Farkas via R-help >>> <r-help@r-project.org> wrote: >>> >>> Dear All, >>> >>> wonder if you could please assist with the following >>> >>> df<-data.frame(ID=c(1,1,1,2,2,3,3,4,4,5,5),samples=c("A","B","C","A","C","A","D","C","B","A","C")) >>> >>> from this data frame the goal is to extract the value of 3 from the ID >>> column based on the logic that the ID=3 in the data frame has NO row that >>> would pair 3 with either "B", AND/OR "C" in the samples column... >>> >> >> This returns a vector that determines if either of those characters are in >> the character values of that factor column you created. Coercing to >> character is needed because leaving samples as a factor generated an invalid >> factor level warning and gave useless results. >> >> with( df, ave( as.character(samples), ID, FUN=function(x) {!any(x %in% >>c("B","C"))})) >> [1] "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "TRUE" "TRUE" "FALSE" "FALSE" >> [10] "FALSE" "FALSE" >> >> You can then use it to extract and consolidate to a single value (although >> wrapping with as.logical was needed because `ave` returned character class >> values): >> >> unique( df$ID[ as.logical( # fails without this since "FALSE" != FALSE >> with( df, >> ave( as.character(samples), ID, FUN=function(x) >>{!any(x %in% c("B","C"))}))) >> ] ) >> #[1] 3 >> >> The same sort of logic could also be constructed with a for-loop: >> >>> for (x in unique(df$ID) ) { if ( !any( df$samples[df$ID==x] %in% >>> c("b","C")) ) print(x) } >> [1] 3 >> >> Although you are warned that for-loops do not return values and you might >> need to make an assignment rather than just printing. >> >> -- >> >> David Winsemius >> Alameda, CA, USA >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] select from data frame
Dear All, wonder if you could please assist with the following df<-data.frame(ID=c(1,1,1,2,2,3,3,4,4,5,5),samples=c("A","B","C","A","C","A","D","C","B","A","C")) from this data frame the goal is to extract the value of 3 from the ID column based on the logic that the ID=3 in the data frame has NO row that would pair 3 with either "B", AND/OR "C" in the samples column... much appreciate your help... thanks, Andras __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] "reverse" quantile function
Peter, thanks, very nice, this will work for me... could you also help with setting up the code to run the on liner "approx(sort(x), seq(0,1,,length(x)), q)$y" on the rows of a data frame using my example above? So if I cbind z and res, df<-cbind(z,res) the "x" in your one liner would be the first 4 column values of each row and "q" is the last (5fth) column value of each row.. thanks again for all the help, Andras Farkas On Friday, June 16, 2017 4:58 AM, peter dalgaard <pda...@gmail.com> wrote: It would depend on which one of the 9 quantile definitions you are using. The discontinuous ones aren't invertible, and the continuous ones won't be either, if there are ties in the data. This said, it should just be a matter of setting up the inverse of a piecewise linear function. To set ideas, try x <- rnorm(5) curve(quantile(x,p), xname="p") The breakpoints for the default quantiles are n points evenly spread on [0,1], including the endpoints; i.e., for n=5, (0, .25, .5, .75, 1) So: x <- rnorm(5) br <- seq(0, 1, ,5) qq <- quantile(x, br) ## actually == sort(x) pfun <- approxfun(qq, br) (q <- quantile(x, .1234)) pfun(q) There are variations, e.g. the one-liner approx(sort(x), seq(0,1,,length(x)), q)$y -pd > On 16 Jun 2017, at 01:56 , Andras Farkas via R-help <r-help@r-project.org> > wrote: > > David, > > thanks for the response. In your response the quantile function (if I see > correctly) runs on the columns versus I need to run it on the rows, which is > an easy fix, but that is not exactly what I had in mind... essentially we can > remove t() from my original code to make "res" look like this: > > res<-apply(z, 1, quantile, probs=c(0.3)) > > but after all maybe I did not explain myself clear enough so let me try > again: the known variables to us in what I am trying to do are the data frame > "z' : > > t<-seq(0,24,1) > a<-10*exp(-0.05*t) > b<-10*exp(-0.07*t) > c<-10*exp(-0.1*t) > d<-10*exp(-0.03*t) > > z<-data.frame(a,b,c,d) > > and the vector "res": > > res<-c(10.00, 9.296382, 8.642955, 8.036076 ,7.472374, 6.948723, > 6.462233, 6.010223 ,5.590211 > > ,5.199896 ,4.837147, 4.499989 ,4.186589, 3.895250 ,3.624397, 3.372570, > 3.138415, 2.920675 > , 2.718185 ,2.529864 ,2.354708, 2.191786, 2.040233, 1.899247, 1.768084) > > and I need to find the probability (probs) , the unknown value, which would > result in creating "res", ie: the probs=c(0.3), from: > res<-apply(z, 1, quantile, probs=c(0.3))... > > > a more simplified example assuming : > > k<-c(1:100) > f<-30 > ecdf(k)(f) > > would give us the value of 0.3... so same idea as this, but instead of "k" we > have data frame "z", and instead of "f" we have "res", and need to find the > value of 0.3... Does that make sense? > > much appreciate the help... > > Andras Farkas, > > > On Thursday, June 15, 2017 6:46 PM, David Winsemius <dwinsem...@comcast.net> > wrote: > > > > >> On Jun 15, 2017, at 12:37 PM, Andras Farkas via R-help >> <r-help@r-project.org> wrote: >> >> Dear All, >> >> we have: >> >> t<-seq(0,24,1) >> a<-10*exp(-0.05*t) >> b<-10*exp(-0.07*t) >> c<-10*exp(-0.1*t) >> d<-10*exp(-0.03*t) >> z<-data.frame(a,b,c,d) >> >> res<-t(apply(z, 1, quantile, probs=c(0.3))) >> >> >> >> my goal is to do a 'reverse" of the function here that produces "res" on a >> data frame, ie: to get the answer 0.3 back for the percentile location when >> I have "res" available to me... For a single vector this would be done using >> ecdf something like this: >> >> x <- rnorm(100) >> #then I know this value: >> quantile(x,0.33) >> #so do this step >> ecdf(x)(quantile(x,0.33)) >> #to get 0.33 back... >> >> any suggestions on how I could to that for a data frame? > > Can't you just used ecdf and quantile ecdf? > > # See ?ecdf page for both functions > >> lapply( lapply(z, ecdf), quantile, 0.33) > $a > 33% > 4.475758 > > $b > 33% > 3.245151 > > $c > 33% > 2.003595 > > > $d > 33% > 6.173204 > -- > > David Winsemius > Alameda, CA, USA > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] "reverse" quantile function
Never mind, I think i figured: z<-df apply(df,1,function(x) approx(sort(x[1:4]), seq(0,1,,length(x[1:4])), x[5])$y) thanks again for the help Andras Farkas, On Friday, June 16, 2017 5:34 AM, Andras Farkas via R-help <r-help@r-project.org> wrote: Peter, thanks, very nice, this will work for me... could you also help with setting up the code to run the on liner "approx(sort(x), seq(0,1,,length(x)), q)$y" on the rows of a data frame using my example above? So if I cbind z and res, df<-cbind(z,res) the "x" in your one liner would be the first 4 column values of each row and "q" is the last (5fth) column value of each row.. thanks again for all the help, Andras Farkas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] "reverse" quantile function
Peter, thanks, very nice, this will work for me... could you also help with setting up the code to run the on liner "approx(sort(x), seq(0,1,,length(x)), q)$y" on the rows of a data frame using my example above? So if I cbind z and res, df<-cbind(z,res) the "x" in your one liner would be the first 4 column values of each row and "q" is the last (5fth) column value of each row.. thanks again for all the help, Andras Farkas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] "reverse" quantile function
David, thanks for the response. In your response the quantile function (if I see correctly) runs on the columns versus I need to run it on the rows, which is an easy fix, but that is not exactly what I had in mind... essentially we can remove t() from my original code to make "res" look like this: res<-apply(z, 1, quantile, probs=c(0.3)) but after all maybe I did not explain myself clear enough so let me try again: the known variables to us in what I am trying to do are the data frame "z' : t<-seq(0,24,1) a<-10*exp(-0.05*t) b<-10*exp(-0.07*t) c<-10*exp(-0.1*t) d<-10*exp(-0.03*t) z<-data.frame(a,b,c,d) and the vector "res": res<-c(10.00, 9.296382, 8.642955, 8.036076 ,7.472374, 6.948723, 6.462233, 6.010223 ,5.590211 ,5.199896 ,4.837147, 4.499989 ,4.186589, 3.895250 ,3.624397, 3.372570, 3.138415, 2.920675 , 2.718185 ,2.529864 ,2.354708, 2.191786, 2.040233, 1.899247, 1.768084) and I need to find the probability (probs) , the unknown value, which would result in creating "res", ie: the probs=c(0.3), from: res<-apply(z, 1, quantile, probs=c(0.3))... a more simplified example assuming : k<-c(1:100) f<-30 ecdf(k)(f) would give us the value of 0.3... so same idea as this, but instead of "k" we have data frame "z", and instead of "f" we have "res", and need to find the value of 0.3... Does that make sense? much appreciate the help... Andras Farkas, On Thursday, June 15, 2017 6:46 PM, David Winsemius <dwinsem...@comcast.net> wrote: > On Jun 15, 2017, at 12:37 PM, Andras Farkas via R-help <r-help@r-project.org> > wrote: > > Dear All, > > we have: > > t<-seq(0,24,1) > a<-10*exp(-0.05*t) > b<-10*exp(-0.07*t) > c<-10*exp(-0.1*t) > d<-10*exp(-0.03*t) > z<-data.frame(a,b,c,d) > > res<-t(apply(z, 1, quantile, probs=c(0.3))) > > > > my goal is to do a 'reverse" of the function here that produces "res" on a > data frame, ie: to get the answer 0.3 back for the percentile location when I > have "res" available to me... For a single vector this would be done using > ecdf something like this: > > x <- rnorm(100) > #then I know this value: > quantile(x,0.33) > #so do this step > ecdf(x)(quantile(x,0.33)) > #to get 0.33 back... > > any suggestions on how I could to that for a data frame? Can't you just used ecdf and quantile ecdf? # See ?ecdf page for both functions > lapply( lapply(z, ecdf), quantile, 0.33) $a 33% 4.475758 $b 33% 3.245151 $c 33% 2.003595 $d 33% 6.173204 -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] "reverse" quantile function
Dear All, we have: t<-seq(0,24,1) a<-10*exp(-0.05*t) b<-10*exp(-0.07*t) c<-10*exp(-0.1*t) d<-10*exp(-0.03*t) z<-data.frame(a,b,c,d) res<-t(apply(z, 1, quantile, probs=c(0.3))) my goal is to do a 'reverse" of the function here that produces "res" on a data frame, ie: to get the answer 0.3 back for the percentile location when I have "res" available to me... For a single vector this would be done using ecdf something like this: x <- rnorm(100) #then I know this value: quantile(x,0.33) #so do this step ecdf(x)(quantile(x,0.33)) #to get 0.33 back... any suggestions on how I could to that for a data frame? thank you,Andras Farkas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Prophet package
Dear All wonder if you could assist with the following we have: library(prophet) library(dplyr) abc<-c(0.3684693,0.4938679, 0.4429201,0.452598,0.4301452,0.4315169, 0.447026,0.496179,0.4045693,0.398533, 0.355,0.431079,0.4063136,0.4120126,0.5210375,0.402897,0.4466131,0.5005669,0.5014164,0.5042271,0.5498575,0.6014215,0.4415863,0.4377443,0.4316092,0.4156757,0.3517915,0.3669508,0.3899471,0.3964143,0.4001074,0.3851003,0.4222451,0.375324,0.3652045,0.3376978 ,0.383012,0.3763665,0.3550609,0.2958678,0.3726571,0.3442298 #,0.3403275,0.2973978 #, 0.4,0.4,0.4, 0.4,0.4,0.4, 0.4,0.4,0.4 ) df<-data.frame(ds = seq(as.Date('2013-08-01'), as.Date('2017-01-01'), by = 'm'),abc) names(df)<-c("ds","y") m<-prophet(df,yearly.seasonality = TRUE) future <- make_future_dataframe(m, periods = 730) forecast <- predict(m, future) plot(m, forecast) points(x=as.Date('2017-02-01'),y=0.5) results in error message : Error in plot.xy(xy.coords(x, y), type = type, ...) : plot.new has not been called yet would you have a solution to plot the point on the plot? appreiate the help, Andras Farkas, __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] extend lines of prediction interval ggplot2
Dear All, would you have some thoughts on how to extend the prediction interval lines to beyond the "range of data"? example: y <-c(0.4316092,0.4156757,0.3517915,0.3669508,0.3899471,0.3964143, 0.4001074,0.3851003,0.4222451,0.375324,0.3652045,0.3376978,0.383012, 0.3763665,0.3550609,0.2958678,0.3726571,0.3442298 #,0.3403275,0.2973978 )*100 x <-seq(1,length(y),1) z<-c("07/01/2015","08/01/2015","09/01/2015","10/01/2015","11/01/2015", "12/01/2015","01/01/2016","02/01/2016","03/01/2016","04/01/2016","05/01/2016", "06/01/2016","07/01/2016","08/01/2016","09/01/2016","10/01/2016","11/01/2016", "12/01/2016","01/01/2017","02/01/2017") fit <-lm(y~x) temp_var <- predict(fit, interval="prediction") new_df <- data.frame(cbind(x,y, temp_var)) #new_df$x<-factor(new_df$x, ordered = T) library(ggplot2) ggplot(new_df, aes(x,y))+ geom_point() + theme(panel.background = element_rect(fill = 'white', colour = 'black'))+ geom_line(aes(y=lwr), color = "black", linetype = "dashed",size=0.75)+ geom_line(aes(y=upr), color = "black", linetype = "dashed",size=0.75)+ scale_x_discrete(limits=z)+ theme(axis.text.x = element_text(angle = 45, hjust = 1))+ theme(panel.grid.major=element_line(colour = "grey"))+ lims(y=c(0,50))+ geom_smooth(method=lm, se=TRUE,fullrange=TRUE,fill="darkgrey",col="black")+labs(title = paste("Adj R2 = ",signif(summary(fit)$adj.r.squared, 4), "Intercept =",signif(fit$coef[[1]],4 ), " Slope =",signif(fit$coef[[2]], 4) # " P =",signif(summary(fit)$coef[2,4], 3) ))+ ggtitle("Consumption Over Time") + theme(plot.title = element_text(hjust = 0.5))+ labs(y="y",x="x")+ geom_point(shape=15,aes(x=c(7),y=new_df[,2][7]), color="black",cex=4)+ geom_point(shape=15,aes(x=c(8),y=new_df[,2][8]), color="black",cex=4)+ geom_point(shape=17,aes(x=c(19),y=0.3403275*100), color="black",cex=4)+ geom_point(shape=17,aes(x=c(20),y=0.2973978*100), color="black",cex=4) as you will see the regresssion line and confidence interval is extended, but would also want to extend the prediction interval lines to the "same length"... Wonder if you have any insights to this question... appreciate the help, Andras Farkas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.