[R] plotly: ability to drag points on x axis only and prevent change of y axis value

2020-12-10 Thread Andras Farkas via R-help
Hello,

wonder if you could provide input on the following: please see toy example 
below, wanted to see if there is a way to have restrictions on how the points 
are dragged on the plot. More specifically I would like the points draggable 
horizontally ONLY and have their y axis value remain fixed, ie the movement 
vertically would be restricted and no change allowed to that direction for each 
point? much appreciate any input you may have,

library(plotly)
library(purrr)

# creates a list of 32 circle shapes (one for each row/car)
circles <- map2(
  mtcars$mpg, 
  mtcars$wt, 
  ~list(
    type = "circle",
    # anchor circles at (mpg, wt)
    xanchor = .x,
    yanchor = .y,
    # give each circle a 2 pixel diameter
    x0 = -5, x1 = 5,
    y0 = -5, y1 = 5,
    xsizemode = "pixel", 
    ysizemode = "pixel",
    # other visual properties
    fillcolor = "blue",
    line = list(color = "transparent")
  )
)


plot_ly() %>%
  layout(shapes = circles) %>%
  config(edits = list(shapePosition = TRUE))


appreciate the help,


thanks,

Andras 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] (almost) rolling or fill function?

2020-06-08 Thread Andras Farkas via R-help
Thanks Bert, here it is in plain.



Hello,

please see if you have a thought on how to achieve the following:

we have:

df<-data.frame(a=Sys.Date()+1:10,
               b=Sys.Date()+c(NA,NA,NA,rep(3,4),NA,NA,3),
               c=Sys.Date()+c(NA,NA,NA,rep(9,4),NA,NA,9))



the idea I have difficulty wrapping my head around is to do the following: I 
need the system to look at df$a by row (lets call it the index row) and look at 
df$b and df$c 1 row before the given row in df$a  (lets call it index row -1) 
and evaluate if the index row value in df$a falls into the range (>= and <=) of 
the index row -1 values in df$b and df$c. If it does, then copy over the index 
row -1 values in df$b and df$c into the index row in df$b and df$c, if not 
place an NA in both cells of the index row in df$b and df$c. 

 examples:

1. the date value in df$a[8] is between df$b[7] and df$c[7] so we can copy the 
values in df$b[7] and df$c[7] into df$b[8] and df$c[8]
2.  the date value in df$a[9] is between df$b[8] and df$c[8] (as we copied it 
in in step 1)  so we can copy the values in df$b[8] and df$c[8] into df$b[9] 
and df$c[9]
3.  the date value in df$a[10] is NOT between df$b[9] and df$c[9] (as we copied 
it in in step 2)  so we can place NA in df$b[10] and df$c[10] 


also would like to do this going up, too, similar to fill(...,"downup"). On the 
end we would want to have this:

dfwanted<-data.frame(a=Sys.Date()+1:10,
               b=Sys.Date()+c(NA,NA,rep(3,7),NA),
               c=Sys.Date()+c(NA,NA,rep(9,7),NA))



much appreciate any help you could provide.

thanks,


Andras 


Andras 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] (almost) rolling function or fill?

2020-06-08 Thread Andras Farkas via R-help
Hello,
please see if you have a thought on how to achieve the following:
we have:
 df<-data.frame(a=Sys.Date()+1:10,               
b=Sys.Date()+c(NA,NA,NA,rep(3,4),NA,NA,3),               
c=Sys.Date()+c(NA,NA,NA,rep(9,4),NA,NA,9))


the idea I have difficulty wrapping my head around is to do the following: I 
need the system to look at df$a by row (lets call it the index row) and look at 
df$b and df$c 1 row before the given row in df$a  (lets call it index row -1) 
and evaluate if the index row value in df$a falls into the range (>= and <=) of 
the index row -1 values in df$b and df$c. If it does, then copy over the index 
row -1 values in df$b and df$c into the index row in df$b and df$c, if not 
place an NA in both cells of the index row in df$b and df$c. 
 examples:
1. the date value in df$a[8] is between df$b[7] and df$c[7] so we can copy the 
values in df$b[7] and df$c[7] into df$b[8] and df$c[8]2.  the date value in 
df$a[9] is between df$b[8] and df$c[8] (as we copied it in in step 1)  so we 
can copy the values in df$b[8] and df$c[8] into df$b[9] and df$c[9]3.  the date 
value in df$a[10] is NOT between df$b[9] and df$c[9] (as we copied it in in 
step 2)  so we can place NA in df$b[10] and df$c[10] 

also would like to do this going up, too, similar to fill(...,"downup"). On the 
end we would want to have this:
 dfwanted<-data.frame(a=Sys.Date()+1:10,               
b=Sys.Date()+c(NA,NA,rep(3,7),NA),               
c=Sys.Date()+c(NA,NA,rep(9,7),NA))


much appreciate any help you could provide.
thanks,

Andras 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] MuMIn package with gamlss error

2019-04-13 Thread Andras Farkas via R-help
Hello,

could you please provide your thoughts on what I may be missing? gamlss models 
are supposedly supported by MuMIn yet this one fails:

library(MuMIn)

#this lm runs
linearMod <- lm(Sepal.Length ~ ., data=iris) 
options(na.action = "na.fail")
res <-dredge(linearMod,beta = T, evaluate = T)
confset.95p<-get.models(res, subset = cumsum(weight) <= .95)
avgm <- model.avg(confset.95p)
predict(avgm,se.fit = TRUE, type="response")

#this gamlss fails on dredge(), and writing out the formula does not solve the 
initial error...
gamlssMod <- gamlss(Sepal.Length ~ ., data=iris,) 
res <-dredge(gamlssMod,beta = T, evaluate = T)
confset.95p<-get.models(res, subset = cumsum(weight) <= .95)
avgm <- model.avg(confset.95p)
predict(avgm,se.fit = TRUE, type="response")
options(na.action = "na.omit") 


appreciate any thoughts you may have,

Andras 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data frame solution

2019-03-19 Thread Andras Farkas via R-help
Hello All,

wonder if you have thoughts on a clever solution for this code:



df       <- data.frame(a = c(6,1), b = c(1000,1200), c =c(-1,3)) 

#the caveat here is that the number of rows for df can be anything from 1 row 
to in the hundreds. I kept it to 2 to have minimal reproducible

t<-seq(-5,24,0.1) #min(t) will always be <=df$c[1], which is the value that is 
always going to equal to min(df$c)

times1 <- c(rbind(df$c[1],df$c[1]+df$a[1]),max(t)) #length of times1 will 
always be 3, see times2 is of length 4

input1   <- c(rbind(df$b[1]/df$a[1],rep(0,length(df$b[1]))),0) #length of 
input1 will always be 3, see input2 is of length 4

out1 
<-data.frame(t,ifelse(t>=times1[1]=times1[2]=times2[1]=times2[2]=times2[3]https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reduce and intersect question (maybe)?

2019-02-21 Thread Andras Farkas via R-help
Works well! Thanks!
Andras  

On Thursday, February 21, 2019, 8:47:51 AM EST, Jeff Newmiller 
 wrote:  
 
 Use ?merge instead of intersect.

On February 21, 2019 5:22:46 AM PST, Andras Farkas via R-help 
 wrote:
>Hello All,
>
>wonder if you have a suggestion for the following:
>
>we have
>a<-data.frame(ID=c(1,2,3,4,5,6,7),date=as.POSIXct(seq(as.Date('2011-01-01'),as.Date('2011-01-07'),by
>= 1),format='%m/%d/%Y %H:%M'),z=rnorm(7,1,1))
>b<-data.frame(ID=c(1,2,3,11,12,13,14,15),date=as.POSIXct(seq(as.Date('2011-01-01'),as.Date('2011-01-08'),by
>= 1),format='%m/%d/%Y %H:%M'),z=rnorm(8,1,1))
>c<-data.frame(ID=c(1,2,3,4,5,6,7,8,9,10),date=as.POSIXct(c(seq(as.Date('2011-01-01'),as.Date('2011-01-05'),by
>= 1),seq(as.Date('2011-01-11'),as.Date('2011-01-15'),by =
>1)),format='%m/%d/%Y %H:%M'),z=rnorm(10,1,1))
>d<-data.frame(ID=c(1,2,3,21,22,23,24,25,26,27,28),date=as.POSIXct(c(as.Date('2011-01-01'),as.Date('2011-11-01'),as.Date('2011-01-03'),seq(as.Date('2011-01-01'),as.Date('2011-01-08'),by
>= 1)),format='%m/%d/%Y %H:%M'),z=rnorm(11,1,1))
>
>
>#this function will do the obvious and give the IDs that are in all of
>the data frames based on the ID column
>
>intersect_all <- function(a,b,...){
>  Reduce(intersect, list(a,b,...))
>}
>
>intersect_all(a$ID,b$ID,c$ID,d$ID)
>
>
>#I would like to extend this (or use another function) where the
>function would give all the rows (ie based on both columns as a
>condition) that are in all of the data frames, so the result should be
>as below as these 2 rows are in all of the data frames (the fact that
>the rows that are common in all data frames ie 1 and 3 in my example
>are I only set up for the sake of convenience, in reality their row
>number in each of the data frames may be different) . The value of z is
>of no particular importance, but once the common rows are identified I
>would want to subset the data frames to get these results:
>
>a[c(1,3),]
>b[c(1,3),]
>c[c(1,3),]
>d[c(1,3),]
>
>much appreciate your input,
>
>thanks
>
>Andras 
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] reduce and intersect question (maybe)?

2019-02-21 Thread Andras Farkas via R-help
Hello All,

wonder if you have a suggestion for the following:

we have
a<-data.frame(ID=c(1,2,3,4,5,6,7),date=as.POSIXct(seq(as.Date('2011-01-01'),as.Date('2011-01-07'),by
 = 1),format='%m/%d/%Y %H:%M'),z=rnorm(7,1,1))
b<-data.frame(ID=c(1,2,3,11,12,13,14,15),date=as.POSIXct(seq(as.Date('2011-01-01'),as.Date('2011-01-08'),by
 = 1),format='%m/%d/%Y %H:%M'),z=rnorm(8,1,1))
c<-data.frame(ID=c(1,2,3,4,5,6,7,8,9,10),date=as.POSIXct(c(seq(as.Date('2011-01-01'),as.Date('2011-01-05'),by
 = 1),seq(as.Date('2011-01-11'),as.Date('2011-01-15'),by = 1)),format='%m/%d/%Y 
%H:%M'),z=rnorm(10,1,1))
d<-data.frame(ID=c(1,2,3,21,22,23,24,25,26,27,28),date=as.POSIXct(c(as.Date('2011-01-01'),as.Date('2011-11-01'),as.Date('2011-01-03'),seq(as.Date('2011-01-01'),as.Date('2011-01-08'),by
 = 1)),format='%m/%d/%Y %H:%M'),z=rnorm(11,1,1))


#this function will do the obvious and give the IDs that are in all of the data 
frames based on the ID column

intersect_all <- function(a,b,...){
  Reduce(intersect, list(a,b,...))
}

intersect_all(a$ID,b$ID,c$ID,d$ID)


#I would like to extend this (or use another function) where the function would 
give all the rows (ie based on both columns as a condition) that are in all of 
the data frames, so the result should be as below as these 2 rows are in all of 
the data frames (the fact that the rows that are common in all data frames ie 1 
and 3 in my example are I only set up for the sake of convenience, in reality 
their row number in each of the data frames may be different) . The value of z 
is of no particular importance, but once the common rows are identified I would 
want to subset the data frames to get these results:

a[c(1,3),]
b[c(1,3),]
c[c(1,3),]
d[c(1,3),]

much appreciate your input,

thanks

Andras 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] list with list function

2019-02-05 Thread Andras Farkas via R-help
Thanks Rui and Ivan, works perfectly...
Andras
On Monday, February 4, 2019, 4:18:39 PM EST, Rui Barradas 
 wrote:  
 
 Hello,

Like this?


Map('[', listA, lapply(listB, '*', -1))


Hope this helps,

Rui Barradas

Às 21:01 de 04/02/2019, Andras Farkas via R-help escreveu:
> Hello everyone,
> 
> wonder if you would have a thought on a function for the following:
> 
> 
> we have
> 
> a<-sample(seq(as.Date('1999/01/01'), as.Date('2000/01/01'), by="day"),5)
> b<-sample(seq(as.Date('1999/01/01'), as.Date('2000/01/01'), by="day"), 4)
> c<-sample(seq(as.Date('1999/01/01'), as.Date('2000/01/01'), by="day"), 3)
> 
> d<-c(1,3,5)
> e<-c(1,4)
> f<-c(1,2)
> 
> listA<-list(a,b,c)
> listB<-list(d,e,f)
> 
> 
> what I would like to do with a function (my real listA and listB can be of 
> any length but always equal length, but their components like a,b,and c those 
> can be unequal) as opposed to manually is to derive the following answer
> 
> listfinal<-list(a[-d],b[-e],c[-f])
> listfinal
> 
> 
> essentially the elements in listB serve as identifying the position of 
> corresponding list element in listA and removing it from listA.
> 
> these lists listA and listB in practice are columns of a data frame that I am 
> trying to work with and were generated with a function using lapply...
> 
> appreciate any thoughts you may have to make this functional...
> 
> thanks,
> 
> Andras
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] list with list function

2019-02-04 Thread Andras Farkas via R-help
Hello everyone,

wonder if you would have a thought on a function for the following:


we have

a<-sample(seq(as.Date('1999/01/01'), as.Date('2000/01/01'), by="day"),5)
b<-sample(seq(as.Date('1999/01/01'), as.Date('2000/01/01'), by="day"), 4)
c<-sample(seq(as.Date('1999/01/01'), as.Date('2000/01/01'), by="day"), 3)

d<-c(1,3,5)
e<-c(1,4)
f<-c(1,2)

listA<-list(a,b,c)
listB<-list(d,e,f)


what I would like to do with a function (my real listA and listB can be of any 
length but always equal length, but their components like a,b,and c those can 
be unequal) as opposed to manually is to derive the following answer

listfinal<-list(a[-d],b[-e],c[-f])
listfinal


essentially the elements in listB serve as identifying the position of 
corresponding list element in listA and removing it from listA. 

these lists listA and listB in practice are columns of a data frame that I am 
trying to work with and were generated with a function using lapply...

appreciate any thoughts you may have to make this functional...

thanks,

Andras 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame transformation

2019-01-07 Thread Andras Farkas via R-help
Thanks Bert this will do...
Andras

Sent from Yahoo Mail on Android 
 
  On Sun, Jan 6, 2019 at 1:09 PM, Bert Gunter wrote:   
... and my reordering of column indices was unnecessary:    merge(dat, d, all.y 
= TRUE)will do.
Bert Gunter

"The trouble with having an open mind is that people keep coming along and 
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Jan 6, 2019 at 5:16 AM Andras Farkas via R-help  
wrote:

Hello Everyone,

would you be able to assist with some expertise on how to get the following 
done in a way that can be applied to a data set with different dimensions and 
without all the line items here?

we have:

id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ of 
course in real data set, usually in magnitude of 1
letter<-c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2),
          
sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number of 
unique "letters" is less than 4000 in real data set and they are no duplicates 
within same ID
weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2),
          sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is 
below 50 in real data set and they are no duplicates within same ID


data<-data.frame(id=id,letter=letter,weight=weight)

#goal is to get the following transformation where a column is added for each 
unique letter and the weight is pulled into the column if the letter exist 
within the ID, otherwise NA
#so we would get datatransform like below but without the many steps described 
here

datatransfer<-data.frame(data,apply(data[2],2,function(x) 
ifelse(x=="A",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) 
ifelse(x=="B",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) 
ifelse(x=="C",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) 
ifelse(x=="D",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) 
ifelse(x=="E",data$weight,NA)))

colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E")
much appreciate the help,

thanks

Andras 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

  

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data frame transformation

2019-01-06 Thread Andras Farkas via R-help
Hello Everyone,

would you be able to assist with some expertise on how to get the following 
done in a way that can be applied to a data set with different dimensions and 
without all the line items here?

we have:

id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ of 
course in real data set, usually in magnitude of 1
letter<-c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2),
          
sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number of 
unique "letters" is less than 4000 in real data set and they are no duplicates 
within same ID
weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2),
          sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is 
below 50 in real data set and they are no duplicates within same ID


data<-data.frame(id=id,letter=letter,weight=weight)

#goal is to get the following transformation where a column is added for each 
unique letter and the weight is pulled into the column if the letter exist 
within the ID, otherwise NA
#so we would get datatransform like below but without the many steps described 
here

datatransfer<-data.frame(data,apply(data[2],2,function(x) 
ifelse(x=="A",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) 
ifelse(x=="B",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) 
ifelse(x=="C",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) 
ifelse(x=="D",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) 
ifelse(x=="E",data$weight,NA)))

colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E")
much appreciate the help,

thanks

Andras 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ddply (or other suitable solution) question

2018-09-14 Thread Andras Farkas via R-help
thank you all, Bert's idea will get it done... good question also re what if 1 
row: have a separate plan for that... Anyhow, finishing up Bert's lines with 
z<-lapply(ix, function(i)   df[i,])
lapply(z, function(x) split(x, rep(1:ceiling(nrow(x)/2), each=2)[1:nrow(x)]))


seems to do what I need,
thanks again...

Andras  

On Thursday, September 13, 2018, 5:16:54 PM EDT, Bert Gunter 
 wrote:  
 
 Mod my earlier question, it seems that you just want to replicate all
rows within an id if there more than 2 rows. If this is incorrect,
ignore the rest of this post.

Otherwise...

(I assume the data frame is listed in ID order, whatever that is)

set.seed(123.456)
df <-data.frame(ID=c(1,1,2,2,2,3,3,3,3,4,4,5,5),
                read=c(1,1,0,1,1,1,0,0,0,1,0,0,0),
                int=c(1,1,0,0,0,1,1,0,0,1,1,1,1),
                z=rnorm(13,1,5),
                y=rnorm(13,1,5))

yielded on my Mac and R version 3.5.1

> df
  ID read int          z          y
1  1    1  1 -1.8023782  1.55341358
2  1    1  1 -0.1508874 -1.77920567
3  2    0  0  8.7935416  9.93456568
4  2    1  0  1.3525420  3.48925239
5  2    1  0  1.6464387 -8.83308578
6  3    1  1  9.5753249  4.50677951
7  3    0  1  3.3045810 -1.36395704
8  3    0  0 -5.3253062 -4.33911853
9  3    0  0 -2.4342643 -0.08987457
10  4    1  1 -1.2283099 -4.13002224
11  4    0  1  7.1204090 -2.64445615
12  5    0  1  2.7990691 -2.12519634
13  5    0  1  3.0038573 -7.43346655

## The following doubles up the rows by ID
> ix <- tapply(seq_len(nrow(df)),df$ID,
+              function(x){
+                lenx <- length(x)
+                if(lenx > 2)
+                    c(x[1],rep(x[2]:x[lenx-1],e=2),x[lenx])
+                else x
+              }
+    )
> ix
$`1`
[1] 1 2

$`2`
[1] 3 4 4 5

$`3`
[1] 6 7 7 8 8 9

$`4`
[1] 10 11

$`5`
[1] 12 13

## now use the ix list to break up df:

> lapply(ix, function(i)df[i,])
$`1`
  ID read int          z        y
1  1    1  1 -1.8023782  1.553414
2  1    1  1 -0.1508874 -1.779206

$`2`
    ID read int        z        y
3    2    0  0 8.793542  9.934566
4    2    1  0 1.352542  3.489252
4.1  2    1  0 1.352542  3.489252
5    2    1  0 1.646439 -8.833086

$`3`
    ID read int        z          y
6    3    1  1  9.575325  4.50677951
7    3    0  1  3.304581 -1.36395704
7.1  3    0  1  3.304581 -1.36395704
8    3    0  0 -5.325306 -4.33911853
8.1  3    0  0 -5.325306 -4.33911853
9    3    0  0 -2.434264 -0.08987457

$`4`
  ID read int        z        y
10  4    1  1 -1.228310 -4.130022
11  4    0  1  7.120409 -2.644456

$`5`
  ID read int        z        y
12  5    0  1 2.799069 -2.125196
13  5    0  1 3.003857 -7.433467

I leave it to you to modify the lapply() function to break up each id
data frame into sublists of pairs if that is what you wish to do.
Assuming again that this is actually what you want.

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Thu, Sep 13, 2018 at 1:40 PM Bert Gunter  wrote:
>
> What if there is only one read in the id?
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
> On Thu, Sep 13, 2018 at 12:11 PM Andras Farkas via R-help
>  wrote:
> >
> > Dear All,
> >
> > I have data frame:
> > set.seed(123.456)
> > df <-data.frame(ID=c(1,1,2,2,2,3,3,3,3,4,4,5,5),
> >                read=c(1,1,0,1,1,1,0,0,0,1,0,0,0),
> >                int=c(1,1,0,0,0,1,1,0,0,1,1,1,1),
> >                z=rnorm(13,1,5),
> >                y=rnorm(13,1,5))
> >
> > what I would like to achieve (as best as I see it now) is to create 
> > multiple lists (and lists within lists using the data in df) that would be 
> > based on the groups in the ID column ("top level of list") and "join 
> > together" each line item within the group followed by the next line item 
> > ("bottom level list"), so would look like this for
> >
> > [[ID=1]]
> > [[1]][[1]]
> >  ID read int        z        y
> >  1    1  1 5.188935 5.107905
> >  1    1  1 1.766866 4.443201
> > [[ID=2]]
> > [[2]][[1]]  ID read int        z        y
> >  2    0  0 -4.690685 3.7695883
> >  2    1  0  7.269075 0.6904414[[ID=2]]
> > [[2]][[2]]  ID read int        z          y
> >  2    1  0 7.269075  0.6904414
> >  2    1  0 3.132321 -0.5298133[[ID=3]]
> > [[3]][[1]]  ID read int          z        y
> >  3    1  1 -0.4753574 -0.902355
> >  3    0  1  5.4756283 -2.473535
> > [[ID=3]]
> > [[3]][[2]]
> >  3    0  1 5.475628 -2.47353489
> >  3    0  0 5.3

[R] ddply (or other suitable solution) question

2018-09-13 Thread Andras Farkas via R-help
Dear All,

I have data frame:
set.seed(123.456)
df <-data.frame(ID=c(1,1,2,2,2,3,3,3,3,4,4,5,5),
                read=c(1,1,0,1,1,1,0,0,0,1,0,0,0),
                int=c(1,1,0,0,0,1,1,0,0,1,1,1,1),
                z=rnorm(13,1,5),
                y=rnorm(13,1,5))

what I would like to achieve (as best as I see it now) is to create multiple 
lists (and lists within lists using the data in df) that would be based on the 
groups in the ID column ("top level of list") and "join together" each line 
item within the group followed by the next line item ("bottom level list"), so 
would look like this for 

[[ID=1]]
[[1]][[1]]
  ID read int        z        y
  1    1   1 5.188935 5.107905
  1    1   1 1.766866 4.443201
[[ID=2]]
[[2]][[1]]  ID read int         z         y
  2    0   0 -4.690685 3.7695883
  2    1   0  7.269075 0.6904414[[ID=2]]
[[2]][[2]]  ID read int        z          y
  2    1   0 7.269075  0.6904414
  2    1   0 3.132321 -0.5298133[[ID=3]]
[[3]][[1]]  ID read int          z         y
  3    1   1 -0.4753574 -0.902355
  3    0   1  5.4756283 -2.473535
[[ID=3]]
[[3]][[2]]
  3    0   1 5.475628 -2.47353489
  3    0   0 5.390667 -0.03958639


hoping example clear enough... all our help is appreciated,

thanks,



Andras 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] kSamples ad.test question

2018-08-02 Thread Andras Farkas via R-help
Dear All,

once we run the following code, the results of the test will give us the 
expected obvious, samples are from the common distribution...


library(kSamples)

u1 <- sample(rnorm(500,10,1),20,replace = TRUE)
u2 <- sample(rnorm(500,10,1),20,replace = TRUE)
u3 <- sample(rnorm(500,10,1),20,replace = TRUE)
u4 <- sample(rnorm(500,10,1),20,replace = TRUE)
u5 <- sample(rnorm(500,10,1),20,replace = TRUE)

ad.test(u1, u2, u3,u4,u5, method = "exact", dist = FALSE, Nsim = 1000)

next, if I change "u5" to:

u5 <- sample(rnorm(500,20,1),20,replace = TRUE)

the results of the test again gives us what we expect, ie samples are not from 
the common distribution my question is: would you know of a way to be able 
to automatically select out or identify  "u5", the distribution that is 
"responsible"  for the results generated showing that the samples are not from 
the common distribution? 

much appreciate your help,


Andras 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] MICE data analysis with glmulti

2018-01-30 Thread Andras Farkas via R-help
Dear All,

wonder if you have some thoughts on running the with() function (and perhaps 
including the pool() function to get the results?) in glmulti? In other words, 
how to run glmulti with a data set that is produced by mice()?

publicly available code:

data <- airquality
data[4:10,3] <- rep(NA,7)
data[1:5,4] <- NA
data <- data[-c(5,6)]
library(mice)
library(glmulti)

the following line will compute the missing data:
tempData <- mice(data,m=5,maxit=50,meth='pmm',seed=500)

and the following 2 lines will run the regression on the mice output and pool 
the results to establish the final result of interest for the model specified...
modelFit1 <- with(tempData,glm(Temp~ Ozone+Solar.R+Wind))
summary(pool(modelFit1))


with glmulti I am trying to establish the "best" model by evaluating 
combinations of all predictors and interactions in different models and would 
like to force the variable "Ozone" into all models with the following code:

glm.redefined = function(formula, data, always="", ...) 
{glm(as.formula(paste(deparse(formula), always)), data=data, ...)}

then run glmulti:


output<-glmulti(with(tempData,Temp~Solar.R+Wind), 
                fitfunc=glm.redefined, 
                level=1, 
                crit=aic, 
                method="h", 
                always= "+Ozone")


which will obviously fail once you give it a try... any thoughts on how to 
identify the best model using glmulti in this fashion  that would fit the 
different combination of predictors with interactions on the mice() output of 
tempData?

much appreciate the help...

Andras 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] summary.rms help

2018-01-03 Thread Andras Farkas via R-help
Dear All,
using the example from the help of summary.rms

library(rms)
n <- 1000# define sample size 
set.seed(17) # so can reproduce the results 
age<- rnorm(n, 50, 10) 
blood.pressure <- rnorm(n, 120, 15) 
cholesterol<- rnorm(n, 200, 25) 
sex<- factor(sample(c('female','male'), n,TRUE)) 
label(age)<- 'Age'  # label is in Hmisc 
label(cholesterol)<- 'Total Cholesterol' 
label(blood.pressure) <- 'Systolic Blood Pressure' 
label(sex)<- 'Sex' 
units(cholesterol)<- 'mg/dl'   # uses units.default in Hmisc 
units(blood.pressure) <- 'mmHg' 
# Specify population model for log odds that Y=1 
L <- .4*(sex=='male') + .045*(age-50) + 
(log(cholesterol - 10)-5.2)*(-2*(sex=='female') + 2*(sex=='male')) 
# Simulate binary y to have Prob(y=1) = 1/[1+exp(-L)] 
y <- ifelse(runif(n) < plogis(L), 1, 0) 
ddist <- datadist(age, blood.pressure, cholesterol, sex) 
options(datadist='ddist') 
fit <- lrm(y ~ blood.pressure + sex * (age + rcs(cholesterol,4)))
s <- summary(fit) 
plot(s)
as you will see the plot will by default include the low and high values from 
the summary printed on the plot to the right of the variable name... Any 
thoughts on how printing these low and high values can be suppressed, ie: 
prevent them from being printed?

 
appreciate your help,
Andras

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] effects package x axis labels

2017-11-10 Thread Andras Farkas via R-help
Dear All,

probably a simple enough solution but don;t seem to be able to get my head 
around it...example based on a publicly available data set:

mydata <- read.csv("https://stats.idre.ucla.edu/stat/data/binary.csv;)
mylogit <- glm(admit ~ gre + gpa + rank, data = mydata, family = "binomial")
library(effects)
plot(allEffects(mylogit)
     ,axes=list(y=list(lab="Prob(xyz)"))
)

axes=list(y=list(lab="Prob(xyz)")) changes the y axis labels for all 3 plots... 
Any thoughts on how I could change the x axis labels to let say 'black' (plot 
1), 'white' (plot 2) and 'green' (plot 3) for the 3 respective plots produced? 


appreciate the help...

Andras 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] bquote question

2017-08-18 Thread Andras Farkas via R-help
Dear All,

could you please provide input on the following:

plot(1:10,main=paste("\n   ","\nABCD","\n","\n","\n"),cex.main=1.3) 

a<-500 
b<-12 
mtext(bquote(bold(.(formatC(1.2*a,decimal.mark=",",digits=2,format="f")))~ " 
words "~bold(.(b))~" words"~"\n"~"\n")) 



as you will see form the sub-title only the result of  
formatC(1.2*a,decimal.mark=",",digits=2,format="f") gets bolded, while the part 
bold(.(b)) does not seem to bold the letter 'b'... In addition, the spacing 
("\n ") at the end of the mtext line also does not seem to get recognized 
any thoughts on what I may be doing wrong?

much appreciate your help...
 
Andras

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame question

2017-08-06 Thread Andras Farkas via R-help
thank you both... assumption is in fact that a and b are always the same 
length... these work for me well...

much appreciate it... 
Andras 


On Sunday, August 6, 2017 12:14 PM, Ulrik Stervbo <ulrik.ster...@gmail.com> 
wrote:



Hi Andreas,

assuming that the increment is always indicated by the same value (in your 
example 0), this could work:

df$a <- cumsum(seq_along(df$b) %in% which(df$b == 0))

df

HTH,
Ulrik

On Sun, 6 Aug 2017 at 18:06 Bert Gunter <bgunter.4...@gmail.com> wrote:

Your specification is a bit unclear to me, so I'm not sure the below
>is really what you want. For example, your example seems to imply that
>a and b must be of the same length, but I do not see that your
>description requires this. So the following may not be what you want
>exactly, but one way to do this(there may be cleverer ones!) is to
>make use of ?rep. Everything else is just fussy detail. (Your example
>suggests that you should also learn about ?seq. Both of these should
>be covered in any good R tutorial, which you should probably spend
>time with if you haven't already).
>
>Anyway...
>
>## WARNING: Not thoroughly tested! May (probably :-( ) contain bugs.
>
>f <- function(x,y,switch_val =0)
>{
>   wh <- which(y == switch_val)
>   len <- length(wh)
>   len_x <- length(x)
>   if(!len) x
>   else if(wh[1] == 1){
>  if(len ==1) return(rep(x[1],len_x))
>  else {
> wh <- wh[-1]
> len <- len -1
>  }
>   }
>   count <- c(wh[1]-1,diff(wh))
>   if(wh[len] == len_x) count<- c(count,1)
>   else count <- c(count, len_x - wh[len] +1)
>   rep(x[seq_along(count)],times = count)
>}
>
>> a <- c(1:5,1:8)
>> b <- c(0:4,0:7)
>> f(a,b)
> [1] 1 1 1 1 1 2 2 2 2 2 2 2 2
>
>
>
>Bert Gunter
>
>"The trouble with having an open mind is that people keep coming along
>and sticking things into it."
>-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
>On Sun, Aug 6, 2017 at 4:10 AM, Andras Farkas via R-help
><r-help@r-project.org> wrote:
>> Dear All,
>>
>> wonder if you have thoughts on the following:
>>
>> let us say we have:
>>
>> df<-data.frame(a=c(1,2,3,4,5,1,2,3,4,5,6,7,8),b=c(0,1,2,3,4,0,1,2,3,4,5,6,7))
>>
>>
>>  I would like to rewrite values in column name "a" based on values in column 
>> name "b", where based on a certain value of column "b" the next value of 
>> column 'a' is prompted, in other words would like to have this as a result:
>>
>> df<-data.frame(a=c(1,1,1,1,1,2,2,2,2,2,2,2,2),b=c(0,1,2,3,4,0,1,2,3,4,5,6,7))
>>
>>
>> where at the value of 0 in column 'b' the number in column a changes from 1 
>> to 2. From the first zero value of column 'b' and until the next zero in 
>> column 'b' the numbers would not change in 'a', ie: they are all 1 in my 
>> example... then from 2 it would change to 3 again as 'b' will have zero 
>> again in a row, and so on.. Would be grateful for a solution that would 
>> allow me to set the values (from 'b') that determine how the values get 
>> established in 'a' (ie: lets say instead of 0 I would want 3 being the value 
>> where 1 changes to 2 in 'a') and that would be flexible to take into account 
>> that the number of rows and the number of time 0 shows up in a row in column 
>> 'b' may vary...
>>
>> much appreciate your thoughts..
>>
>> Andras
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data frame question

2017-08-06 Thread Andras Farkas via R-help
Dear All,

wonder if you have thoughts on the following: 

let us say we have:

df<-data.frame(a=c(1,2,3,4,5,1,2,3,4,5,6,7,8),b=c(0,1,2,3,4,0,1,2,3,4,5,6,7))


 I would like to rewrite values in column name "a" based on values in column 
name "b", where based on a certain value of column "b" the next value of column 
'a' is prompted, in other words would like to have this as a result:

df<-data.frame(a=c(1,1,1,1,1,2,2,2,2,2,2,2,2),b=c(0,1,2,3,4,0,1,2,3,4,5,6,7)) 


where at the value of 0 in column 'b' the number in column a changes from 1 to 
2. From the first zero value of column 'b' and until the next zero in column 
'b' the numbers would not change in 'a', ie: they are all 1 in my example... 
then from 2 it would change to 3 again as 'b' will have zero again in a row, 
and so on.. Would be grateful for a solution that would allow me to set the 
values (from 'b') that determine how the values get established in 'a' (ie: 
lets say instead of 0 I would want 3 being the value where 1 changes to 2 in 
'a') and that would be flexible to take into account that the number of rows 
and the number of time 0 shows up in a row in column 'b' may vary...

much appreciate your thoughts..

Andras

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] select from data frame

2017-07-16 Thread Andras Farkas via R-help
thank you David and Bert, these solutions will work for me... Andras  

On Saturday, July 15, 2017 6:05 PM, Bert Gunter <bgunter.4...@gmail.com> 
wrote:
 

 ...
and here is a slightly cleaner and more transparent way of doing the
same thing (setdiff() does the matching)

> with(df, setdiff(ID,ID[samples %in% c("B","C") ]))
[1] 3

-- Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Jul 15, 2017 at 9:23 AM, Bert Gunter <bgunter.4...@gmail.com> wrote:
> If I understand correctly, no looping (ave(), for()) or type casting
> (as.character()) is needed -- indexing and matching suffice:
>
>> with(df, ID[!ID %in% unique(ID[samples %in% c("B","C") ])])
> [1] 3 3
>
>
>
> Cheers,
>
> Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Sat, Jul 15, 2017 at 8:54 AM, David Winsemius <dwinsem...@comcast.net> 
> wrote:
>>
>>> On Jul 15, 2017, at 4:01 AM, Andras Farkas via R-help 
>>> <r-help@r-project.org> wrote:
>>>
>>> Dear All,
>>>
>>> wonder if you could please assist with the following
>>>
>>> df<-data.frame(ID=c(1,1,1,2,2,3,3,4,4,5,5),samples=c("A","B","C","A","C","A","D","C","B","A","C"))
>>>
>>> from this data frame the goal is to extract the value of 3 from the ID 
>>> column based on the logic that the ID=3 in the data frame has NO row that 
>>> would pair 3 with either "B", AND/OR "C" in the samples column...
>>>
>>
>> This returns a vector that determines if either of those characters are in 
>> the character values of that factor column you created. Coercing to 
>> character is needed because leaving samples as a factor generated an invalid 
>> factor level warning and gave useless results.
>>
>>  with( df, ave( as.character(samples), ID, FUN=function(x) {!any(x %in% 
>>c("B","C"))}))
>>  [1] "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "TRUE"  "TRUE"  "FALSE" "FALSE"
>> [10] "FALSE" "FALSE"
>>
>> You can then use it to extract and consolidate to a single value (although 
>> wrapping with as.logical was needed because `ave` returned character class 
>> values):
>>
>>  unique( df$ID[ as.logical(  # fails without this since "FALSE" != FALSE
>>                    with( df,
>>                        ave( as.character(samples), ID, FUN=function(x) 
>>{!any(x %in% c("B","C"))})))
>>              ] )
>> #[1] 3
>>
>> The same sort of logic could also be constructed with a for-loop:
>>
>>> for (x in unique(df$ID) ) { if ( !any( df$samples[df$ID==x] %in% 
>>> c("b","C")) ) print(x) }
>> [1] 3
>>
>> Although you are warned that for-loops do not return values and you might 
>> need to make an assignment rather than just printing.
>>
>> --
>>
>> David Winsemius
>> Alameda, CA, USA
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

   
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] select from data frame

2017-07-15 Thread Andras Farkas via R-help
Dear All,

wonder if you could please assist with the following 

df<-data.frame(ID=c(1,1,1,2,2,3,3,4,4,5,5),samples=c("A","B","C","A","C","A","D","C","B","A","C"))

from this data frame the goal is to extract the value of 3 from the ID column 
based on the logic that the ID=3 in the data frame has NO row that would pair 3 
with either "B", AND/OR "C" in the samples column...


much appreciate your help...

thanks,
 Andras

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] "reverse" quantile function

2017-06-16 Thread Andras Farkas via R-help
Peter,
thanks, very nice, this will work for me... could you also help with setting up 
the code to run the on liner "approx(sort(x), seq(0,1,,length(x)), q)$y" on the 
rows of a data frame using my example above? So if I cbind z and res, 
df<-cbind(z,res)

the "x" in your one liner would be the first 4 column values of each row and 
"q" is the last (5fth) column value of each row..
thanks again for all the help, Andras Farkas 

On Friday, June 16, 2017 4:58 AM, peter dalgaard <pda...@gmail.com> wrote:
 

 It would depend on which one of the 9 quantile definitions you are using. The 
discontinuous ones aren't invertible, and the continuous ones won't be either, 
if there are ties in the data. 

This said, it should just be a matter of setting up the inverse of a piecewise 
linear function. To set ideas, try 

x <- rnorm(5)
curve(quantile(x,p), xname="p")

The breakpoints for the default quantiles are n points evenly spread on [0,1], 
including the endpoints; i.e., for n=5, (0, .25, .5, .75, 1) 

So:

x <- rnorm(5)
br <- seq(0, 1, ,5)
qq <- quantile(x, br) ## actually == sort(x)

pfun <- approxfun(qq, br)
(q <- quantile(x, .1234))
pfun(q)


There are variations, e.g. the one-liner

approx(sort(x), seq(0,1,,length(x)), q)$y

-pd


> On 16 Jun 2017, at 01:56 , Andras Farkas via R-help <r-help@r-project.org> 
> wrote:
> 
> David,
> 
> thanks for the response. In your response the quantile function (if I see 
> correctly)  runs on the columns versus I need to run it on the rows, which is 
> an easy fix, but that is not exactly what I had in mind... essentially we can 
> remove t() from my original code to make "res" look like this:
> 
> res<-apply(z, 1, quantile, probs=c(0.3))
> 
> but after all maybe I did not explain myself clear enough so let me try 
> again: the known variables to us in what I am trying to do are the data frame 
> "z' :
> 
> t<-seq(0,24,1) 
> a<-10*exp(-0.05*t) 
> b<-10*exp(-0.07*t) 
> c<-10*exp(-0.1*t) 
> d<-10*exp(-0.03*t) 
> 
> z<-data.frame(a,b,c,d)
> 
> and the vector "res":
> 
> res<-c(10.00,  9.296382,  8.642955,  8.036076 ,7.472374,  6.948723,  
> 6.462233,  6.010223 ,5.590211 
> 
> ,5.199896 ,4.837147,  4.499989 ,4.186589,  3.895250 ,3.624397,  3.372570,  
> 3.138415,  2.920675 
> , 2.718185 ,2.529864 ,2.354708,  2.191786,  2.040233,  1.899247,  1.768084)
> 
> and I need to find the probability (probs) , the unknown value, which would 
> result in creating "res", ie: the probs=c(0.3), from: 
> res<-apply(z, 1, quantile, probs=c(0.3))... 
> 
> 
> a more simplified example assuming :
> 
> k<-c(1:100)
> f<-30
> ecdf(k)(f)
> 
> would give us the value of 0.3... so same idea as this, but instead of "k" we 
> have data frame "z", and instead of "f" we have "res", and need to find the 
> value of 0.3... Does that make sense?
> 
> much appreciate the help...
> 
> Andras Farkas, 
> 
> 
> On Thursday, June 15, 2017 6:46 PM, David Winsemius <dwinsem...@comcast.net> 
> wrote:
> 
> 
> 
> 
>> On Jun 15, 2017, at 12:37 PM, Andras Farkas via R-help 
>> <r-help@r-project.org> wrote:
>> 
>> Dear All,
>> 
>> we have:
>> 
>> t<-seq(0,24,1) 
>> a<-10*exp(-0.05*t) 
>> b<-10*exp(-0.07*t) 
>> c<-10*exp(-0.1*t) 
>> d<-10*exp(-0.03*t) 
>> z<-data.frame(a,b,c,d) 
>> 
>> res<-t(apply(z, 1, quantile, probs=c(0.3))) 
>> 
>> 
>> 
>> my goal is to do a 'reverse" of the function here that produces "res" on a 
>> data frame, ie: to get the answer 0.3 back for the percentile location when 
>> I have "res" available to me... For a single vector this would be done using 
>> ecdf something like this:
>> 
>> x <- rnorm(100) 
>> #then I know this value:  
>> quantile(x,0.33) 
>> #so do this step
>> ecdf(x)(quantile(x,0.33)) 
>> #to get 0.33 back...
>> 
>> any suggestions on how I could to that for a data frame?
> 
> Can't you just used ecdf and quantile ecdf?
> 
> # See ?ecdf page for both functions
> 
>> lapply( lapply(z, ecdf), quantile, 0.33)
> $a
>    33% 
> 4.475758 
> 
> $b
>    33% 
> 3.245151 
> 
> $c
>    33% 
> 2.003595 
> 
> 
> $d
>    33% 
> 6.173204 
> -- 
> 
> David Winsemius
> Alameda, CA, USA
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com










   
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] "reverse" quantile function

2017-06-16 Thread Andras Farkas via R-help
Never mind, I think i figured:

z<-df

apply(df,1,function(x) approx(sort(x[1:4]), seq(0,1,,length(x[1:4])), x[5])$y) 
thanks again for the help
 
Andras Farkas, 


On Friday, June 16, 2017 5:34 AM, Andras Farkas via R-help 
<r-help@r-project.org> wrote:




Peter, 

thanks, very nice, this will work for me... could you also help with setting up 
the code to run the on liner "approx(sort(x), seq(0,1,,length(x)), q)$y" on the 
rows of a data frame using my example above? So if I cbind z and res, 

df<-cbind(z,res) 

the "x" in your one liner would be the first 4 column values of each row and 
"q" is the last (5fth) column value of each row.. 

thanks again for all the help, 

Andras Farkas

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] "reverse" quantile function

2017-06-16 Thread Andras Farkas via R-help

Peter, 

thanks, very nice, this will work for me... could you also help with setting up 
the code to run the on liner "approx(sort(x), seq(0,1,,length(x)), q)$y" on the 
rows of a data frame using my example above? So if I cbind z and res, 

df<-cbind(z,res) 

the "x" in your one liner would be the first 4 column values of each row and 
"q" is the last (5fth) column value of each row.. 

thanks again for all the help, 

Andras Farkas

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] "reverse" quantile function

2017-06-15 Thread Andras Farkas via R-help
David,

thanks for the response. In your response the quantile function (if I see 
correctly)  runs on the columns versus I need to run it on the rows, which is 
an easy fix, but that is not exactly what I had in mind... essentially we can 
remove t() from my original code to make "res" look like this:

 res<-apply(z, 1, quantile, probs=c(0.3))

but after all maybe I did not explain myself clear enough so let me try again: 
the known variables to us in what I am trying to do are the data frame "z' :

 t<-seq(0,24,1) 
 a<-10*exp(-0.05*t) 
 b<-10*exp(-0.07*t) 
 c<-10*exp(-0.1*t) 
d<-10*exp(-0.03*t) 

z<-data.frame(a,b,c,d)

and the vector "res":

res<-c(10.00,  9.296382,  8.642955,  8.036076 ,7.472374,  6.948723,  
6.462233,  6.010223 ,5.590211 

,5.199896 ,4.837147,  4.499989 ,4.186589,  3.895250 ,3.624397,  3.372570,  
3.138415,  2.920675 
, 2.718185 ,2.529864 ,2.354708,  2.191786,  2.040233,  1.899247,  1.768084)

and I need to find the probability (probs) , the unknown value, which would 
result in creating "res", ie: the probs=c(0.3), from: 
res<-apply(z, 1, quantile, probs=c(0.3))... 


a more simplified example assuming :

k<-c(1:100)
f<-30
ecdf(k)(f)

would give us the value of 0.3... so same idea as this, but instead of "k" we 
have data frame "z", and instead of "f" we have "res", and need to find the 
value of 0.3... Does that make sense?

much appreciate the help...
  
Andras Farkas, 


On Thursday, June 15, 2017 6:46 PM, David Winsemius <dwinsem...@comcast.net> 
wrote:




> On Jun 15, 2017, at 12:37 PM, Andras Farkas via R-help <r-help@r-project.org> 
> wrote:
> 
> Dear All,
> 
> we have:
> 
> t<-seq(0,24,1) 
> a<-10*exp(-0.05*t) 
> b<-10*exp(-0.07*t) 
> c<-10*exp(-0.1*t) 
> d<-10*exp(-0.03*t) 
> z<-data.frame(a,b,c,d) 
> 
> res<-t(apply(z, 1, quantile, probs=c(0.3))) 
> 
> 
> 
> my goal is to do a 'reverse" of the function here that produces "res" on a 
> data frame, ie: to get the answer 0.3 back for the percentile location when I 
> have "res" available to me... For a single vector this would be done using 
> ecdf something like this:
> 
> x <- rnorm(100) 
> #then I know this value:  
> quantile(x,0.33) 
> #so do this step
> ecdf(x)(quantile(x,0.33)) 
> #to get 0.33 back...
> 
> any suggestions on how I could to that for a data frame?

Can't you just used ecdf and quantile ecdf?

# See ?ecdf page for both functions

> lapply( lapply(z, ecdf), quantile, 0.33)
$a
 33% 
4.475758 

$b
 33% 
3.245151 

$c
 33% 
2.003595 


$d
 33% 
6.173204 
-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] "reverse" quantile function

2017-06-15 Thread Andras Farkas via R-help
Dear All,

we have:

t<-seq(0,24,1) 
a<-10*exp(-0.05*t) 
b<-10*exp(-0.07*t) 
c<-10*exp(-0.1*t) 
d<-10*exp(-0.03*t) 
z<-data.frame(a,b,c,d) 

res<-t(apply(z, 1, quantile, probs=c(0.3))) 



my goal is to do a 'reverse" of the function here that produces "res" on a data 
frame, ie: to get the answer 0.3 back for the percentile location when I have 
"res" available to me... For a single vector this would be done using ecdf 
something like this:

x <- rnorm(100) 
#then I know this value:  
quantile(x,0.33) 
#so do this step
ecdf(x)(quantile(x,0.33)) 
#to get 0.33 back...

 any suggestions on how I could to that for a data frame?

thank you,Andras Farkas

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Prophet package

2017-03-19 Thread Andras Farkas via R-help
Dear All


wonder if you could assist with the following

we have:

library(prophet) 

library(dplyr) 


abc<-c(0.3684693,0.4938679, 0.4429201,0.452598,0.4301452,0.4315169, 
0.447026,0.496179,0.4045693,0.398533, 
0.355,0.431079,0.4063136,0.4120126,0.5210375,0.402897,0.4466131,0.5005669,0.5014164,0.5042271,0.5498575,0.6014215,0.4415863,0.4377443,0.4316092,0.4156757,0.3517915,0.3669508,0.3899471,0.3964143,0.4001074,0.3851003,0.4222451,0.375324,0.3652045,0.3376978
 
,0.383012,0.3763665,0.3550609,0.2958678,0.3726571,0.3442298 
#,0.3403275,0.2973978 
#, 0.4,0.4,0.4, 0.4,0.4,0.4, 0.4,0.4,0.4 

) 

df<-data.frame(ds = seq(as.Date('2013-08-01'), as.Date('2017-01-01'), by = 
'm'),abc) 
names(df)<-c("ds","y") 


m<-prophet(df,yearly.seasonality = TRUE) 
future <- make_future_dataframe(m, periods = 730) 
forecast <- predict(m, future) 
plot(m, forecast)
points(x=as.Date('2017-02-01'),y=0.5)

results in error message :


Error in plot.xy(xy.coords(x, y), type = type, ...) : 
plot.new has not been called yet

would you have a solution to plot the point on the plot?

appreiate the help,

Andras Farkas,

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] extend lines of prediction interval ggplot2

2017-03-14 Thread Andras Farkas via R-help
Dear All, 

 would you have some thoughts on how to extend the prediction interval lines to 
beyond the "range of data"? 


example: 

y <-c(0.4316092,0.4156757,0.3517915,0.3669508,0.3899471,0.3964143, 
0.4001074,0.3851003,0.4222451,0.375324,0.3652045,0.3376978,0.383012, 
0.3763665,0.3550609,0.2958678,0.3726571,0.3442298 
#,0.3403275,0.2973978 
)*100 
x <-seq(1,length(y),1) 

z<-c("07/01/2015","08/01/2015","09/01/2015","10/01/2015","11/01/2015", 
"12/01/2015","01/01/2016","02/01/2016","03/01/2016","04/01/2016","05/01/2016", 
"06/01/2016","07/01/2016","08/01/2016","09/01/2016","10/01/2016","11/01/2016", 
"12/01/2016","01/01/2017","02/01/2017") 

fit <-lm(y~x) 

temp_var <- predict(fit, interval="prediction") 

new_df <- data.frame(cbind(x,y, temp_var)) 
#new_df$x<-factor(new_df$x, ordered = T) 

library(ggplot2) 
ggplot(new_df, aes(x,y))+ 
geom_point() + 
theme(panel.background = element_rect(fill = 'white', colour = 'black'))+ 
geom_line(aes(y=lwr), color = "black", linetype = "dashed",size=0.75)+ 
geom_line(aes(y=upr), color = "black", linetype = "dashed",size=0.75)+ 
scale_x_discrete(limits=z)+ 
theme(axis.text.x = element_text(angle = 45, hjust = 1))+ 
theme(panel.grid.major=element_line(colour = "grey"))+ 
lims(y=c(0,50))+ 
geom_smooth(method=lm, 
se=TRUE,fullrange=TRUE,fill="darkgrey",col="black")+labs(title = paste("Adj R2 
= ",signif(summary(fit)$adj.r.squared, 4), 
"Intercept =",signif(fit$coef[[1]],4 ), 
" Slope =",signif(fit$coef[[2]], 4) 
# " P =",signif(summary(fit)$coef[2,4], 3) 
))+ 
ggtitle("Consumption Over Time") + 
theme(plot.title = element_text(hjust = 0.5))+ 
labs(y="y",x="x")+ 
geom_point(shape=15,aes(x=c(7),y=new_df[,2][7]), color="black",cex=4)+ 
geom_point(shape=15,aes(x=c(8),y=new_df[,2][8]), color="black",cex=4)+ 
geom_point(shape=17,aes(x=c(19),y=0.3403275*100), color="black",cex=4)+ 
geom_point(shape=17,aes(x=c(20),y=0.2973978*100), color="black",cex=4) 


as you  will see the regresssion line and confidence interval is extended, but 
would also want to extend the prediction interval lines to the "same length"... 
Wonder if you have any insights to this question... 


appreciate the help, 


Andras Farkas

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.