Re: [R] dataframe calculations based on certain values of a column

2014-03-27 Thread johannesradin...@gmail.com
Thanks, your solution using ave() works perfectly.
/johannes

-Ursprüngliche Nachricht-
Von: Bert Gunter 
An: Johannes Radinger 
Cc: R help 
Gesendet: Mittwoch, 26. März 2014 16:45:43 GMT+00:00
Betreff: Re: [R] dataframe calculations based on certain values of a column

I believe this will generalize. But check carefully!

Using your example (Excellent!), use ave():

with(df,ave(seq_along(var1),var2,FUN=function(i)
var3[i]/var3[i][var1[i]=="c"]))

[1] 0.500 1.000 1.000 0.833 0.333 1.000 1.750
[8] 1.000 1.000

This is kind of a low level brute force approach. Others may have more
elegant approaches.

-- Bert


Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
H. Gilbert Welch




On Wed, Mar 26, 2014 at 9:09 AM, Johannes Radinger
 wrote:
> Hi,
>
> I have data in a dataframe in following structure
> var1 <- c("a","b","c","a","b","c","a","b","c")
> var2 <- c("X","X","X","Y","Y","Y","Z","Z","Z")
> var3 <- c(1,2,2,5,2,6,7,4,4)
> df <- data.frame(var1,var2,var3)
>
> Now I'd like to calculate relative values of var3. This values
> should be relative to the base value (where var1=c) which is
> indicated for each group (var2).
>
> To illustrate how my result column should look like I divide
> the column var3 by a vector c(2,2,2,6,6,6,4,4,4) (= for each group
> of var2 the value c)
>
> Of course this can also be done like this:
> df$div <- rep(df$var3[df$var1=="c"],each=length(unique(df$var1)))
> df$result_calc <- df$var3/df$div
>
>
> However what when the dataframe is not as simple and not that well ordered
> as
> in the example here. So for example there is always a value c for each group
> but all the "c"s are clumped in the last rows of the dataframe or scatterd
> in a random
> mannar. Is there a simple way to still calculate such relative values.
> Probably with an approach using apply, but maybe someone can give me a hint.
> Or do I need to sort my dataframe in order to do such calculations?
>
> best,
>
> /Johannes
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dataframe calculations based on certain values of a column

2014-03-26 Thread Noah Marconi

dplyr's group_by and mutate can create those columns for you:

var1 <- c("a","b","c","a","b","c","a","b","c")
var2 <- c("X","X","X","Y","Y","Y","Z","Z","Z")
var3 <- c(1,2,2,5,2,6,7,4,4)
df <- data.frame(var1,var2,var3)


dt <- tbl_df(df)

dt %.%
  group_by(var2) %.%
  mutate(
div = var3[var1 == "c"],
result_calc = var3/div
  )


On 2014-03-26 12:09, Johannes Radinger wrote:

Hi,

I have data in a dataframe in following structure
var1 <- c("a","b","c","a","b","c","a","b","c")
var2 <- c("X","X","X","Y","Y","Y","Z","Z","Z")
var3 <- c(1,2,2,5,2,6,7,4,4)
df <- data.frame(var1,var2,var3)

Now I'd like to calculate relative values of var3. This values
should be relative to the base value (where var1=c) which is
indicated for each group (var2).

To illustrate how my result column should look like I divide
the column var3 by a vector c(2,2,2,6,6,6,4,4,4) (= for each group
of var2 the value c)

Of course this can also be done like this:
df$div <- rep(df$var3[df$var1=="c"],each=length(unique(df$var1)))
df$result_calc <- df$var3/df$div


However what when the dataframe is not as simple and not that well 
ordered

as
in the example here. So for example there is always a value c for each 
group
but all the "c"s are clumped in the last rows of the dataframe or 
scatterd

in a random
mannar. Is there a simple way to still calculate such relative values.
Probably with an approach using apply, but maybe someone can give me a 
hint.

Or do I need to sort my dataframe in order to do such calculations?

best,

/Johannes

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dataframe calculations based on certain values of a column

2014-03-26 Thread Berend Hasselman

On 26-03-2014, at 17:09, Johannes Radinger  wrote:

> Hi,
> 
> I have data in a dataframe in following structure
> var1 <- c("a","b","c","a","b","c","a","b","c")
> var2 <- c("X","X","X","Y","Y","Y","Z","Z","Z")
> var3 <- c(1,2,2,5,2,6,7,4,4)
> df <- data.frame(var1,var2,var3)
> 
> Now I'd like to calculate relative values of var3. This values
> should be relative to the base value (where var1=c) which is
> indicated for each group (var2).
> 
> To illustrate how my result column should look like I divide
> the column var3 by a vector c(2,2,2,6,6,6,4,4,4) (= for each group
> of var2 the value c)
> 
> Of course this can also be done like this:
> df$div <- rep(df$var3[df$var1=="c"],each=length(unique(df$var1)))
> df$result_calc <- df$var3/df$div
> 
> 
> However what when the dataframe is not as simple and not that well ordered
> as
> in the example here. So for example there is always a value c for each group
> but all the "c"s are clumped in the last rows of the dataframe or scatterd
> in a random
> mannar. Is there a simple way to still calculate such relative values.
> Probably with an approach using apply, but maybe someone can give me a hint.
> Or do I need to sort my dataframe in order to do such calculations?


Create a list splitting the data.frame into groups defined by column var2.
And perform the calculation you need. Like this

df <- data.frame(var1,var2,var3, stringsAsFactors=FALSE)
L <- by(df,list(df$var2), FUN=function(x) { k <- which(x$var1=="c"); x$rel <- 
x$var3/x$var3[k];x})  


And then convert the list L back to a data.frame.

See the following two stackoverflow pages for the various ways this can be done.

http://stackoverflow.com/questions/4227223/r-list-to-data-frame
http://stackoverflow.com/questions/4512465/what-is-the-most-efficient-way-to-cast-a-list-as-a-data-frame?rq=1

Two methods from the first page:

data.frame(Reduce(rbind,L))

library (plyr)
ldply (L, data.frame)

and one method from the second page:

for this method

do.call(rbind,L)

Berend
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dataframe calculations based on certain values of a column

2014-03-26 Thread Bert Gunter
I believe this will generalize. But check carefully!

Using your example (Excellent!), use ave():

with(df,ave(seq_along(var1),var2,FUN=function(i)
var3[i]/var3[i][var1[i]=="c"]))

[1] 0.500 1.000 1.000 0.833 0.333 1.000 1.750
[8] 1.000 1.000

This is kind of a low level brute force approach. Others may have more
elegant approaches.

-- Bert


Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
H. Gilbert Welch




On Wed, Mar 26, 2014 at 9:09 AM, Johannes Radinger
 wrote:
> Hi,
>
> I have data in a dataframe in following structure
> var1 <- c("a","b","c","a","b","c","a","b","c")
> var2 <- c("X","X","X","Y","Y","Y","Z","Z","Z")
> var3 <- c(1,2,2,5,2,6,7,4,4)
> df <- data.frame(var1,var2,var3)
>
> Now I'd like to calculate relative values of var3. This values
> should be relative to the base value (where var1=c) which is
> indicated for each group (var2).
>
> To illustrate how my result column should look like I divide
> the column var3 by a vector c(2,2,2,6,6,6,4,4,4) (= for each group
> of var2 the value c)
>
> Of course this can also be done like this:
> df$div <- rep(df$var3[df$var1=="c"],each=length(unique(df$var1)))
> df$result_calc <- df$var3/df$div
>
>
> However what when the dataframe is not as simple and not that well ordered
> as
> in the example here. So for example there is always a value c for each group
> but all the "c"s are clumped in the last rows of the dataframe or scatterd
> in a random
> mannar. Is there a simple way to still calculate such relative values.
> Probably with an approach using apply, but maybe someone can give me a hint.
> Or do I need to sort my dataframe in order to do such calculations?
>
> best,
>
> /Johannes
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] dataframe calculations based on certain values of a column

2014-03-26 Thread Johannes Radinger
Hi,

I have data in a dataframe in following structure
var1 <- c("a","b","c","a","b","c","a","b","c")
var2 <- c("X","X","X","Y","Y","Y","Z","Z","Z")
var3 <- c(1,2,2,5,2,6,7,4,4)
df <- data.frame(var1,var2,var3)

Now I'd like to calculate relative values of var3. This values
should be relative to the base value (where var1=c) which is
indicated for each group (var2).

To illustrate how my result column should look like I divide
the column var3 by a vector c(2,2,2,6,6,6,4,4,4) (= for each group
of var2 the value c)

Of course this can also be done like this:
df$div <- rep(df$var3[df$var1=="c"],each=length(unique(df$var1)))
df$result_calc <- df$var3/df$div


However what when the dataframe is not as simple and not that well ordered
as
in the example here. So for example there is always a value c for each group
but all the "c"s are clumped in the last rows of the dataframe or scatterd
in a random
mannar. Is there a simple way to still calculate such relative values.
Probably with an approach using apply, but maybe someone can give me a hint.
Or do I need to sort my dataframe in order to do such calculations?

best,

/Johannes

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dataframe calculations

2010-03-22 Thread Petr PIKAL
Hi

If I understand correctly you want to add wait and travel time to first 
arrive for each block of data in one day

test<-SCHEDULE2

test$ARRIVE[test$ARRIVE==0]<-NA
library(zoo)
test$ARRIVE<-na.locf(test$ARRIVE)
datumA<-paste(paste(test$MM, test$DD, test$YEAR, sep="."), test$ARRIVE, 
sep=" ")
datumA<-strptime(datumA, format="%m.%d.%Y %H:%M:%S")


w<-cumsum(test$WAIT[1:4]*60)
tr<-cumsum(test$TRAVEL[1:4]*60)
arrivals <- datumA[1:4]+w+tr
departures <- datumA[1:4]+w+c(0,tr[1:3])

now  you can either make a cycle in which you choose appropriate values 
from your data frame or try to look at split/lapply/sapply solution. I 
would try a cycle with such index

idx<-seq(1,316,4)

for (i in idx) {

wi <- cumsum(test$WAIT[i:(i+4)]*60)
tri <- cumsum(test$TRAVEL[i:(i+4)]*60)
arrivals <- datumA[i:(i+4)]+wi+tri
departures <- datumA[i:(i+4)]+wi+c(0,tri[1:3])
test$ARRIVALS [i+1:i+3] <- arrivals[1:3]
test$DEPARTURES[i:i+4] <- departures
}

untested

Regards
Petr



r-help-boun...@r-project.org napsal dne 19.03.2010 18:58:09:

> Unfortunately, that did not correct the problem. Times for 'ARRIVE' need 
to be
> either 07:00:00 or 14:30:00 for the first case of each unique 'MM' by 
'DD' 
> subgroup (the others will be calculated), and the code produces 
calculations 
> that I can't interpret from the fixed numbers. Also, 'ARRIVE' and 
'DEPART' 
> incorrectly have the same value for the first case of each unique 'MM' 
by 'DD'
> subgroup. 'DEPART' should equal 'ARRIVE' plus the 'WAIT' time in minutes 
of 
> the same line.
> 
> Thank you,
> 
> Mike
> 
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] 
On 
> Behalf Of Erich Neuwirth
> Sent: Friday, March 19, 2010 1:33 PM
> To: r-help@r-project.org
> Subject: Re: [R] Dataframe calculations
> 
> Sorry,
> Oddly I got the use of odds and evens the wrong way round.
> 
> addDelays <- function(arriveTime,waitVec,travelVec){
>   start<-as.POSIXct(arriveTime,format="%H:%M:%S")
>   delays<-as.vector(t(cbind(waitVec,travelVec)))
>   newtimes<-format(start+cumsum(delays)*60,format="%H:%M:%S")
>   list(departs=c(arriveTime,(evens(newtimes))[-1]),
>arrives=odds(newtimes))
> }
> 
> Using the new definition of addDelays above should do the trick.
> 
> 
> 
> On 3/19/2010 5:30 PM, Hosack, Michael wrote:
> > Erich,
> >
> > Thank you so much for the effort you put into writing this code.
> >  I ran it and then assigned the two variables you created to the
> > 'ARRIVE' and 'DEPART' variables of my dataframe as you directed and
> > the resultant calculations were incorrect. I am not sure why it did
> > not work, I do not yet grasp the coding, I am still a novice.
> > Perhaps you or someone else could rerun your code on my original
> > dataframe and see why it did not yield the correct results.
> >
> > Thank you,
> >
> > Mike
> >
> > -Original Message-
> > From: r-help-boun...@r-project.org [
mailto:r-help-boun...@r-project.org] On 
> Behalf Of Erich Neuwirth
> > Sent: Friday, March 19, 2010 11:38 AM
> > To: r-help@r-project.org
> > Subject: Re: [R] Dataframe calculations
> >
> > with the following code
> >
> > newvars()$ARRIVALS and newvars()$DEPARTURES
> > will give you the new variables you need.
> >
> >
> > -=-=-=
> >
> >
> > addDelays <- function(arriveTime,waitVec,travelVec){
> >   start<-as.POSIXct(arriveTime,format="%H:%M:%S")
> >   delays<-as.vector(t(cbind(waitVec,travelVec)))
> >   newtimes<-format(start+cumsum(delays)*60,format="%H:%M:%S")
> >   list(departs=c(arriveTime,(odds(newtimes))[-1]),
> >arrives=evens(newtimes))
> > }
> >
> > odds <- function(inVec){
> >   indvec<-0:(floor((length(inVec)-1)/2))
> >   inVec[2*indvec+1]
> > }
> >
> > evens <- function(inVec){
> >   odds(inVec[-1])
> > }
> >
> >
> > newvars <- function(){
> >   DATE<-with(SCHEDULE2,paste(YEAR,MM,DD,sep=""))
> >   starts<-as.list(with(SCHEDULE2,tapply(ARRIVE,DATE,function(x)x[1])))
> >   waits<-with(SCHEDULE2,tapply(WAIT,DATE,function(x)x))
> >   travels<-with(SCHEDULE2,tapply(TRAVEL,DATE,function(x)x))
> >   list(DEPARTURES=
> >
> > 
as.vector(mapply(function(...)addDelays(...)$departs,starts,waits,travels)),
> > ARRIVALS=
> >
> > 
as.vector(mapply(function(...)addDelays(...)$arrives,start

Re: [R] Dataframe calculations

2010-03-19 Thread jim holtman
try this:

# add 'date' to separate the data
SCHEDULE2 <- within(SCHEDULE2, {
date <- paste(YEAR, '-', MM, '-', DD, sep='')
ARRIVE <- as.POSIXct(paste(date, ARRIVE))
DEPART <- as.POSIXct(paste(date, DEPART))
})
# process each day
result <- lapply(split(SCHEDULE2, SCHEDULE2$date), function(.day){
# assume first line is complete; convert to POSIXct
for (i in 2:nrow(.day)){
.day$ARRIVE[i] <- .day$DEPART[i - 1L] + (.day$TRAVEL[i - 1L] * 60)
.day$DEPART[i] <- .day$ARRIVE[i] + (.day$WAIT[i] * 60)
}
# return the changes
.day
})
SCHEDULE2 <- do.call(rbind, result)


On Fri, Mar 19, 2010 at 9:05 AM, Hosack, Michael wrote:

> Hi everyone,
>
> My question will probably seem simple to most of you, but I
> have spent many hours trying to solve it. I need to perform
> a series of sequential calculations on my dataframe that move
> across rows and down columns, and then repeat themselves at
> each unique 'MM' by 'DD' grouping. Specifically, I want to add
> 'DEPART' time (24 hr time) to 'TRAVEL'(minutes) in line 1 and
> put the result in 'ARRIVE' (24 hr time) of line 2, then I want
> to add 'WAIT' (minutes) to that 'ARRIVE' time of line 2 to
> create 'DEPART', which will then be combined with 'TRAVEL'
> (minutes) to yield the 'ARRIVE' time of line 3, etc. This
> series of calc's will start anew beginning at each unique 'MM'
> by 'DD' grouping. Any advice would be greatly appreciated.
>
> Thank you,
>
> Mike
>
> SCHEDULE2 <-
> structure(list(MM = c("05", "05", "05", "05", "05", "05", "05",
> "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05",
> "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05",
> "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05",
> "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05",
> "05", "05", "05", "05", "05", "06", "06", "06", "06", "06", "06",
> "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06",
> "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06",
> "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06",
> "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06",
> "06", "06", "07", "07", "07", "07", "07", "07", "07", "07", "07",
> "07", "07", "07", "07", "07", "07", "07", "07", "07", "07", "07",
> "07", "07", "07", "07", "07", "07", "07", "07", "07", "07", "07",
> "07", "07", "07", "07", "07", "07", "07", "07", "07", "07", "07",
> "07", "07", "07", "07", "07", "07", "08", "08", "08", "08", "08",
> "08", "08", "08", "08", "08", "08", "08", "08", "08", "08", "08",
> "08", "08", "08", "08", "08", "08", "08", "08", "08", "08", "08",
> "08", "08", "08", "08", "08", "08", "08", "08", "08", "08", "08",
> "08", "08", "08", "08", "08", "08", "08", "08", "08", "08", "08",
> "08", "08", "08", "08", "08", "08", "08", "09", "09", "09", "09",
> "09", "09", "09", "09", "09", "09", "09", "09", "09", "09", "09",
> "09", "09", "09", "09", "09", "09", "09", "09", "09", "09", "09",
> "09", "09", "09", "09", "09", "09", "09", "09", "09", "09", "09",
> "09", "09", "09", "09", "09", "09", "09", "09", "09", "09", "09",
> "09", "09", "09", "09", "10", "10", "10", "10", "10", "10", "10",
> "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10",
> "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10",
> "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10",
> "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10",
> "10"), DD = c("02", "02", "02", "02", "03", "03", "03", "03",
> "06", "06", "06", "06", "09", "09", "09", "09", "10", "10", "10",
> "10", "14", "14", "14", "14", "16", "16", "16", "16", "17", "17",
> "17", "17", "19", "19", "19", "19", "22", "22", "22", "22", "24",
> "24", "24", "24", "27", "27", "27", "27", "29", "29", "29", "29",
> "31", "31", "31", "31", "04", "04", "04", "04", "06", "06", "06",
> "06", "07", "07", "07", "07", "10", "10", "10", "10", "12", "12",
> "12", "12", "16", "16", "16", "16", "17", "17", "17", "17", "19",
> "19", "19", "19", "22", "22", "22", "22", "23", "23", "23", "23",
> "27", "27", "27", "27", "28", "28", "28", "28", "29", "29", "29",
> "29", "03", "03", "03", "03", "05", "05", "05", "05", "09", "09",
> "09", "09", "10", "10", "10", "10", "13", "13", "13", "13", "14",
> "14", "14", "14", "18", "18", "18", "18", "22", "22", "22", "22",
> "23", "23", "23", "23", "24", "24", "24", "24", "27", "27", "27",
> "27", "28", "28", "28", "28", "01", "01", "01", "01", "04", "04",
> "04", "04", "06", "06", "06", "06", "07", "07", "07", "07", "12",
> "12", "12", "12", "13", "13", "13", "13", "14", "14", "14", "14",
> "16", "16", "16", "16", "19", "19", "19", "19", "21", "21", "21",
> "21", "23", "23", "23", "23", "24", "24", "24", "24", "28", "28",
> "28", "28", "31", "31", "31", "31", "02", "02", "02", "02", "04",
> "04", "04", "04", "08", "08", "08", "08", "09", "09", "09", "09",
> "11", "11", "11", "11", "14", "14", "14", "14", "16", "16", "16",
> "16", "19", "19", "19", "19", "20", "20", "20", "20", "

Re: [R] Dataframe calculations

2010-03-19 Thread Hosack, Michael
Unfortunately, that did not correct the problem. Times for 'ARRIVE' need to be 
either 07:00:00 or 14:30:00 for the first case of each unique 'MM' by 'DD' 
subgroup (the others will be calculated), and the code produces calculations 
that I can't interpret from the fixed numbers. Also, 'ARRIVE' and 'DEPART' 
incorrectly have the same value for the first case of each unique 'MM' by 'DD' 
subgroup. 'DEPART' should equal 'ARRIVE' plus the 'WAIT' time in minutes of the 
same line.

Thank you,

Mike

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Erich Neuwirth
Sent: Friday, March 19, 2010 1:33 PM
To: r-help@r-project.org
Subject: Re: [R] Dataframe calculations

Sorry,
Oddly I got the use of odds and evens the wrong way round.

addDelays <- function(arriveTime,waitVec,travelVec){
  start<-as.POSIXct(arriveTime,format="%H:%M:%S")
  delays<-as.vector(t(cbind(waitVec,travelVec)))
  newtimes<-format(start+cumsum(delays)*60,format="%H:%M:%S")
  list(departs=c(arriveTime,(evens(newtimes))[-1]),
   arrives=odds(newtimes))
}

Using the new definition of addDelays above should do the trick.



On 3/19/2010 5:30 PM, Hosack, Michael wrote:
> Erich,
>
> Thank you so much for the effort you put into writing this code.
>  I ran it and then assigned the two variables you created to the
> 'ARRIVE' and 'DEPART' variables of my dataframe as you directed and
> the resultant calculations were incorrect. I am not sure why it did
> not work, I do not yet grasp the coding, I am still a novice.
> Perhaps you or someone else could rerun your code on my original
> dataframe and see why it did not yield the correct results.
>
> Thank you,
>
> Mike
>
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf Of Erich Neuwirth
> Sent: Friday, March 19, 2010 11:38 AM
> To: r-help@r-project.org
> Subject: Re: [R] Dataframe calculations
>
> with the following code
>
> newvars()$ARRIVALS and newvars()$DEPARTURES
> will give you the new variables you need.
>
>
> -=-=-=
>
>
> addDelays <- function(arriveTime,waitVec,travelVec){
>   start<-as.POSIXct(arriveTime,format="%H:%M:%S")
>   delays<-as.vector(t(cbind(waitVec,travelVec)))
>   newtimes<-format(start+cumsum(delays)*60,format="%H:%M:%S")
>   list(departs=c(arriveTime,(odds(newtimes))[-1]),
>arrives=evens(newtimes))
> }
>
> odds <- function(inVec){
>   indvec<-0:(floor((length(inVec)-1)/2))
>   inVec[2*indvec+1]
> }
>
> evens <- function(inVec){
>   odds(inVec[-1])
> }
>
>
> newvars <- function(){
>   DATE<-with(SCHEDULE2,paste(YEAR,MM,DD,sep=""))
>   starts<-as.list(with(SCHEDULE2,tapply(ARRIVE,DATE,function(x)x[1])))
>   waits<-with(SCHEDULE2,tapply(WAIT,DATE,function(x)x))
>   travels<-with(SCHEDULE2,tapply(TRAVEL,DATE,function(x)x))
>   list(DEPARTURES=
>
> as.vector(mapply(function(...)addDelays(...)$departs,starts,waits,travels)),
> ARRIVALS=
>
> as.vector(mapply(function(...)addDelays(...)$arrives,starts,waits,travels)))
> }
>
>
>
> SCHEDULE2 <-
> structure(list(MM = c("05", "05", "05", "05", "05", "05", "05", "05", "05", 
> "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", 
> "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", 
> "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", 
> "05", "05", "05", "05", "05", "05", "05", "05", "06", "06", "06", "06", "06", 
> "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", 
> "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", 
> "06", "06", "06", "06&qu

Re: [R] Dataframe calculations

2010-03-19 Thread Erich Neuwirth
Sorry,
Oddly I got the use of odds and evens the wrong way round.

addDelays <- function(arriveTime,waitVec,travelVec){
  start<-as.POSIXct(arriveTime,format="%H:%M:%S")
  delays<-as.vector(t(cbind(waitVec,travelVec)))
  newtimes<-format(start+cumsum(delays)*60,format="%H:%M:%S")
  list(departs=c(arriveTime,(evens(newtimes))[-1]),
   arrives=odds(newtimes))
}

Using the new definition of addDelays above should do the trick.



On 3/19/2010 5:30 PM, Hosack, Michael wrote:
> Erich,
> 
> Thank you so much for the effort you put into writing this code.
>  I ran it and then assigned the two variables you created to the
> 'ARRIVE' and 'DEPART' variables of my dataframe as you directed and
> the resultant calculations were incorrect. I am not sure why it did
> not work, I do not yet grasp the coding, I am still a novice.
> Perhaps you or someone else could rerun your code on my original
> dataframe and see why it did not yield the correct results.
> 
> Thank you,
> 
> Mike
> 
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf Of Erich Neuwirth
> Sent: Friday, March 19, 2010 11:38 AM
> To: r-help@r-project.org
> Subject: Re: [R] Dataframe calculations
> 
> with the following code
> 
> newvars()$ARRIVALS and newvars()$DEPARTURES
> will give you the new variables you need.
> 
> 
> -=-=-=
> 
> 
> addDelays <- function(arriveTime,waitVec,travelVec){
>   start<-as.POSIXct(arriveTime,format="%H:%M:%S")
>   delays<-as.vector(t(cbind(waitVec,travelVec)))
>   newtimes<-format(start+cumsum(delays)*60,format="%H:%M:%S")
>   list(departs=c(arriveTime,(odds(newtimes))[-1]),
>arrives=evens(newtimes))
> }
> 
> odds <- function(inVec){
>   indvec<-0:(floor((length(inVec)-1)/2))
>   inVec[2*indvec+1]
> }
> 
> evens <- function(inVec){
>   odds(inVec[-1])
> }
> 
> 
> newvars <- function(){
>   DATE<-with(SCHEDULE2,paste(YEAR,MM,DD,sep=""))
>   starts<-as.list(with(SCHEDULE2,tapply(ARRIVE,DATE,function(x)x[1])))
>   waits<-with(SCHEDULE2,tapply(WAIT,DATE,function(x)x))
>   travels<-with(SCHEDULE2,tapply(TRAVEL,DATE,function(x)x))
>   list(DEPARTURES=
> 
> as.vector(mapply(function(...)addDelays(...)$departs,starts,waits,travels)),
> ARRIVALS=
> 
> as.vector(mapply(function(...)addDelays(...)$arrives,starts,waits,travels)))
> }
> 
> 
> 
> SCHEDULE2 <-
> structure(list(MM = c("05", "05", "05", "05", "05", "05", "05", "05", "05", 
> "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", 
> "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", 
> "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", 
> "05", "05", "05", "05", "05", "05", "05", "05", "06", "06", "06", "06", "06", 
> "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", 
> "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", 
> "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", 
> "06", "06", "06", "06", "06", "06", "06", "06", "07", "07", "07", "07", "07", 
> "07", "07", "07", "07", "07", "07", "07", "07", "07", "07", "07", "07", "07", 
> "07", "07", "07", "07", "07", "07", "07", "07", "07", "07", "07", "07", "07", 
> "07", &q

Re: [R] Dataframe calculations

2010-03-19 Thread Hosack, Michael
Erich,

Thank you so much for the effort you put into writing this code.
 I ran it and then assigned the two variables you created to the
'ARRIVE' and 'DEPART' variables of my dataframe as you directed and
the resultant calculations were incorrect. I am not sure why it did
not work, I do not yet grasp the coding, I am still a novice.
Perhaps you or someone else could rerun your code on my original
dataframe and see why it did not yield the correct results.

Thank you,

Mike

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Erich Neuwirth
Sent: Friday, March 19, 2010 11:38 AM
To: r-help@r-project.org
Subject: Re: [R] Dataframe calculations

with the following code

newvars()$ARRIVALS and newvars()$DEPARTURES
will give you the new variables you need.


-=-=-=


addDelays <- function(arriveTime,waitVec,travelVec){
  start<-as.POSIXct(arriveTime,format="%H:%M:%S")
  delays<-as.vector(t(cbind(waitVec,travelVec)))
  newtimes<-format(start+cumsum(delays)*60,format="%H:%M:%S")
  list(departs=c(arriveTime,(odds(newtimes))[-1]),
   arrives=evens(newtimes))
}

odds <- function(inVec){
  indvec<-0:(floor((length(inVec)-1)/2))
  inVec[2*indvec+1]
}

evens <- function(inVec){
  odds(inVec[-1])
}


newvars <- function(){
  DATE<-with(SCHEDULE2,paste(YEAR,MM,DD,sep=""))
  starts<-as.list(with(SCHEDULE2,tapply(ARRIVE,DATE,function(x)x[1])))
  waits<-with(SCHEDULE2,tapply(WAIT,DATE,function(x)x))
  travels<-with(SCHEDULE2,tapply(TRAVEL,DATE,function(x)x))
  list(DEPARTURES=

as.vector(mapply(function(...)addDelays(...)$departs,starts,waits,travels)),
ARRIVALS=

as.vector(mapply(function(...)addDelays(...)$arrives,starts,waits,travels)))
}



SCHEDULE2 <-
structure(list(MM = c("05", "05", "05", "05", "05", "05", "05", "05", "05", 
"05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", 
"05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", 
"05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", 
"05", "05", "05", "05", "05", "05", "05", "05", "06", "06", "06", "06", "06", 
"06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", 
"06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", 
"06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", 
"06", "06", "06", "06", "06", "06", "06", "06", "07", "07", "07", "07", "07", 
"07", "07", "07", "07", "07", "07", "07", "07", "07", "07", "07", "07", "07", 
"07", "07", "07", "07", "07", "07", "07", "07", "07", "07", "07", "07", "07", 
"07", "07", "07", "07", "07", "07", "07", "07", "07", "07", "07", "07", "07", 
"07", "07", "07", "07", "08", "08", "08", "08", "08", "!
 08", "08", "08", "08", "08", "08", "08", "08", "08", "08", "08", "08", "08", 
"08", "08", "08", "08", "08", "08", "08", "08", "08", "08", "08", "08", "08", 
"08", "08", "08", "08", "08", "08", "08", "08", "0

Re: [R] Dataframe calculations

2010-03-19 Thread Erich Neuwirth
with the following code

newvars()$ARRIVALS and newvars()$DEPARTURES
will give you the new variables you need.


-=-=-=


addDelays <- function(arriveTime,waitVec,travelVec){
  start<-as.POSIXct(arriveTime,format="%H:%M:%S")
  delays<-as.vector(t(cbind(waitVec,travelVec)))
  newtimes<-format(start+cumsum(delays)*60,format="%H:%M:%S")
  list(departs=c(arriveTime,(odds(newtimes))[-1]),
   arrives=evens(newtimes))
}

odds <- function(inVec){
  indvec<-0:(floor((length(inVec)-1)/2))
  inVec[2*indvec+1]
}

evens <- function(inVec){
  odds(inVec[-1])
}


newvars <- function(){
  DATE<-with(SCHEDULE2,paste(YEAR,MM,DD,sep=""))
  starts<-as.list(with(SCHEDULE2,tapply(ARRIVE,DATE,function(x)x[1])))
  waits<-with(SCHEDULE2,tapply(WAIT,DATE,function(x)x))
  travels<-with(SCHEDULE2,tapply(TRAVEL,DATE,function(x)x))
  list(DEPARTURES=

as.vector(mapply(function(...)addDelays(...)$departs,starts,waits,travels)),
ARRIVALS=

as.vector(mapply(function(...)addDelays(...)$arrives,starts,waits,travels)))
}



-- 
Erich Neuwirth, University of Vienna
Faculty of Computer Science
Computer Supported Didactics Working Group
Visit our SunSITE at http://sunsite.univie.ac.at
Phone: +43-1-4277-39464 Fax: +43-1-4277-39459

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Dataframe calculations

2010-03-19 Thread Hosack, Michael
Hi everyone,

My question will probably seem simple to most of you, but I
have spent many hours trying to solve it. I need to perform
a series of sequential calculations on my dataframe that move
across rows and down columns, and then repeat themselves at
each unique 'MM' by 'DD' grouping. Specifically, I want to add
'DEPART' time (24 hr time) to 'TRAVEL'(minutes) in line 1 and
put the result in 'ARRIVE' (24 hr time) of line 2, then I want
to add 'WAIT' (minutes) to that 'ARRIVE' time of line 2 to
create 'DEPART', which will then be combined with 'TRAVEL'
(minutes) to yield the 'ARRIVE' time of line 3, etc. This
series of calc's will start anew beginning at each unique 'MM'
by 'DD' grouping. Any advice would be greatly appreciated.

Thank you,

Mike

SCHEDULE2 <-
structure(list(MM = c("05", "05", "05", "05", "05", "05", "05",
"05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05",
"05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05",
"05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05",
"05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05",
"05", "05", "05", "05", "05", "06", "06", "06", "06", "06", "06",
"06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06",
"06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06",
"06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06",
"06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "06",
"06", "06", "07", "07", "07", "07", "07", "07", "07", "07", "07",
"07", "07", "07", "07", "07", "07", "07", "07", "07", "07", "07",
"07", "07", "07", "07", "07", "07", "07", "07", "07", "07", "07",
"07", "07", "07", "07", "07", "07", "07", "07", "07", "07", "07",
"07", "07", "07", "07", "07", "07", "08", "08", "08", "08", "08",
"08", "08", "08", "08", "08", "08", "08", "08", "08", "08", "08",
"08", "08", "08", "08", "08", "08", "08", "08", "08", "08", "08",
"08", "08", "08", "08", "08", "08", "08", "08", "08", "08", "08",
"08", "08", "08", "08", "08", "08", "08", "08", "08", "08", "08",
"08", "08", "08", "08", "08", "08", "08", "09", "09", "09", "09",
"09", "09", "09", "09", "09", "09", "09", "09", "09", "09", "09",
"09", "09", "09", "09", "09", "09", "09", "09", "09", "09", "09",
"09", "09", "09", "09", "09", "09", "09", "09", "09", "09", "09",
"09", "09", "09", "09", "09", "09", "09", "09", "09", "09", "09",
"09", "09", "09", "09", "10", "10", "10", "10", "10", "10", "10",
"10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10",
"10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10",
"10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10",
"10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10",
"10"), DD = c("02", "02", "02", "02", "03", "03", "03", "03",
"06", "06", "06", "06", "09", "09", "09", "09", "10", "10", "10",
"10", "14", "14", "14", "14", "16", "16", "16", "16", "17", "17",
"17", "17", "19", "19", "19", "19", "22", "22", "22", "22", "24",
"24", "24", "24", "27", "27", "27", "27", "29", "29", "29", "29",
"31", "31", "31", "31", "04", "04", "04", "04", "06", "06", "06",
"06", "07", "07", "07", "07", "10", "10", "10", "10", "12", "12",
"12", "12", "16", "16", "16", "16", "17", "17", "17", "17", "19",
"19", "19", "19", "22", "22", "22", "22", "23", "23", "23", "23",
"27", "27", "27", "27", "28", "28", "28", "28", "29", "29", "29",
"29", "03", "03", "03", "03", "05", "05", "05", "05", "09", "09",
"09", "09", "10", "10", "10", "10", "13", "13", "13", "13", "14",
"14", "14", "14", "18", "18", "18", "18", "22", "22", "22", "22",
"23", "23", "23", "23", "24", "24", "24", "24", "27", "27", "27",
"27", "28", "28", "28", "28", "01", "01", "01", "01", "04", "04",
"04", "04", "06", "06", "06", "06", "07", "07", "07", "07", "12",
"12", "12", "12", "13", "13", "13", "13", "14", "14", "14", "14",
"16", "16", "16", "16", "19", "19", "19", "19", "21", "21", "21",
"21", "23", "23", "23", "23", "24", "24", "24", "24", "28", "28",
"28", "28", "31", "31", "31", "31", "02", "02", "02", "02", "04",
"04", "04", "04", "08", "08", "08", "08", "09", "09", "09", "09",
"11", "11", "11", "11", "14", "14", "14", "14", "16", "16", "16",
"16", "19", "19", "19", "19", "20", "20", "20", "20", "21", "21",
"21", "21", "26", "26", "26", "26", "27", "27", "27", "27", "29",
"29", "29", "29", "03", "03", "03", "03", "05", "05", "05", "05",
"08", "08", "08", "08", "10", "10", "10", "10", "14", "14", "14",
"14", "15", "15", "15", "15", "16", "16", "16", "16", "20", "20",
"20", "20", "21", "21", "21", "21", "24", "24", "24", "24", "26",
"26", "26", "26", "29", "29", "29", "29", "30", "30", "30", "30"
), YEAR = c("2010", "2010", "2010", "2010", "2010", "2010", "2010",
"2010", "2010", "2010", "2010", "2010", "2010", "2010", "2010",
"2010", "2010", "2010", "2010", "2010", "2010", "2010", "2010",
"2010", "2010", "2010", "2010", "2010", "2010", "2010", "2010",
"2010", "2010", "2010", "2010", "2010", "2010", "2010", "2010",
"2010", "2010", "2010", "2010", "2010", "2010", "2010", "2010",
"2010", "2010",