Re: [R] create a index.date column

jose Bartolomei Thu, 28 Jul 2011 08:52:23 -0700


Dear Dennis,
I appreciate your time.
I studied and implemented
your recommendation and is exactly what I needed.

I was in an unproductive
loop in this.

Thank you very much,
Jose
> Date: Wed, 27 Jul 2011 14:28:51 -0700
> Subject: Re: [R] create a index.date column
> From: djmu...@gmail.com
> To: surfpr...@hotmail.com
> CC: r-help@r-project.org
> 
> Hi:
> 
> I prefer to use one of the summarization packages for this sort of
> thing, but aggregate() works, too. Here are two versions of the same
> idea:
> 
> # Uses ddply() in the plyr package:
> index.date <- function(d) {
>      require('plyr')
>      out1 <- ddply(d, .(id, rcat), summarise, index = max(tdiff))
>      ndate <- as.numeric(as.Date('2002-09-01')) - out1[['index']]
>      out1$index.date <- as.Date(ndate, origin = '1970-01-01')
>      out1 <- out1[, -3]
>      out1
>    }
> 
> # Uses aggregate() from the base package:
> index.date2 <- function(d) {
>      out <- aggregate(tdiff ~ id + rcat, data = d, FUN = max)
>     ndate <- as.numeric(as.Date('2002-09-01')) - out[['tdiff']]
>      out$index.date <- as.Date(ndate, origin = '1970-01-01')
>      out <- out[, -3]
>      out
>    }
> 
> index.date(test)
> index.date2(test)
> 
> In each function, I did the following:
> 
> (1) Found the maximum time difference from the reference date 2002-09-01.
> (2) Determined the numeric value of the date associated with the max
> time difference (ndate)
> (3) Determined the date associated with the maximum time difference
> and assigned it the variable name index.date in the output data frame.
> (4) Removed the variable computed in (1) from the output data frame.
> (5) Return the output data frame and exit.
> 
> HTH,
> Dennis
> 
> On Wed, Jul 27, 2011 at 6:38 AM, jose Bartolomei <surfpr...@hotmail.com> 
> wrote:
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > Dear
> > R users,
> >
> >
> >
> > I
> > created a matrix that tells me the first day of use of a category by
> > id.
> >
> >
> >
> > #Calculate
> > time difference
> > test$tdiff<-as.numeric(difftime(as.Date("2002-09-01"), test$ftime, units = 
> > "days"))
> >
> >
> >
> > #
> > obtain the index date per person and dcategory
> > index.date.test<-tapply(test$tdiff,
> > list(test$id, test$rcat), max)
> >
> >
> >
> > Nonetheless,
> > at the moment I think will be more useful to create a column in my
> > data that tells me which row is the index date.
> >
> >
> >
> >
> > Something
> > like:
> >
> >
> >
> > ti<-function(x){
> >        ifelse(x==max(x),
> > "i", "n") # x = test$tdiff
> > }
> >
> >
> >
> > tapply(test$tdiff,
> > list(test$rcat, test$id), FUN=ti)
> >
> >
> >
> > I
> > have been testing different things for few days but I am in a loop
> > and I do not see my mistake.
> >
> >
> >
> > It
> > should be simple but I can't get it
> >
> >
> >
> > Bellow
> > a test data
> >
> >
> >
> > Thanks in advance for your time,
> > Jose
> > Back ground info:  I want to use the index.date to obtain information from 
> > other df for every id six month previous the index date.
> > Then I should normalize the ftime to a common time frame and look form 
> > patterns in that time frame.
> > (Do not know yet how I will do it. )
> >
> >
> >
> >
> >
> >
> > ###
> > test data ####
> >
> >
> >
> > structure(list(id = c(1L, 1L, 1L, 46L, 80L, 80L, 80L, 80L, 88L,
> > 160L, 179L, 179L, 179L, 179L, 179L, 179L, 192L, 192L, 192L, 204L,
> > 204L, 204L, 204L, 205L, 211L, 233L, 233L, 272L, 272L, 272L, 272L,
> > 309L, 309L, 309L, 310L, 310L, 314L, 314L, 315L, 316L, 320L, 320L,
> > 320L, 320L, 324L, 324L, 324L, 329L, 329L, 339L, 354L, 354L, 354L,
> > 357L, 358L, 359L, 364L, 366L, 377L, 377L, 377L, 377L, 377L, 377L,
> > 377L, 377L, 377L, 377L, 377L, 377L, 379L, 383L, 383L, 387L, 387L,
> > 391L, 395L, 398L, 401L, 401L, 401L, 401L, 401L, 407L, 407L, 407L,
> > 409L, 414L, 414L, 414L, 434L, 434L, 434L, 437L, 437L, 437L, 437L,
> > 437L, 439L, 439L, 439L, 439L, 442L, 443L, 450L, 452L, 452L, 459L,
> > 459L, 468L, 472L, 472L, 472L, 478L, 478L, 484L, 484L, 484L, 484L,
> > 484L, 486L, 486L, 486L, 487L, 487L, 487L, 487L, 487L), ftime = 
> > structure(c(11761,
> > 11824, 11925, 11852, 11814, 11814, 11929, 11929, 11902, 11857,
> > 11779, 11779, 11807, 11841, 11871, 11899, 11831, 11894, 11925,
> > 11761, 11801, 11843, 11905, 11832, 11877, 11838, 11901, 11783,
> > 11783, 11818, 11850, 11750, 11782, 11905, 11852, 11877, 11852,
> > 11922, 11855, 11838, 11845, 11878, 11901, 11927, 11795, 11817,
> > 11837, 11901, 11928, 11853, 11751, 11751, 11877, 11922, 11760,
> > 11914, 11857, 11912, 11752, 11752, 11785, 11785, 11825, 11825,
> > 11862, 11862, 11891, 11891, 11926, 11926, 11919, 11907, 11907,
> > 11842, 11873, 11842, 11922, 11865, 11782, 11829, 11858, 11888,
> > 11912, 11750, 11803, 11897, 11871, 11787, 11787, 11787, 11764,
> > 11817, 11882, 11778, 11808, 11863, 11894, 11918, 11771, 11817,
> > 11851, 11907, 11799, 11766, 11794, 11765, 11828, 11788, 11884,
> > 11897, 11810, 11852, 11922, 11810, 11846, 11801, 11835, 11859,
> > 11891, 11922, 11771, 11884, 11925, 11765, 11765, 11801, 11843,
> > 11892), class = "Date"), rcat = structure(c(1L, 1L, 1L, 1L, 1L,
> > 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L,
> > 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
> > 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> > 1L, 1L, 1L, 2L, 3L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> > 1L, 1L, 1L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> > 1L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L), .Label = c("ICS",
> > "LABA", "MCSs"), class = "factor"), tdiff = c(170, 107, 6, 79,
> > 117, 117, 2, 2, 29, 74, 152, 152, 124, 90, 60, 32, 100, 37, 6,
> > 170, 130, 88, 26, 99, 54, 93, 30, 148, 148, 113, 81, 181, 149,
> > 26, 79, 54, 79, 9, 76, 93, 86, 53, 30, 4, 136, 114, 94, 30, 3,
> > 78, 180, 180, 54, 9, 171, 17, 74, 19, 179, 179, 146, 146, 106,
> > 106, 69, 69, 40, 40, 5, 5, 12, 24, 24, 89, 58, 89, 9, 66, 149,
> > 102, 73, 43, 19, 181, 128, 34, 60, 144, 144, 144, 167, 114, 49,
> > 153, 123, 68, 37, 13, 160, 114, 80, 24, 132, 165, 137, 166, 103,
> > 143, 47, 34, 121, 79, 9, 121, 85, 130, 96, 72, 40, 9, 160, 47,
> > 6, 166, 166, 130, 88, 39)), .Names = c("id", "ftime", "rcat",
> > "tdiff"), row.names = c(11L, 4L, 13L, 25L, 39L, 41L, 35L, 44L,
> > 54L, 57L, 96L, 98L, 88L, 107L, 80L, 77L, 118L, 136L, 124L, 146L,
> > 150L, 157L, 153L, 169L, 196L, 210L, 214L, 225L, 230L, 221L, 222L,
> > 258L, 266L, 281L, 311L, 324L, 333L, 334L, 358L, 372L, 400L, 419L,
> > 423L, 434L, 439L, 437L, 443L, 479L, 465L, 496L, 517L, 516L, 519L,
> > 525L, 539L, 598L, 606L, 634L, 658L, 649L, 637L, 655L, 640L, 644L,
> > 645L, 636L, 647L, 646L, 639L, 654L, 665L, 673L, 680L, 701L, 688L,
> > 712L, 737L, 738L, 784L, 766L, 785L, 753L, 773L, 799L, 791L, 808L,
> > 818L, 826L, 821L, 820L, 838L, 830L, 837L, 841L, 840L, 844L, 850L,
> > 845L, 886L, 875L, 887L, 868L, 899L, 912L, 915L, 931L, 929L, 934L,
> > 939L, 957L, 988L, 975L, 981L, 1015L, 1003L, 1043L, 1051L, 1056L,
> > 1034L, 1031L, 1073L, 1068L, 1065L, 1079L, 1101L, 1092L, 1089L,
> > 1096L), class = "data.frame")
> >
> >
> >
> >        [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
                                          
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Re: [R] create a index.date column

Reply via email to