On Dec 3, 2014, at 2:10 PM, Muhuri, Pradip (SAMHSA/CBHSQ) wrote:

> Hello,
> 
> Two alternative approaches - mutate() vs. sapply() - were used to get the 
> desired results (i.e., creating a new column of the most recent date  from 4 
> dates ) with help from Arun and Mark on this forum.  I now find that the two 
> data objects (created using two different approaches) are not identical 
> although results are exactly the same.  
> 
> identical(new1, new2) 
> [1] FALSE
> 

You should have examined the output from dput() on both objects. I think you 
will find that dplyr is adding new attributes.

Notice the the "mutate()-ed" object now has this class:

class = c("rowwise_df", "tbl_df", "tbl", "data.frame")

Moral: Never rely on the the print representation.

-- 
David.


> Please see the reproducible example below.
> 
> I don't understand why the code returns FALSE here.  Any hints/comments  will 
> be  appreciated.
> 
> Thanks,
> 
> Pradip
> 
> #############################################  reproducible example 
> ########################################
> library(dplyr)
> # data object - description 
> 
> temp <- "id  mrjdate cocdate inhdate haldate
> 1     2004-11-04 2008-07-18 2005-07-07 2007-11-07
> 2             NA         NA         NA         NA     
> 3     2009-10-24         NA 2011-10-13         NA
> 4     2007-10-10         NA         NA         NA
> 5     2006-09-01 2005-08-10         NA         NA
> 6     2007-09-04 2011-10-05         NA         NA
> 7     2005-10-25         NA         NA 2011-11-04"
> 
> # read the data object
> 
> example.data <- read.table(textConnection(temp), 
>                    colClasses=c("character", "Date", "Date", "Date", "Date"), 
>  
>                    header=TRUE, as.is=TRUE
>                    )
> 
> 
> # create a new column -dplyr solution (Acknowledgement: Arun)
> 
> new1 <- example.data %>% 
>     rowwise() %>%
>      mutate(oldflag=as.Date(max(mrjdate,cocdate, inhdate, haldate,
>                                                               na.rm=TRUE), 
> origin='1970-01-01'))
> 
> # create a new column - Base R solution (Acknowlegement: Mark Sharp)
> 
> new2 <- example.data
> new2$oiddate <- as.Date(sapply(seq_along(new2$id), function(row) {
>  if (all(is.na(unlist(example.data[row, c('mrjdate','cocdate', 'inhdate', 
> 'haldate')])))) {
>    max_d <- NA
>  } else {
>    max_d <- max(unlist(example.data[row, c('mrjdate','cocdate', 'inhdate', 
> 'haldate')]), na.rm = TRUE)
>  }
>  max_d}),
>  origin = "1970-01-01")
> 
> identical(new1, new2) 
> 
> # print records
> 
> print (new1); print(new2)
> 
> Pradip K. Muhuri
> SAMHSA/CBHSQ
> 1 Choke Cherry Road, Room 2-1071
> Rockville, MD 20857
> Tel: 240-276-1070
> Fax: 240-276-1260
> 
> -----Original Message-----
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf Of Muhuri, Pradip (SAMHSA/CBHSQ)
> Sent: Sunday, November 09, 2014 6:11 AM
> To: 'Mark Sharp'
> Cc: r-help@r-project.org
> Subject: Re: [R] Getting the most recent dates in a new column from dates in 
> four columns using the dplyr package (mutate verb)
> 
> Hi Mark,
> 
> Your code has also given me the results I expected.  Thank you so much for 
> your help.
> 
> Regards,
> 
> Pradip
> 
> Pradip K. Muhuri, PhD
> SAMHSA/CBHSQ
> 1 Choke Cherry Road, Room 2-1071
> Rockville, MD 20857
> Tel: 240-276-1070
> Fax: 240-276-1260
> 
> 
> -----Original Message-----
> From: Mark Sharp [mailto:msh...@txbiomed.org] 
> Sent: Sunday, November 09, 2014 3:01 AM
> To: Muhuri, Pradip (SAMHSA/CBHSQ)
> Cc: r-help@r-project.org
> Subject: Re: [R] Getting the most recent dates in a new column from dates in 
> four columns using the dplyr package (mutate verb)
> 
> Pradip,
> 
> mutate() works on the entire column as a vector so that you find the maximum 
> of the entire data set.
> 
> I am almost certain there is some nice way to handle this, but the sapply() 
> function is a standard approach.
> 
> max() does not want a dataframe thus the use of unlist().
> 
> Using your definition of data1:
> 
> data3 <- data1
> data3$oidflag <- as.Date(sapply(seq_along(data3$id), function(row) {
>  if (all(is.na(unlist(data1[row, -1])))) {
>    max_d <- NA
>  } else {
>    max_d <- max(unlist(data1[row, -1]), na.rm = TRUE)
>  }
>  max_d}),
>  origin = "1970-01-01")
> 
> data3
>  id    mrjdate    cocdate    inhdate    haldate    oidflag
> 1  1 2004-11-04 2008-07-18 2005-07-07 2007-11-07 2008-07-18
> 2  2       <NA>       <NA>       <NA>       <NA>       <NA>
> 3  3 2009-10-24       <NA> 2011-10-13       <NA> 2011-10-13
> 4  4 2007-10-10       <NA>       <NA>       <NA> 2007-10-10
> 5  5 2006-09-01 2005-08-10       <NA>       <NA> 2006-09-01
> 6  6 2007-09-04 2011-10-05       <NA>       <NA> 2011-10-05
> 7  7 2005-10-25       <NA>       <NA> 2011-11-04 2011-11-04
> 
> 
> 
> R. Mark Sharp, Ph.D.
> Director of Primate Records Database
> Southwest National Primate Research Center Texas Biomedical Research 
> Institute P.O. Box 760549 San Antonio, TX 78245-0549
> Telephone: (210)258-9476
> e-mail: msh...@txbiomed.org
> 
> 
> 
> 
> 
> NOTICE:  This E-Mail (including attachments) is confidential and may be 
> legally privileged.  It is covered by the Electronic Communications Privacy 
> Act, 18 U.S.C.2510-2521.  If you are not the intended recipient, you are 
> hereby notified that any retention, dissemination, distribution or copying of 
> this communication is strictly prohibited.  Please reply to the sender that 
> you have received this message in error, then delete it.
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to