Yes, that's got it! (20 years from now I'll have it all figured out UGH!), lol!
Thank you David Min. 1st Qu. Median Mean 3rd Qu. Max. "1977-07-16" "1984-03-13" "1990-08-16" "1990-12-28" "1997-07-29" "2002-12-31" WHP From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Thursday, July 12, 2018 11:29 AM To: Bill Poling <bill.pol...@zelis.com> Cc: r-help (r-help@r-project.org) <r-help@r-project.org> Subject: Re: [R] Help with replace() > On Jul 12, 2018, at 8:17 AM, Bill Poling > <bill.pol...@zelis.com<mailto:bill.pol...@zelis.com>> wrote: > > > R version 3.5.1 (2018-07-02) -- "Feather Spray" > Copyright (C) 2018 The R Foundation for Statistical Computing > Platform: x86_64-w64-mingw32/x64 (64-bit) > > Hi. > > I have data set with day month year integers. I am creating a date column > from those using lubridate. > > a hundred or so rows failed to parse. > > The problem is April and September have day = 31. > > paste(df1$year, df1$month, df1$day, sep = "-") > > ymd(paste(df1$year, df1$month, df1$day, sep = "-"))#Warning message: 129 > failed to parse. As expected in tutorial > > #The resulting Date vector can be added to df1 as a new column called date: > df1$date <- ymd(paste(df1$year, df1$month, df1$day, sep = "-"))#Same warning > > > head(df1) > sapply(df1$date,class) #"date" > summary(df1$date) > # Min. 1st Qu. Median Mean 3rd Qu. Max. NA's > #"1977-07-16" "1984-03-12" "1990-07-22" "1990-12-15" "1997-07-29" > "2002-12-31" "129" > > is_missing_date <- is.na(df1$date) > View(is_missing_date) > > date_columns <- c("year", "month", "day") > missing_dates <- df1[is_missing_date, date_columns] > > head(missing_dates) > # year month day > # 3144 2000 9 31 > # 3817 2000 4 31 > # 3818 2000 4 31 > # 3819 2000 4 31 > # 3820 2000 4 31 > # 3856 2000 9 31 > > I am trying to replace those with 30. Seems like a fairly straightforward application of "[<-" with a conditional argument. (No need for tidyverse.) missing_dates$day[ missing_dates$day==31 & ( missing_dates$month %in% c(4,9) )] <- 30 > missing_dates year month day 3144 2000 9 30 3817 2000 4 30 3818 2000 4 30 3819 2000 4 30 3820 2000 4 30 3856 2000 9 30 Best; David. > > I am all over the map in Google looking for a fix, but haven't found one. I > am sure I have over complicated my attempts with ideas(below) from these and > other sites. > > https://stackoverflow.com/questions/14737773/replacing-occurrences-of-a-number-in-multiple-columns-of-data-frame-with-another?noredirect=1&lq=1<https://stackoverflow.com/questions/14737773/replacing-occurrences-of-a-number-in-multiple-columns-of-data-frame-with-another?noredirect=1&lq=1> > https://www.rdocumentation.org/packages/base/versions/3.5.1/topics/replace<https://www.rdocumentation.org/packages/base/versions/3.5.1/topics/replace> > https://stackoverflow.com/questions/48714625/error-in-data-frame-unused-argument<https://stackoverflow.com/questions/48714625/error-in-data-frame-unused-argument> > The following are screwy attempts at this simple repair, > > ??mutate_if > > ??replace > > is_missing_date <- is.na(df1$date) > View(is_missing_date) > > date_columns <- c("year", "month", "day") > missing_dates <- df1[is_missing_date, date_columns] > > head(missing_dates) > #year month day > # 3144 2000 9 31 > # 3817 2000 4 31 > # 3818 2000 4 31 > # 3819 2000 4 31 > # 3820 2000 4 31 > # 3856 2000 9 31 > > #So need those months with 30 days that are 31 to be 30 > View(missing_dates) > > install.packages("dplyr") > library(dplyr) > > > View(missing_dates) > # ..those were the values you're going to replace > > I thought this function from stackover would work, but get error when I try > to add filter > > #https://stackoverflow.com/questions/14737773/replacing-occurrences-of-a-number-in-multiple-columns-of-data-frame-with-another?noredirect=1&lq=1<https://stackoverflow.com/questions/14737773/replacing-occurrences-of-a-number-in-multiple-columns-of-data-frame-with-another?noredirect=1&lq=1> > df.Rep <- function(.data_Frame, .search_Columns, .search_Value, .sub_Value){ > .data_Frame[, .search_Columns] <- ifelse(.data_Frame[, > .search_Columns]==.search_Value,.sub_Value/.search_Value,1) * .data_Frame[, > .search_Columns] > return(.data_Frame) > } > > df.Rep(missing_dates, 3, 31, 30) > > #--So I should be able to apply this to the complete df1 data somehow? > head(df1) > df.Rep(df1, filter(month == c(4,9)), 31, 30) > #Error in month == c(4, 9) : comparison (1) is possible only for atomic and > list types > > > Other screwy attempts: > > > select(df1, month, day, year) > str(df1) > #'data.frame': 34786 obs. of 14 variables: > #To choose rows, use filter(): > > #mutate_if(df1, month =4,9), day = 30) > > > filter(df1, month == c(4,9), day == 31) > > df1 %>% > group_by(month == c(4,9), day == 31) %>% > tally() > # 1 FALSE FALSE 31161 > # 2 FALSE TRUE 576 > # 3 TRUE FALSE 2981 > # 4 TRUE TRUE 68 > > df1 %>% > mutate(day=replace(day, month == c(4,9), 30)) %>% > as.data.frame() > View(as.list(df1, month == 4)) > View(df1, month == c(4,9), day == 31) > > > df1 %>% > group_by(month == c(4,9), day == 31) %>% > tally() > View(df1, month == c(4,9)) > > # df1 %>% > # group_by(month == c(4,9), day == 30) %>% > > > I know there is a simple solution and it is driving me mad that it eludes me, > despite being new to R. > > Thank you for any advice. > > WHP > > > > > > > > > > > > > > > > > > > > > > Confidentiality Notice This message is sent from Zelis. ...{{dropped:15}} > > ______________________________________________ > R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To > UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA 'Any technology distinguishable from magic is insufficiently advanced.' -Gehm's Corollary to Clarke's Third Law Confidentiality Notice This message is sent from Zelis. ...{{dropped:15}} ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.