> On 10 Aug 2017, at 06:54, Courtney Benjamin <cbenj...@btboces.org> wrote: > > Hello R Help List, > > I am an R novice and trying to use the ifelse function to create a new binary > variable based off of the responses of two other binary variables; NAs are > involved. I pulled it off almost successfully, but when I checked the counts > of my new variable for accuracy, I found that a small portion of the NA cases > were not being passed through as NAs, but as "0" counts in my new variable. > My many attempts at creating a nested ifelse statement that would pass the > NAs through properly have not been successful. Any help is greatly > appreciated. > > Here is a MRE:? > > library(RCurl) > data <- > getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq2wbl.csv") > elsq2wbl <- read.csv(text = data) > > ##Recoding Negative Responses to NA > elsq2wbl [elsq2wbl[, "EVERRELJOB"] < -3, "EVERRELJOB"] <- NA > elsq2wbl [elsq2wbl[, "PSWBL"] < -2, "PSWBL"] <- NA > > #Labeling categorical variable levels > elsq2wbl$EVERRELJOB <- factor(elsq2wbl$EVERRELJOB, levels = c(0,1), labels = > c("No","Yes")) > elsq2wbl$PSWBL <- factor(elsq2wbl$PSWBL, levels = c(0,1), labels = > c("No","Yes")) > > ##Trying to create a new variable to indicate if the student had a job > #related to the college studies that was NOT a WBL experience > elsq2wbl$NONWBLRELJOB <- ifelse(elsq2wbl$PSWBL=="No" & > elsq2wbl$EVERRELJOB=="Yes",1,0) > > #Cross tab to check counts of two variables that new variable is based upon > xtabs(~PSWBL+EVERRELJOB,subset(elsq2wbl,BYSCTRL==1&G10COHRT==1),addNA=TRUE) > > #Checking count of newly created variable > Q2sub <- subset(elsq2wbl,BYSCTRL==1&G10COHRT==1) > library(plyr) > count(Q2sub,'NONWBLRELJOB') > > #The new variable has the correct count of "1", but 88 cases too many for "0" > #The cross tab shows 20 and 68 NA cases that are being incorrectly counted as > "0" in the new variable > > #My other approach at trying to handle the NAs properly-returns an error > elsq2wbl$NONWBLRELJOB <- ifelse(elsq2wbl$PSWBL=="No" & > elsq2wbl$EVERRELJOB=="Yes",1,ifelse(is.na(elsq2wbl$PSWBL)&is.na(elsq2wbl$EVERRELJOB),NA, > > ifelse(elsq2wbl$PSWBL!="No" & elsq2wbl$EVERRELJOB!="Yes",0))) > > > > Courtney Benjamin
I could not follow the question up clearly. But one thing that come across to my sight is that you have values in elsq2wbl$EVERRELJOB as below: summary(factor(elsq2wbl$EVERRELJOB)) -9 -8 -7 -4 -3 0 1 139 459 946 2488 1948 4619 5598 and in fact, you want to set negative values to NA. > ##Recoding Negative Responses to NA > elsq2wbl [elsq2wbl[, "EVERRELJOB"] < -3, "EVERRELJOB"] <- NA But after the command, you still have 1948 ‘-3' in the variable; summary(factor(elsq2wbl$EVERRELJOB)) -3 0 1 NA's 1948 4619 5598 4032 So I think, you need to fix the line as follows: > ##Recoding Negative Responses to NA > elsq2wbl [elsq2wbl[, "EVERRELJOB"] <= -3, "EVERRELJOB"] <- NA Instead of using ‘-2' and ‘-3' as threshold to set NA for different variables, why don’t you use “less than zero” condition as follows? elsq2wbl [elsq2wbl[, "EVERRELJOB"] < 0, "EVERRELJOB"] <- NA elsq2wbl [elsq2wbl[, "PSWBL"] < 0, "PSWBL"] <- NA Hence, in both columns (variables), values lower than zero will be NA and you only will have 0, 1 and NA values in the variable as you called “binary”. _ifelse_ part: You have NA’s in both variables. In this circumstances, consider following ifelse samples (both sides of '&' can be exchanged) ifelse(TRUE & TRUE, 1, 0) # 1 ifelse(TRUE & FALSE, 1, 0) # 0 ifelse(FALSE & FALSE, 1, 0) # 0 ifelse(TRUE & NA, 1, 0) # NA ifelse(FALSE & NA, 1, 0) # 0 according to above, try to create new logic to achieve what you need. In your last neste-ifelse, you forgot to define a value if deepest ifelse statement fails. elsq2wbl$NONWBLRELJOB <- ifelse(elsq2wbl$PSWBL=="No" & elsq2wbl$EVERRELJOB=="Yes", 1, ifelse(is.na(elsq2wbl$PSWBL) & is.na(elsq2wbl$EVERRELJOB), NA, ifelse(elsq2wbl$PSWBL != "No" & elsq2wbl$EVERRELJOB != "Yes",0, "Forgotten value"))) Also, please, try to create a _minimal reproducible example_ instead of make us download a big csv file (219 columns x 16197 rows) and try to understand what you are trying to do. :) ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.