Hello All, Can anyone tell help me understand why the function below doesn't work and how I can fix it? Below are some sample data, some code that works on individual rows of the data, and my attempt to translate that code into a function. My hope is to get the function working and then to apply it to the larger data frame using ddply() from the plyr package or possibly some other approach.
As yet, I don't have much experience writing anonymous functions. I imagine I'm doing something that is obviously wrong, but I don't know what it is. Thanks, Paul #### Read in test data #### testData <- structure(list(profile_key = structure(c(1L, 1L, 2L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 7L, 7L), .Label = c("001-001 ", "001-002 ", "001-003 ", "001-004 ", "001-005 ", "001-006 ", "001-007 " ), class = "factor"), encounter_date = structure(c(9L, 10L, 11L, 12L, 13L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 4L, 7L, 7L), .Label = c(" 2009-03-01 ", " 2009-03-22 ", " 2009-04-01 ", " 2010-03-01 ", " 2010-10-15 ", " 2010-11-15 ", " 2011-03-01 ", " 2011-03-14 ", " 2011-10-10 ", " 2011-10-24 ", " 2012-09-15 ", " 2012-10-05 ", " 2012-10-17 " ), class = "factor"), raw = c(" ordered kras testing on 10102010 results not yet available if patient has a mutation will start erbitux ", " received kras results on 10202010 test results indicate tumor is wild type ua protein positve erpr positive her2neu positve ", " will conduct kras mutation testing prior to initiation of therapy with erbitux ", " still need to order kras mutation testing ", " ordered kras testing waiting for results ", " kras test results pending note that patient was negative for lynch mutation ", " kras results still pending note that patient was negative for lynch mutation ", " kras mutated will not prescribe erbitux due to mutation ", " kras mutated therefore did not prescribe erbitux ", " kras wild ", " tumor is negative for mutation ", " tumor is wild type patient is eligible to receive eribtux ", " if patient kras result is wild type they will start erbitux several lines of material ordered kras mutation test 11112011 results are still not available ", " kras results are in patient has the mutation ", " ordered kras mutation testing on 02152011 results came back negative several lines of material patient kras mutation test is negative will start erbitux ", " patient is kras negative started erbitux on 03012011 ")), .Names = c("profile_key", "encounter_date", "raw"), row.names = c(NA, -16L), class = "data.frame") #### Convert text record to lowercase #### testData$raw <- tolower(testData$raw) #### Remove punctuation and any multiple spaces #### testData$raw <- gsub("[[:punct:]]", "", testData$raw) testData$raw <- gsub(" +", " ", testData$raw) #### Select test row #### testRow <- testData[13,] testRow #### Select terms +/- a specified number of words from "kras" #### Text <- unlist(strsplit(testRow$raw, " ")) Target <- grep("kras", Text) if (length(Target) == 0) {testRow$reduced <- ""} else{ Length <- length(Text) Keep <- rep(NA, Length) Lower <- ifelse(Target - 6 > 0, Target - 6, 1) Upper <- ifelse(Target + 6 < Length, Target + 6, Length) for(i in 1:length(Keep)){ for(j in 1:length(Lower)){ Keep[i][i %in% seq(Lower[j], Upper[j])] <- i }} testRow$reduced <- paste(Text[!is.na(Keep)], collapse=" ") } testRow length(Text) length(Text[!is.na(Keep)]) #### Function for selecting words within specified range of a target term #### nearTerms <- function(df, text, target, before, after, outvar){ Text <- with(df, strsplit(text, " ")) Target <- grep(target, Text) if (length(Target) == 0) {df$reduced <- ""} else{ Length <- length(Text) Keep <- rep(NA, Length) Lower <- ifelse(Target - before > 0, Target - before, 1) Upper <- ifelse(Target + after < Length, Target + after, Length) for(i in 1:length(Keep)){ for(j in 1:length(Lower)){ Keep[i][i %in% seq(Lower[j], Upper[j])] <- i }} df <- transform(df, outvar = paste(Text[!is.na(Keep)], collapse=" ")) } } nearTerms(testRow, raw, "kras", 6, 6) nearTerms(df = testRow, text = raw, target = "kras", before = 6, after = 6) ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.