Hi all, My code looks like the following: inname = read.csv("ID_error_checker.csv", as.is=TRUE) outname = read.csv("output.csv", as.is=TRUE)
#My algorithm is the following: #for line in inname #if first string up to whitespace in row in inname$name = first string up to whitespace in row + 1 in inname$name #AND ID in inname$ID for the top row NOT EQUAL ID in inname$ID for the row below it #copy these two lines to a new file In other words, if the name (up to the first whitespace) in the first row equals the name in the second row (etc for whole file) and the ID in the first row does not equal the ID in the second row, copy both of these rows in full to a new file. Only caveat is that I want a regular expression not to take the full names, but just the first string up to the first whitespace in the inname$name column (ie if row1 has a name of: New York Mets and row2 has a name of New York Yankees, I would want both of these rows to be copied in full since "New" is the same in both...) Here is some example data: ID NAME YEAR SOURCE NOTES 1 New York Mets 1900 ESPN 2 New York Yankees 1920 Cooperstown 3 Boston Redsox 1918 ESPN 4 Washington Nationals 2010 ESPN 5 Detroit Tigers 1990 ESPN The desired output would be: ID NAME YEAR SOURCE 1 New York Mets 1900 ESPN 2 New York Yankees 1920 Cooperstown Thanks so much! [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.