It sounds like what you want is a rudimentary spell-checker whose "word" is the input name, and whose "dictionary" is an array of your database names. Spell checking rules are designed to find missing repeats, transposed letters, extra letters... precisely the reasons you're not matching your names to your database.
Anyway, as I don't believe R has something like this, what I would do is simply rewrite one of the dozens of Perl or C spell checkers to fit your needs (such as Aspell / Ispell), then invoke a script under R using the "system" call, passing in the student name and your database of names. And as R can use Perl-like regular expression (?regexpr), you could (if you really wanted to!) rewrite this into R after the fact, although this would likely be a waste of time since expression matching is what Perl is so good for. You'll also need to think about what this percentage argument is. It's not obvious to me what percentage of closeness "Robert" and "Robret" are vs. "Robert" and "RobQQto". ex: http://tomacorp.com/perl/lingua/style.html http://aspell.sourceforge.net/ Robert -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 05, 2005 12:36 PM To: r-help@stat.math.ethz.ch Subject: [R] Tuning string matching Dear list, I spent about two hours searching on the message archive, with no avail. I have a list of people that have to pass an on-line test, but only a fraction of them do it. Moreover, as they input their names, the resulting string do not always match the names I have in my database. I would like to do two things: 1. Match any strings that are 90% the same Example: name1 <- "Harry Harrington" name2 <- "Harry Harington" I need a function that would declare those strings as a match (ideally having an argument that would allow introducing 80% instead of 90%) 2. Arrange a final table that would take me from: Table1 (the complete list of people from my database) No Name 1 Byron C. Andrew 2 Friedman Bob 3 Harrington Harry Table2 (the people having been tested) No Name Score 1 Harry Harington 13 2 Byron Andrew 28 to: No Name1 Name2 Score 1 Byron C. Andrew Byron Andrew 28 2 Friedman Bob 3 Harrington Harry Harry Harington 13 Thank you in advance, any help is highly appreciated. Adrian ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html