On 20/04/2015 9:59 AM, Dimitri Liakhovitski wrote:
> Hello!
> 
> Please point me in the right direction.
> I need to match 2 strings, but focusing ONLY on characters, ignoring
> all special characters and punctuation signs, including (), "", etc..
> 
> For example:
> I want the following to return: TRUE
> 
> "What a nice day today! - Story of happiness: Part 2." ==
>    "What a nice day today: Story of happiness (Part 2)"
> 
> 

I would transform both strings using gsub(), then compare.

e.g.

clean <- function(s)
  gsub("[[:punct:][:blank:]]", "", s)

clean("What a nice day today! - Story of happiness: Part 2.") ==
clean("What a nice day today: Story of happiness (Part 2)")

This completely ignores spaces; you might want something more
sophisticated if you consider "today" and "to day" to be different, e.g.

clean <- function(s) {
  s <- gsub("[[:punct:]]", "", s)
  gsub("[[:blank:]]+", " ", s)
}

which converts multiple blanks into single spaces.

Duncan Murdoch

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to