Hi,
we gave students the task to construct a regular expression selecting
some texts. One send us back a program which gives different results on
stringr::str_view and grep.
The problem is "[^[A-Z]]" / "[^[A-Z]" at the end of the regular
expression. I would have expected that all four calls would give the
same result; interpreting [ and ] within [...] as the characters `[` and
`]`. Obviously this not the case and moreover stringr::str_view and grep
interpret the regular expressions differently.
Any ideas?
Thanks Sigbert
---
aff <- c("affgfking", "fgok", "rafgkahe","a fgk", "bafghk", "affgm",
"baffgkit", "afffhk", "affgfking", "fgok", "rafgkahe", "afg.K",
"bafghk", "aff gm", "baffg kit", "afffhgk")
correct_brackets <- "af+g[^m$][^[A-Z]]"
missing_brackets <- "af+g[^m$][^[A-Z]"
library("stringr")
grep(correct_brackets, aff, value = TRUE) ### result: character(0)
grep(missing_brackets, aff, value = TRUE) ### correct result
str_view(aff, correct_brackets) ### correct result
str_view(aff, missing_brackets) ### error: missing closing bracket
--
https://hu.berlin/sk
https://hu.berlin/mmstat3
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.