OK, so you want parentheses, not "brackets" + I think I misinterpreted your specification, which I think is actually incomplete. Based on what I think you meant, how does this work:
gsub("((\\\\|/)[[:alnum:]]+)|(\\([[:alnum:]-]+\\))", "",tmp$Text) [1] "Я досяг того, чого хотів" "Мені вдалося\nзробити бажане" [3] "Я досяг того, чого хотів " "Я\nдосяг речей, яких хотілося досягти" [5] "Я досяг того, чого\nхотів" "Я досяг того, чого прагнув" [7] "Я\nдосягнув того, чого хотів" If you want it without the \n's, cat the above to get: cat(gsub("((\\\\|/)[[:alnum:]]+)|(\\([[:alnum:]-]+\\))", "",tmp$Text)) Я досяг того, чого хотів Мені вдалося зробити бажане Я досяг того, чого хотів Я досяг речей, яких хотілося досягти Я досяг того, чого хотів Я досяг того, чого прагнув Я досягнув того, чого хотів Cheers, Bert On Tue, Jun 27, 2023 at 11:09 AM Bert Gunter <bgunter.4...@gmail.com> wrote: > Does this do it for you (or get you closer): > > gsub("\\[.*\\]|[\\\\] |/ ","",tmp$Text) > [1] "Я досяг того, чого хотів" > [2] "Мені вдалося\nзробити бажане" > [3] "Я досяг (досягла) того, чого хотів (хотіла)" > [4] "Я\nдосяг(-ла) речей, яких хотілося досягти" > [5] "Я досяг/ла того, чого\nхотів/ла" > [6] "Я досяг\\досягла того, чого прагнув\\прагнула" > [7] "Я\nдосягнув(ла) того, чого хотів(ла)" > > On Tue, Jun 27, 2023 at 10:16 AM Chris Evans via R-help < > r-help@r-project.org> wrote: > >> I am sure this is easy for people who are good at regexps but I'm >> failing with it. The situation is that I have hundreds of lines of >> Ukrainian translations of some English. They contain things like this: >> >> 1"Я досяг того, чого хотів"2"Мені вдалося зробити бажане"3"Я досяг >> (досягла) того, чого хотів (хотіла)"4"Я досяг(-ла) речей, яких хотілося >> досягти"5"Я досяг/ла того, чого хотів/ла"6"Я досяг\\досягла того, чого >> прагнув\\прагнула."7"Я досягнув(ла) того, чого хотів(ла)" >> >> Using dput(): >> >> tmp <- structure(list(Text = c("Я досяг того, чого хотів", "Мені вдалося >> зробити бажане", "Я досяг (досягла) того, чого хотів (хотіла)", "Я >> досяг(-ла) речей, яких хотілося досягти", "Я досяг/ла того, чого >> хотів/ла", "Я досяг\\досягла того, чого прагнув\\прагнула", "Я >> досягнув(ла) того, чого хотів(ла)" )), row.names = c(NA, -7L), class = >> c("tbl_df", "tbl", "data.frame" )) Those show four different ways >> translators have handled gendered words: 1) Ignore them and (I'm >> guessing) only give the masculine 2) Give the feminine form of the word >> (or just the feminine suffix) in brackets 3) Give the feminine >> form/suffix prefixed by a forward slash 4) Give the feminine form/suffix >> prefixed by backslash (here a double backslash) I would like just to >> drop all these feminine gendered options. (Don't worry, they'll get back >> in later.) So I would like to replace 1) anything between brackets with >> nothing! 2) anything between a forward slash and the next space with >> nothing 3) anything between a backslash and the next space with nothing >> but preserving the rest of the text. I have been trying to achieve this >> using str_replace_all() but I am failing utterly. Here's a silly little >> example of my failures. This was just trying to get the text I wanted to >> replace (as I was trying to simplify the issues for my tired wetware): > >> tmp %>%+ as_tibble() %>% + rename(Text = value) %>% + mutate(Text = >> str_replace_all(Text, fixed("."), "")) %>% + filter(row_number() < 4) >> %>% + mutate(Text2 = str_replace(Text, "\\(.*\\)", "\\1")) Errorin >> `mutate()`:ℹIn argument: `Text2 = str_replace(Text, "\\(.*\\)", >> "\\1")`.Caused by error in `stri_replace_first_regex()`:!Trying to >> access the index that is out of bounds. (U_INDEX_OUTOFBOUNDS_ERROR) Run >> `rlang::last_trace()` to see where the error occurred. I have tried >> gurgling around the internet but am striking out so throwing myself on >> the list. Apologies if this is trivial but I'd hate to have to clean >> these hundreds of lines by hand though it's starting to look as if I'd >> achieve that faster by hand than I will by banging my ignorance of R >> regexp syntax on the problem. TIA, Chris >> >> -- >> Chris Evans (he/him) >> Visiting Professor, UDLA, Quito, Ecuador & Honorary Professor, >> University of Roehampton, London, UK. >> Work web site: https://www.psyctc.org/psyctc/ >> CORE site: http://www.coresystemtrust.org.uk/ >> Personal site: https://www.psyctc.org/pelerinage2016/ >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.