The answer is, of course, using regular expressions and/or libraries therefor. However, I do not think you have defined your problem sufficiently. Some questions I have:
1. Do possible patterns to be matched always appear at the beginning of your strings? 2. Always together between specified separators ("_" in your example); or one of several specified separators; or otherwise? 3. Do spaces or other nonprinting characters occur in your strings? e.g. would abc_something this.is_a long stringwithabcinthemiddle be considered matching? There are undoubtedly other possibilities that I've missed. You may also find it useful to check this "task view" out for possibilities: https://cran.r-project.org/web/views/NaturalLanguageProcessing.html Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Fri, May 4, 2018 at 3:25 PM, Jeff Reichman <reichm...@sbcglobal.net> wrote: > R Help Forum > > > > Is there a R library (or a way) that I can extract unique character strings, > or repeating patterns in textual strings. Say for example I have the > following records: > > > > Abc_1234_kjhksh_276 > > Abc > > Abc_1234_lakdofyo_324 > > Bce_876_skdhk_*&^%*& > > Bce > > Bce_454 > > > > And I would like to see the following results > > Abc > > Abc_1234 > > Bce > > > > > > Jeff Reichman > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.