What's the expected output for this sample? How do _you_ define what should be counted?
> On Apr 26, 2017, at 8:33 AM, Dan Abner <dan.abne...@gmail.com> wrote: > > Hi all, > > I was not clearly enough in my example code. Please see below where "blah > blah blah" can be ANY text or numbers: No predictable pattern at all to > what may or may not be written in place of "blah blah blah". > > text1<-c("blah blah blah. > blah blah blah > 1) blah blah blah 1 > 2) blah blah blah > 10) blah 10 blah blah > blah blah blah > 1) blah blah blah > 2) blah blah blah 2 > blah blah blah.","blah blah blah. > blah blah blah > 1. blah blah blah 1 > 2. blah blah blah > 10.blah 10 blah blah > blah blah blah > 1. blah blah blah 1 > 2. blah blah blah > blah blah blah.","blah blah blah. blah blah blah 1 1)blah blah blah 1. 2) blah > blah blah 10) blah 10 blah blah blah blah blah 1) blah blah blah 1. 2) blah > blah blah. blah blah blah." > ,"blah blah blah. blah blah blah 1 1.blah blah blah 1. 2. blah blah blah. > 10. blah 10 blah blah. blah blah blah 1. blah blah blah 1. 2. blah blah > blah. blah blah blah.") > > text1 > > Thank you in advance for your suggestions and/or guidance. > > Best, > > Dan > > > On Wed, Apr 26, 2017 at 12:52 AM, Michael Hannon <jmhannon.ucda...@gmail.com >> wrote: > >> Thanks, Ista. I thought there might be a "tidy" way to do this, but I >> hadn't use stringr. >> >> -- Mike >> >> >> On Tue, Apr 25, 2017 at 8:47 PM, Ista Zahn <istaz...@gmail.com> wrote: >>> stringr::str_count (and stringi::stri_count that it wraps) interpret >>> the pattern argument as a regular expression by default. >>> >>> Best, >>> Ista >>> >>> On Tue, Apr 25, 2017 at 11:40 PM, Michael Hannon >>> <jmhannon.ucda...@gmail.com> wrote: >>>> I like Boris's "Hadley" solution. For the record, I've appended a >>>> version that uses regular expressions, the only benefit of which is >>>> that it could be generalized to find more-complicated patterns. >>>> >>>> -- Mike >>>> >>>> counts <- sapply(text1, function(next_string) { >>>> loc_example <- length(gregexpr("Example", next_string)[[1]]) >>>> loc_example >>>> }, USE.NAMES=FALSE) >>>> >>>>> counts >>>> [1] 5 5 5 5 >>>>> >>>> >>>> On Tue, Apr 25, 2017 at 5:33 PM, Boris Steipe <boris.ste...@utoronto.ca> >> wrote: >>>>> I should add: there's a str_count() function in the stringr package. >>>>> >>>>> library(stringr) >>>>> str_count(text1, "Example") >>>>> # [1] 5 5 5 5 >>>>> >>>>> I guess that would be the neater solution. >>>>> >>>>> B. >>>>> >>>>> >>>>> >>>>>> On Apr 25, 2017, at 8:23 PM, Boris Steipe <boris.ste...@utoronto.ca> >> wrote: >>>>>> >>>>>> How about: >>>>>> >>>>>> unlist(lapply(strsplit(text1, "Example"), function(x) { length(x) - 1 >> } )) >>>>>> >>>>>> >>>>>> Splitting your string on the five "Examples" in each gives six >> elements. length(x) - 1 is the number of >>>>>> matches. You can use any regex instead of "example" if you need to >> tweak what you are looking for. >>>>>> >>>>>> >>>>>> B. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> On Apr 25, 2017, at 8:14 PM, Dan Abner <dan.abne...@gmail.com> >> wrote: >>>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> I am looking for a streamlined way of counting the number of >> enumerated >>>>>>> items are each element of a character vector. For example: >>>>>>> >>>>>>> >>>>>>> text1<-c("This is an example. >>>>>>> List 1 >>>>>>> 1) Example 1 >>>>>>> 2) Example 2 >>>>>>> 10) Example 10 >>>>>>> List 2 >>>>>>> 1) Example 1 >>>>>>> 2) Example 2 >>>>>>> These have been examples.","This is another example. >>>>>>> List 1 >>>>>>> 1. Example 1 >>>>>>> 2. Example 2 >>>>>>> 10. Example 10 >>>>>>> List 2 >>>>>>> 1. Example 1 >>>>>>> 2. Example 2 >>>>>>> These have been examples.","This is a third example. List 1 1) >> Example 1. >>>>>>> 2) Example 2. 10) Example 10. List 2 1) Example 1. 2) Example 2. >> These have >>>>>>> been examples." >>>>>>> ,"This is a fourth example. List 1 1. Example 1. 2. Example 2. 10. >> Example >>>>>>> 10. List 2 Example 1. 2. Example 2. These have been examples.") >>>>>>> >>>>>>> text1 >>>>>>> >>>>>>> === >>>>>>> >>>>>>> I would like the result to be c(5,5,5,5). Notice that sometimes >> there are >>>>>>> leading hard returns, other times not. Sometimes are there separate >> lists >>>>>>> and the same numbers are used in the enumerated items multiple times >> within >>>>>>> each character string. Sometimes the leading numbers for the >> enumerated >>>>>>> items exceed single digits. Notice that the delimiter may be ) or a >> period >>>>>>> (.). If the delimiter is a period and there are hard returns >> (example 2), >>>>>>> then I expect that will be easy enough to differentiate sentences >> ending >>>>>>> with a number from enumerated items. However, I imagine it would be >> much >>>>>>> more difficult to differentiate the two for example 4. >>>>>>> >>>>>>> Any suggestions are appreciated. >>>>>>> >>>>>>> Best, >>>>>>> >>>>>>> Dan >>>>>>> >>>>>>> [[alternative HTML version deleted]] >>>>>>> >>>>>>> ______________________________________________ >>>>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>> PLEASE do read the posting guide http://www.R-project.org/ >> posting-guide.html >>>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>> >>>>>> ______________________________________________ >>>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>> PLEASE do read the posting guide http://www.R-project.org/ >> posting-guide.html >>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>>> ______________________________________________ >>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide http://www.R-project.org/ >> posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>>> ______________________________________________ >>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide http://www.R-project.org/ >> posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/ >> posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.