Here's a function that can get the interval sizes for you.

getStringSegmentLengths <- function(s, delim, ...) {
  nchar(unlist(strsplit(s, delim, ...))) + 1L
}

It uses strsplit to return a list of all the segments of the string separated by delim. delim can be a regular expression and with ..., you can pass all the extra options to strsplit in order to specify how to break up the string. It then uses unlist to convert the list output of strsplit to a character vector. nchar then gives the lengths of all the elements of the character vector and finally a 1 is added to each of these in order to obtain the correct interval sizes.

Hth,
Andrew.

On 2/12/2022 14:18, Evan Cooch wrote:
Was wondering if there is an 'efficient/elegant' way to do the following (without tidyverse). Take a string

abaaabbaaaaabaaab

Its easy enough to count the number of times the character 'b' shows up in the string, but...what I'm looking for is outputing the 'intervals' between occurrences of 'b' (starting the counter at the beginning of the string). So, for the preceding example, 'b' shows up in positions

2, 6, 7, 13, 17

So, the interval data would be: 2, 4, 1, 6, 4

My main approach has been to simply output positions (say, something like unlist(gregexpr('b', target_string))), and 'do the math' between successive positions. Can anyone suggest a more elegant approach?

Thanks in advance...

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to