Just started thinking about this. The name of regmatches() suggests that it will only extract the matches but not return anything for the non-matches. We might need another function that returns a value for non-matches. Perhaps the value should be the empty string for non-matches and NA for matches to NA. The rationale is that we delegate to regexpr() (at least conceptually), and it returns a "matching region" which would be empty when there is no match. We could allow strcapture() to accept an atomic vector as a prototype, which would do what you want for regexec() (NA on no match, empty string on empty capture). Then we could call the regexpr()-based function strextract().
What do you think? Michael On Thu, Aug 29, 2019 at 3:27 PM Cyclic Group Z_1 <cyclicgroup...@yahoo.com> wrote: > > Thank you! I greatly appreciate your consideration, though of course it is up > to you. I think many people switch to stringr/stringi simply because > functions in those packages have some consistent design choices, for example, > they do not drop empty/missing matches, which facilitates array-based > programming. For example, in the cases where one needs to make a new column > in a data.frame (data.table, tibble, etc.) of regex extractions. Or in any > other case where there needs to be an element-wise correspondence between > input and output. I think insertion of NA_character_ to prevent dropping > indices seems like the natural choice for an array language (which, I think, > motivated the creation of stringr/stringi). While those are great packages > and this behavior can be easily replicated with simple wrappers, string > operations are normally easy to accomplish in base languages, so this seems > like something that would be appropriate to have in base. For example, MATLAB > and Pandas regex both all ow non-dropping empty matches (though of course I acknowledge Pandas is not a base language). > > Best, > CG -- Michael Lawrence Scientist, Bioinformatics and Computational Biology Genentech, A Member of the Roche Group Office +1 (650) 225-7760 micha...@gene.com Join Genentech on LinkedIn | Twitter | Facebook | Instagram | YouTube ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel