On tor, sep 04, 2025 at 13:00, Ahmad Fatoum <[email protected]> wrote: > Hello Tobias, > > On 8/28/25 5:05 PM, Tobias Waldekranz wrote: >> Add an implementation of libc's standard strtok(3), which is useful >> for tokenizing strings. > > strtok was previously removed in favor of strsep as it doesn't suffer > from re-entrancy issues (poller and bthreads can run during delays). If > you want to allow escapes, there's also strsep_unescaped.
Aha, my bad. I did not realize that there was more than one thread of execution. strsep() is not quite the same thing though, I am really after the strtok()'s behavior of skipping empty tokens. How would you feel about adding strtok_r() instead? >> Also, add a version that will collect all tokens from a string into an >> array, which is useful in situations where you need to know how many >> tokens there are, and when a token's relative position in the order is >> significant. > > We have the inverse as strjoin, but not this. Maybe call it strsplit > instead? If you accept my strtok_r() suggestion, do you still think strsplit() is a better name, or is there value in signaling the underlying strtok() behavior? > Cheers, > Ahmad > >> >> Signed-off-by: Tobias Waldekranz <[email protected]> >> --- >> include/string.h | 2 ++ >> lib/string.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++ >> 2 files changed, 68 insertions(+) >> >> diff --git a/include/string.h b/include/string.h >> index 71affe48b6..c8df8540d8 100644 >> --- a/include/string.h >> +++ b/include/string.h >> @@ -8,6 +8,8 @@ >> void *mempcpy(void *dest, const void *src, size_t count); >> int strtobool(const char *str, int *val); >> char *strsep_unescaped(char **, const char *, char *); >> +char *strtok(char *str, const char *delim); >> +int strtokv(char *str, const char *delim, char ***vecp); >> char *stpcpy(char *dest, const char *src); >> bool strends(const char *str, const char *postfix); >> >> diff --git a/lib/string.c b/lib/string.c >> index 73637cd971..be7e65eb45 100644 >> --- a/lib/string.c >> +++ b/lib/string.c >> @@ -593,6 +593,72 @@ char *strsep_unescaped(char **s, const char *ct, char >> *delim) >> return sbegin; >> } >> >> +/** >> + * strtok - extract tokens from string >> + * @str: string to split >> + * @delim: set of delimiter characters >> + * >> + * The strtok() function breaks up a string into zero or more nonempty >> + * tokens. On the first call, the string to be parsed should be >> + * specified in @str. In each subsequent call that should parse the >> + * same string, @str must be NULL. >> + * >> + * @delim specifies a set of bytes that delimit the tokens in the >> + * string. >> + * >> + * Each call to strtok() returns a pointer to a string containing the >> + * next token. This is done by replacing the first delimiter with a >> + * NUL character, the operation is thus destructive to the string. If >> + * no more tokens are found, strtok() returns NULL. >> + */ >> +char *strtok(char *str, const char *delim) >> +{ >> + static char *cursor; >> + >> + if (str) >> + cursor = str; >> + >> + if (!cursor) >> + return NULL; >> + >> + cursor += strspn(cursor, delim); >> + if (*cursor == '\0') { >> + cursor = NULL; >> + return NULL; >> + } >> + >> + return strsep(&cursor, delim); >> +} >> +EXPORT_SYMBOL(strtok); >> + >> +/** >> + * strtokv - split string into array of tokens based on a delimiter set >> + * @str: string to split >> + * @delim: set of delimiter characters >> + * @vecp: array of tokens >> + * >> + * Split @str into tokens delimited by @delim, using strtok(), and >> + * store the allocated token array in @vecp, which the caller is >> + * responsible for freeing. >> + * >> + * Return: The number of tokens in the array. >> + */ >> +int strtokv(char *str, const char *delim, char ***vecp) >> +{ >> + char *tok, **vec = NULL; >> + int cnt = 0; >> + >> + >> + for (tok = strtok(str, delim); tok; tok = strtok(NULL, delim)) { >> + vec = xrealloc(vec, (cnt + 1) * sizeof(*vec)); >> + vec[cnt++] = tok; >> + } >> + >> + *vecp = vec; >> + return cnt; >> +} >> +EXPORT_SYMBOL(strtokv); >> + >> #ifndef __HAVE_ARCH_STRSWAB >> /** >> * strswab - swap adjacent even and odd bytes in %NUL-terminated string > > -- > Pengutronix e.K. | | > Steuerwalder Str. 21 | http://www.pengutronix.de/ | > 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 | > Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
