Re: [PATCH] lib: fix buggy strcmp and strncmp

2022-10-27 Thread Tom Rini
On Wed, Oct 05, 2022 at 11:09:25AM +0200, Rasmus Villemoes wrote:

> There are two problems with both strcmp and strncmp:
> 
> (1) The C standard is clear that the contents should be compared as
> "unsigned char":
> 
>   The sign of a nonzero value returned by the comparison functions
>   memcmp, strcmp, and strncmp is determined by the sign of the
>   difference between the values of the first pair of characters (both
>   interpreted as unsigned char) that differ in the objects being
>   compared.
> 
> (2) The difference between two char (or unsigned char) values can
> range from -255 to +255; so that's (due to integer promotion) the
> range of values we could get in the *cs-*ct expressions, but when that
> is then shoe-horned into an 8-bit quantity the sign may of course
> change.
> 
> The impact is somewhat limited by the way these functions
> are used in practice:
> 
> - Most of the time, one is only interested in equality (or for
>   strncmp, "starts with"), and the existing functions do correctly
>   return 0 if and only if the strings are equal [for strncmp, up to
>   the given bound].
> 
> - Also most of the time, the strings being compared only consist of
>   ASCII characters, i.e. have values in the range [0, 127], and in
>   that case it doesn't matter if they are interpreted as signed or
>   unsigned char, and the possible difference range is bounded to
>   [-127, 127] which does fit the signed char.
> 
> For size, one could implement strcmp() in terms of strncmp() - just
> make it "return strncmp(a, b, (size_t)-1);". However, performance of
> strcmp() does matter somewhat, since it is used all over when parsing
> and matching DT nodes and properties, so let's find some other place
> to save those ~30 bytes.
> 
> Signed-off-by: Rasmus Villemoes 

Applied to u-boot/master, thanks!

-- 
Tom


signature.asc
Description: PGP signature


Re: [PATCH] lib: fix buggy strcmp and strncmp

2022-10-25 Thread Rasmus Villemoes
On 05/10/2022 11.09, Rasmus Villemoes wrote:
> There are two problems with both strcmp and strncmp:
> 

Hi Tom

Could you consider applying this now to give it ample time to soak
before release? I'm pretty confident it's correct.

Rasmus


[PATCH] lib: fix buggy strcmp and strncmp

2022-10-05 Thread Rasmus Villemoes
There are two problems with both strcmp and strncmp:

(1) The C standard is clear that the contents should be compared as
"unsigned char":

  The sign of a nonzero value returned by the comparison functions
  memcmp, strcmp, and strncmp is determined by the sign of the
  difference between the values of the first pair of characters (both
  interpreted as unsigned char) that differ in the objects being
  compared.

(2) The difference between two char (or unsigned char) values can
range from -255 to +255; so that's (due to integer promotion) the
range of values we could get in the *cs-*ct expressions, but when that
is then shoe-horned into an 8-bit quantity the sign may of course
change.

The impact is somewhat limited by the way these functions
are used in practice:

- Most of the time, one is only interested in equality (or for
  strncmp, "starts with"), and the existing functions do correctly
  return 0 if and only if the strings are equal [for strncmp, up to
  the given bound].

- Also most of the time, the strings being compared only consist of
  ASCII characters, i.e. have values in the range [0, 127], and in
  that case it doesn't matter if they are interpreted as signed or
  unsigned char, and the possible difference range is bounded to
  [-127, 127] which does fit the signed char.

For size, one could implement strcmp() in terms of strncmp() - just
make it "return strncmp(a, b, (size_t)-1);". However, performance of
strcmp() does matter somewhat, since it is used all over when parsing
and matching DT nodes and properties, so let's find some other place
to save those ~30 bytes.

Signed-off-by: Rasmus Villemoes 
---

Please double- and triple-check before applying. I've tested these
against my libc's strcmp() and strncmp() for thousands of random
strings, but I may very well have messed up when copy-pasting back to
lib/string.c.

 lib/string.c | 27 +--
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/lib/string.c b/lib/string.c
index 78bd65c413..ecea755f40 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -206,16 +206,20 @@ size_t strlcat(char *dest, const char *src, size_t size)
  * @cs: One string
  * @ct: Another string
  */
-int strcmp(const char * cs,const char * ct)
+int strcmp(const char *cs, const char *ct)
 {
-   register signed char __res;
+   int ret;
 
while (1) {
-   if ((__res = *cs - *ct++) != 0 || !*cs++)
+   unsigned char a = *cs++;
+   unsigned char b = *ct++;
+
+   ret = a - b;
+   if (ret || !b)
break;
}
 
-   return __res;
+   return ret;
 }
 #endif
 
@@ -226,17 +230,20 @@ int strcmp(const char * cs,const char * ct)
  * @ct: Another string
  * @count: The maximum number of bytes to compare
  */
-int strncmp(const char * cs,const char * ct,size_t count)
+int strncmp(const char *cs, const char *ct, size_t count)
 {
-   register signed char __res = 0;
+   int ret = 0;
+
+   while (count--) {
+   unsigned char a = *cs++;
+   unsigned char b = *ct++;
 
-   while (count) {
-   if ((__res = *cs - *ct++) != 0 || !*cs++)
+   ret = a - b;
+   if (ret || !b)
break;
-   count--;
}
 
-   return __res;
+   return ret;
 }
 #endif
 
-- 
2.37.2