Re: [PATCH v4 02/11] lib/charset: add u16_strlcat() function

2022-04-28 Thread Masahisa Kojima
Hi Heinrich,

On Mon, 18 Apr 2022 at 16:47, Masahisa Kojima
 wrote:
>
> On Sat, 16 Apr 2022 at 16:32, Heinrich Schuchardt  wrote:
> >
> > On 3/24/22 14:54, Masahisa Kojima wrote:
> > > Provide u16 string version of strlcat().
> > >
> > > Signed-off-by: Masahisa Kojima 
> > > Reviewed-by: Simon Glass 
> > > ---
> > > Changes in v4:
> > > - add blank line above the return statement
> > >
> > > Changes in v2:
> > > - implement u16_strlcat(with the destination buffer size in argument)
> > >instead of u16_strcat
> > >
> > >   include/charset.h | 15 +++
> > >   lib/charset.c | 21 +
> > >   2 files changed, 36 insertions(+)
> > >
> > > diff --git a/include/charset.h b/include/charset.h
> > > index b93d023092..dc5fc275ec 100644
> > > --- a/include/charset.h
> > > +++ b/include/charset.h
> > > @@ -259,6 +259,21 @@ u16 *u16_strcpy(u16 *dest, const u16 *src);
> > >*/
> > >   u16 *u16_strdup(const void *src);
> > >
> > > +/**
> > > + * u16_strlcat() - Append a length-limited, %NUL-terminated string to 
> > > another
> > > + *
> > > + * Append the src string to the dest string, overwriting the terminating
> > > + * null word at the end of dest, and then adds a terminating null word.
> > > + * It will append at most size - u16_strlen(dst) - 1 bytes, 
> > > NUL-terminating the result.
> >
> > Why "- 1"?
>
> It is my mistake, it should be 2.
>
> >
> > If size is even, we append up to size - u16_strlen(dst) - 2 bytes. The
> > two extra bytes used for 0x.
> > If size is odd, we append up to size - u16_strlen(dst) - 3 bytes leaving
> > one byte of the buffer unused.

To make behavior simple, I update the meaning of the 3rd parameter
from buffer size to u16 string count.
It is the same behavior as other u16_strxxx functions in U-boot.

Thanks,
Masahisa Kojima
>
> Thanks, It clearly explains the behavior.
>
> >
> > > + *
> > > + * @dest:destination buffer (null terminated)
> > > + * @src: source buffer (null terminated)
> > > + * @size:destination buffer size in bytes
> >
> > s/$/ including the trailing 0x/
>
> OK, I will update "(null terminated)" to the suggested one.
>
> >
> > > + * Return:   total size of the created string in bytes.
> > > + *   If return value >= size, truncation occurred.
> > > + */
> > > +size_t u16_strlcat(u16 *dest, const u16 *src, size_t size);
> > > +
> > >   /**
> > >* utf16_to_utf8() - Convert an utf16 string to utf8
> > >*
> > > diff --git a/lib/charset.c b/lib/charset.c
> > > index f44c58d9d8..47997eca7d 100644
> > > --- a/lib/charset.c
> > > +++ b/lib/charset.c
> > > @@ -428,6 +428,27 @@ u16 *u16_strdup(const void *src)
> > >   return new;
> > >   }
> > >
> > > +size_t u16_strlcat(u16 *dest, const u16 *src, size_t size)
> > > +{
> >
> > If you start the function with
> >
> >  size >>= 1;
> >
> > or
> >
> >  size /= sizeof(u16);
> >
> > this might simplify the code.
>
> In u16_strlcat(), there are two size definitions, u16 string size and
> buffer size.
> I will rename some of the variables to clearly identify the meaning.
>
> >
> > > + size_t dstrlen = u16_strnlen(dest, size >> 1);
> > > + size_t dlen = dstrlen * sizeof(u16);
> > > + size_t len = u16_strlen(src) * sizeof(u16);
> > > + size_t ret = dlen + len;
> >
> > This misses the  trailing 0x.
>
> Strlcat() is not the C standard function, but the linux implementation
> of strlcat() does not include trailing 0x00[1],
> also the same for openbsd.
> [1] https://github.com/torvalds/linux/blob/master/lib/string.c#L319.
>
> The current U-Boot strlcat() contains trailing 0x00, I think it needs
> to be updated.
>
> Thanks,
> Masahisa Kojima
>
> >
> > Best regards
> >
> > Heinrich
> >
> > > +
> > > + if (dlen >= size)
> > > + return ret;
> > > +
> > > + dest += dstrlen;
> > > + size -= dlen;
> > > + if (len >= size)
> > > + len = size - sizeof(u16);
> > > +
> > > + memcpy(dest, src, len);
> > > + dest[len >> 1] = u'\0';
> > > +
> > > + return ret;
> > > +}
> > > +
> > >   /* Convert UTF-16 to UTF-8.  */
> > >   uint8_t *utf16_to_utf8(uint8_t *dest, const uint16_t *src, size_t size)
> > >   {
> >


Re: [PATCH v4 02/11] lib/charset: add u16_strlcat() function

2022-04-18 Thread Masahisa Kojima
On Sat, 16 Apr 2022 at 16:32, Heinrich Schuchardt  wrote:
>
> On 3/24/22 14:54, Masahisa Kojima wrote:
> > Provide u16 string version of strlcat().
> >
> > Signed-off-by: Masahisa Kojima 
> > Reviewed-by: Simon Glass 
> > ---
> > Changes in v4:
> > - add blank line above the return statement
> >
> > Changes in v2:
> > - implement u16_strlcat(with the destination buffer size in argument)
> >instead of u16_strcat
> >
> >   include/charset.h | 15 +++
> >   lib/charset.c | 21 +
> >   2 files changed, 36 insertions(+)
> >
> > diff --git a/include/charset.h b/include/charset.h
> > index b93d023092..dc5fc275ec 100644
> > --- a/include/charset.h
> > +++ b/include/charset.h
> > @@ -259,6 +259,21 @@ u16 *u16_strcpy(u16 *dest, const u16 *src);
> >*/
> >   u16 *u16_strdup(const void *src);
> >
> > +/**
> > + * u16_strlcat() - Append a length-limited, %NUL-terminated string to 
> > another
> > + *
> > + * Append the src string to the dest string, overwriting the terminating
> > + * null word at the end of dest, and then adds a terminating null word.
> > + * It will append at most size - u16_strlen(dst) - 1 bytes, 
> > NUL-terminating the result.
>
> Why "- 1"?

It is my mistake, it should be 2.

>
> If size is even, we append up to size - u16_strlen(dst) - 2 bytes. The
> two extra bytes used for 0x.
> If size is odd, we append up to size - u16_strlen(dst) - 3 bytes leaving
> one byte of the buffer unused.

Thanks, It clearly explains the behavior.

>
> > + *
> > + * @dest:destination buffer (null terminated)
> > + * @src: source buffer (null terminated)
> > + * @size:destination buffer size in bytes
>
> s/$/ including the trailing 0x/

OK, I will update "(null terminated)" to the suggested one.

>
> > + * Return:   total size of the created string in bytes.
> > + *   If return value >= size, truncation occurred.
> > + */
> > +size_t u16_strlcat(u16 *dest, const u16 *src, size_t size);
> > +
> >   /**
> >* utf16_to_utf8() - Convert an utf16 string to utf8
> >*
> > diff --git a/lib/charset.c b/lib/charset.c
> > index f44c58d9d8..47997eca7d 100644
> > --- a/lib/charset.c
> > +++ b/lib/charset.c
> > @@ -428,6 +428,27 @@ u16 *u16_strdup(const void *src)
> >   return new;
> >   }
> >
> > +size_t u16_strlcat(u16 *dest, const u16 *src, size_t size)
> > +{
>
> If you start the function with
>
>  size >>= 1;
>
> or
>
>  size /= sizeof(u16);
>
> this might simplify the code.

In u16_strlcat(), there are two size definitions, u16 string size and
buffer size.
I will rename some of the variables to clearly identify the meaning.

>
> > + size_t dstrlen = u16_strnlen(dest, size >> 1);
> > + size_t dlen = dstrlen * sizeof(u16);
> > + size_t len = u16_strlen(src) * sizeof(u16);
> > + size_t ret = dlen + len;
>
> This misses the  trailing 0x.

Strlcat() is not the C standard function, but the linux implementation
of strlcat() does not include trailing 0x00[1],
also the same for openbsd.
[1] https://github.com/torvalds/linux/blob/master/lib/string.c#L319.

The current U-Boot strlcat() contains trailing 0x00, I think it needs
to be updated.

Thanks,
Masahisa Kojima

>
> Best regards
>
> Heinrich
>
> > +
> > + if (dlen >= size)
> > + return ret;
> > +
> > + dest += dstrlen;
> > + size -= dlen;
> > + if (len >= size)
> > + len = size - sizeof(u16);
> > +
> > + memcpy(dest, src, len);
> > + dest[len >> 1] = u'\0';
> > +
> > + return ret;
> > +}
> > +
> >   /* Convert UTF-16 to UTF-8.  */
> >   uint8_t *utf16_to_utf8(uint8_t *dest, const uint16_t *src, size_t size)
> >   {
>


Re: [PATCH v4 02/11] lib/charset: add u16_strlcat() function

2022-04-16 Thread Heinrich Schuchardt

On 3/24/22 14:54, Masahisa Kojima wrote:

Provide u16 string version of strlcat().

Signed-off-by: Masahisa Kojima 
Reviewed-by: Simon Glass 
---
Changes in v4:
- add blank line above the return statement

Changes in v2:
- implement u16_strlcat(with the destination buffer size in argument)
   instead of u16_strcat

  include/charset.h | 15 +++
  lib/charset.c | 21 +
  2 files changed, 36 insertions(+)

diff --git a/include/charset.h b/include/charset.h
index b93d023092..dc5fc275ec 100644
--- a/include/charset.h
+++ b/include/charset.h
@@ -259,6 +259,21 @@ u16 *u16_strcpy(u16 *dest, const u16 *src);
   */
  u16 *u16_strdup(const void *src);

+/**
+ * u16_strlcat() - Append a length-limited, %NUL-terminated string to another
+ *
+ * Append the src string to the dest string, overwriting the terminating
+ * null word at the end of dest, and then adds a terminating null word.
+ * It will append at most size - u16_strlen(dst) - 1 bytes, NUL-terminating 
the result.


Why "- 1"?

If size is even, we append up to size - u16_strlen(dst) - 2 bytes. The
two extra bytes used for 0x.
If size is odd, we append up to size - u16_strlen(dst) - 3 bytes leaving
one byte of the buffer unused.


+ *
+ * @dest:  destination buffer (null terminated)
+ * @src:   source buffer (null terminated)
+ * @size:  destination buffer size in bytes


s/$/ including the trailing 0x/


+ * Return: total size of the created string in bytes.
+ * If return value >= size, truncation occurred.
+ */
+size_t u16_strlcat(u16 *dest, const u16 *src, size_t size);
+
  /**
   * utf16_to_utf8() - Convert an utf16 string to utf8
   *
diff --git a/lib/charset.c b/lib/charset.c
index f44c58d9d8..47997eca7d 100644
--- a/lib/charset.c
+++ b/lib/charset.c
@@ -428,6 +428,27 @@ u16 *u16_strdup(const void *src)
return new;
  }

+size_t u16_strlcat(u16 *dest, const u16 *src, size_t size)
+{


If you start the function with

size >>= 1;

or

size /= sizeof(u16);

this might simplify the code.


+   size_t dstrlen = u16_strnlen(dest, size >> 1);
+   size_t dlen = dstrlen * sizeof(u16);
+   size_t len = u16_strlen(src) * sizeof(u16);
+   size_t ret = dlen + len;


This misses the  trailing 0x.

Best regards

Heinrich


+
+   if (dlen >= size)
+   return ret;
+
+   dest += dstrlen;
+   size -= dlen;
+   if (len >= size)
+   len = size - sizeof(u16);
+
+   memcpy(dest, src, len);
+   dest[len >> 1] = u'\0';
+
+   return ret;
+}
+
  /* Convert UTF-16 to UTF-8.  */
  uint8_t *utf16_to_utf8(uint8_t *dest, const uint16_t *src, size_t size)
  {




Re: [PATCH v4 02/11] lib/charset: add u16_strlcat() function

2022-04-04 Thread Masahisa Kojima
Hi Heinrich,

On Sat, 2 Apr 2022 at 16:19, Heinrich Schuchardt  wrote:
>
> On 3/24/22 14:54, Masahisa Kojima wrote:
> > Provide u16 string version of strlcat().
> >
> > Signed-off-by: Masahisa Kojima 
> > Reviewed-by: Simon Glass 
> > ---
> > Changes in v4:
> > - add blank line above the return statement
> >
> > Changes in v2:
> > - implement u16_strlcat(with the destination buffer size in argument)
> >instead of u16_strcat
> >
> >   include/charset.h | 15 +++
> >   lib/charset.c | 21 +
> >   2 files changed, 36 insertions(+)
> >
> > diff --git a/include/charset.h b/include/charset.h
> > index b93d023092..dc5fc275ec 100644
> > --- a/include/charset.h
> > +++ b/include/charset.h
> > @@ -259,6 +259,21 @@ u16 *u16_strcpy(u16 *dest, const u16 *src);
> >*/
> >   u16 *u16_strdup(const void *src);
> >
> > +/**
> > + * u16_strlcat() - Append a length-limited, %NUL-terminated string to 
> > another
>
> The function should be called u16_strncat() in reference to the
> strncat() function.

I intended to implement the string concatenation function with destination
buffer size check, it is u16_strlcat().
strncat() is not safe. strncat() has size parameter, but it indicates
the size to be copied to the destination, not the size of the
destination buffer.

>
> > + *
> > + * Append the src string to the dest string, overwriting the terminating
> > + * null word at the end of dest, and then adds a terminating null word.
> > + * It will append at most size - u16_strlen(dst) - 1 bytes, 
> > NUL-terminating the result.
> > + *
> > + * @dest:destination buffer (null terminated)
> > + * @src: source buffer (null terminated)
> > + * @size:destination buffer size in bytes
> > + * Return:   total size of the created string in bytes.
> > + *   If return value >= size, truncation occurred.
> > + */
> > +size_t u16_strlcat(u16 *dest, const u16 *src, size_t size);
> > +
> >   /**
> >* utf16_to_utf8() - Convert an utf16 string to utf8
> >*
> > diff --git a/lib/charset.c b/lib/charset.c
> > index f44c58d9d8..47997eca7d 100644
> > --- a/lib/charset.c
> > +++ b/lib/charset.c
> > @@ -428,6 +428,27 @@ u16 *u16_strdup(const void *src)
> >   return new;
> >   }
> >
> > +size_t u16_strlcat(u16 *dest, const u16 *src, size_t size)
> > +{
> > + size_t dstrlen = u16_strnlen(dest, size >> 1);
> > + size_t dlen = dstrlen * sizeof(u16);
> > + size_t len = u16_strlen(src) * sizeof(u16);
> > + size_t ret = dlen + len;
> > +
> > + if (dlen >= size)
> > + return ret;
> > +
> > + dest += dstrlen;
> > + size -= dlen;
> > + if (len >= size)
> > + len = size - sizeof(u16);
>
> For size = dlen + 1 this results in
>
> len = SIZE_MAX = 0x
>
> Something must be missing in your unit test.

Yes, you are correct.
I need to care about the case that the size is an odd number.

Thanks,
Masahisa Kojima

>
> Best regards
>
> Heinrich
>
> > +
> > + memcpy(dest, src, len);
> > + dest[len >> 1] = u'\0';
> > +
> > + return ret;
> > +}
> > +
> >   /* Convert UTF-16 to UTF-8.  */
> >   uint8_t *utf16_to_utf8(uint8_t *dest, const uint16_t *src, size_t size)
> >   {
>


Re: [PATCH v4 02/11] lib/charset: add u16_strlcat() function

2022-04-02 Thread Heinrich Schuchardt

On 3/24/22 14:54, Masahisa Kojima wrote:

Provide u16 string version of strlcat().

Signed-off-by: Masahisa Kojima 
Reviewed-by: Simon Glass 
---
Changes in v4:
- add blank line above the return statement

Changes in v2:
- implement u16_strlcat(with the destination buffer size in argument)
   instead of u16_strcat

  include/charset.h | 15 +++
  lib/charset.c | 21 +
  2 files changed, 36 insertions(+)

diff --git a/include/charset.h b/include/charset.h
index b93d023092..dc5fc275ec 100644
--- a/include/charset.h
+++ b/include/charset.h
@@ -259,6 +259,21 @@ u16 *u16_strcpy(u16 *dest, const u16 *src);
   */
  u16 *u16_strdup(const void *src);

+/**
+ * u16_strlcat() - Append a length-limited, %NUL-terminated string to another


The function should be called u16_strncat() in reference to the
strncat() function.


+ *
+ * Append the src string to the dest string, overwriting the terminating
+ * null word at the end of dest, and then adds a terminating null word.
+ * It will append at most size - u16_strlen(dst) - 1 bytes, NUL-terminating 
the result.
+ *
+ * @dest:  destination buffer (null terminated)
+ * @src:   source buffer (null terminated)
+ * @size:  destination buffer size in bytes
+ * Return: total size of the created string in bytes.
+ * If return value >= size, truncation occurred.
+ */
+size_t u16_strlcat(u16 *dest, const u16 *src, size_t size);
+
  /**
   * utf16_to_utf8() - Convert an utf16 string to utf8
   *
diff --git a/lib/charset.c b/lib/charset.c
index f44c58d9d8..47997eca7d 100644
--- a/lib/charset.c
+++ b/lib/charset.c
@@ -428,6 +428,27 @@ u16 *u16_strdup(const void *src)
return new;
  }

+size_t u16_strlcat(u16 *dest, const u16 *src, size_t size)
+{
+   size_t dstrlen = u16_strnlen(dest, size >> 1);
+   size_t dlen = dstrlen * sizeof(u16);
+   size_t len = u16_strlen(src) * sizeof(u16);
+   size_t ret = dlen + len;
+
+   if (dlen >= size)
+   return ret;
+
+   dest += dstrlen;
+   size -= dlen;
+   if (len >= size)
+   len = size - sizeof(u16);


For size = dlen + 1 this results in

len = SIZE_MAX = 0x

Something must be missing in your unit test.

Best regards

Heinrich


+
+   memcpy(dest, src, len);
+   dest[len >> 1] = u'\0';
+
+   return ret;
+}
+
  /* Convert UTF-16 to UTF-8.  */
  uint8_t *utf16_to_utf8(uint8_t *dest, const uint16_t *src, size_t size)
  {




[PATCH v4 02/11] lib/charset: add u16_strlcat() function

2022-03-24 Thread Masahisa Kojima
Provide u16 string version of strlcat().

Signed-off-by: Masahisa Kojima 
Reviewed-by: Simon Glass 
---
Changes in v4:
- add blank line above the return statement

Changes in v2:
- implement u16_strlcat(with the destination buffer size in argument)
  instead of u16_strcat

 include/charset.h | 15 +++
 lib/charset.c | 21 +
 2 files changed, 36 insertions(+)

diff --git a/include/charset.h b/include/charset.h
index b93d023092..dc5fc275ec 100644
--- a/include/charset.h
+++ b/include/charset.h
@@ -259,6 +259,21 @@ u16 *u16_strcpy(u16 *dest, const u16 *src);
  */
 u16 *u16_strdup(const void *src);
 
+/**
+ * u16_strlcat() - Append a length-limited, %NUL-terminated string to another
+ *
+ * Append the src string to the dest string, overwriting the terminating
+ * null word at the end of dest, and then adds a terminating null word.
+ * It will append at most size - u16_strlen(dst) - 1 bytes, NUL-terminating 
the result.
+ *
+ * @dest:  destination buffer (null terminated)
+ * @src:   source buffer (null terminated)
+ * @size:  destination buffer size in bytes
+ * Return: total size of the created string in bytes.
+ * If return value >= size, truncation occurred.
+ */
+size_t u16_strlcat(u16 *dest, const u16 *src, size_t size);
+
 /**
  * utf16_to_utf8() - Convert an utf16 string to utf8
  *
diff --git a/lib/charset.c b/lib/charset.c
index f44c58d9d8..47997eca7d 100644
--- a/lib/charset.c
+++ b/lib/charset.c
@@ -428,6 +428,27 @@ u16 *u16_strdup(const void *src)
return new;
 }
 
+size_t u16_strlcat(u16 *dest, const u16 *src, size_t size)
+{
+   size_t dstrlen = u16_strnlen(dest, size >> 1);
+   size_t dlen = dstrlen * sizeof(u16);
+   size_t len = u16_strlen(src) * sizeof(u16);
+   size_t ret = dlen + len;
+
+   if (dlen >= size)
+   return ret;
+
+   dest += dstrlen;
+   size -= dlen;
+   if (len >= size)
+   len = size - sizeof(u16);
+
+   memcpy(dest, src, len);
+   dest[len >> 1] = u'\0';
+
+   return ret;
+}
+
 /* Convert UTF-16 to UTF-8.  */
 uint8_t *utf16_to_utf8(uint8_t *dest, const uint16_t *src, size_t size)
 {
-- 
2.17.1