locale

Taylor R Campbell Fri, 16 Aug 2024 17:32:26 -0700

Module Name:    src
Committed By:   riastradh
Date:           Sat Aug 17 00:32:19 UTC 2024


Modified Files:
        src/lib/libc/locale: c8rtomb.3

Log Message:
c8rtomb(3): Clarify prose and fix example in caveat.

PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb


To generate a diff of this commit:
cvs rdiff -u -r1.4 -r1.5 src/lib/libc/locale/c8rtomb.3

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.

Modified files:

Index: src/lib/libc/locale/c8rtomb.3
diff -u src/lib/libc/locale/c8rtomb.3:1.4 src/lib/libc/locale/c8rtomb.3:1.5
--- src/lib/libc/locale/c8rtomb.3:1.4	Fri Aug 16 23:34:25 2024
+++ src/lib/libc/locale/c8rtomb.3	Sat Aug 17 00:32:19 2024
@@ -1,4 +1,4 @@
-.\"	$NetBSD: c8rtomb.3,v 1.4 2024/08/16 23:34:25 riastradh Exp $
+.\"	$NetBSD: c8rtomb.3,v 1.5 2024/08/17 00:32:19 riastradh Exp $
 .\"
 .\" Copyright (c) 2024 The NetBSD Foundation, Inc.
 .\" All rights reserved.
@@ -30,7 +30,7 @@
 .\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
 .Sh NAME
 .Nm c8rtomb
-.Nd Restartable UTF-8 code unit to multibyte conversion
+.Nd Restartable UTF-8 to multibyte conversion
 .\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
 .Sh LIBRARY
 .Lb libc
@@ -49,37 +49,52 @@
 .Sh DESCRIPTION
 The
 .Nm
-function attempts to encode Unicode input as a multibyte character
-sequence output at
-.Fa s
-in the current locale, writing anywhere between zero and
-.Dv MB_CUR_MAX
-bytes, inclusive, to
-.Fa s ,
-depending on the inputs and conversion state
-.Fa ps .
+function decodes UTF-8 and converts it to multibyte characters in the
+current locale, keeping state so it can restart after incremental
+progress.
 .Pp
-The input
-.Fa c8
-is a UTF-8 code unit.
-Successive calls to
+Each call to
 .Nm
-must provide well-formed UTF-8 code unit sequences.
-If
+updates the conversion state
+.Fa ps
+with a UTF-8 code unit
 .Fa c8 ,
-when appended to the sequence of code units passed in previous calls
+writes up to
+.Dv MB_CUR_MAX
+bytes to
+.Fa s
+(possibly none), and returns either the number of bytes written to
+.Fa s
+or
+.Li (size_t)-1
+to denote error.
+.Pp
+Over successive calls to
+.Nm
 with the same state
 .Fa ps ,
+the sequence of
+.Fa c8
+values must be a well-formed UTF-8 code unit sequence.
+If
+.Fa c8 ,
+when appended to the sequence of code units passed in previous calls,
 does not form a well-formed UTF-8 code unit sequence, then
 .Nm
-will return
+returns
 .Li (size_t)-1
-to denote failure with
+with
 .Xr errno 2
 set to
 .Er EILSEQ .
 .Pp
 If
+.Fa s
+is a null pointer, no output is stored, but the effects on
+.Fa ps
+and the return value are unchanged.
+.Pp
+If
 .Fa ps
 is a null pointer,
 .Nm
@@ -191,14 +206,14 @@ followed by a NUL:
 c8rtomb(s, 0xf0, ps);
 c8rtomb(s, 0x9f, ps);
 c8rtomb(s, 0x92, ps);
-c8rtomb(s, L'\e0', ps);
+c8rtomb(s, '\e0', ps);
 .Ed
 .Pp
 Currently this fails with
 .Er EILSEQ
 which matches other implementations, but this is at odds with language
 in the standard which suggests that passing
-.Li L'\e0'
+.Li '\e0'
 should unconditionally store a null byte and reset
 .Fa ps
 to the initial conversion state:

CVS commit: src/lib/libc/locale

Reply via email to