On Mon, 27 Nov 2017 08:01:25 +0000, Otto Moerbeek wrote:
> On Mon, Nov 27, 2017 at 05:48:14AM +0000, kshe wrote:
> > On Mon, 27 Nov 2017 00:42:01 +0000, Theo de Raadt wrote:
> > > This needs worst-case performance measurements.
> >
> > The only instances where performance could be a problem are in
> > vfprintf.c and vfwprintf.c, where the calls happen inside a loop; but
> > for these, in fact, the best solution would be to use recallocarray
> > directly: rather than repeatedly freeing (and clearing) `convbuf' and
> > reallocating a fresh one every time it is needed, it should be kept
> > around and passed to __wcsconv() and __mbsconv(), along with some
> > accounting of its current size, so that these functions could then call
> > recallocarray appropriately.  However, this optimisation would be more
> > intrusive than the diff I submitted, and would therefore demand greater
> > familiarity with stdio's source, which I have not yet acquired.
> >
> > That being said, in all other cases, since the way stdio functions fill
> > their buffers is much more costly than simply writing a bunch of zeros
> > in sequence (or merely calling munmap(2)), no real slowdown is to be
> > expected.  I should also point out that recallocarray is currently used
> > inside several loops in fvwrite.c, fgetln.c and getdelim.c, but no one
> > complained that the affected functions became too slow, because doing
> > things, as stdio does, like reading from and writing to disk or decoding
> > and converting user input will always dominate the effect of a few calls
> > to discard temporary buffers.
> >
> > Regards,
> >
> > kshe
>
> I was thinking: only point against your statement is that there could
> be cases where a very large buffer is used, but only a small part of
> it is used. In that case, clearing the buffer is more costly. Luckily,
> in those cases malloc just calls munmap(2) without clearing.
>
> Still, I would like to see benchmarks to validate assumptions.

Well, I looked under regress/lib/libc/, but nothing here seems to test
for pathological instances where performance is known to be an issue.  I
would happily run benchmarks if someone could provide examples of such
cases.

In the mean time, here is a better diff for vfprintf.c and vfwprintf.c,
which should eliminate all potential performance regressions,
implementing the optimisation described in my previous message.  In
fact, can recallocarray be faster than plain free followed by calloc?
If so, in the case of vfwprintf.c, this patch may actually improve
performance.  At least, it shall not degrade it.

Nevertheless, this needs to be reviewed carefully since, as stated
above, I do not have much familiarity with stdio's intricate code.

Index: vfprintf.c
===================================================================
RCS file: /cvs/src/lib/libc/stdio/vfprintf.c,v
retrieving revision 1.77
diff -u -p -r1.77 vfprintf.c
--- vfprintf.c  29 Aug 2016 12:20:57 -0000      1.77
+++ vfprintf.c  28 Nov 2017 04:06:43 -0000
@@ -153,7 +153,7 @@ __sbprintf(FILE *fp, const char *fmt, va
  * string is null-terminated.
  */
 static char *
-__wcsconv(wchar_t *wcsarg, int prec)
+__wcsconv(wchar_t *wcsarg, int prec, char *oconvbuf, size_t *convbuf_size)
 {
        mbstate_t mbs;
        char buf[MB_LEN_MAX];
@@ -191,18 +191,19 @@ __wcsconv(wchar_t *wcsarg, int prec)
                                return (NULL);
                }
        }
-       if ((convbuf = malloc(nbytes + 1)) == NULL)
+       convbuf = recallocarray(oconvbuf, *convbuf_size, nbytes + 1, 1);
+       if (convbuf == NULL)
                return (NULL);
+       *convbuf_size = nbytes + 1;
 
        /* Fill the output buffer. */
        p = wcsarg;
        memset(&mbs, 0, sizeof(mbs));
-       if ((nbytes = wcsrtombs(convbuf, (const wchar_t **)&p,
-           nbytes, &mbs)) == (size_t)-1) {
-               free(convbuf);
+       nbytes = wcsrtombs(convbuf, (const wchar_t **)&p, nbytes, &mbs);
+       if (nbytes == (size_t)-1) {
+               freezero(convbuf, *convbuf_size);
                return (NULL);
        }
-       convbuf[nbytes] = '\0';
        return (convbuf);
 }
 #endif
@@ -330,6 +331,7 @@ __vfprintf(FILE *fp, const char *fmt0, _
        va_list orgap;          /* original argument pointer */
 #ifdef PRINTF_WIDE_CHAR
        char *convbuf;          /* buffer for wide to multi-byte conversion */
+       size_t convbuf_size;
 #endif
 
        /*
@@ -857,8 +859,6 @@ fp_common:
                        if (flags & LONGINT) {
                                wchar_t *wcp;
 
-                               free(convbuf);
-                               convbuf = NULL;
                                if ((wcp = GETARG(wchar_t *)) == NULL) {
                                        struct syslog_data sdata = 
SYSLOG_DATA_INIT;
                                        int save_errno = errno;
@@ -867,9 +867,13 @@ fp_common:
                                            "vfprintf %%ls NULL in \"%s\"", 
fmt0);
                                        errno = save_errno;
 
+                                       freezero(convbuf, convbuf_size);
+                                       convbuf = NULL;
+
                                        cp = "(null)";
                                } else {
-                                       convbuf = __wcsconv(wcp, prec);
+                                       convbuf = __wcsconv(wcp, prec, convbuf,
+                                           &convbuf_size);
                                        if (convbuf == NULL) {
                                                ret = -1;
                                                goto error;
@@ -1087,7 +1091,7 @@ overflow:
 
 finish:
 #ifdef PRINTF_WIDE_CHAR
-       free(convbuf);
+       freezero(convbuf, convbuf_size);
 #endif
 #ifdef FLOATING_POINT
        if (dtoaresult)
Index: vfwprintf.c
===================================================================
RCS file: /cvs/src/lib/libc/stdio/vfwprintf.c,v
retrieving revision 1.18
diff -u -p -r1.18 vfwprintf.c
--- vfwprintf.c 15 Aug 2017 00:20:39 -0000      1.18
+++ vfwprintf.c 28 Nov 2017 04:06:43 -0000
@@ -164,7 +164,7 @@ __xfputwc(wchar_t wc, FILE *fp)
  * To find out what happened check errno for ENOMEM, EILSEQ and EINVAL.
  */
 static wchar_t *
-__mbsconv(char *mbsarg, int prec)
+__mbsconv(char *mbsarg, int prec, wchar_t *oconvbuf, size_t *convbuf_size)
 {
        mbstate_t mbs;
        wchar_t *convbuf, *wcp;
@@ -205,9 +205,11 @@ __mbsconv(char *mbsarg, int prec)
         * converting at most `size' bytes of the input multibyte string to
         * wide characters for printing.
         */
-       convbuf = calloc(insize + 1, sizeof(*convbuf));
+       convbuf = recallocarray(oconvbuf, *convbuf_size, insize + 1,
+           sizeof(*convbuf));
        if (convbuf == NULL)
                return (NULL);
+       *convbuf_size = insize + 1;
        wcp = convbuf;
        p = mbsarg;
        bzero(&mbs, sizeof(mbs));
@@ -221,7 +223,7 @@ __mbsconv(char *mbsarg, int prec)
                insize -= nconv;
        }
        if (nconv == (size_t)-1 || nconv == (size_t)-2) {
-               free(convbuf);
+               freezero(convbuf, *convbuf_size * sizeof(*convbuf));
                return (NULL);
        }
        *wcp = '\0';
@@ -333,6 +335,7 @@ __vfwprintf(FILE * __restrict fp, const 
        int nextarg;            /* 1-based argument index */
        va_list orgap;          /* original argument pointer */
        wchar_t *convbuf;       /* buffer for multibyte to wide conversion */
+       size_t convbuf_size;
 
        /*
         * Choose PADSIZE to trade efficiency vs. size.  If larger printf
@@ -668,8 +671,8 @@ reswitch:   switch (ch) {
                                prec = dtoaend - dtoaresult;
                        if (expt == INT_MAX)
                                ox[1] = '\0';
-                       free(convbuf);
-                       cp = convbuf = __mbsconv(dtoaresult, -1);
+                       cp = convbuf = __mbsconv(dtoaresult, -1, convbuf,
+                           &convbuf_size);
                        if (cp == NULL)
                                goto error;
                        ndig = dtoaend - dtoaresult;
@@ -717,8 +720,8 @@ fp_begin:
                                if (expt == 9999)
                                        expt = INT_MAX;
                        }
-                       free(convbuf);
-                       cp = convbuf = __mbsconv(dtoaresult, -1);
+                       cp = convbuf = __mbsconv(dtoaresult, -1, convbuf,
+                           &convbuf_size);
                        if (cp == NULL)
                                goto error;
                        ndig = dtoaend - dtoaresult;
@@ -839,8 +842,8 @@ fp_common:
 
                                        mbsarg = "(null)";
                                }
-                               free(convbuf);
-                               convbuf = __mbsconv(mbsarg, prec);
+                               convbuf = __mbsconv(mbsarg, prec, convbuf,
+                                   &convbuf_size);
                                if (convbuf == NULL) {
                                        fp->_flags |= __SERR;
                                        goto error;
@@ -1055,7 +1058,7 @@ overflow:
        ret = -1;
 
 finish:
-       free(convbuf);
+       freezero(convbuf, convbuf_size * sizeof(*convbuf));
 #ifdef FLOATING_POINT
        if (dtoaresult)
                __freedtoa(dtoaresult);

Regards,

kshe

Reply via email to