Pádraig Brady <[email protected]> writes:

> On 24/08/2025 20:15, Collin Funk wrote:
>> Pádraig Brady <[email protected]> writes:
>> 
>>> I don't think getline() is a generic enough interface,
>>> especially for a utility used to introduce new lines.
>>>
>>> Consider for example you want 80x10 lines of "y",
>>> you could do this previously with:
>>>
>>>    $ yes | tr -d '\n' | fold -w 80 | head -n 10
>>>
>>> While now fold will just consume all of memory.
>>> Instead you might fread() to an IO_BUFSIZE buffer?
>> 
>> Good point.
>> 
>> Using fread is also faster since it doesn't check for the delimiter upon
>> reading a character, and doesn't need to lock the stream each call
>> (since we use fread_unlocked).
>> 
>> Will push the attached in a bit.
>
> Cool, that was quick!
> You might include something like the following test.

Actually, I realized that it is pointless to malloc line_out now. Since
the input buffer is a fixed size and we do not do case conversion or
anything that may change the number of bytes needed to represent the
output.

Will push this v2 patch in a bit instead.

> diff --git a/tests/fold/fold-characters.sh 
> b/tests/fold/fold-characters.shindex 0b22aad6b..a6cc39b73 100755
> --- a/tests/fold/fold-characters.sh
> +++ b/tests/fold/fold-characters.sh
> @@ -58,4 +58,10 @@ compare column-exp2 column-out2 || fail=1
>   fold --characters -w 10 input2 > character-out2 || fail=1
>   compare character-exp2 character-out2 || fail=1
>
> +# Ensure bounded memory operation
> +vm=$(get_min_ulimit_v_ fold /dev/null) && {
> +  yes | tr -d '\n' | (ulimit -v $(($vm+8000)) && fold 2>err) | head -n10 || 
> fail=1
> +  compare /dev/null err || fail=1
> +}
> +
>   Exit $fail

Thanks, I'll probably add this if I implement the --unicode option.
Since that will certainly require allocating memory based on my skimming
of unilbrk.h.

Collin

>From fb9016d50505b2cb66179dd526249063bc393846 Mon Sep 17 00:00:00 2001
Message-ID: <fb9016d50505b2cb66179dd526249063bc393846.1756069082.git.collin.fu...@gmail.com>
From: Collin Funk <[email protected]>
Date: Sun, 24 Aug 2025 12:05:41 -0700
Subject: [PATCH v2] fold: use fread instead of getline

* src/fold.c: Include ioblksize.h.
(fold_file): Use two IO_BUFSIZE-sized buffers. Use fread instead of
getline. Check for if we reached the end of file.
---
 src/fold.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/src/fold.c b/src/fold.c
index 153ce15a6..7bf30cd0b 100644
--- a/src/fold.c
+++ b/src/fold.c
@@ -25,6 +25,7 @@
 
 #include "system.h"
 #include "fadvise.h"
+#include "ioblksize.h"
 #include "mcel.h"
 #include "xdectoint.h"
 
@@ -136,11 +137,9 @@ fold_file (char const *filename, size_t width)
   FILE *istream;
   size_t column = 0;		/* Screen column where next char will go. */
   idx_t offset_out = 0;		/* Index in 'line_out' for next char. */
-  static char *line_out = nullptr;
-  static idx_t allocated_out = 0;
-  static char *line_in = nullptr;
-  static size_t allocated_in = 0;
-  static ssize_t length_in = 0;
+  static char line_out[IO_BUFSIZE];
+  static char line_in[IO_BUFSIZE];
+  static size_t length_in = 0;
   int saved_errno;
 
   if (STREQ (filename, "-"))
@@ -159,7 +158,7 @@ fold_file (char const *filename, size_t width)
 
   fadvise (istream, FADVISE_SEQUENTIAL);
 
-  while (0 <= (length_in = getline (&line_in, &allocated_in, istream)))
+  while (0 < (length_in = fread (line_in, 1, sizeof line_in, istream)))
     {
       char *p = line_in;
       char *lim = p + length_in;
@@ -167,9 +166,6 @@ fold_file (char const *filename, size_t width)
       for (; p < lim; p += g.len)
         {
           g = mcel_scan (p, lim);
-          if (allocated_out - offset_out <= g.len)
-            line_out = xpalloc (line_out, &allocated_out, g.len, -1,
-                                sizeof *line_out);
           if (g.ch == '\n')
             {
               memcpy (line_out + offset_out, p, g.len);
@@ -243,6 +239,8 @@ fold_file (char const *filename, size_t width)
           memcpy (line_out + offset_out, p, g.len);
           offset_out += g.len;
         }
+      if (feof (istream))
+        break;
     }
 
   saved_errno = errno;
-- 
2.51.0

Reply via email to