The counting sort in flush_line() uses int for the count array,
save, and tot variables. When reverse line feeds (\v) cause many
characters to be placed at the same column, these overflow.
The sort works by counting characters per column, then computing
a running total as indices into a sorted output array:
static int *count, save, tot;
...
for (i = nchars, c = l->l_line; i-- > 0; c++)
count[c->c_column]++;
...
for (tot = 0, i = 0; i <= l->l_max_col; i++) {
save = count[i];
count[i] = tot;
tot += save;
}
nchars (the total character count) is size_t. The per-column
counts in count[] and the running total in tot accumulate to
nchars. When nchars exceeds INT_MAX, both count[i]++ and
tot += save overflow signed int.
The 5-byte input "1\v2\n\n" triggers the overflow: the vertical
tab causes characters to be placed at the same column via reverse
line feed, requiring the sort path.
Fix: change count, save, and tot from int to size_t, matching the
existing nchars type. Also use sizeof(*count) in the allocation
for consistency.
Found by AFL++ fuzzing with UBSan.
Index: usr.bin/col/col.c
===================================================================
RCS file: /cvs/src/usr.bin/col/col.c,v
retrieving revision 1.20
diff -u -p -r1.20 col.c
--- usr.bin/col/col.c 4 Dec 2022 23:50:47 -0000 1.20
+++ usr.bin/col/col.c
@@ -388,7 +388,7 @@ flush_line(LINE *l)
if (l->l_needs_sort) {
static CHAR *sorted;
static size_t count_size, i, sorted_size;
- static int *count, save, tot;
+ static size_t *count, save, tot;
/*
* Do an O(n) sort on l->l_line by column being careful to
@@ -402,7 +402,7 @@ flush_line(LINE *l)
if (l->l_max_col >= count_size) {
count_size = l->l_max_col + 1;
count = xreallocarray(count,
- count_size, sizeof(int));
+ count_size, sizeof(*count));
}
memset(count, 0, sizeof(*count) * (l->l_max_col + 1));
for (i = nchars, c = l->l_line; i-- > 0; c++)