The compile_delimited() function in sed(1) reads one byte past
the input buffer when a backslash escape appears at the end of
the input line.
At line 382-383, when processing a backslash followed by a
character that is neither the delimiter nor 'n':
*d++ = *p++; /* copies the backslash */
*d++ = *p++; /* copies the next character */
The outer while loop at line 365 only checks that p[0] != '\0'.
After copying p[0] (the backslash), p[1] might be the null
terminator at the end of the buffer. The second *p++ copies
the null byte and advances p past it. On the next loop
iteration, p[0] reads one byte past the allocated buffer.
A sed script consisting of a regex with ~1000 backslashes
triggers this: the getline() buffer is exactly 1024 bytes,
and the last backslash's escape sequence extends past it.
Fix: check that the character after the backslash is not the
null terminator before copying it.
Found by AFL++ fuzzing.
Index: usr.bin/sed/compile.c
===================================================================
RCS file: /cvs/src/usr.bin/sed/compile.c,v
retrieving revision 1.53
diff -u -p -r1.53 compile.c
--- usr.bin/sed/compile.c 17 Jul 2024 20:57:15 -0000 1.53
+++ usr.bin/sed/compile.c
@@ -380,7 +380,8 @@ compile_delimited(char *p, char *d)
/* Other escapes remain unchanged. */
} else {
*d++ = *p++;
- *d++ = *p++;
+ if (*p)
+ *d++ = *p++;
}
continue;
}