On Fri 2018-11-30 11:26:02, Tetsuo Handa wrote: > Petr Mladek wrote: > > See syslog_print_all() and kmsg_dump_get_buffer(). They > > start with calling msg_print_text() many times to calculate > > how many messages fit into the given buffer. Then they > > blindly copy the messages into the buffer. > > > > Now, the buffer might overflow if the size is calculated > > with printk_time disabled and the messages are copied with > > printk_time enabled. > > > > Indeed, SYSLOG_ACTION_READ_ALL case can write more bytes than userspace > has supplied, resulting in possible memory corruption in userpsace. And > obtaining a snapshot of printk_time under logbuf_lock won't help, for we > need to drop logbuf_lock before calling copy_to_user(). > > > IMHO, passing it as parameter to msg_print_text(), print_prefix(), > > and print_time() looks acceptable. > > Below is a patch for passing a snapshot of printk_time to msg_print_text(), > but we are still failing to close a race explained at bottom.
The patch looks fine to me. > The remaining race is that klogctl(SYSLOG_ACTION_SIZE_UNREAD) after > klogctl(SYSLOG_ACTION_READ) > can return negative value upon success, for SYSLOG_ACTION_SIZE_UNREAD depends > that printk_time > did not change (from true to false) after SYSLOG_ACTION_READ updated > syslog_partial. To close > this race, we need to make sure that printk_time is evaluated only once for > each record. > That is, evaluate printk_time upon e.g. log_store() rather than upon > msg_print_text(). Great catch! > kernel/printk/printk.c | 9 ++++++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > index 1b2a029..c692fbc 100644 > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > @@ -421,6 +421,8 @@ __packed __aligned(4) > static u64 clear_seq; > static u32 clear_idx; > > +static bool printk_time = IS_ENABLED(CONFIG_PRINTK_TIME); > + > #define PREFIX_MAX 32 > #define LOG_LINE_MAX (1024 - PREFIX_MAX) > > @@ -620,7 +622,9 @@ static int log_store(int facility, int level, > msg->facility = facility; > msg->level = level & 7; > msg->flags = flags & 0x1f; > - if (ts_nsec > 0) > + if (!printk_time) > + msg->ts_nsec = (u64) -1ULL; > + else if (ts_nsec > 0) > msg->ts_nsec = ts_nsec; > else > msg->ts_nsec = local_clock(); > @@ -1210,14 +1214,13 @@ static inline void boot_delay_msec(int level) > } > #endif > > -static bool printk_time = IS_ENABLED(CONFIG_PRINTK_TIME); > module_param_named(time, printk_time, bool, S_IRUGO | S_IWUSR); > > static size_t print_time(u64 ts, char *buf) > { > unsigned long rem_nsec; > > - if (!printk_time) > + if (ts == (u64) -1ULL) Sigh, we need to always store the timestamp. /dev/kmsg prints it timestamps even when printk.time parameter is disabled. Instead, we could use one bit in log_flags instead. But I feel a bit uneasy that we change the semantic and could break something. For example, SYSLOG_ACTION_READ_ALL callers might want to have a consistent output (dmesg -Sr). Also we would need to sync early boot messages with printk.time kernel parameter. I am not fully happy with the solution passing time parameter. It is less secure. But it would not break compatibility. This particular race might be solved the following way: diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 1b2a029360b7..5b489988e9b7 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -1294,6 +1294,7 @@ static size_t msg_print_text(const struct printk_log *msg, bool syslog, char *bu static int syslog_print(char __user *buf, int size) { + static bool time; char *text; struct printk_log *msg; int len = 0; @@ -1318,9 +1319,13 @@ static int syslog_print(char __user *buf, int size) break; } + /* Keep partial line consistent */ + if (!syslog_partial) + time = printk_time; + skip = syslog_partial; msg = log_from_idx(syslog_idx); - n = msg_print_text(msg, true, text, LOG_LINE_MAX + PREFIX_MAX); + n = msg_print_text(msg, true, time, text, LOG_LINE_MAX + PREFIX_MAX); if (n - syslog_partial <= size) { /* message fits into buffer, move forward */ syslog_idx = log_next(syslog_idx); Best Regards, Petr