Re: [PATCH v3 1/3] kasan: switch kunit tests to console tracepoints
On 7 Jan 2024, at 19:22, Paul Heidekrüger wrote: > On 12.12.2023 10:32, Marco Elver wrote: >> On Tue, 12 Dec 2023 at 10:19, Paul Heidekrüger >> wrote: >>> >>> On 12.12.2023 00:37, Andrey Konovalov wrote: On Tue, Dec 12, 2023 at 12:35 AM Paul Heidekrüger wrote: > > Using CONFIG_FTRACE=y instead of CONFIG_TRACEPOINTS=y produces the same > error > for me. > > So > > CONFIG_KUNIT=y > CONFIG_KUNIT_ALL_TESTS=n > CONFIG_FTRACE=y > CONFIG_KASAN=y > CONFIG_KASAN_GENERIC=y > CONFIG_KASAN_KUNIT_TEST=y > > produces > > ➜ ./tools/testing/kunit/kunit.py run > --kunitconfig=mm/kasan/.kunitconfig --arch=arm64 > Configuring KUnit Kernel ... > Regenerating .config ... > Populating config with: > $ make ARCH=arm64 O=.kunit olddefconfig CC=clang > ERROR:root:Not all Kconfig options selected in kunitconfig were > in the generated .config. > This is probably due to unsatisfied dependencies. > Missing: CONFIG_KASAN_KUNIT_TEST=y > > By that error message, CONFIG_FTRACE appears to be present in the > generated > config, but CONFIG_KASAN_KUNIT_TEST still isn't. Presumably, > CONFIG_KASAN_KUNIT_TEST is missing because of an unsatisfied dependency, > which > must be CONFIG_TRACEPOINTS, unless I'm missing something ... > > If I just generate an arm64 defconfig and select CONFIG_FTRACE=y, > CONFIG_TRACEPOINTS=y shows up in my .config. So, maybe this is > kunit.py-related > then? > > Andrey, you said that the tests have been working for you; are you > running them > with kunit.py? No, I just run the kernel built with a config file that I put together based on defconfig. >>> >>> Ah. I believe I've figured it out. >>> >>> When I add CONFIG_STACK_TRACER=y in addition to CONFIG_FTRACE=y, it works. >> >> CONFIG_FTRACE should be enough - maybe also check x86 vs. arm64 to debug >> more. > > See below. > >>> CONFIG_STACK_TRACER selects CONFIG_FUNCTION_TRACER, CONFIG_FUNCTION_TRACER >>> selects CONFIG_GENERIC_TRACER, CONFIG_GENERIC_TRACER selects >>> CONFIG_TRACING, and >>> CONFIG_TRACING selects CONFIG_TRACEPOINTS. >>> >>> CONFIG_BLK_DEV_IO_TRACE=y also works instead of CONFIG_STACK_TRACER=y, as it >>> directly selects CONFIG_TRACEPOINTS. >>> >>> CONFIG_FTRACE=y on its own does not appear suffice for kunit.py on arm64. >> >> When you build manually with just CONFIG_FTRACE, is CONFIG_TRACEPOINTS >> enabled? > > When I add CONFIG_FTRACE and enter-key my way through the FTRACE prompts - I > believe because CONFIG_FTRACE is a menuconfig? - at the beginning of a build, > CONFIG_TRACEPOINTS does get set on arm64, yes. > > On X86, the defconfig already includes CONIFG_TRACEPOINTS. > > I also had a closer look at how kunit.py builds its configs. > I believe it does something along the following lines: > > cp .kunit/.config > make ARCH=arm64 O=.kunit olddefconfig > > On arm64, that isn't enough to set CONFIG_TRACEPOINTS; same behaviour when run > outside of kunit.py. > > For CONFIG_TRACEPOINTS, `make ARCH=arm64 menuconfig` shows: > > Symbol: TRACEPOINTS [=n] > Type : bool > Defined at init/Kconfig:1920 > Selected by [n]: > - TRACING [=n] > - BLK_DEV_IO_TRACE [=n] && FTRACE [=y] && SYSFS [=y] && BLOCK [=y] > > So, CONFIG_TRACING or CONFIG_BLK_DEV_IO_TRACE are the two options that prevent > CONFIG_TRACEPOINTS from being set on arm64. > > For CONFIG_TRACING we have: > > Symbol: TRACING [=n] > Type : bool > Defined at kernel/trace/Kconfig:157 > Selects: RING_BUFFER [=n] && STACKTRACE [=y] && TRACEPOINTS [=n] && > NOP_TRACER [=n] && BINARY_PRINTF [=n] && EVENT_TRACING [=n] && TRACE_CLOCK > [=y] && TASKS_RCU [=n] > Selected by [n]: > - DRM_I915_TRACE_GEM [=n] && HAS_IOMEM [=y] && DRM_I915 [=n] && EXPERT > [=n] && DRM_I915_DEBUG_GEM [=n] > - DRM_I915_TRACE_GTT [=n] && HAS_IOMEM [=y] && DRM_I915 [=n] && EXPERT > [=n] && DRM_I915_DEBUG_GEM [=n] > - PREEMPTIRQ_TRACEPOINTS [=n] && (TRACE_PREEMPT_TOGGLE [=n] || > TRACE_IRQFLAGS [=n]) > - GENERIC_TRACER [=n] > - ENABLE_DEFAULT_TRACERS [=n] && FTRACE [=y] && !GENERIC_TRACER [=n] > - FPROBE_EVENTS [=n] && FTRACE [=y] && FPROBE [=n] && > HAVE_REGS_AND_STACK_ACCESS_API [=y] > - KPROBE_EVENTS [=n] && FTRACE [=y] && KPROBES [=n] && > HAVE_REGS_AND_STACK_ACCESS_API [=y] > - UPROBE_EVENTS [=n] && FTRACE [=y] && ARCH_SUPPORTS_UPROBES [=y] && > MMU [=y] && PERF_EVENTS [=n] > - SYNTH_EVENTS [=n] && FTRACE [=y] > - USER_EVENTS [=n] && FTRACE [=y] > - HIST_TRIGGERS [=n] && FTRACE [=y] && ARCH_HAVE_NMI_SAFE_CMPXCHG [=y] > >>> I believe the reason my .kunitconfig as well as the existing >>> mm/kfence/.kunitconfig work on X86 is because CONFIG_TRACEPOINTS=y is >>> present in >>> an
Re: Re: [PATCH v3 1/3] kasan: switch kunit tests to console tracepoints
On 12.12.2023 10:32, Marco Elver wrote: > On Tue, 12 Dec 2023 at 10:19, Paul Heidekrüger > wrote: > > > > On 12.12.2023 00:37, Andrey Konovalov wrote: > > > On Tue, Dec 12, 2023 at 12:35 AM Paul Heidekrüger > > > wrote: > > > > > > > > Using CONFIG_FTRACE=y instead of CONFIG_TRACEPOINTS=y produces the same > > > > error > > > > for me. > > > > > > > > So > > > > > > > > CONFIG_KUNIT=y > > > > CONFIG_KUNIT_ALL_TESTS=n > > > > CONFIG_FTRACE=y > > > > CONFIG_KASAN=y > > > > CONFIG_KASAN_GENERIC=y > > > > CONFIG_KASAN_KUNIT_TEST=y > > > > > > > > produces > > > > > > > > ➜ ./tools/testing/kunit/kunit.py run > > > > --kunitconfig=mm/kasan/.kunitconfig --arch=arm64 > > > > Configuring KUnit Kernel ... > > > > Regenerating .config ... > > > > Populating config with: > > > > $ make ARCH=arm64 O=.kunit olddefconfig CC=clang > > > > ERROR:root:Not all Kconfig options selected in kunitconfig were > > > > in the generated .config. > > > > This is probably due to unsatisfied dependencies. > > > > Missing: CONFIG_KASAN_KUNIT_TEST=y > > > > > > > > By that error message, CONFIG_FTRACE appears to be present in the > > > > generated > > > > config, but CONFIG_KASAN_KUNIT_TEST still isn't. Presumably, > > > > CONFIG_KASAN_KUNIT_TEST is missing because of an unsatisfied > > > > dependency, which > > > > must be CONFIG_TRACEPOINTS, unless I'm missing something ... > > > > > > > > If I just generate an arm64 defconfig and select CONFIG_FTRACE=y, > > > > CONFIG_TRACEPOINTS=y shows up in my .config. So, maybe this is > > > > kunit.py-related > > > > then? > > > > > > > > Andrey, you said that the tests have been working for you; are you > > > > running them > > > > with kunit.py? > > > > > > No, I just run the kernel built with a config file that I put together > > > based on defconfig. > > > > Ah. I believe I've figured it out. > > > > When I add CONFIG_STACK_TRACER=y in addition to CONFIG_FTRACE=y, it works. > > CONFIG_FTRACE should be enough - maybe also check x86 vs. arm64 to debug more. See below. > > CONFIG_STACK_TRACER selects CONFIG_FUNCTION_TRACER, CONFIG_FUNCTION_TRACER > > selects CONFIG_GENERIC_TRACER, CONFIG_GENERIC_TRACER selects > > CONFIG_TRACING, and > > CONFIG_TRACING selects CONFIG_TRACEPOINTS. > > > > CONFIG_BLK_DEV_IO_TRACE=y also works instead of CONFIG_STACK_TRACER=y, as it > > directly selects CONFIG_TRACEPOINTS. > > > > CONFIG_FTRACE=y on its own does not appear suffice for kunit.py on arm64. > > When you build manually with just CONFIG_FTRACE, is CONFIG_TRACEPOINTS > enabled? When I add CONFIG_FTRACE and enter-key my way through the FTRACE prompts - I believe because CONFIG_FTRACE is a menuconfig? - at the beginning of a build, CONFIG_TRACEPOINTS does get set on arm64, yes. On X86, the defconfig already includes CONIFG_TRACEPOINTS. I also had a closer look at how kunit.py builds its configs. I believe it does something along the following lines: cp .kunit/.config make ARCH=arm64 O=.kunit olddefconfig On arm64, that isn't enough to set CONFIG_TRACEPOINTS; same behaviour when run outside of kunit.py. For CONFIG_TRACEPOINTS, `make ARCH=arm64 menuconfig` shows: Symbol: TRACEPOINTS [=n] Type : bool Defined at init/Kconfig:1920 Selected by [n]: - TRACING [=n] - BLK_DEV_IO_TRACE [=n] && FTRACE [=y] && SYSFS [=y] && BLOCK [=y] So, CONFIG_TRACING or CONFIG_BLK_DEV_IO_TRACE are the two options that prevent CONFIG_TRACEPOINTS from being set on arm64. For CONFIG_TRACING we have: Symbol: TRACING [=n] Type : bool Defined at kernel/trace/Kconfig:157 Selects: RING_BUFFER [=n] && STACKTRACE [=y] && TRACEPOINTS [=n] && NOP_TRACER [=n] && BINARY_PRINTF [=n] && EVENT_TRACING [=n] && TRACE_CLOCK [=y] && TASKS_RCU [=n] Selected by [n]: - DRM_I915_TRACE_GEM [=n] && HAS_IOMEM [=y] && DRM_I915 [=n] && EXPERT [=n] && DRM_I915_DEBUG_GEM [=n] - DRM_I915_TRACE_GTT [=n] && HAS_IOMEM [=y] && DRM_I915 [=n] && EXPERT [=n] && DRM_I915_DEBUG_GEM [=n] - PREEMPTIRQ_TRACEPOINTS [=n] && (TRACE_PREEMPT_TOGGLE [=n] || TRACE_IRQFLAGS [=n]) - GENERIC_TRACER [=n] - ENABLE_DEFAULT_TRACERS [=n] && FTRACE [=y] && !GENERIC_TRACER [=n] - FPROBE_EVENTS [=n] && FTRACE [=y] && FPROBE [=n] && HAVE_REGS_AND_STACK_ACCESS_API [=y] - KPROBE_EVENTS [=n] && FTRACE [=y] && KPROBES [=n] && HAVE_REGS_AND_STACK_ACCESS_API [=y] - UPROBE_EVENTS [=n] && FTRACE [=y] && ARCH_SUPPORTS_UPROBES [=y] && MMU [=y] && PERF_EVENTS [=n] - SYNTH_EVENTS [=n] && FTRACE [=y] - USER_EVENTS [=n] && FTRACE [=y] - HIST_TRIGGERS [=n] && FTRACE [=y] && ARCH_HAVE_NMI_SAFE_CMPXCHG [=y] > > I believe the reason my .kunitconfig as well as the
Re: [PATCH v3 1/3] kasan: switch kunit tests to console tracepoints
On Tue, 12 Dec 2023 at 10:19, Paul Heidekrüger wrote: > > On 12.12.2023 00:37, Andrey Konovalov wrote: > > On Tue, Dec 12, 2023 at 12:35 AM Paul Heidekrüger > > wrote: > > > > > > Using CONFIG_FTRACE=y instead of CONFIG_TRACEPOINTS=y produces the same > > > error > > > for me. > > > > > > So > > > > > > CONFIG_KUNIT=y > > > CONFIG_KUNIT_ALL_TESTS=n > > > CONFIG_FTRACE=y > > > CONFIG_KASAN=y > > > CONFIG_KASAN_GENERIC=y > > > CONFIG_KASAN_KUNIT_TEST=y > > > > > > produces > > > > > > ➜ ./tools/testing/kunit/kunit.py run > > > --kunitconfig=mm/kasan/.kunitconfig --arch=arm64 > > > Configuring KUnit Kernel ... > > > Regenerating .config ... > > > Populating config with: > > > $ make ARCH=arm64 O=.kunit olddefconfig CC=clang > > > ERROR:root:Not all Kconfig options selected in kunitconfig were > > > in the generated .config. > > > This is probably due to unsatisfied dependencies. > > > Missing: CONFIG_KASAN_KUNIT_TEST=y > > > > > > By that error message, CONFIG_FTRACE appears to be present in the > > > generated > > > config, but CONFIG_KASAN_KUNIT_TEST still isn't. Presumably, > > > CONFIG_KASAN_KUNIT_TEST is missing because of an unsatisfied dependency, > > > which > > > must be CONFIG_TRACEPOINTS, unless I'm missing something ... > > > > > > If I just generate an arm64 defconfig and select CONFIG_FTRACE=y, > > > CONFIG_TRACEPOINTS=y shows up in my .config. So, maybe this is > > > kunit.py-related > > > then? > > > > > > Andrey, you said that the tests have been working for you; are you > > > running them > > > with kunit.py? > > > > No, I just run the kernel built with a config file that I put together > > based on defconfig. > > Ah. I believe I've figured it out. > > When I add CONFIG_STACK_TRACER=y in addition to CONFIG_FTRACE=y, it works. CONFIG_FTRACE should be enough - maybe also check x86 vs. arm64 to debug more. > CONFIG_STACK_TRACER selects CONFIG_FUNCTION_TRACER, CONFIG_FUNCTION_TRACER > selects CONFIG_GENERIC_TRACER, CONFIG_GENERIC_TRACER selects CONFIG_TRACING, > and > CONFIG_TRACING selects CONFIG_TRACEPOINTS. > > CONFIG_BLK_DEV_IO_TRACE=y also works instead of CONFIG_STACK_TRACER=y, as it > directly selects CONFIG_TRACEPOINTS. > > CONFIG_FTRACE=y on its own does not appear suffice for kunit.py on arm64. When you build manually with just CONFIG_FTRACE, is CONFIG_TRACEPOINTS enabled? > I believe the reason my .kunitconfig as well as the existing > mm/kfence/.kunitconfig work on X86 is because CONFIG_TRACEPOINTS=y is present > in > an X86 defconfig. > > Does this make sense? > > Would you welcome a patch addressing this for the existing > mm/kfence/.kunitconfig? > > I would also like to submit a patch for an mm/kasan/.kunitconfig. Do you think > that would be helpful too? > > FWICT, kernel/kcsan/.kunitconfig might also be affected since > CONFIG_KCSAN_KUNIT_TEST also depends on CONFIG_TRACEPOITNS, but I would have > to > test that. That could be a third patch. I'd support figuring out the minimal config (CONFIG_FTRACE or something else?) that satisfies the TRACEPOINTS dependency. I always thought CONFIG_FTRACE ought to be the one config option, but maybe something changed. Also maybe one of the tracing maintainers can help untangle what's going on here. Thanks, -- Marco
Re: [PATCH v3 1/3] kasan: switch kunit tests to console tracepoints
On 12.12.2023 00:37, Andrey Konovalov wrote: > On Tue, Dec 12, 2023 at 12:35 AM Paul Heidekrüger > wrote: > > > > Using CONFIG_FTRACE=y instead of CONFIG_TRACEPOINTS=y produces the same > > error > > for me. > > > > So > > > > CONFIG_KUNIT=y > > CONFIG_KUNIT_ALL_TESTS=n > > CONFIG_FTRACE=y > > CONFIG_KASAN=y > > CONFIG_KASAN_GENERIC=y > > CONFIG_KASAN_KUNIT_TEST=y > > > > produces > > > > ➜ ./tools/testing/kunit/kunit.py run > > --kunitconfig=mm/kasan/.kunitconfig --arch=arm64 > > Configuring KUnit Kernel ... > > Regenerating .config ... > > Populating config with: > > $ make ARCH=arm64 O=.kunit olddefconfig CC=clang > > ERROR:root:Not all Kconfig options selected in kunitconfig were in > > the generated .config. > > This is probably due to unsatisfied dependencies. > > Missing: CONFIG_KASAN_KUNIT_TEST=y > > > > By that error message, CONFIG_FTRACE appears to be present in the generated > > config, but CONFIG_KASAN_KUNIT_TEST still isn't. Presumably, > > CONFIG_KASAN_KUNIT_TEST is missing because of an unsatisfied dependency, > > which > > must be CONFIG_TRACEPOINTS, unless I'm missing something ... > > > > If I just generate an arm64 defconfig and select CONFIG_FTRACE=y, > > CONFIG_TRACEPOINTS=y shows up in my .config. So, maybe this is > > kunit.py-related > > then? > > > > Andrey, you said that the tests have been working for you; are you running > > them > > with kunit.py? > > No, I just run the kernel built with a config file that I put together > based on defconfig. Ah. I believe I've figured it out. When I add CONFIG_STACK_TRACER=y in addition to CONFIG_FTRACE=y, it works. CONFIG_STACK_TRACER selects CONFIG_FUNCTION_TRACER, CONFIG_FUNCTION_TRACER selects CONFIG_GENERIC_TRACER, CONFIG_GENERIC_TRACER selects CONFIG_TRACING, and CONFIG_TRACING selects CONFIG_TRACEPOINTS. CONFIG_BLK_DEV_IO_TRACE=y also works instead of CONFIG_STACK_TRACER=y, as it directly selects CONFIG_TRACEPOINTS. CONFIG_FTRACE=y on its own does not appear suffice for kunit.py on arm64. I believe the reason my .kunitconfig as well as the existing mm/kfence/.kunitconfig work on X86 is because CONFIG_TRACEPOINTS=y is present in an X86 defconfig. Does this make sense? Would you welcome a patch addressing this for the existing mm/kfence/.kunitconfig? I would also like to submit a patch for an mm/kasan/.kunitconfig. Do you think that would be helpful too? FWICT, kernel/kcsan/.kunitconfig might also be affected since CONFIG_KCSAN_KUNIT_TEST also depends on CONFIG_TRACEPOITNS, but I would have to test that. That could be a third patch. What do you think? Many thanks, Paul
Re: [PATCH v3 1/3] kasan: switch kunit tests to console tracepoints
On Tue, Dec 12, 2023 at 12:35 AM Paul Heidekrüger wrote: > > Using CONFIG_FTRACE=y instead of CONFIG_TRACEPOINTS=y produces the same error > for me. > > So > > CONFIG_KUNIT=y > CONFIG_KUNIT_ALL_TESTS=n > CONFIG_FTRACE=y > CONFIG_KASAN=y > CONFIG_KASAN_GENERIC=y > CONFIG_KASAN_KUNIT_TEST=y > > produces > > ➜ ./tools/testing/kunit/kunit.py run > --kunitconfig=mm/kasan/.kunitconfig --arch=arm64 > Configuring KUnit Kernel ... > Regenerating .config ... > Populating config with: > $ make ARCH=arm64 O=.kunit olddefconfig CC=clang > ERROR:root:Not all Kconfig options selected in kunitconfig were in > the generated .config. > This is probably due to unsatisfied dependencies. > Missing: CONFIG_KASAN_KUNIT_TEST=y > > By that error message, CONFIG_FTRACE appears to be present in the generated > config, but CONFIG_KASAN_KUNIT_TEST still isn't. Presumably, > CONFIG_KASAN_KUNIT_TEST is missing because of an unsatisfied dependency, which > must be CONFIG_TRACEPOINTS, unless I'm missing something ... > > If I just generate an arm64 defconfig and select CONFIG_FTRACE=y, > CONFIG_TRACEPOINTS=y shows up in my .config. So, maybe this is > kunit.py-related > then? > > Andrey, you said that the tests have been working for you; are you running > them > with kunit.py? No, I just run the kernel built with a config file that I put together based on defconfig.
Re: [PATCH v3 1/3] kasan: switch kunit tests to console tracepoints
On 11.12.2023 23:56, Marco Elver wrote: > On Mon, 11 Dec 2023 at 23:48, Paul Heidekrüger > wrote: > > > > On 11.12.2023 21:51, Andrey Konovalov wrote: > > > On Mon, Dec 11, 2023 at 7:59 PM Paul Heidekrüger > > > wrote: > > > > > > > > > Hi Paul, > > > > > > > > > > I've been successfully running KASAN tests with CONFIG_TRACEPOINTS > > > > > enabled on arm64 since this patch landed. > > > > > > > > Interesting ... > > > > > > > > > What happens when you try running the tests with .kunitconfig? Does > > > > > CONFIG_TRACEPOINTS or CONFIG_KASAN_KUNIT_TEST get disabled during > > > > > kernel building? > > > > > > > > Yes, exactly, that's what's happening. > > > > > > > > Here's the output kunit.py is giving me. I replaced CONFIG_DEBUG_KERNEL > > > > with > > > > CONFIG_TRACEPOINTS in my .kunitconfig. Otherwise, it's identical with > > > > the one I > > > > posted above. > > > > > > > > ➜ ./tools/testing/kunit/kunit.py run > > > > --kunitconfig=mm/kasan/.kunitconfig --arch=arm64 > > > > Configuring KUnit Kernel ... > > > > Regenerating .config ... > > > > Populating config with: > > > > $ make ARCH=arm64 O=.kunit olddefconfig > > > > ERROR:root:Not all Kconfig options selected in kunitconfig were > > > > in the generated .config. > > > > This is probably due to unsatisfied dependencies. > > > > Missing: CONFIG_KASAN_KUNIT_TEST=y, CONFIG_TRACEPOINTS=y > > > > > > > > Does CONFIG_TRACEPOINTS have some dependency I'm not seeing? I couldn't > > > > find a > > > > reason why it would get disabled, but I could definitely be wrong. > > > > > > Does your .kunitconfig include CONFIG_TRACEPOINTS=y? I don't see it in > > > the listing that you sent earlier. > > > > Yes. For the kunit.py output from my previous email, I replaced > > CONFIG_DEBUG_KERNEL=y with CONFIG_TRACEPOINTS=y. So, the .kunitconfig I > > used to > > produce the output above was: > > > > CONFIG_KUNIT=y > > CONFIG_KUNIT_ALL_TESTS=n > > CONFIG_TRACEPOINTS=y > > CONFIG_KASAN=y > > CONFIG_KASAN_GENERIC=y > > CONFIG_KASAN_KUNIT_TEST=y > > > > This more or less mirrors what mm/kfence/.kunitconfig is doing, which also > > isn't > > working on my side; kunit.py reports the same error. > > mm/kfence/.kunitconfig does CONFIG_FTRACE=y. TRACEPOINTS is not user > selectable. I don't think any of this has changed since the initial > discussion above, so CONFIG_FTRACE=y is still needed. Using CONFIG_FTRACE=y instead of CONFIG_TRACEPOINTS=y produces the same error for me. So CONFIG_KUNIT=y CONFIG_KUNIT_ALL_TESTS=n CONFIG_FTRACE=y CONFIG_KASAN=y CONFIG_KASAN_GENERIC=y CONFIG_KASAN_KUNIT_TEST=y produces ➜ ./tools/testing/kunit/kunit.py run --kunitconfig=mm/kasan/.kunitconfig --arch=arm64 Configuring KUnit Kernel ... Regenerating .config ... Populating config with: $ make ARCH=arm64 O=.kunit olddefconfig CC=clang ERROR:root:Not all Kconfig options selected in kunitconfig were in the generated .config. This is probably due to unsatisfied dependencies. Missing: CONFIG_KASAN_KUNIT_TEST=y By that error message, CONFIG_FTRACE appears to be present in the generated config, but CONFIG_KASAN_KUNIT_TEST still isn't. Presumably, CONFIG_KASAN_KUNIT_TEST is missing because of an unsatisfied dependency, which must be CONFIG_TRACEPOINTS, unless I'm missing something ... If I just generate an arm64 defconfig and select CONFIG_FTRACE=y, CONFIG_TRACEPOINTS=y shows up in my .config. So, maybe this is kunit.py-related then? Andrey, you said that the tests have been working for you; are you running them with kunit.py? Many thanks, Paul
Re: [PATCH v3 1/3] kasan: switch kunit tests to console tracepoints
On Mon, 11 Dec 2023 at 23:48, Paul Heidekrüger wrote: > > On 11.12.2023 21:51, Andrey Konovalov wrote: > > On Mon, Dec 11, 2023 at 7:59 PM Paul Heidekrüger > > wrote: > > > > > > > Hi Paul, > > > > > > > > I've been successfully running KASAN tests with CONFIG_TRACEPOINTS > > > > enabled on arm64 since this patch landed. > > > > > > Interesting ... > > > > > > > What happens when you try running the tests with .kunitconfig? Does > > > > CONFIG_TRACEPOINTS or CONFIG_KASAN_KUNIT_TEST get disabled during > > > > kernel building? > > > > > > Yes, exactly, that's what's happening. > > > > > > Here's the output kunit.py is giving me. I replaced CONFIG_DEBUG_KERNEL > > > with > > > CONFIG_TRACEPOINTS in my .kunitconfig. Otherwise, it's identical with the > > > one I > > > posted above. > > > > > > ➜ ./tools/testing/kunit/kunit.py run > > > --kunitconfig=mm/kasan/.kunitconfig --arch=arm64 > > > Configuring KUnit Kernel ... > > > Regenerating .config ... > > > Populating config with: > > > $ make ARCH=arm64 O=.kunit olddefconfig > > > ERROR:root:Not all Kconfig options selected in kunitconfig were > > > in the generated .config. > > > This is probably due to unsatisfied dependencies. > > > Missing: CONFIG_KASAN_KUNIT_TEST=y, CONFIG_TRACEPOINTS=y > > > > > > Does CONFIG_TRACEPOINTS have some dependency I'm not seeing? I couldn't > > > find a > > > reason why it would get disabled, but I could definitely be wrong. > > > > Does your .kunitconfig include CONFIG_TRACEPOINTS=y? I don't see it in > > the listing that you sent earlier. > > Yes. For the kunit.py output from my previous email, I replaced > CONFIG_DEBUG_KERNEL=y with CONFIG_TRACEPOINTS=y. So, the .kunitconfig I used > to > produce the output above was: > > CONFIG_KUNIT=y > CONFIG_KUNIT_ALL_TESTS=n > CONFIG_TRACEPOINTS=y > CONFIG_KASAN=y > CONFIG_KASAN_GENERIC=y > CONFIG_KASAN_KUNIT_TEST=y > > This more or less mirrors what mm/kfence/.kunitconfig is doing, which also > isn't > working on my side; kunit.py reports the same error. mm/kfence/.kunitconfig does CONFIG_FTRACE=y. TRACEPOINTS is not user selectable. I don't think any of this has changed since the initial discussion above, so CONFIG_FTRACE=y is still needed.
Re: [PATCH v3 1/3] kasan: switch kunit tests to console tracepoints
On 11.12.2023 21:51, Andrey Konovalov wrote: > On Mon, Dec 11, 2023 at 7:59 PM Paul Heidekrüger > wrote: > > > > > Hi Paul, > > > > > > I've been successfully running KASAN tests with CONFIG_TRACEPOINTS > > > enabled on arm64 since this patch landed. > > > > Interesting ... > > > > > What happens when you try running the tests with .kunitconfig? Does > > > CONFIG_TRACEPOINTS or CONFIG_KASAN_KUNIT_TEST get disabled during > > > kernel building? > > > > Yes, exactly, that's what's happening. > > > > Here's the output kunit.py is giving me. I replaced CONFIG_DEBUG_KERNEL with > > CONFIG_TRACEPOINTS in my .kunitconfig. Otherwise, it's identical with the > > one I > > posted above. > > > > ➜ ./tools/testing/kunit/kunit.py run > > --kunitconfig=mm/kasan/.kunitconfig --arch=arm64 > > Configuring KUnit Kernel ... > > Regenerating .config ... > > Populating config with: > > $ make ARCH=arm64 O=.kunit olddefconfig > > ERROR:root:Not all Kconfig options selected in kunitconfig were in > > the generated .config. > > This is probably due to unsatisfied dependencies. > > Missing: CONFIG_KASAN_KUNIT_TEST=y, CONFIG_TRACEPOINTS=y > > > > Does CONFIG_TRACEPOINTS have some dependency I'm not seeing? I couldn't > > find a > > reason why it would get disabled, but I could definitely be wrong. > > Does your .kunitconfig include CONFIG_TRACEPOINTS=y? I don't see it in > the listing that you sent earlier. Yes. For the kunit.py output from my previous email, I replaced CONFIG_DEBUG_KERNEL=y with CONFIG_TRACEPOINTS=y. So, the .kunitconfig I used to produce the output above was: CONFIG_KUNIT=y CONFIG_KUNIT_ALL_TESTS=n CONFIG_TRACEPOINTS=y CONFIG_KASAN=y CONFIG_KASAN_GENERIC=y CONFIG_KASAN_KUNIT_TEST=y This more or less mirrors what mm/kfence/.kunitconfig is doing, which also isn't working on my side; kunit.py reports the same error. Many thanks, Paul
Re: [PATCH v3 1/3] kasan: switch kunit tests to console tracepoints
On Mon, Dec 11, 2023 at 7:59 PM Paul Heidekrüger wrote: > > > Hi Paul, > > > > I've been successfully running KASAN tests with CONFIG_TRACEPOINTS > > enabled on arm64 since this patch landed. > > Interesting ... > > > What happens when you try running the tests with .kunitconfig? Does > > CONFIG_TRACEPOINTS or CONFIG_KASAN_KUNIT_TEST get disabled during > > kernel building? > > Yes, exactly, that's what's happening. > > Here's the output kunit.py is giving me. I replaced CONFIG_DEBUG_KERNEL with > CONFIG_TRACEPOINTS in my .kunitconfig. Otherwise, it's identical with the one > I > posted above. > > ➜ ./tools/testing/kunit/kunit.py run > --kunitconfig=mm/kasan/.kunitconfig --arch=arm64 > Configuring KUnit Kernel ... > Regenerating .config ... > Populating config with: > $ make ARCH=arm64 O=.kunit olddefconfig > ERROR:root:Not all Kconfig options selected in kunitconfig were in > the generated .config. > This is probably due to unsatisfied dependencies. > Missing: CONFIG_KASAN_KUNIT_TEST=y, CONFIG_TRACEPOINTS=y > > Does CONFIG_TRACEPOINTS have some dependency I'm not seeing? I couldn't find a > reason why it would get disabled, but I could definitely be wrong. Does your .kunitconfig include CONFIG_TRACEPOINTS=y? I don't see it in the listing that you sent earlier.
Re: [PATCH v3 1/3] kasan: switch kunit tests to console tracepoints
Hi Andrey! On 11.12.2023 18:50, Andrey Konovalov wrote: > On Mon, Dec 11, 2023 at 5:37 PM Paul Heidekrüger > wrote: > > > > Hi all! > > > > On 05.05.2023 09:58, Steven Rostedt wrote: > > > On Mon, 1 May 2023 15:02:37 -0700 > > > Peter Collingbourne wrote: > > > > > > > > > "ftrace" is really for just the function tracing, but CONFIG_FTRACE > > > > > > really should just be for the function tracing infrastructure, and > > > > > > perhaps not even include trace events :-/ But at the time it was > > > > > > created, it was for all the "tracers" (this was added before trace > > > > > > events). > > > > > > > > > > It would be great to see this cleaned up. I found this aspect of how > > > > > tracing works rather confusing. > > > > > > > > > > So do you think it makes sense for the KASAN tests to "select TRACING" > > > > > for now if the code depends on the trace event infrastructure? > > > > > > > > Any thoughts? It looks like someone else got tripped up by this: > > > > https://reviews.llvm.org/D144057 > > > > > > Yeah, it really does need to get cleaned up, but unfortunately it's not > > > going to be a trivial change. We need to make sure it's done in a way that > > > an old .config still keeps the same things enabled with the new config > > > settings. That takes some trickery in the dependency. > > > > > > I'll add this to my todo list, hopefully it doesn't fall into the abyss > > > portion of that list :-p > > > > > > -- Steve > > > > Just adding to Peter's concern re: CONFIG_KASAN_KUNIT_TEST's dependency on > > CONFIG_TRACEPOINTS. > > > > I'm having no luck running the KASan KUnit tests on arm64 with the following > > .kunitconfig on v6.6.0: > > > > CONFIG_KUNIT=y > > CONFIG_KUNIT_ALL_TESTS=n > > CONFIG_DEBUG_KERNEL=y > > CONFIG_KASAN=y > > CINFIG_KASAN_GENERIC=y > > CONFIG_KASAN_KUNIT_TEST=y > > > > CONFIG_TRACEPOINTS, which CONFIG_KASAN_TEST relies on since the patch this > > thread is based on, isn't defined for arm64, AFAICT. > > > > If I comment out the dependency on CONFIG_TRACEPOINTS, the tests appear to > > run, > > but KUnit isn't picking up the KASan output. > > > > If I revert the patch, the above .kunitconfig appears to work fine on arm64 > > and > > the tests pass. > > > > The above .kunitconfig works as intended on X86, no changes necessary. > > > > Am I missing something? > > Hi Paul, > > I've been successfully running KASAN tests with CONFIG_TRACEPOINTS > enabled on arm64 since this patch landed. Interesting ... > What happens when you try running the tests with .kunitconfig? Does > CONFIG_TRACEPOINTS or CONFIG_KASAN_KUNIT_TEST get disabled during > kernel building? Yes, exactly, that's what's happening. Here's the output kunit.py is giving me. I replaced CONFIG_DEBUG_KERNEL with CONFIG_TRACEPOINTS in my .kunitconfig. Otherwise, it's identical with the one I posted above. ➜ ./tools/testing/kunit/kunit.py run --kunitconfig=mm/kasan/.kunitconfig --arch=arm64 Configuring KUnit Kernel ... Regenerating .config ... Populating config with: $ make ARCH=arm64 O=.kunit olddefconfig ERROR:root:Not all Kconfig options selected in kunitconfig were in the generated .config. This is probably due to unsatisfied dependencies. Missing: CONFIG_KASAN_KUNIT_TEST=y, CONFIG_TRACEPOINTS=y Does CONFIG_TRACEPOINTS have some dependency I'm not seeing? I couldn't find a reason why it would get disabled, but I could definitely be wrong. > Or tests just don't get executed? > > Thanks! Many thanks, Paul
Re: [PATCH v3 1/3] kasan: switch kunit tests to console tracepoints
On Mon, Dec 11, 2023 at 5:37 PM Paul Heidekrüger wrote: > > Hi all! > > On 05.05.2023 09:58, Steven Rostedt wrote: > > On Mon, 1 May 2023 15:02:37 -0700 > > Peter Collingbourne wrote: > > > > > > > "ftrace" is really for just the function tracing, but CONFIG_FTRACE > > > > > really should just be for the function tracing infrastructure, and > > > > > perhaps not even include trace events :-/ But at the time it was > > > > > created, it was for all the "tracers" (this was added before trace > > > > > events). > > > > > > > > It would be great to see this cleaned up. I found this aspect of how > > > > tracing works rather confusing. > > > > > > > > So do you think it makes sense for the KASAN tests to "select TRACING" > > > > for now if the code depends on the trace event infrastructure? > > > > > > Any thoughts? It looks like someone else got tripped up by this: > > > https://reviews.llvm.org/D144057 > > > > Yeah, it really does need to get cleaned up, but unfortunately it's not > > going to be a trivial change. We need to make sure it's done in a way that > > an old .config still keeps the same things enabled with the new config > > settings. That takes some trickery in the dependency. > > > > I'll add this to my todo list, hopefully it doesn't fall into the abyss > > portion of that list :-p > > > > -- Steve > > Just adding to Peter's concern re: CONFIG_KASAN_KUNIT_TEST's dependency on > CONFIG_TRACEPOINTS. > > I'm having no luck running the KASan KUnit tests on arm64 with the following > .kunitconfig on v6.6.0: > > CONFIG_KUNIT=y > CONFIG_KUNIT_ALL_TESTS=n > CONFIG_DEBUG_KERNEL=y > CONFIG_KASAN=y > CINFIG_KASAN_GENERIC=y > CONFIG_KASAN_KUNIT_TEST=y > > CONFIG_TRACEPOINTS, which CONFIG_KASAN_TEST relies on since the patch this > thread is based on, isn't defined for arm64, AFAICT. > > If I comment out the dependency on CONFIG_TRACEPOINTS, the tests appear to > run, > but KUnit isn't picking up the KASan output. > > If I revert the patch, the above .kunitconfig appears to work fine on arm64 > and > the tests pass. > > The above .kunitconfig works as intended on X86, no changes necessary. > > Am I missing something? Hi Paul, I've been successfully running KASAN tests with CONFIG_TRACEPOINTS enabled on arm64 since this patch landed. What happens when you try running the tests with .kunitconfig? Does CONFIG_TRACEPOINTS or CONFIG_KASAN_KUNIT_TEST get disabled during kernel building? Or tests just don't get executed? Thanks!
Re: [PATCH v3 1/3] kasan: switch kunit tests to console tracepoints
Hi all! On 05.05.2023 09:58, Steven Rostedt wrote: > On Mon, 1 May 2023 15:02:37 -0700 > Peter Collingbourne wrote: > > > > > "ftrace" is really for just the function tracing, but CONFIG_FTRACE > > > > really should just be for the function tracing infrastructure, and > > > > perhaps not even include trace events :-/ But at the time it was > > > > created, it was for all the "tracers" (this was added before trace > > > > events). > > > > > > It would be great to see this cleaned up. I found this aspect of how > > > tracing works rather confusing. > > > > > > So do you think it makes sense for the KASAN tests to "select TRACING" > > > for now if the code depends on the trace event infrastructure? > > > > Any thoughts? It looks like someone else got tripped up by this: > > https://reviews.llvm.org/D144057 > > Yeah, it really does need to get cleaned up, but unfortunately it's not > going to be a trivial change. We need to make sure it's done in a way that > an old .config still keeps the same things enabled with the new config > settings. That takes some trickery in the dependency. > > I'll add this to my todo list, hopefully it doesn't fall into the abyss > portion of that list :-p > > -- Steve Just adding to Peter's concern re: CONFIG_KASAN_KUNIT_TEST's dependency on CONFIG_TRACEPOINTS. I'm having no luck running the KASan KUnit tests on arm64 with the following .kunitconfig on v6.6.0: CONFIG_KUNIT=y CONFIG_KUNIT_ALL_TESTS=n CONFIG_DEBUG_KERNEL=y CONFIG_KASAN=y CINFIG_KASAN_GENERIC=y CONFIG_KASAN_KUNIT_TEST=y CONFIG_TRACEPOINTS, which CONFIG_KASAN_TEST relies on since the patch this thread is based on, isn't defined for arm64, AFAICT. If I comment out the dependency on CONFIG_TRACEPOINTS, the tests appear to run, but KUnit isn't picking up the KASan output. If I revert the patch, the above .kunitconfig appears to work fine on arm64 and the tests pass. The above .kunitconfig works as intended on X86, no changes necessary. Am I missing something? Many thanks, Paul
[PATCH v2 03/13] serial: stm32: fix incorrect characters on console
Incorrect characters are observed on console during boot. This issue occurs when init/main.c is modifying termios settings to open /dev/console on the rootfs. This patch adds a waiting loop in set_termios to wait for TX shift register empty (and TX FIFO if any) before stopping serial port. Fixes: 48a6092fb41f ("serial: stm32-usart: Add STM32 USART Driver") Signed-off-by: Erwan Le Ray diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c index c6ca8f964c69..eae54b8cf5e2 100644 --- a/drivers/tty/serial/stm32-usart.c +++ b/drivers/tty/serial/stm32-usart.c @@ -738,8 +738,9 @@ static void stm32_usart_set_termios(struct uart_port *port, unsigned int baud, bits; u32 usartdiv, mantissa, fraction, oversampling; tcflag_t cflag = termios->c_cflag; - u32 cr1, cr2, cr3; + u32 cr1, cr2, cr3, isr; unsigned long flags; + int ret; if (!stm32_port->hw_flow_control) cflag &= ~CRTSCTS; @@ -748,6 +749,15 @@ static void stm32_usart_set_termios(struct uart_port *port, spin_lock_irqsave(>lock, flags); + ret = readl_relaxed_poll_timeout_atomic(port->membase + ofs->isr, + isr, + (isr & USART_SR_TC), + 10, 10); + + /* Send the TC error message only when ISR_TC is not set. */ + if (ret) + dev_err(port->dev, "Transmission is not complete\n"); + /* Stop serial port and reset value */ writel_relaxed(0, port->membase + ofs->cr1); -- 2.17.1
[PATCH next v4 15/15] printk: console: remove unnecessary safe buffer usage
Upon registering a console, safe buffers are activated when setting up the sequence number to replay the log. However, these are already protected by @console_sem and @syslog_lock. Remove the unnecessary safe buffer usage. Signed-off-by: John Ogness Reviewed-by: Petr Mladek --- kernel/printk/printk.c | 10 +++--- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 602de86d4e76..2f829fbf0a13 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2967,9 +2967,7 @@ void register_console(struct console *newcon) /* * console_unlock(); will print out the buffered messages * for us. -*/ - printk_safe_enter_irqsave(flags); - /* +* * We're about to replay the log buffer. Only do this to the * just-registered console to avoid excessive message spam to * the already-registered consoles. @@ -2982,11 +2980,9 @@ void register_console(struct console *newcon) exclusive_console_stop_seq = console_seq; /* Get a consistent copy of @syslog_seq. */ - raw_spin_lock(_lock); + raw_spin_lock_irqsave(_lock, flags); console_seq = syslog_seq; - raw_spin_unlock(_lock); - - printk_safe_exit_irqrestore(flags); + raw_spin_unlock_irqrestore(_lock, flags); } console_unlock(); console_sysfs_notify(); -- 2.20.1
[PATCH next v3 15/15] printk: console: remove unnecessary safe buffer usage
Upon registering a console, safe buffers are activated when setting up the sequence number to replay the log. However, these are already protected by @console_sem and @syslog_lock. Remove the unnecessary safe buffer usage. Signed-off-by: John Ogness Reviewed-by: Petr Mladek --- kernel/printk/printk.c | 10 +++--- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 15a9bc409e0a..27a748ed0bc7 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2967,9 +2967,7 @@ void register_console(struct console *newcon) /* * console_unlock(); will print out the buffered messages * for us. -*/ - printk_safe_enter_irqsave(flags); - /* +* * We're about to replay the log buffer. Only do this to the * just-registered console to avoid excessive message spam to * the already-registered consoles. @@ -2982,11 +2980,9 @@ void register_console(struct console *newcon) exclusive_console_stop_seq = console_seq; /* Get a consistent copy of @syslog_seq. */ - raw_spin_lock(_lock); + raw_spin_lock_irqsave(_lock, flags); console_seq = syslog_seq; - raw_spin_unlock(_lock); - - printk_safe_exit_irqrestore(flags); + raw_spin_unlock_irqrestore(_lock, flags); } console_unlock(); console_sysfs_notify(); -- 2.20.1
Re: [PATCH printk-rework 14/14] printk: console: remove unnecessary safe buffer usage
On Thu 2021-02-18 09:18:17, John Ogness wrote: > Upon registering a console, safe buffers are activated when setting > up the sequence number to replay the log. However, these are already > protected by @console_sem and @syslog_lock. Remove the unnecessary > safe buffer usage. > > Signed-off-by: John Ogness Reviewed-by: Petr Mladek Best Regards, Petr
[PATCH 03/13] serial: stm32: fix incorrect characters on console
Incorrect characters are observed on console during boot. This issue occurs when init/main.c is modifying termios settings to open /dev/console on the rootfs. This patch adds a waiting loop in set_termios to wait for TX shift register empty (and TX FIFO if any) before stopping serial port. Fixes: 48a6092fb41f ("serial: stm32-usart: Add STM32 USART Driver") Signed-off-by: Erwan Le Ray diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c index 33a479062948..7710de947aa3 100644 --- a/drivers/tty/serial/stm32-usart.c +++ b/drivers/tty/serial/stm32-usart.c @@ -738,8 +738,9 @@ static void stm32_usart_set_termios(struct uart_port *port, unsigned int baud, bits; u32 usartdiv, mantissa, fraction, oversampling; tcflag_t cflag = termios->c_cflag; - u32 cr1, cr2, cr3; + u32 cr1, cr2, cr3, isr; unsigned long flags; + int ret; if (!stm32_port->hw_flow_control) cflag &= ~CRTSCTS; @@ -748,6 +749,15 @@ static void stm32_usart_set_termios(struct uart_port *port, spin_lock_irqsave(>lock, flags); + ret = readl_relaxed_poll_timeout_atomic(port->membase + ofs->isr, + isr, + (isr & USART_SR_TC), + 10, 10); + + /* Send the TC error message only when ISR_TC is not set. */ + if (ret) + dev_err(port->dev, "Transmission is not complete\n"); + /* Stop serial port and reset value */ writel_relaxed(0, port->membase + ofs->cr1); -- 2.17.1
[PATCH printk-rework 14/14] printk: console: remove unnecessary safe buffer usage
Upon registering a console, safe buffers are activated when setting up the sequence number to replay the log. However, these are already protected by @console_sem and @syslog_lock. Remove the unnecessary safe buffer usage. Signed-off-by: John Ogness --- kernel/printk/printk.c | 10 +++--- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 23d525e885e7..78eee6c553a5 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2961,9 +2961,7 @@ void register_console(struct console *newcon) /* * console_unlock(); will print out the buffered messages * for us. -*/ - printk_safe_enter_irqsave(flags); - /* +* * We're about to replay the log buffer. Only do this to the * just-registered console to avoid excessive message spam to * the already-registered consoles. @@ -2976,11 +2974,9 @@ void register_console(struct console *newcon) exclusive_console_stop_seq = console_seq; /* Get a consistent copy of @syslog_seq. */ - raw_spin_lock(_lock); + raw_spin_lock_irqsave(_lock, flags); console_seq = syslog_seq; - raw_spin_unlock(_lock); - - printk_safe_exit_irqrestore(flags); + raw_spin_unlock_irqrestore(_lock, flags); } console_unlock(); console_sysfs_notify(); -- 2.20.1
[tip: core/rcu] torture: Remove "Failed to add ttynull console" false positive
The following commit has been merged into the core/rcu branch of tip: Commit-ID: 546eee2d931b3d76357a9c813778203001375fe1 Gitweb: https://git.kernel.org/tip/546eee2d931b3d76357a9c813778203001375fe1 Author:Paul E. McKenney AuthorDate:Wed, 23 Dec 2020 10:35:39 -08:00 Committer: Paul E. McKenney CommitterDate: Mon, 04 Jan 2021 14:01:25 -08:00 torture: Remove "Failed to add ttynull console" false positive Commit 757055ae8ded ("init/console: Use ttynull as a fallback when there is no console") results in the string "Warning: Failed to add ttynull console. No stdin, stdout, and stderr for the init process!" appearing on the console, which the rcutorture scripting interprets as a warning, which causes every rcutorture run to be flagged. However, the rcutorture init process never attempts to do any I/O, and thus does not care that it has no stdin, stdout, or stderr. This commit therefore causes the rcutorture scripting to ignore this message. Signed-off-by: Paul E. McKenney --- tools/testing/selftests/rcutorture/bin/console-badness.sh | 1 + tools/testing/selftests/rcutorture/bin/parse-console.sh | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/rcutorture/bin/console-badness.sh b/tools/testing/selftests/rcutorture/bin/console-badness.sh index 80ae7f0..e6a132d 100755 --- a/tools/testing/selftests/rcutorture/bin/console-badness.sh +++ b/tools/testing/selftests/rcutorture/bin/console-badness.sh @@ -14,4 +14,5 @@ egrep 'Badness|WARNING:|Warn|BUG|===|Call Trace:|Oops:|detected stalls o grep -v 'ODEBUG: ' | grep -v 'This means that this is a DEBUG kernel and it is' | grep -v 'Warning: unable to open an initial console' | +grep -v 'Warning: Failed to add ttynull console. No stdin, stdout, and stderr.*the init process!' | grep -v 'NOHZ tick-stop error: Non-RCU local softirq work is pending, handler' diff --git a/tools/testing/selftests/rcutorture/bin/parse-console.sh b/tools/testing/selftests/rcutorture/bin/parse-console.sh index 263b1be..9f624bd 100755 --- a/tools/testing/selftests/rcutorture/bin/parse-console.sh +++ b/tools/testing/selftests/rcutorture/bin/parse-console.sh @@ -128,7 +128,7 @@ then then summary="$summary Badness: $n_badness" fi - n_warn=`grep -v 'Warning: unable to open an initial console' $file | egrep -c 'WARNING:|Warn'` + n_warn=`grep -v 'Warning: unable to open an initial console' $file | grep -v 'Warning: Failed to add ttynull console. No stdin, stdout, and stderr for the init process' | egrep -c 'WARNING:|Warn'` if test "$n_warn" -ne 0 then summary="$summary Warnings: $n_warn"
Re: [init/console] a91bd6223e: unixbench.score -5.9% regression
Hello, just for record. On Tue 2021-02-09 13:49:25, kernel test robot wrote: > > Greeting, > > FYI, we noticed a -5.9% regression of unixbench.score due to commit: > > > commit: a91bd6223ecd46addc71ee6fcd432206d39365d2 ("Revert "init/console: Use > ttynull as a fallback when there is no console"") > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master > > > in testcase: unixbench > on test machine: 16 threads Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz with 32G > memory > with following parameters: > > runtime: 300s > nr_task: 30% > test: dhry2reg > cpufreq_governor: performance > ucode: 0xde > > test-description: UnixBench is the original BYTE UNIX benchmark suite aims to > test performance of Unix-like system. > test-url: https://github.com/kdlucas/byte-unixbench > > > > If you fix the issue, kindly add following tag > Reported-by: kernel test robot > > > Details are as below: > --> > > > To reproduce: > > git clone https://github.com/intel/lkp-tests.git > cd lkp-tests > bin/lkp installjob.yaml # job file is attached in > this email > bin/lkp split-job --compatible job.yaml > bin/lkp runcompatible-job.yaml > > = > compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase/ucode: > > gcc-9/performance/x86_64-rhel-8.3/30%/debian-10.4-x86_64-20200603.cgz/300s/lkp-cfl-e1/dhry2reg/unixbench/0xde > > commit: > c4cc3b1de3 ("Merge tag 'gcc-plugins-v5.11-rc3' of > git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux") > a91bd6223e ("Revert "init/console: Use ttynull as a fallback when there is > no console"") > > c4cc3b1de31b76f4 a91bd6223ecd46addc71ee6fcd4 > --- >fail:runs %reproductionfail:runs >| | | > 0:42% 0:4 > perf-profile.children.cycles-pp.error_entry > %stddev %change %stddev > \ |\ > 19487-5.9% 18336unixbench.score > 8.875e+10-5.9% 8.351e+10unixbench.workload > 0.50+0.10.61 ± 3% mpstat.cpu.all.irq% > 44280 ± 4% +15.9% 51322 ± 3% softirqs.CPU0.SCHED > 29624 ± 3% +9.6% 32476vmstat.system.in > 2207 ±1927% +3559.5% 80796 ± 18% > sched_debug.cfs_rq:/.spread0.avg > 135343 ± 38% +61.1% 218075 ± 16% sched_debug.cfs_rq:/.spread0.max > 0.01 ± 31% +78.3% 0.01 ± 22% > perf-sched.sch_delay.avg.ms.do_nanosleep.hrtimer_nanosleep.__x64_sys_nanosleep.do_syscall_64 > 0.01 ± 7% +22.8% 0.02 ± 2% > perf-sched.sch_delay.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64 > 0.02 ± 15% +27.1% 0.02 ± 11% > perf-sched.sch_delay.avg.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64 > 0.02 ± 44%+113.3% 0.04 ± 12% > perf-sched.sch_delay.avg.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown] > 0.02 ± 10% +56.1% 0.03 ± 25% > perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_select > 0.02 ± 44%+113.3% 0.04 ± 12% > perf-sched.sch_delay.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown] > 0.01 ± 11% +37.5% 0.01 ± 19% > perf-sched.total_sch_delay.average.ms > 2001 +120.7% 4416 ± 33% > perf-sched.wait_and_delay.max.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64 > 0.05 ± 54% +56.5% 0.08 ± 7% > perf-sched.wait_time.avg.ms.schedule_timeout.khugepaged.kthread.ret_from_fork > 2001 +120.7% 4416 ± 33% > perf-sched.wait_time.max.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64 > 0.05 ± 54% +56.5% 0.08 ± 7% > perf-sched.wait_time.max.ms.schedule_timeout.khugepaged.kthread.ret_from_fork > 0.07 ±173% +0.20.27 ± 21% > perf-profile.children.cycles-pp.io_serial_in > 0.07 ±173% +0.20.30 ± 21% > perf-profile.children.cycles-pp.serial8250_console_putchar > 0.07 ±173% +0.20.31 ± 20% > perf-profile.children.cycles-pp.wai
[PATCH 5.10 003/142] tty: avoid using vfs_iocb_iter_write() for redirected console writes
From: Linus Torvalds commit a9cbbb80e3e7dd38ceac166e0698f161862a18ae upstream. It turns out that the vfs_iocb_iter_{read,write}() functions are entirely broken, and don't actually use the passed-in file pointer for IO - only for the preparatory work (permission checking and for the write_iter function lookup). That worked fine for overlayfs, which always builds the new iocb with the same file pointer that it passes in, but in the general case it ends up doing nonsensical things (and could cause an iterator call that doesn't even match the passed-in file pointer). This subtly broke the tty conversion to write_iter in commit 9bb48c82aced ("tty: implement write_iter"), because the console redirection didn't actually end up redirecting anything, since the passed-in file pointer was basically ignored, and the actual write was done with the original non-redirected console tty after all. The main visible effect of this is that the console messages were no longer logged to /var/log/boot.log during graphical boot. Fix the issue by simply not using the vfs write "helper" function at all, and just redirecting the write entirely internally to the tty layer. Do the target writability permission checks when actually registering the target tty with TIOCCONS instead of at write time. Fixes: 9bb48c82aced ("tty: implement write_iter") Reported-and-tested-by: Hans de Goede Cc: Greg Kroah-Hartman Cc: sta...@kernel.org Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- drivers/tty/tty_io.c | 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) --- a/drivers/tty/tty_io.c +++ b/drivers/tty/tty_io.c @@ -1027,9 +1027,8 @@ void tty_write_message(struct tty_struct * write method will not be invoked in parallel for each device. */ -static ssize_t tty_write(struct kiocb *iocb, struct iov_iter *from) +static ssize_t file_tty_write(struct file *file, struct kiocb *iocb, struct iov_iter *from) { - struct file *file = iocb->ki_filp; struct tty_struct *tty = file_tty(file); struct tty_ldisc *ld; ssize_t ret; @@ -1052,6 +1051,11 @@ static ssize_t tty_write(struct kiocb *i return ret; } +static ssize_t tty_write(struct kiocb *iocb, struct iov_iter *from) +{ + return file_tty_write(iocb->ki_filp, iocb, from); +} + ssize_t redirected_tty_write(struct kiocb *iocb, struct iov_iter *iter) { struct file *p = NULL; @@ -1061,9 +1065,13 @@ ssize_t redirected_tty_write(struct kioc p = get_file(redirect); spin_unlock(_lock); + /* +* We know the redirected tty is just another tty, we can can +* call file_tty_write() directly with that file pointer. +*/ if (p) { ssize_t res; - res = vfs_iocb_iter_write(p, iocb, iter); + res = file_tty_write(p, iocb, iter); fput(p); return res; } @@ -2306,6 +2314,12 @@ static int tioccons(struct file *file) fput(f); return 0; } + if (file->f_op->write_iter != tty_write) + return -ENOTTY; + if (!(file->f_mode & FMODE_WRITE)) + return -EBADF; + if (!(file->f_mode & FMODE_CAN_WRITE)) + return -EINVAL; spin_lock(_lock); if (redirect) { spin_unlock(_lock);
Re: ARC no console output (was Re: [PATCH 1/2] init/console: Use ttynull as a fallback when there is no console)
>>>>> "Vineet" == Vineet Gupta writes: > On 1/7/21 9:04 AM, Petr Mladek wrote: >> On Thu 2021-01-07 08:43:16, Vineet Gupta wrote: >>> Hi John, >>> >>> On 1/7/21 1:02 AM, John Ogness wrote: >>>> Hi Vineet, >>>> >>>> On 2021-01-06, Vineet Gupta wrote: >>>>> This breaks ARC booting (no output on console). >>>> >>>> Could you provide the kernel boot arguments that you use? This series is >>>> partly about addressing users that have used boot arguments that are >>>> technically incorrect (even if had worked). Seeing the boot arguments of >>>> users that are not experiencing problems may help to reveal some of the >>>> unusual console usages until now. >>> >>> >>> Kernel command line: earlycon=uart8250,mmio32,0xf0005000,115200n8 >>> console=ttyS0,115200n8 debug print-fatal-signals=1 >> >> This is strange, the problematic patch should use ttynull >> only as a fallback. It should not be used when a particular console >> is defined on the command line. > What happens in my case is console_on_rootfs() doesn't find > /dev/console and switching to ttynull. /dev is not present because > devtmpfs doesn't automount for initramfs. But our initramfs/cpio logic ensures that the initramfs has a static /dev/console device node, so how can that be? https://git.buildroot.net/buildroot/tree/fs/cpio/cpio.mk#n25 -- Bye, Peter Korsgaard
Re: [PATCH 3/4] ARM: dts: imx6sl-tolino-shine2hd: correct console uart pinmux
please ignore, that was accidentially sent. On Wed, 13 Jan 2021 00:15:47 +0100 Andreas Kemnade wrote: > Configuration was correct enough to work with the pre-configuration done by > uboot. While at it, also document the location. > > Signed-off-by: Andreas Kemnade > --- > arch/arm/boot/dts/imx6sl-tolino-shine3.dts | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/arm/boot/dts/imx6sl-tolino-shine3.dts > b/arch/arm/boot/dts/imx6sl-tolino-shine3.dts > index 27143ea0f0f1..62d2ebda04ff 100644 > --- a/arch/arm/boot/dts/imx6sl-tolino-shine3.dts > +++ b/arch/arm/boot/dts/imx6sl-tolino-shine3.dts > @@ -156,7 +156,7 @@ > pinctrl_uart1: uart1grp { > fsl,pins = < > MX6SL_PAD_UART1_TXD__UART1_TX_DATA 0x1b0b1 > - MX6SL_PAD_UART1_RXD__UART1_TX_DATA 0x1b0b1 > + MX6SL_PAD_UART1_RXD__UART1_RX_DATA 0x1b0b1 > >; > }; >
Re: [Letux-kernel] [PATCH 3/4] ARM: dts: imx6sl-tolino-shine3: correct console uart pinmux
Ah, there is a v2. Let's hope this gets merged/processed by robots correctly :) > Am 13.01.2021 um 00:15 schrieb Andreas Kemnade : > > Configuration was correct enough to work with the pre-configuration done by > uboot. While at it, also document the location. > > Signed-off-by: Andreas Kemnade > --- > arch/arm/boot/dts/imx6sl-tolino-shine3.dts | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/arm/boot/dts/imx6sl-tolino-shine3.dts > b/arch/arm/boot/dts/imx6sl-tolino-shine3.dts > index 27143ea0f0f1..62d2ebda04ff 100644 > --- a/arch/arm/boot/dts/imx6sl-tolino-shine3.dts > +++ b/arch/arm/boot/dts/imx6sl-tolino-shine3.dts > @@ -156,7 +156,7 @@ > pinctrl_uart1: uart1grp { > fsl,pins = < > MX6SL_PAD_UART1_TXD__UART1_TX_DATA 0x1b0b1 > - MX6SL_PAD_UART1_RXD__UART1_TX_DATA 0x1b0b1 > + MX6SL_PAD_UART1_RXD__UART1_RX_DATA 0x1b0b1 > >; > }; > > -- > 2.20.1 > > ___ > https://projects.goldelico.com/p/gta04-kernel/ > Letux-kernel mailing list > letux-ker...@openphoenux.org > http://lists.goldelico.com/mailman/listinfo.cgi/letux-kernel
Re: [Letux-kernel] [PATCH 3/4] ARM: dts: imx6sl-tolino-shine2hd: correct console uart pinmux
> Am 13.01.2021 um 00:15 schrieb Andreas Kemnade : > > Configuration was correct enough to work with the pre-configuration done by > uboot. While at it, also document the location. > > Signed-off-by: Andreas Kemnade > --- > arch/arm/boot/dts/imx6sl-tolino-shine3.dts | 2 +- seems to have a typo (shine2hd vs. shine3) in the patch subject. > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/arm/boot/dts/imx6sl-tolino-shine3.dts > b/arch/arm/boot/dts/imx6sl-tolino-shine3.dts > index 27143ea0f0f1..62d2ebda04ff 100644 > --- a/arch/arm/boot/dts/imx6sl-tolino-shine3.dts > +++ b/arch/arm/boot/dts/imx6sl-tolino-shine3.dts > @@ -156,7 +156,7 @@ > pinctrl_uart1: uart1grp { > fsl,pins = < > MX6SL_PAD_UART1_TXD__UART1_TX_DATA 0x1b0b1 > - MX6SL_PAD_UART1_RXD__UART1_TX_DATA 0x1b0b1 > + MX6SL_PAD_UART1_RXD__UART1_RX_DATA 0x1b0b1 > >; > }; > > -- > 2.20.1 > > ___ > https://projects.goldelico.com/p/gta04-kernel/ > Letux-kernel mailing list > letux-ker...@openphoenux.org > http://lists.goldelico.com/mailman/listinfo.cgi/letux-kernel
[PATCH 1/4] ARM: dts: imx6sl-tolino-shine2hd: correct console uart pinmux
Configuration was correct enough to work with the pre-configuration done by uboot. While at it, also document the location. Signed-off-by: Andreas Kemnade --- arch/arm/boot/dts/imx6sl-tolino-shine2hd.dts | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/arm/boot/dts/imx6sl-tolino-shine2hd.dts b/arch/arm/boot/dts/imx6sl-tolino-shine2hd.dts index caa279608803..e17c75c360f2 100644 --- a/arch/arm/boot/dts/imx6sl-tolino-shine2hd.dts +++ b/arch/arm/boot/dts/imx6sl-tolino-shine2hd.dts @@ -396,7 +396,7 @@ pinctrl_uart1: uart1grp { fsl,pins = < MX6SL_PAD_UART1_TXD__UART1_TX_DATA 0x1b0b1 - MX6SL_PAD_UART1_RXD__UART1_TX_DATA 0x1b0b1 + MX6SL_PAD_UART1_RXD__UART1_RX_DATA 0x1b0b1 >; }; @@ -543,6 +543,7 @@ }; { + /* J4, through-holes */ pinctrl-names = "default"; pinctrl-0 = <_uart1>; status = "okay"; -- 2.20.1
[PATCH 3/4] ARM: dts: imx6sl-tolino-shine3: correct console uart pinmux
Configuration was correct enough to work with the pre-configuration done by uboot. While at it, also document the location. Signed-off-by: Andreas Kemnade --- arch/arm/boot/dts/imx6sl-tolino-shine3.dts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/boot/dts/imx6sl-tolino-shine3.dts b/arch/arm/boot/dts/imx6sl-tolino-shine3.dts index 27143ea0f0f1..62d2ebda04ff 100644 --- a/arch/arm/boot/dts/imx6sl-tolino-shine3.dts +++ b/arch/arm/boot/dts/imx6sl-tolino-shine3.dts @@ -156,7 +156,7 @@ pinctrl_uart1: uart1grp { fsl,pins = < MX6SL_PAD_UART1_TXD__UART1_TX_DATA 0x1b0b1 - MX6SL_PAD_UART1_RXD__UART1_TX_DATA 0x1b0b1 + MX6SL_PAD_UART1_RXD__UART1_RX_DATA 0x1b0b1 >; }; -- 2.20.1
[PATCH 3/4] ARM: dts: imx6sl-tolino-shine2hd: correct console uart pinmux
Configuration was correct enough to work with the pre-configuration done by uboot. While at it, also document the location. Signed-off-by: Andreas Kemnade --- arch/arm/boot/dts/imx6sl-tolino-shine3.dts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/boot/dts/imx6sl-tolino-shine3.dts b/arch/arm/boot/dts/imx6sl-tolino-shine3.dts index 27143ea0f0f1..62d2ebda04ff 100644 --- a/arch/arm/boot/dts/imx6sl-tolino-shine3.dts +++ b/arch/arm/boot/dts/imx6sl-tolino-shine3.dts @@ -156,7 +156,7 @@ pinctrl_uart1: uart1grp { fsl,pins = < MX6SL_PAD_UART1_TXD__UART1_TX_DATA 0x1b0b1 - MX6SL_PAD_UART1_RXD__UART1_TX_DATA 0x1b0b1 + MX6SL_PAD_UART1_RXD__UART1_RX_DATA 0x1b0b1 >; }; -- 2.20.1
Re: [PATCH v1] panic: push panic() messages to the console even from the MCE nmi handler
Thank you for your clarification, and let me explain what I was talking about before providing my last code version. >>>>kmsg_dump(KMSG_DUMP_PANIC); >>>> + panic_flush_to_console(); > > This is wrong. kmsg_dump() flushes the messages into the registered > dumpers, e.g. pstore. It handles only messages in the main > log buffer. > > printk_safe_flush_on_panic() must be called before. It moves any > pending messages from the temporary per-CPU buffers into the main > log buffer so that they are visible for the dumpers. Ok, this is very clear. >>> Why? >> >> Here, we are supposed to push the pending messages, as I could verify >> that they don't reach the console until the console_unblank(). So I >> wanted to push them with panic_flush_to_console() before the possible >> upcoming __crash_kexec() call. > > I do not see how the ordering of printk_safe_flush_on_panic() > and kmsg_dump() would change anything for the upcoming > __crash_kexec() call. I was wrong to remove the call to printk_safe_flush_on_panic() before kmsg_dump() as it was in panic_flush_to_console(). The call panic_flush_to_console() idea here was to push any messages that could have been submitted (even by any dumper) to the console before the possible upcoming __crash_kexec() call. Otherwise they could be discarded (if the crash messages can't be retrieved). And that's the reason why I've changed this code in my previous version to force a call to panic_flush_to_console() before the __crash_kexec(). Which seems to be too insecure as far as I understand it now. In your previous email you also indicate: >> . With the use of "crashkernel", a Fatal MCE injection shows only the >> injection message >> >> [ 80.811708] mce: Machine check injector initialized >> [ 92.298755] mce: Triggering MCE exception on CPU 0 >> [ 92.299362] Disabling lock debugging due to kernel taint >> >>No other messages is displayed and the system reboots immediately. > > But you could find the messages in the crashdump. Aren't you? Yes, but _only_ with the latest kexec-tools-2.0.21.git version ! [My installed kexec-tools.x86_64 2.0.20-34.0.4.el8_3 doesn't seem to work, and I guess there is a separate problem that I will investigate later.] > > It works this way by "design". The idea is the following: > That's a rather disarming statement :) I'm sure we can agree that getting the information directly (and of course safely) from the running kernel console would be a better option, instead of having a 2 steps process to get the crash reason (even when it works) as it adds a risk not to get this crash reason at all. And I know for a fact that Cloud people do want to know why a platform goes away ;) > Taking any locks from NMI context might lead to a deadlock. > Re-initializing the locks might lead to deadlock as well > because of possible double unlock. Ignoring the locks might > lead to problems either. > > A compromise is needed: > > 1. crashdump disabled > >console_flush_on_panic() is called. It tries hard to get the >messages on the console because it is the only chance. > >It does console_trylock(). It is called after >bust_spinlocks(1) so that even the console-specific locks >are taken only with trylock, see oops_in_progress. > >BTW: There are people that do not like this because there >is still a risk of a deadlock. Some code paths >take locks without checking oops_in_progress. >For these people, more reliable reboot is more >important because they want to have the system >back ASAP (cloud people). > > > 2. crashdump enabled: > > Only printk_safe_flush_on_panic() is called. It does the best effort > to flush messages from the per-CPU buffers into the main log buffer > so that they can be found easily in the core. > > It is pretty reliable. It should not be needed at all once the new > lockless ringbuffer gets fully integrated, > > It does not try to flush the messages to the console. Getting > the crash dump is more important than risking a deadlock with > consoles. Thanks again for the good clarification. I did not realize that calling console_flush_on_panic(CONSOLE_FLUSH_PENDING) from panic() before our __crash_kexec() was not safe enough. So here is a new version of a (minimal) fix trying to follow your suggestions: (And you'll probably remove the added calls to printk_safe_flush_on_panic() when integrating the new lockless ringbuffer) What do you think of this simplified possibility ?: [If this fix is correct, I'll submit it as a v2 fix, with an updated comment and Notes] --- kernel/panic.c 2021-01-11 20:33:25.268047057 -0500 *** *** 16
Re: [PATCH 1/1] Revert "init/console: Use ttynull as a fallback when there is no console"
On Fri, Jan 8, 2021 at 9:45 AM Petr Mladek wrote: > > Feel free to push v2 directly. Thanks, done. Linus
Re: [PATCH 1/1] Revert "init/console: Use ttynull as a fallback when there is no console"
On 1/8/21 9:45 AM, Petr Mladek wrote: > On Thu 2021-01-07 11:38:36, Linus Torvalds wrote: >> On Thu, Jan 7, 2021 at 11:15 AM Greg Kroah-Hartman >> wrote: >>> Linus, can you take this directly, or is this going through some other >>> tree? >> I was _assuming_ that I'd get it through the normal printk pull >> request, it doesn't seem to be that timing-critical. >> >> But if there is nothing else pending, I can certainly take it directly >> as that patch too. > This is the only printk-related fix at the moment. > > I have just pushed v2 (updated commit message, tags) into > printk/linux.git for linux-next. It is the patch sent as > https://urldefense.com/v3/__https://lore.kernel.org/lkml/20210108114847.23469-1-pmla...@suse.com/__;!!A4F2R9G_pg!P3n1xyAl156-9Gxs8S-k_6obYq-XjXYYWe1_yGYLUmCFRfU7sqjBq4XE1wuZoiZN$ > > Feel free to push v2 directly. Or I will create pull request the > following week after it spends few days in linux-next. Please have it upstream in the upcoming rc since this affects ARC booting. -Vineet
Re: [PATCH 1/1] Revert "init/console: Use ttynull as a fallback when there is no console"
On Thu 2021-01-07 11:38:36, Linus Torvalds wrote: > On Thu, Jan 7, 2021 at 11:15 AM Greg Kroah-Hartman > wrote: > > > > Linus, can you take this directly, or is this going through some other > > tree? > > I was _assuming_ that I'd get it through the normal printk pull > request, it doesn't seem to be that timing-critical. > > But if there is nothing else pending, I can certainly take it directly > as that patch too. This is the only printk-related fix at the moment. I have just pushed v2 (updated commit message, tags) into printk/linux.git for linux-next. It is the patch sent as https://lore.kernel.org/lkml/20210108114847.23469-1-pmla...@suse.com/ Feel free to push v2 directly. Or I will create pull request the following week after it spends few days in linux-next. Best Regards, Petr
Re: [PATCH v1] panic: push panic() messages to the console even from the MCE nmi handler
On Fri 2021-01-08 01:26:06, William Roche wrote: > On 06/01/2021 05:35, Sergey Senozhatsky wrote: > > On (21/01/04 16:15), “William Roche wrote: > >> @@ -271,9 +280,8 @@ void panic(const char *fmt, ...) > >> */ > >>atomic_notifier_call_chain(_notifier_list, 0, buf); > >> > >> - /* Call flush even twice. It tries harder with a single online CPU */ > >> - printk_safe_flush_on_panic(); > >>kmsg_dump(KMSG_DUMP_PANIC); > >> + panic_flush_to_console(); This is wrong. kmsg_dump() flushes the messages into the registered dumpers, e.g. pstore. It handles only messages in the main log buffer. printk_safe_flush_on_panic() must be called before. It moves any pending messages from the temporary per-CPU buffers into the main log buffer so that they are visible for the dumpers. > > Why? > > Here, we are supposed to push the pending messages, as I could verify > that they don't reach the console until the console_unblank(). So I > wanted to push them with panic_flush_to_console() before the possible > upcoming __crash_kexec() call. I do not see how the ordering of printk_safe_flush_on_panic() and kmsg_dump() would change anything for the upcoming __crash_kexec() call. Best Regards, Petr
Re: [PATCH v1] panic: push panic() messages to the console even from the MCE nmi handler
On Mon 2021-01-04 16:15:55, “William Roche wrote: > From: William Roche > > Force push panic messages to the console as panic() can be called from NMI > interrupt handler functions where printed messages can't always reach the > console without an explicit push provided by printk_safe_flush_on_panic() > and console_flush_on_panic(). > This is the case with the MCE handler that can lead to a system panic > giving information on the fatal MCE root cause that must reach the console. > > Signed-off-by: William Roche > --- > > Notes: > While testing MCE injection and kernel reaction, we discovered a bug > in the way the kernel provides the panic reason information: When dealing > with fatal MCE, the machine (physical or virtual) can reboot without > leaving any message on the console. > > This behavior can be reproduced on Intel with the mce-inject tool > with a simple: > # modprobe mce-inject > # mce-inject test/uncorrected > > The investigations showed that the MCE panic can be totally message-less > or can give a small set of messages. This behavior depends on the use of > the > crash_kexec mechanism (using the "crashkernel" parameter). Not using this > parameter, we get a partial [Hardware Error] information on panic, but > some > important notifications can be missing. And when using it, a fatal MCE can > panic the system without leaving any information. > > . Without "crashkernel", a Fatal MCE injection shows: > > [ 212.153928] mce: Machine check injector initialized > [ 236.730682] mce: Triggering MCE exception on CPU 0 > [ 236.731304] Disabling lock debugging due to kernel taint > [ 236.731947] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 1: > b000 > [ 236.731948] mce: [Hardware Error]: TSC 78418fb4a83f > [ 236.731949] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1605312952 > SOCKET 0 APIC 0 microcode 1 > [ 236.731949] mce: [Hardware Error]: Run the above through 'mcelog > --ascii' > [ 236.731950] mce: [Hardware Error]: Machine check: MCIP not set in MCA > handler > [ 236.731950] Kernel panic - not syncing: Fatal machine check > [ 236.732047] Kernel Offset: disabled > > The system hangs 30 seconds without any additional message, and finally > reboots. > > . With the use of "crashkernel", a Fatal MCE injection shows only the > injection message > > [ 80.811708] mce: Machine check injector initialized > [ 92.298755] mce: Triggering MCE exception on CPU 0 > [ 92.299362] Disabling lock debugging due to kernel taint > > No other messages is displayed and the system reboots immediately. But you could find the messages in the crashdump. Aren't you? It works this way by "design". The idea is the following: Taking any locks from NMI context might lead to a deadlock. Re-initializing the locks might lead to deadlock as well because of possible double unlock. Ignoring the locks might lead to problems either. A compromise is needed: 1. crashdump disabled console_flush_on_panic() is called. It tries hard to get the messages on the console because it is the only chance. It does console_trylock(). It is called after bust_spinlocks(1) so that even the console-specific locks are taken only with trylock, see oops_in_progress. BTW: There are people that do not like this because there is still a risk of a deadlock. Some code paths take locks without checking oops_in_progress. For these people, more reliable reboot is more important because they want to have the system back ASAP (cloud people). 2. crashdump enabled: Only printk_safe_flush_on_panic() is called. It does the best effort to flush messages from the per-CPU buffers into the main log buffer so that they can be found easily in the core. It it pretty reliable. It should not be needed at all once the new lockless ringbuffer gets fully integrated, It does not try to flush the messages to the console. Getting the crash dump is more important than risking a deadlock with consoles. Best Regards, Petr
[PATCH v2] Revert "init/console: Use ttynull as a fallback when there is no console"
This reverts commit 757055ae8dedf5333af17b3b5b4b70ba9bc9da4e. The commit caused that ttynull was used as the default console on several systems[1][2][3]. As a result, the console was blank even when a better alternative existed. It happened when there was no console configured on the command line and ttynull_init() was the first initcall calling register_console(). Or it happened when /dev/ did not exist when console_on_rootfs() was called. It was not able to open /dev/console even though a console driver was registered. It tried to add ttynull console but it obviously did not help. But ttynull became the preferred console and was used by /dev/console when it was available later. The commit tried to fix a historical problem that have been there for ages. The primary motivation was the commit 3cffa06aeef7ece30f6 ("printk/console: Allow to disable console output by using console="" or console=null"). It provided a clean solution for a workaround that was widely used and worked only by chance. This revert causes that the console="" or console=null command line options will again work only by chance. These options will cause that a particular console will be preferred and the default (tty) ones will not get enabled. There will be no console registered at all. As a result there won't be stdin, stdout, and stderr for the init process. But it worked exactly this way even before. The proper solution has to fulfill many conditions: + Register ttynull only when explicitly required or as the ultimate fallback. + ttynull should get associated with /dev/console but it must not become preferred console when used as a fallback. Especially, it must still be possible to replace it by a better console later. Such a change requires clean up of the register_console() code. Otherwise, it would be even harder to follow. Especially, the use of has_preferred_console and CON_CONSDEV flag is tricky. The clean up is risky. The ordering of consoles is not well defined. And any changes tend to break existing user settings. Do the revert at the least risky solution for now. [1] https://lore.kernel.org/linux-kselftest/20201221144302.gr4...@smile.fi.intel.com/ [2] https://lore.kernel.org/lkml/d2a3b3c0-e548-7dd1-730f-59bc5c04e...@synopsys.com/ [3] https://patchwork.ozlabs.org/project/linux-um/patch/20210105120128.10854-1-tho...@m3y3r.de/ Reported-by: Andy Shevchenko Reported-by: Vineet Gupta Reported-by: Thomas Meyer Signed-off-by: Petr Mladek Acked-by: Greg Kroah-Hartman Acked-by: Sergey Senozhatsky --- Changes is v2: + Improved description of the problem + Added links, Reported-by, and Acked-by drivers/tty/Kconfig | 14 ++ drivers/tty/Makefile| 3 ++- drivers/tty/ttynull.c | 18 -- include/linux/console.h | 3 --- init/main.c | 10 ++ 5 files changed, 18 insertions(+), 30 deletions(-) diff --git a/drivers/tty/Kconfig b/drivers/tty/Kconfig index 47a6e42f0d04..e15cd6b5bb99 100644 --- a/drivers/tty/Kconfig +++ b/drivers/tty/Kconfig @@ -401,6 +401,20 @@ config MIPS_EJTAG_FDC_KGDB_CHAN help FDC channel number to use for KGDB. +config NULL_TTY + tristate "NULL TTY driver" + help + Say Y here if you want a NULL TTY which simply discards messages. + + This is useful to allow userspace applications which expect a console + device to work without modifications even when no console is + available or desired. + + In order to use this driver, you should redirect the console to this + TTY, or boot the kernel with console=ttynull. + + If unsure, say N. + config TRACE_ROUTER tristate "Trace data router for MIPI P1149.7 cJTAG standard" depends on TRACE_SINK diff --git a/drivers/tty/Makefile b/drivers/tty/Makefile index 3c1c5a9240a7..b3ccae932660 100644 --- a/drivers/tty/Makefile +++ b/drivers/tty/Makefile @@ -2,7 +2,7 @@ obj-$(CONFIG_TTY) += tty_io.o n_tty.o tty_ioctl.o tty_ldisc.o \ tty_buffer.o tty_port.o tty_mutex.o \ tty_ldsem.o tty_baudrate.o tty_jobctrl.o \ - n_null.o ttynull.o + n_null.o obj-$(CONFIG_LEGACY_PTYS) += pty.o obj-$(CONFIG_UNIX98_PTYS) += pty.o obj-$(CONFIG_AUDIT)+= tty_audit.o @@ -25,6 +25,7 @@ obj-$(CONFIG_ISI) += isicom.o obj-$(CONFIG_MOXA_INTELLIO)+= moxa.o obj-$(CONFIG_MOXA_SMARTIO) += mxser.o obj-$(CONFIG_NOZOMI) += nozomi.o +obj-$(CONFIG_NULL_TTY) += ttynull.o obj-$(CONFIG_ROCKETPORT) += rocket.o obj-$(CONFIG_SYNCLINK_GT) += synclink_gt.o obj-$(CONFIG_PPC_EPAPR_HV_BYTECHAN) += ehv_bytechan.o diff --git a/drivers/tty/ttynull.c b/drivers/tty/ttynull.c index eced70ec54e1..17f05b7eb6d3 100644 --- a/drivers/tty/ttynull.c +++ b/drivers/t
Re: ARC no console output (was Re: [PATCH 1/2] init/console: Use ttynull as a fallback when there is no console)
On Thu 2021-01-07 21:18:20, Vineet Gupta wrote: > On 1/7/21 7:48 PM, Sergey Senozhatsky wrote: > > On (21/01/07 09:58), Vineet Gupta wrote: > > > On 1/7/21 9:04 AM, Petr Mladek wrote: > > > > On Thu 2021-01-07 08:43:16, Vineet Gupta wrote: > > > > > Hi John, > > > > > > > > > > On 1/7/21 1:02 AM, John Ogness wrote: > > > > > > Hi Vineet, > > > > > > > > > > > > On 2021-01-06, Vineet Gupta wrote: > > > > > > > This breaks ARC booting (no output on console). > > > > > > > > > > > > Could you provide the kernel boot arguments that you use? This > > > > > > series is > > > > > > partly about addressing users that have used boot arguments that are > > > > > > technically incorrect (even if had worked). Seeing the boot > > > > > > arguments of > > > > > > users that are not experiencing problems may help to reveal some of > > > > > > the > > > > > > unusual console usages until now. > > > > > > > > > > > > > > > Kernel command line: earlycon=uart8250,mmio32,0xf0005000,115200n8 > > > > > console=ttyS0,115200n8 debug print-fatal-signals=1 > > > > > > > > This is strange, the problematic patch should use ttynull > > > > only as a fallback. It should not be used when a particular console > > > > is defined on the command line. > > > > > > What happens in my case is console_on_rootfs() doesn't find /dev/console > > > and > > > switching to ttynull. /dev is not present because devtmpfs doesn't > > > automount > > > for initramfs. I see. I did not though about a possibility that /dev/console could not be opened from other reasons. > > I wonder if we'll move the nulltty fallback logic into printk code [1] > > will it fix the problem? > > > > [1] > > https://lore.kernel.org/lkml/X6x%2FAxD1qanC6evJ@jagdpanzerIV.localdomain/ > > Your reasoning in the post above makes total sense. > > I tired the patch: adding register_ttynull_console() call in > console_device(), removing from console_on_rootfs() band that works too. IMHO, this worked because you removed the change in console_on_rootfs(). I guess that the change in console_device() did not make any difference. It was likely not called because filp_open("/dev/console", O_RDWR, 0) failed earlier because /dev/ did not exists. Anyway, the proposed change in console_device() has some more problems that I realized this week: + It does not check whether console_drivers really contains any console with tty binding. + register_ttynull_console() calls add_preferred_console(ttynull_console.name, 0, NULL). The ttynull console stays preferred even when any better console gets registered later. As a result, it would stay associated with /dev/console. The right solution would be to enable ttynull console and do _not_ modify the list of preferred consoles. And it makes sense to add the console only there is no console with tty binding at the moment. I still have to think whether console_device() is a better or worse location for adding tyynull as a fallback. Anyway, it has to wait. The proper solution can't be done easily with the existing register_console() code. We need to clean it up first. Best Regards, Petr
Re: ARC no console output (was Re: [PATCH 1/2] init/console: Use ttynull as a fallback when there is no console)
On 1/7/21 7:48 PM, Sergey Senozhatsky wrote: On (21/01/07 09:58), Vineet Gupta wrote: On 1/7/21 9:04 AM, Petr Mladek wrote: On Thu 2021-01-07 08:43:16, Vineet Gupta wrote: Hi John, On 1/7/21 1:02 AM, John Ogness wrote: Hi Vineet, On 2021-01-06, Vineet Gupta wrote: This breaks ARC booting (no output on console). Could you provide the kernel boot arguments that you use? This series is partly about addressing users that have used boot arguments that are technically incorrect (even if had worked). Seeing the boot arguments of users that are not experiencing problems may help to reveal some of the unusual console usages until now. Kernel command line: earlycon=uart8250,mmio32,0xf0005000,115200n8 console=ttyS0,115200n8 debug print-fatal-signals=1 This is strange, the problematic patch should use ttynull only as a fallback. It should not be used when a particular console is defined on the command line. What happens in my case is console_on_rootfs() doesn't find /dev/console and switching to ttynull. /dev is not present because devtmpfs doesn't automount for initramfs. I wonder if we'll move the nulltty fallback logic into printk code [1] will it fix the problem? [1] https://lore.kernel.org/lkml/X6x%2FAxD1qanC6evJ@jagdpanzerIV.localdomain/ Your reasoning in the post above makes total sense. I tired the patch: adding register_ttynull_console() call in console_device(), removing from console_on_rootfs() band that works too. | Warning: unable to open an initial console. Fallback to ttynull. | Warning: Failed to add ttynull console. No stdin, stdout, and stderr for the init process! | Freeing unused kernel memory: 3096K | This architecture does not have kernel memory protection. | Run /init as init process | with arguments: |/init | with environment: |HOME=/ |TERM=linux | Starting System logger (syslogd) | Bringing up loopback device | Starting inetd | Mounting Posix Mqueue filesys | CONFIG_INITRAMFS_SOURCE="~/arc/RAMFS/archs/ramfs_2011-GNU-2020-03-glibc-2.32-tiny" | ** | Welcome to ARCLinux | ** | [ARCLinux]#
Re: ARC no console output (was Re: [PATCH 1/2] init/console: Use ttynull as a fallback when there is no console)
On (21/01/07 09:58), Vineet Gupta wrote: > On 1/7/21 9:04 AM, Petr Mladek wrote: > > On Thu 2021-01-07 08:43:16, Vineet Gupta wrote: > > > Hi John, > > > > > > On 1/7/21 1:02 AM, John Ogness wrote: > > > > Hi Vineet, > > > > > > > > On 2021-01-06, Vineet Gupta wrote: > > > > > This breaks ARC booting (no output on console). > > > > > > > > Could you provide the kernel boot arguments that you use? This series is > > > > partly about addressing users that have used boot arguments that are > > > > technically incorrect (even if had worked). Seeing the boot arguments of > > > > users that are not experiencing problems may help to reveal some of the > > > > unusual console usages until now. > > > > > > > > > Kernel command line: earlycon=uart8250,mmio32,0xf0005000,115200n8 > > > console=ttyS0,115200n8 debug print-fatal-signals=1 > > > > This is strange, the problematic patch should use ttynull > > only as a fallback. It should not be used when a particular console > > is defined on the command line. > > What happens in my case is console_on_rootfs() doesn't find /dev/console and > switching to ttynull. /dev is not present because devtmpfs doesn't automount > for initramfs. I wonder if we'll move the nulltty fallback logic into printk code [1] will it fix the problem? [1] https://lore.kernel.org/lkml/X6x%2FAxD1qanC6evJ@jagdpanzerIV.localdomain/ -ss
Re: [PATCH 1/1] Revert "init/console: Use ttynull as a fallback when there is no console"
On (21/01/07 17:44), Petr Mladek wrote: > > This reverts commit 757055ae8dedf5333af17b3b5b4b70ba9bc9da4e. > > The commit caused that ttynull was used as the default console > on many systems. It happened when there was no console configured > on the command line and ttynull_init() was the first initcall > calling register_console(). > > The commit fixed a historical problem that have been there for ages. > The primary motivation was the commit 3cffa06aeef7ece30f6 > ("printk/console: Allow to disable console output by using console="" > or console=null"). It provided a clean solution > for a workaround that was widely used and worked only by chance. > > This revert causes that the console="" or console=null command line > options will again work only by chance. These options will cause that > a particular console will be preferred and the default (tty) ones > will not get enabled. There will be no console registered at > all. As a result there won't be stdin, stdout, and stderr for > the init process. But it worked exactly this way even before. > > The proper solution has to fulfill many conditions: > > + Register ttynull only when explicitly required or as > the ultimate fallback. > > + ttynull must get associated with /dev/console but it must > not become preferred console when used as a fallback. > Especially, it must still be possible to replace it > by a better console later. > > Such a change requires clean up of the register_console() code. > Otherwise, it would be even harder to follow. Especially, the use > of has_preferred_console and CON_CONSDEV flag is tricky. The clean > up is risky. The ordering of consoles is not well defined. And > any changes tend to break existing user settings. > > Do the revert at the least risky solution for now. > > Signed-off-by: Petr Mladek Acked-by: Sergey Senozhatsky -ss
Re: [PATCH v1] panic: push panic() messages to the console even from the MCE nmi handler
On 06/01/2021 05:35, Sergey Senozhatsky wrote: > On (21/01/04 16:15), “William Roche wrote: > [..] >> diff --git a/kernel/panic.c b/kernel/panic.c >> index 332736a..eb90cc0 100644 >> --- a/kernel/panic.c >> +++ b/kernel/panic.c >> @@ -166,6 +166,15 @@ static void panic_print_sys_info(void) >>ftrace_dump(DUMP_ALL); >> } >> >> +/* >> + * Force flush messages to the console. >> + */ >> +static void panic_flush_to_console(void) >> +{ >> + printk_safe_flush_on_panic(); >> + console_flush_on_panic(CONSOLE_FLUSH_PENDING); >> +} > > You must debug_locks_off() as the very first step here. But see below. Thanks I've missed this point, I will call debug_locks_off() before the first call to console_flush_on_panic(CONSOLE_FLUSH_PENDING); > >> /** >>* panic - halt the system >>* @fmt: The text string to print >> @@ -247,7 +256,7 @@ void panic(const char *fmt, ...) >> * Bypass the panic_cpu check and call __crash_kexec directly. >> */ >>if (!_crash_kexec_post_notifiers) { >> - printk_safe_flush_on_panic(); >> + panic_flush_to_console(); >>__crash_kexec(NULL); > > It's dangerous to call console_flush_on_panic() before we stop secondary > CPUs. console_flush_on_panic() ignores the state console_sem, so if any > of the secondary is currently printing something on the consoles you'll > get corrupted messages - we use `static char buffer` for messages we > push to consoles. I understand that there is a risk here, and the __crash_exec() call is not supposed to return when crashexec is ready. This is our last chance here to push to the console the pending messages. > Another issue is that with this panic_flush_to_console() call console_sem > can end up being locked once (by secondary CPU) and unlocked twice (by > second and panic CPUs) [*] I agree, this is a risk. But what we see is that even the message previously submitted by the pr_emerg() call of the panic thread is currently discarded when __crash_kexec(NULL) is called. In the case described here, no other message is printed out by a secondary CPU, but I understand that it could happen, and we could avoid calling panic_flush_to_console() when kexec_crash isn't loaded so that we can wait for the SMP stop and leave the current steps to run untouched until the last printed messages. > >>/* >> @@ -271,9 +280,8 @@ void panic(const char *fmt, ...) >> */ >>atomic_notifier_call_chain(_notifier_list, 0, buf); >> >> - /* Call flush even twice. It tries harder with a single online CPU */ >> - printk_safe_flush_on_panic(); >>kmsg_dump(KMSG_DUMP_PANIC); >> + panic_flush_to_console(); > > Why? Here, we are supposed to push the pending messages, as I could verify that they don't reach the console until the console_unblank(). So I wanted to push them with panic_flush_to_console() before the possible upcoming __crash_kexec() call. Are you suggesting that I should leave the printk_safe_flush_on_panic() call before the kmsg_dump(KMSG_DUMP_PANIC) call ? But let's leave that untouched too. > >>/* >> * If you doubt kdump always works fine in any situation, >> @@ -298,7 +306,7 @@ void panic(const char *fmt, ...) >> * buffer. Try to acquire the lock then release it regardless of the >> * result. The release will also print the buffers out. Locks debug >> * should be disabled to avoid reporting bad unlock balance when >> - * panic() is not being callled from OOPS. >> + * panic() is not being called from OOPS. >> */ >>debug_locks_off(); >>console_flush_on_panic(CONSOLE_FLUSH_PENDING); >> @@ -314,6 +322,7 @@ void panic(const char *fmt, ...) >> * We can't use the "normal" timers since we just panicked. >> */ >>pr_emerg("Rebooting in %d seconds..\n", panic_timeout); >> + panic_flush_to_console(); > > [*] So this > >>for (i = 0; i < panic_timeout * 1000; i += PANIC_TIMER_STEP) { >>touch_nmi_watchdog(); >> @@ -347,6 +356,7 @@ void panic(const char *fmt, ...) >>disabled_wait(); >> #endif >>pr_emerg("---[ end Kernel panic - not syncing: %s ]---\n", buf); >> + panic_flush_to_console(); > > [*] and this are both very interesting points. > > Those extra console_flush_on_panic() calls indicate that normal printk() > cannot succeed in locking the console_sem so it doesn't try to > consol
Re: [PATCH 1/1] Revert "init/console: Use ttynull as a fallback when there is no console"
On Thu, Jan 7, 2021 at 11:15 AM Greg Kroah-Hartman wrote: > > Linus, can you take this directly, or is this going through some other > tree? I was _assuming_ that I'd get it through the normal printk pull request, it doesn't seem to be that timing-critical. But if there is nothing else pending, I can certainly take it directly as that patch too. Petr? Linus
Re: [PATCH 1/1] Revert "init/console: Use ttynull as a fallback when there is no console"
On Thu, Jan 07, 2021 at 05:44:00PM +0100, Petr Mladek wrote: > This reverts commit 757055ae8dedf5333af17b3b5b4b70ba9bc9da4e. > > The commit caused that ttynull was used as the default console > on many systems. It happened when there was no console configured > on the command line and ttynull_init() was the first initcall > calling register_console(). > > The commit fixed a historical problem that have been there for ages. > The primary motivation was the commit 3cffa06aeef7ece30f6 > ("printk/console: Allow to disable console output by using console="" > or console=null"). It provided a clean solution > for a workaround that was widely used and worked only by chance. > > This revert causes that the console="" or console=null command line > options will again work only by chance. These options will cause that > a particular console will be preferred and the default (tty) ones > will not get enabled. There will be no console registered at > all. As a result there won't be stdin, stdout, and stderr for > the init process. But it worked exactly this way even before. > > The proper solution has to fulfill many conditions: > > + Register ttynull only when explicitly required or as > the ultimate fallback. > > + ttynull must get associated with /dev/console but it must > not become preferred console when used as a fallback. > Especially, it must still be possible to replace it > by a better console later. > > Such a change requires clean up of the register_console() code. > Otherwise, it would be even harder to follow. Especially, the use > of has_preferred_console and CON_CONSDEV flag is tricky. The clean > up is risky. The ordering of consoles is not well defined. And > any changes tend to break existing user settings. > > Do the revert at the least risky solution for now. > > Signed-off-by: Petr Mladek Acked-by: Greg Kroah-Hartman Linus, can you take this directly, or is this going through some other tree? thanks, greg k-h
Re: [PATCH] kunit: tool: Force the use of the 'tty' console for UML
On 1/7/21 9:53 AM, Petr Mladek wrote: On Wed 2021-01-06 12:29:12, David Gow wrote: On Wed, Jan 6, 2021 at 3:52 AM Shuah Khan wrote: On 1/5/21 11:57 AM, Andy Shevchenko wrote: On Tue, Jan 05, 2021 at 09:34:33AM -0700, Shuah Khan wrote: On 1/5/21 9:21 AM, Petr Mladek wrote: On Mon 2021-01-04 09:23:57, Shuah Khan wrote: On 12/22/20 4:11 AM, Andy Shevchenko wrote: On Mon, Dec 21, 2020 at 11:39:00PM -0800, David Gow wrote: kunit_tool relies on the UML console outputting printk() output to the tty in order to get results. Since the default console driver could change, pass 'console=tty' to the kernel. This is triggered by a change[1] to use ttynull as a fallback console driver which -- by chance or by design -- seems to have changed the default console output on UML, breaking kunit_tool. While this may be fixed, we should be less fragile to such changes in the default. [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=757055ae8dedf5333af17b3b5b4b70ba9bc9da4e Reported-by: Andy Shevchenko Tested-by: Andy Shevchenko Thank you all. Now in linux-kselftest kunit-fixes branch. Will send this up for rc3. Sorry for the delay - have been away from the keyboard for a bit. JFYI, I am not sure that this is the right solution. I am looking into it, see https://lore.kernel.org/linux-kselftest/X%2FSRA1P8t+ONZFKb@alley/ for more details. Thanks Petr. I will hold off on sending the patch up to Linus and let you find a the right solution. Please. leave it in Linux Next at least. Otherwise kunit will be broken for a long time which is not good. Yes. That is the plan. It will be in there until real fix comes in. The real fix would be too complicated for 5.11-rc3. Instead, I proposed to revert the problematic commit, see https://lore.kernel.org/lkml/20210107164400.17904-2-pmla...@suse.com/ I would like to push it for 5.11-rc3. Personally, I think that this patch makes some sense to keep even if the underlying issue with ttynull is resolved. Given that kunit.py requires the console output, explicitly stating we want console=tty set is probably worth doing rather than relying on it being the default. I agree that the patch makes sense on its own. kunit depends on the particular console. Note that "tty" is actually the UML-specific stdio console implemented in arch/um/drivers/stdio_console.c. The proposal sounds like revert the problem commit https://lore.kernel.org/lkml/20210107164400.17904-2-pmla...@suse.com/ and also send kunit fix up. Sounds reasonable to me. I will send it for 5.11-rc3 thanks, -- Shuah
Re: ARC no console output (was Re: [PATCH 1/2] init/console: Use ttynull as a fallback when there is no console)
On 1/7/21 9:04 AM, Petr Mladek wrote: On Thu 2021-01-07 08:43:16, Vineet Gupta wrote: Hi John, On 1/7/21 1:02 AM, John Ogness wrote: Hi Vineet, On 2021-01-06, Vineet Gupta wrote: This breaks ARC booting (no output on console). Could you provide the kernel boot arguments that you use? This series is partly about addressing users that have used boot arguments that are technically incorrect (even if had worked). Seeing the boot arguments of users that are not experiencing problems may help to reveal some of the unusual console usages until now. Kernel command line: earlycon=uart8250,mmio32,0xf0005000,115200n8 console=ttyS0,115200n8 debug print-fatal-signals=1 This is strange, the problematic patch should use ttynull only as a fallback. It should not be used when a particular console is defined on the command line. What happens in my case is console_on_rootfs() doesn't find /dev/console and switching to ttynull. /dev is not present because devtmpfs doesn't automount for initramfs. The only explanation would be that ttyS0 gets registered too late and ttynull is added as a fallback in the meantime. I don't know if ttyS0 console should have registered already but even if it did - the /dev node missing would not have helped ? Anyway, I propose the revert the problematic patch for 5.11-rc3, see https://lore.kernel.org/lkml/20210107164400.17904-2-pmla...@suse.com/ This mystery is a good reason to avoid bigger changes at this stage. Best Regards, Petr
Re: [PATCH 1/1] Revert "init/console: Use ttynull as a fallback when there is no console"
On Thu, Jan 7, 2021 at 9:45 AM Andy Shevchenko wrote: >> > Shouldn't it have Fixes tag along with Reported-by ones and explanation what > was the actual problem reported? No need for a "Fixes" tag for a revert - the commit it reverts _is_ the thing it fixes, so that's implicitly there. But yes, a reported-by with an explanation of the actual case that broke would be a very good idea, so that the revert documents the particular case that caused it, not just the "big picture" case. Linus
Re: [PATCH 1/1] Revert "init/console: Use ttynull as a fallback when there is no console"
On Thu, Jan 07, 2021 at 05:44:00PM +0100, Petr Mladek wrote: > This reverts commit 757055ae8dedf5333af17b3b5b4b70ba9bc9da4e. > > The commit caused that ttynull was used as the default console > on many systems. It happened when there was no console configured > on the command line and ttynull_init() was the first initcall > calling register_console(). > > The commit fixed a historical problem that have been there for ages. > The primary motivation was the commit 3cffa06aeef7ece30f6 > ("printk/console: Allow to disable console output by using console="" > or console=null"). It provided a clean solution > for a workaround that was widely used and worked only by chance. > > This revert causes that the console="" or console=null command line > options will again work only by chance. These options will cause that > a particular console will be preferred and the default (tty) ones > will not get enabled. There will be no console registered at > all. As a result there won't be stdin, stdout, and stderr for > the init process. But it worked exactly this way even before. > > The proper solution has to fulfill many conditions: > > + Register ttynull only when explicitly required or as > the ultimate fallback. > > + ttynull must get associated with /dev/console but it must > not become preferred console when used as a fallback. > Especially, it must still be possible to replace it > by a better console later. > > Such a change requires clean up of the register_console() code. > Otherwise, it would be even harder to follow. Especially, the use > of has_preferred_console and CON_CONSDEV flag is tricky. The clean > up is risky. The ordering of consoles is not well defined. And > any changes tend to break existing user settings. > > Do the revert at the least risky solution for now. Shouldn't it have Fixes tag along with Reported-by ones and explanation what was the actual problem reported? -- With Best Regards, Andy Shevchenko
Re: ARC no console output (was Re: [PATCH 1/2] init/console: Use ttynull as a fallback when there is no console)
On Thu 2021-01-07 08:43:16, Vineet Gupta wrote: > Hi John, > > On 1/7/21 1:02 AM, John Ogness wrote: > > Hi Vineet, > > > > On 2021-01-06, Vineet Gupta wrote: > > > This breaks ARC booting (no output on console). > > > > Could you provide the kernel boot arguments that you use? This series is > > partly about addressing users that have used boot arguments that are > > technically incorrect (even if had worked). Seeing the boot arguments of > > users that are not experiencing problems may help to reveal some of the > > unusual console usages until now. > > > Kernel command line: earlycon=uart8250,mmio32,0xf0005000,115200n8 > console=ttyS0,115200n8 debug print-fatal-signals=1 This is strange, the problematic patch should use ttynull only as a fallback. It should not be used when a particular console is defined on the command line. The only explanation would be that ttyS0 gets registered too late and ttynull is added as a fallback in the meantime. Anyway, I propose the revert the problematic patch for 5.11-rc3, see https://lore.kernel.org/lkml/20210107164400.17904-2-pmla...@suse.com/ This mystery is a good reason to avoid bigger changes at this stage. Best Regards, Petr
Re: [PATCH] kunit: tool: Force the use of the 'tty' console for UML
On Wed 2021-01-06 12:29:12, David Gow wrote: > On Wed, Jan 6, 2021 at 3:52 AM Shuah Khan wrote: > > > > On 1/5/21 11:57 AM, Andy Shevchenko wrote: > > > On Tue, Jan 05, 2021 at 09:34:33AM -0700, Shuah Khan wrote: > > >> On 1/5/21 9:21 AM, Petr Mladek wrote: > > >>> On Mon 2021-01-04 09:23:57, Shuah Khan wrote: > > >>>> On 12/22/20 4:11 AM, Andy Shevchenko wrote: > > >>>>> On Mon, Dec 21, 2020 at 11:39:00PM -0800, David Gow wrote: > > >>>>>> kunit_tool relies on the UML console outputting printk() output to > > >>>>>> the > > >>>>>> tty in order to get results. Since the default console driver could > > >>>>>> change, pass 'console=tty' to the kernel. > > >>>>>> > > >>>>>> This is triggered by a change[1] to use ttynull as a fallback console > > >>>>>> driver which -- by chance or by design -- seems to have changed the > > >>>>>> default console output on UML, breaking kunit_tool. While this may be > > >>>>>> fixed, we should be less fragile to such changes in the default. > > >>>>>> > > >>>>>> [1]: > > >>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=757055ae8dedf5333af17b3b5b4b70ba9bc9da4e > > >>>>> > > >>>>> Reported-by: Andy Shevchenko > > >>>>> Tested-by: Andy Shevchenko > > >>>>> > > >>>> > > >>>> Thank you all. Now in linux-kselftest kunit-fixes branch. > > >>>> > > >>>> Will send this up for rc3. > > >>>> > > >>>> Sorry for the delay - have been away from the keyboard for a > > >>>> bit. > > >>> > > >>> JFYI, I am not sure that this is the right solution. I am > > >>> looking into it, see > > >>> https://lore.kernel.org/linux-kselftest/X%2FSRA1P8t+ONZFKb@alley/ > > >>> for more details. > > >>> > > >> > > >> Thanks Petr. I will hold off on sending the patch up to Linus and > > >> let you find a the right solution. > > > > > > Please. leave it in Linux Next at least. Otherwise kunit will be broken > > > for a > > > long time which is not good. > > > > > > > > > > Yes. That is the plan. It will be in there until real fix comes in. The real fix would be too complicated for 5.11-rc3. Instead, I proposed to revert the problematic commit, see https://lore.kernel.org/lkml/20210107164400.17904-2-pmla...@suse.com/ I would like to push it for 5.11-rc3. > Personally, I think that this patch makes some sense to keep even if > the underlying issue with ttynull is resolved. Given that kunit.py > requires the console output, explicitly stating we want console=tty > set is probably worth doing rather than relying on it being the > default. I agree that the patch makes sense on its own. kunit depends on the particular console. Note that "tty" is actually the UML-specific stdio console implemented in arch/um/drivers/stdio_console.c. Best Regards, Petr
[PATCH 1/1] Revert "init/console: Use ttynull as a fallback when there is no console"
This reverts commit 757055ae8dedf5333af17b3b5b4b70ba9bc9da4e. The commit caused that ttynull was used as the default console on many systems. It happened when there was no console configured on the command line and ttynull_init() was the first initcall calling register_console(). The commit fixed a historical problem that have been there for ages. The primary motivation was the commit 3cffa06aeef7ece30f6 ("printk/console: Allow to disable console output by using console="" or console=null"). It provided a clean solution for a workaround that was widely used and worked only by chance. This revert causes that the console="" or console=null command line options will again work only by chance. These options will cause that a particular console will be preferred and the default (tty) ones will not get enabled. There will be no console registered at all. As a result there won't be stdin, stdout, and stderr for the init process. But it worked exactly this way even before. The proper solution has to fulfill many conditions: + Register ttynull only when explicitly required or as the ultimate fallback. + ttynull must get associated with /dev/console but it must not become preferred console when used as a fallback. Especially, it must still be possible to replace it by a better console later. Such a change requires clean up of the register_console() code. Otherwise, it would be even harder to follow. Especially, the use of has_preferred_console and CON_CONSDEV flag is tricky. The clean up is risky. The ordering of consoles is not well defined. And any changes tend to break existing user settings. Do the revert at the least risky solution for now. Signed-off-by: Petr Mladek --- drivers/tty/Kconfig | 14 ++ drivers/tty/Makefile| 3 ++- drivers/tty/ttynull.c | 18 -- include/linux/console.h | 3 --- init/main.c | 10 ++ 5 files changed, 18 insertions(+), 30 deletions(-) diff --git a/drivers/tty/Kconfig b/drivers/tty/Kconfig index 47a6e42f0d04..e15cd6b5bb99 100644 --- a/drivers/tty/Kconfig +++ b/drivers/tty/Kconfig @@ -401,6 +401,20 @@ config MIPS_EJTAG_FDC_KGDB_CHAN help FDC channel number to use for KGDB. +config NULL_TTY + tristate "NULL TTY driver" + help + Say Y here if you want a NULL TTY which simply discards messages. + + This is useful to allow userspace applications which expect a console + device to work without modifications even when no console is + available or desired. + + In order to use this driver, you should redirect the console to this + TTY, or boot the kernel with console=ttynull. + + If unsure, say N. + config TRACE_ROUTER tristate "Trace data router for MIPI P1149.7 cJTAG standard" depends on TRACE_SINK diff --git a/drivers/tty/Makefile b/drivers/tty/Makefile index 3c1c5a9240a7..b3ccae932660 100644 --- a/drivers/tty/Makefile +++ b/drivers/tty/Makefile @@ -2,7 +2,7 @@ obj-$(CONFIG_TTY) += tty_io.o n_tty.o tty_ioctl.o tty_ldisc.o \ tty_buffer.o tty_port.o tty_mutex.o \ tty_ldsem.o tty_baudrate.o tty_jobctrl.o \ - n_null.o ttynull.o + n_null.o obj-$(CONFIG_LEGACY_PTYS) += pty.o obj-$(CONFIG_UNIX98_PTYS) += pty.o obj-$(CONFIG_AUDIT)+= tty_audit.o @@ -25,6 +25,7 @@ obj-$(CONFIG_ISI) += isicom.o obj-$(CONFIG_MOXA_INTELLIO)+= moxa.o obj-$(CONFIG_MOXA_SMARTIO) += mxser.o obj-$(CONFIG_NOZOMI) += nozomi.o +obj-$(CONFIG_NULL_TTY) += ttynull.o obj-$(CONFIG_ROCKETPORT) += rocket.o obj-$(CONFIG_SYNCLINK_GT) += synclink_gt.o obj-$(CONFIG_PPC_EPAPR_HV_BYTECHAN) += ehv_bytechan.o diff --git a/drivers/tty/ttynull.c b/drivers/tty/ttynull.c index eced70ec54e1..17f05b7eb6d3 100644 --- a/drivers/tty/ttynull.c +++ b/drivers/tty/ttynull.c @@ -2,13 +2,6 @@ /* * Copyright (C) 2019 Axis Communications AB * - * The console is useful for userspace applications which expect a console - * device to work without modifications even when no console is available - * or desired. - * - * In order to use this driver, you should redirect the console to this - * TTY, or boot the kernel with console=ttynull. - * * Based on ttyprintk.c: * Copyright (C) 2010 Samo Pogacnik */ @@ -66,17 +59,6 @@ static struct console ttynull_console = { .device = ttynull_device, }; -void __init register_ttynull_console(void) -{ - if (!ttynull_driver) - return; - - if (add_preferred_console(ttynull_console.name, 0, NULL)) - return; - - register_console(_console); -} - static int __init ttynull_init(void) { struct tty_driver *driver; diff --git a/include/linux/console.h b/include/linux/console.h index dbe
[PATCH 0/1] console: Blank console - userspace regression
The commit 757055ae8dedf5333af17b3b ("init/console: Use ttynull as a fallback when there is no console") caused several regressions. The ttynull console has been selected by default when no console was configured on the command line and ttynull_init() was the first initcall calling register_console(). There were few possible workarounds but I was not happy with them. I worked on a proper solution but it became too complicated to be used at this stage. The console registration code is a sad story. There are many hidden catches. The ordering of registered consoles is not well defined. Any change in the code tends to break some particular system. The good news is that I become more and more familiar with it. I hope that I will be able to clean it up step by step in the future. But it is definitely not a good idea to do any big refactoring in rc phase. Hence I propose to revert the problematic commit as the least risky solution. Petr Mladek (1): Revert "init/console: Use ttynull as a fallback when there is no console" drivers/tty/Kconfig | 14 ++ drivers/tty/Makefile| 3 ++- drivers/tty/ttynull.c | 18 -- include/linux/console.h | 3 --- init/main.c | 10 ++ 5 files changed, 18 insertions(+), 30 deletions(-) -- 2.26.2
Re: ARC no console output (was Re: [PATCH 1/2] init/console: Use ttynull as a fallback when there is no console)
Hi John, On 1/7/21 1:02 AM, John Ogness wrote: Hi Vineet, On 2021-01-06, Vineet Gupta wrote: This breaks ARC booting (no output on console). Could you provide the kernel boot arguments that you use? This series is partly about addressing users that have used boot arguments that are technically incorrect (even if had worked). Seeing the boot arguments of users that are not experiencing problems may help to reveal some of the unusual console usages until now. Kernel command line: earlycon=uart8250,mmio32,0xf0005000,115200n8 console=ttyS0,115200n8 debug print-fatal-signals=1 And we do have serial driver built-in. Thx, -Vineet
Re: [Buildroot] ARC no console output (was Re: [PATCH 1/2] init/console: Use ttynull as a fallback when there is no console)
Vineet, All, On 2021-01-06 15:36 -0800, Vineet Gupta spake thusly: > On 11/11/20 5:54 AM, Petr Mladek wrote: [--SNIP--] > >Make sure that stdin, stdout, stderr, and /dev/console are always > >available by a fallback to the existing ttynull driver. It has > >been implemented for exactly this purpose but it was used only > >when explicitly configured. > > > >Signed-off-by: Petr Mladek > > >--- a/init/main.c > >+++ b/init/main.c > >@@ -1470,8 +1470,14 @@ void __init console_on_rootfs(void) > > struct file *file = filp_open("/dev/console", O_RDWR, 0); > > if (IS_ERR(file)) { > >-pr_err("Warning: unable to open an initial console.\n"); > >-return; > >+pr_err("Warning: unable to open an initial console. Fallback to > >ttynull.\n"); > >+register_ttynull_console(); > >+ > >+file = filp_open("/dev/console", O_RDWR, 0); > >+if (IS_ERR(file)) { > >+pr_err("Warning: Failed to add ttynull console. No > >stdin, stdout, and stderr for the init process!\n"); > >+return; > >+} > > This breaks ARC booting (no output on console). > > Our Buildroot based setup has dynamic /dev where /dev/console doesn't exist > statically and there's a primoridla /init shell script which does following > > /bin/mount -t devtmpfs devtmpfs /dev > exec 0 exec 1>/dev/console > exec 2>/dev/console > exec /sbin/init "$@" I guess you are speaking about the initramfs (cpio) case, right? We've changed that code last August: https://git.buildroot.org/buildroot/commit/fs/cpio/init?id=b9026e83f I.e. if we can't do the redirection, then we don't redirect anything. The change was done for people who explicitly pass an empty console= on their kernel command line. Now, I haven't looked at nulltty yet, and I have (so far) no idea on how it works. Thanks for the hint, I'll have a look. > Buildroot has had this way of handling missing /dev/console since 2011 [1] > and [2]. See also more archaelogy on that topic, referenced in that commit: https://git.buildroot.org/buildroot/commit/fs/cpio/?id=98a6f1fc02e41 > Please advise what needs to be done to unbork boot. This has been present since the 2020.08 release, and has been backported to the maintenance branches: 2020.02.x (LTS) -> f1a83afe2df2a 2020.05.x -> 797f9e40224c9 > Otherwise this seems > like a kernel change which breaks user-space and needs to be backed-out (or > perhaps conditionalize on CONFIG_NULL_TTY. I'm surprised it hasn't been > reported by any other embedded folks I won't speak about whether this is a kernel regression or not, not my call. Regards, Yann E. MORIN. -- .-..--.. | Yann E. MORIN | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: | | +33 662 376 056 | Software Designer | \ / CAMPAIGN | ___ | | +33 561 099 427 `.---: X AGAINST | \e/ There is no | | http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL| v conspiracy. | '--^---^--^'
Re: ARC no console output (was Re: [PATCH 1/2] init/console: Use ttynull as a fallback when there is no console)
Hi John, On 7/1/21 7:02 pm, John Ogness wrote: On 2021-01-06, Vineet Gupta wrote: This breaks ARC booting (no output on console). Could you provide the kernel boot arguments that you use? This series is partly about addressing users that have used boot arguments that are technically incorrect (even if had worked). Seeing the boot arguments of users that are not experiencing problems may help to reveal some of the unusual console usages until now. I can show an example for m68knommu which this change breaks too (with no console output on boot). All the ColdFire dev board targets (arch/m68k/configs/m5*) have a compiled in boot argument which is "root=/dev/mtdblock0". They have no real mechanism to pass boot arguments from their boot loader, so it is compiled in. The default mcf serial driver is the console on these and no "console=ttyS0" argument was required in the past. Regards Greg
Re: ARC no console output (was Re: [PATCH 1/2] init/console: Use ttynull as a fallback when there is no console)
On Wed 2021-01-06 15:36:36, Vineet Gupta wrote: > +CC Buildroot folks > > Hi Petr, > > On 11/11/20 5:54 AM, Petr Mladek wrote: > > stdin, stdout, and stderr standard I/O stream are created for the init > > process. They are not available when there is no console registered > > for /dev/console. It might lead to a crash when the init process > > tries to use them, see the commit 48021f98130880dd742 ("printk: handle > > blank console arguments passed in."). > > > > Normally, ttySX and ttyX consoles are used as a fallback when no consoles > > are defined via the command line, device tree, or SPCR. But there > > will be no console registered when an invalid console name is configured > > or when the configured consoles do not exist on the system. > > > > Users even try to avoid the console intentionally, for example, > > by using console="" or console=null. It is used on production > > systems where the serial port or terminal are not visible to > > users. Pushing messages to these consoles would just unnecessary > > slowdown the system. > > > > Make sure that stdin, stdout, stderr, and /dev/console are always > > available by a fallback to the existing ttynull driver. It has > > been implemented for exactly this purpose but it was used only > > when explicitly configured. > > > > Signed-off-by: Petr Mladek > > > --- a/init/main.c > > +++ b/init/main.c > > @@ -1470,8 +1470,14 @@ void __init console_on_rootfs(void) > > struct file *file = filp_open("/dev/console", O_RDWR, 0); > > if (IS_ERR(file)) { > > - pr_err("Warning: unable to open an initial console.\n"); > > - return; > > + pr_err("Warning: unable to open an initial console. Fallback to > > ttynull.\n"); > > + register_ttynull_console(); > > + > > + file = filp_open("/dev/console", O_RDWR, 0); > > + if (IS_ERR(file)) { > > + pr_err("Warning: Failed to add ttynull console. No > > stdin, stdout, and stderr for the init process!\n"); > > + return; > > + } > > > This breaks ARC booting (no output on console). This is likely the same problem as with kunit and um kernels. It is being discussed at https://lore.kernel.org/linux-kselftest/X%2FSRA1P8t+ONZFKb@alley/#t We have several workarounds. I am still squashing my head about the right solution. The console registration code is like a vasps' nest. It is always a pain when we touch it. I hope that I will send a patchset for review later today. In the worst case, we will revert the patch in the mainline. > Our Buildroot based setup has dynamic /dev where /dev/console doesn't exist > statically and there's a primoridla /init shell script which does following > > /bin/mount -t devtmpfs devtmpfs /dev > exec 0 exec 1>/dev/console > exec 2>/dev/console > exec /sbin/init "$@" > > Buildroot has had this way of handling missing /dev/console since 2011 [1] > and [2]. Good to know. > Please advise what needs to be done to unbork boot. Otherwise this seems > like a kernel change which breaks user-space and needs to be backed-out (or > perhaps conditionalize on CONFIG_NULL_TTY. I'm surprised it hasn't been > reported by any other embedded folks Two workarounds can be fount at https://lore.kernel.org/linux-kselftest/X%2FSYhBZyudfnKY1u@alley/ https://lore.kernel.org/linux-kselftest/X%2FW2sl7RMvfaV4Ru@alley/ But I still see them as only a partial solutiuon. I still another sources of potential problems. Best Regards, Petr
Re: ARC no console output (was Re: [PATCH 1/2] init/console: Use ttynull as a fallback when there is no console)
[CC'in linux-um since there is a similar issue] On Thu, Jan 7, 2021 at 12:38 AM Vineet Gupta wrote: > > +CC Buildroot folks > > Hi Petr, > > On 11/11/20 5:54 AM, Petr Mladek wrote: > > stdin, stdout, and stderr standard I/O stream are created for the init > > process. They are not available when there is no console registered > > for /dev/console. It might lead to a crash when the init process > > tries to use them, see the commit 48021f98130880dd742 ("printk: handle > > blank console arguments passed in."). > > > > Normally, ttySX and ttyX consoles are used as a fallback when no consoles > > are defined via the command line, device tree, or SPCR. But there > > will be no console registered when an invalid console name is configured > > or when the configured consoles do not exist on the system. > > > > Users even try to avoid the console intentionally, for example, > > by using console="" or console=null. It is used on production > > systems where the serial port or terminal are not visible to > > users. Pushing messages to these consoles would just unnecessary > > slowdown the system. > > > > Make sure that stdin, stdout, stderr, and /dev/console are always > > available by a fallback to the existing ttynull driver. It has > > been implemented for exactly this purpose but it was used only > > when explicitly configured. > > > > Signed-off-by: Petr Mladek > > > --- a/init/main.c > > +++ b/init/main.c > > @@ -1470,8 +1470,14 @@ void __init console_on_rootfs(void) > > struct file *file = filp_open("/dev/console", O_RDWR, 0); > > > > if (IS_ERR(file)) { > > - pr_err("Warning: unable to open an initial console.\n"); > > - return; > > + pr_err("Warning: unable to open an initial console. Fallback > > to ttynull.\n"); > > + register_ttynull_console(); > > + > > + file = filp_open("/dev/console", O_RDWR, 0); > > + if (IS_ERR(file)) { > > + pr_err("Warning: Failed to add ttynull console. No > > stdin, stdout, and stderr for the init process!\n"); > > + return; > > + } > > > This breaks ARC booting (no output on console). > > Our Buildroot based setup has dynamic /dev where /dev/console doesn't > exist statically and there's a primoridla /init shell script which does > following > > /bin/mount -t devtmpfs devtmpfs /dev > exec 0 exec 1>/dev/console > exec 2>/dev/console > exec /sbin/init "$@" > > Buildroot has had this way of handling missing /dev/console since 2011 > [1] and [2]. > > Please advise what needs to be done to unbork boot. Otherwise this seems > like a kernel change which breaks user-space and needs to be backed-out > (or perhaps conditionalize on CONFIG_NULL_TTY. I'm surprised it hasn't > been reported by any other embedded folks > > Thx, > -Vineet > > [1] http://lists.busybox.net/pipermail/buildroot/2011-July/044505.html > [2] http://lists.busybox.net/pipermail/buildroot/2011-August/044832.html -- Thanks, //richard
Re: ARC no console output (was Re: [PATCH 1/2] init/console: Use ttynull as a fallback when there is no console)
Hi Vineet, On 2021-01-06, Vineet Gupta wrote: > This breaks ARC booting (no output on console). Could you provide the kernel boot arguments that you use? This series is partly about addressing users that have used boot arguments that are technically incorrect (even if had worked). Seeing the boot arguments of users that are not experiencing problems may help to reveal some of the unusual console usages until now. John Ogness
ARC no console output (was Re: [PATCH 1/2] init/console: Use ttynull as a fallback when there is no console)
+CC Buildroot folks Hi Petr, On 11/11/20 5:54 AM, Petr Mladek wrote: stdin, stdout, and stderr standard I/O stream are created for the init process. They are not available when there is no console registered for /dev/console. It might lead to a crash when the init process tries to use them, see the commit 48021f98130880dd742 ("printk: handle blank console arguments passed in."). Normally, ttySX and ttyX consoles are used as a fallback when no consoles are defined via the command line, device tree, or SPCR. But there will be no console registered when an invalid console name is configured or when the configured consoles do not exist on the system. Users even try to avoid the console intentionally, for example, by using console="" or console=null. It is used on production systems where the serial port or terminal are not visible to users. Pushing messages to these consoles would just unnecessary slowdown the system. Make sure that stdin, stdout, stderr, and /dev/console are always available by a fallback to the existing ttynull driver. It has been implemented for exactly this purpose but it was used only when explicitly configured. Signed-off-by: Petr Mladek --- a/init/main.c +++ b/init/main.c @@ -1470,8 +1470,14 @@ void __init console_on_rootfs(void) struct file *file = filp_open("/dev/console", O_RDWR, 0); if (IS_ERR(file)) { - pr_err("Warning: unable to open an initial console.\n"); - return; + pr_err("Warning: unable to open an initial console. Fallback to ttynull.\n"); + register_ttynull_console(); + + file = filp_open("/dev/console", O_RDWR, 0); + if (IS_ERR(file)) { + pr_err("Warning: Failed to add ttynull console. No stdin, stdout, and stderr for the init process!\n"); + return; + } This breaks ARC booting (no output on console). Our Buildroot based setup has dynamic /dev where /dev/console doesn't exist statically and there's a primoridla /init shell script which does following /bin/mount -t devtmpfs devtmpfs /dev exec 0/dev/console exec 2>/dev/console exec /sbin/init "$@" Buildroot has had this way of handling missing /dev/console since 2011 [1] and [2]. Please advise what needs to be done to unbork boot. Otherwise this seems like a kernel change which breaks user-space and needs to be backed-out (or perhaps conditionalize on CONFIG_NULL_TTY. I'm surprised it hasn't been reported by any other embedded folks Thx, -Vineet [1] http://lists.busybox.net/pipermail/buildroot/2011-July/044505.html [2] http://lists.busybox.net/pipermail/buildroot/2011-August/044832.html
[PATCH tip/core/rcu 18/20] torture: Remove "Failed to add ttynull console" false positive
From: "Paul E. McKenney" Commit 757055ae8ded ("init/console: Use ttynull as a fallback when there is no console") results in the string "Warning: Failed to add ttynull console. No stdin, stdout, and stderr for the init process!" appearing on the console, which the rcutorture scripting interprets as a warning, which causes every rcutorture run to be flagged. However, the rcutorture init process never attempts to do any I/O, and thus does not care that it has no stdin, stdout, or stderr. This commit therefore causes the rcutorture scripting to ignore this message. Signed-off-by: Paul E. McKenney --- tools/testing/selftests/rcutorture/bin/console-badness.sh | 1 + tools/testing/selftests/rcutorture/bin/parse-console.sh | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/rcutorture/bin/console-badness.sh b/tools/testing/selftests/rcutorture/bin/console-badness.sh index 80ae7f0..e6a132d 100755 --- a/tools/testing/selftests/rcutorture/bin/console-badness.sh +++ b/tools/testing/selftests/rcutorture/bin/console-badness.sh @@ -14,4 +14,5 @@ egrep 'Badness|WARNING:|Warn|BUG|===|Call Trace:|Oops:|detected stalls o grep -v 'ODEBUG: ' | grep -v 'This means that this is a DEBUG kernel and it is' | grep -v 'Warning: unable to open an initial console' | +grep -v 'Warning: Failed to add ttynull console. No stdin, stdout, and stderr.*the init process!' | grep -v 'NOHZ tick-stop error: Non-RCU local softirq work is pending, handler' diff --git a/tools/testing/selftests/rcutorture/bin/parse-console.sh b/tools/testing/selftests/rcutorture/bin/parse-console.sh index 263b1be..9f624bd 100755 --- a/tools/testing/selftests/rcutorture/bin/parse-console.sh +++ b/tools/testing/selftests/rcutorture/bin/parse-console.sh @@ -128,7 +128,7 @@ then then summary="$summary Badness: $n_badness" fi - n_warn=`grep -v 'Warning: unable to open an initial console' $file | egrep -c 'WARNING:|Warn'` + n_warn=`grep -v 'Warning: unable to open an initial console' $file | grep -v 'Warning: Failed to add ttynull console. No stdin, stdout, and stderr for the init process' | egrep -c 'WARNING:|Warn'` if test "$n_warn" -ne 0 then summary="$summary Warnings: $n_warn" -- 2.9.5
Re: [PATCH v1] panic: push panic() messages to the console even from the MCE nmi handler
On (21/01/04 16:15), “William Roche wrote: [..] > diff --git a/kernel/panic.c b/kernel/panic.c > index 332736a..eb90cc0 100644 > --- a/kernel/panic.c > +++ b/kernel/panic.c > @@ -166,6 +166,15 @@ static void panic_print_sys_info(void) > ftrace_dump(DUMP_ALL); > } > > +/* > + * Force flush messages to the console. > + */ > +static void panic_flush_to_console(void) > +{ > + printk_safe_flush_on_panic(); > + console_flush_on_panic(CONSOLE_FLUSH_PENDING); > +} You must debug_locks_off() as the very first step here. But see below. > /** > * panic - halt the system > * @fmt: The text string to print > @@ -247,7 +256,7 @@ void panic(const char *fmt, ...) >* Bypass the panic_cpu check and call __crash_kexec directly. >*/ > if (!_crash_kexec_post_notifiers) { > - printk_safe_flush_on_panic(); > + panic_flush_to_console(); > __crash_kexec(NULL); It's dangerous to call console_flush_on_panic() before we stop secondary CPUs. console_flush_on_panic() ignores the state console_sem, so if any of the secondary is currently printing something on the consoles you'll get corrupted messages - we use `static char buffer` for messages we push to consoles. Another issue is that with this panic_flush_to_console() call console_sem can end up being locked once (by secondary CPU) and unlocked twice (by second and panic CPUs) [*] > /* > @@ -271,9 +280,8 @@ void panic(const char *fmt, ...) >*/ > atomic_notifier_call_chain(_notifier_list, 0, buf); > > - /* Call flush even twice. It tries harder with a single online CPU */ > - printk_safe_flush_on_panic(); > kmsg_dump(KMSG_DUMP_PANIC); > + panic_flush_to_console(); Why? > /* >* If you doubt kdump always works fine in any situation, > @@ -298,7 +306,7 @@ void panic(const char *fmt, ...) >* buffer. Try to acquire the lock then release it regardless of the >* result. The release will also print the buffers out. Locks debug >* should be disabled to avoid reporting bad unlock balance when > - * panic() is not being callled from OOPS. > + * panic() is not being called from OOPS. >*/ > debug_locks_off(); > console_flush_on_panic(CONSOLE_FLUSH_PENDING); > @@ -314,6 +322,7 @@ void panic(const char *fmt, ...) >* We can't use the "normal" timers since we just panicked. >*/ > pr_emerg("Rebooting in %d seconds..\n", panic_timeout); > + panic_flush_to_console(); [*] So this > for (i = 0; i < panic_timeout * 1000; i += PANIC_TIMER_STEP) { > touch_nmi_watchdog(); > @@ -347,6 +356,7 @@ void panic(const char *fmt, ...) > disabled_wait(); > #endif > pr_emerg("---[ end Kernel panic - not syncing: %s ]---\n", buf); > + panic_flush_to_console(); [*] and this are both very interesting points. Those extra console_flush_on_panic() calls indicate that normal printk() cannot succeed in locking the console_sem so it doesn't try to console_unlock(): either because we killed the secondary CPU while it was holding the lock, or because we locked it once and unlocked it twice. I think it would make sense to just re-init console_sem, so that normal printk()-s will have chance to grab the console_sem lock and then we don't need any extra panic_flush_to_console() calls. Maybe we can do something like this --- diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index ffdd0dc7ec6d..4bd2e29c8cc0 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2638,6 +2638,7 @@ void console_flush_on_panic(enum con_flush_mode mode) * context and we don't want to get preempted while flushing, * ensure may_schedule is cleared. */ + sema_init(_sem, 1); console_trylock(); console_may_schedule = 0;
Re: [PATCH] kunit: tool: Force the use of the 'tty' console for UML
On Wed, Jan 6, 2021 at 3:52 AM Shuah Khan wrote: > > On 1/5/21 11:57 AM, Andy Shevchenko wrote: > > On Tue, Jan 05, 2021 at 09:34:33AM -0700, Shuah Khan wrote: > >> On 1/5/21 9:21 AM, Petr Mladek wrote: > >>> On Mon 2021-01-04 09:23:57, Shuah Khan wrote: > >>>> On 12/22/20 4:11 AM, Andy Shevchenko wrote: > >>>>> On Mon, Dec 21, 2020 at 11:39:00PM -0800, David Gow wrote: > >>>>>> kunit_tool relies on the UML console outputting printk() output to the > >>>>>> tty in order to get results. Since the default console driver could > >>>>>> change, pass 'console=tty' to the kernel. > >>>>>> > >>>>>> This is triggered by a change[1] to use ttynull as a fallback console > >>>>>> driver which -- by chance or by design -- seems to have changed the > >>>>>> default console output on UML, breaking kunit_tool. While this may be > >>>>>> fixed, we should be less fragile to such changes in the default. > >>>>>> > >>>>>> [1]: > >>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=757055ae8dedf5333af17b3b5b4b70ba9bc9da4e > >>>>> > >>>>> Reported-by: Andy Shevchenko > >>>>> Tested-by: Andy Shevchenko > >>>>> > >>>> > >>>> Thank you all. Now in linux-kselftest kunit-fixes branch. > >>>> > >>>> Will send this up for rc3. > >>>> > >>>> Sorry for the delay - have been away from the keyboard for a > >>>> bit. > >>> > >>> JFYI, I am not sure that this is the right solution. I am > >>> looking into it, see > >>> https://lore.kernel.org/linux-kselftest/X%2FSRA1P8t+ONZFKb@alley/ > >>> for more details. > >>> > >> > >> Thanks Petr. I will hold off on sending the patch up to Linus and > >> let you find a the right solution. > > > > Please. leave it in Linux Next at least. Otherwise kunit will be broken for > > a > > long time which is not good. > > > > > > Yes. That is the plan. It will be in there until real fix comes in. > Thanks, Shuah. Personally, I think that this patch makes some sense to keep even if the underlying issue with ttynull is resolved. Given that kunit.py requires the console output, explicitly stating we want console=tty set is probably worth doing rather than relying on it being the default. That being said, I definitely agree that this patch doesn't fix the underlying issue with UML/ttynull: it just makes the kunit.py script less sensitive to such changes (which, while unlikely, could potentially occur legitimately down the track). Cheers, -- David
Re: [PATCH] kunit: tool: Force the use of the 'tty' console for UML
On 1/5/21 11:57 AM, Andy Shevchenko wrote: On Tue, Jan 05, 2021 at 09:34:33AM -0700, Shuah Khan wrote: On 1/5/21 9:21 AM, Petr Mladek wrote: On Mon 2021-01-04 09:23:57, Shuah Khan wrote: On 12/22/20 4:11 AM, Andy Shevchenko wrote: On Mon, Dec 21, 2020 at 11:39:00PM -0800, David Gow wrote: kunit_tool relies on the UML console outputting printk() output to the tty in order to get results. Since the default console driver could change, pass 'console=tty' to the kernel. This is triggered by a change[1] to use ttynull as a fallback console driver which -- by chance or by design -- seems to have changed the default console output on UML, breaking kunit_tool. While this may be fixed, we should be less fragile to such changes in the default. [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=757055ae8dedf5333af17b3b5b4b70ba9bc9da4e Reported-by: Andy Shevchenko Tested-by: Andy Shevchenko Thank you all. Now in linux-kselftest kunit-fixes branch. Will send this up for rc3. Sorry for the delay - have been away from the keyboard for a bit. JFYI, I am not sure that this is the right solution. I am looking into it, see https://lore.kernel.org/linux-kselftest/X%2FSRA1P8t+ONZFKb@alley/ for more details. Thanks Petr. I will hold off on sending the patch up to Linus and let you find a the right solution. Please. leave it in Linux Next at least. Otherwise kunit will be broken for a long time which is not good. Yes. That is the plan. It will be in there until real fix comes in. thanks, -- Shuah
Re: [PATCH] kunit: tool: Force the use of the 'tty' console for UML
On Tue, Jan 05, 2021 at 09:34:33AM -0700, Shuah Khan wrote: > On 1/5/21 9:21 AM, Petr Mladek wrote: > > On Mon 2021-01-04 09:23:57, Shuah Khan wrote: > > > On 12/22/20 4:11 AM, Andy Shevchenko wrote: > > > > On Mon, Dec 21, 2020 at 11:39:00PM -0800, David Gow wrote: > > > > > kunit_tool relies on the UML console outputting printk() output to the > > > > > tty in order to get results. Since the default console driver could > > > > > change, pass 'console=tty' to the kernel. > > > > > > > > > > This is triggered by a change[1] to use ttynull as a fallback console > > > > > driver which -- by chance or by design -- seems to have changed the > > > > > default console output on UML, breaking kunit_tool. While this may be > > > > > fixed, we should be less fragile to such changes in the default. > > > > > > > > > > [1]: > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=757055ae8dedf5333af17b3b5b4b70ba9bc9da4e > > > > > > > > Reported-by: Andy Shevchenko > > > > Tested-by: Andy Shevchenko > > > > > > > > > > Thank you all. Now in linux-kselftest kunit-fixes branch. > > > > > > Will send this up for rc3. > > > > > > Sorry for the delay - have been away from the keyboard for a > > > bit. > > > > JFYI, I am not sure that this is the right solution. I am > > looking into it, see > > https://lore.kernel.org/linux-kselftest/X%2FSRA1P8t+ONZFKb@alley/ > > for more details. > > > > Thanks Petr. I will hold off on sending the patch up to Linus and > let you find a the right solution. Please. leave it in Linux Next at least. Otherwise kunit will be broken for a long time which is not good. -- With Best Regards, Andy Shevchenko
Re: [PATCH] kunit: tool: Force the use of the 'tty' console for UML
On 1/5/21 9:21 AM, Petr Mladek wrote: On Mon 2021-01-04 09:23:57, Shuah Khan wrote: On 12/22/20 4:11 AM, Andy Shevchenko wrote: On Mon, Dec 21, 2020 at 11:39:00PM -0800, David Gow wrote: kunit_tool relies on the UML console outputting printk() output to the tty in order to get results. Since the default console driver could change, pass 'console=tty' to the kernel. This is triggered by a change[1] to use ttynull as a fallback console driver which -- by chance or by design -- seems to have changed the default console output on UML, breaking kunit_tool. While this may be fixed, we should be less fragile to such changes in the default. [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=757055ae8dedf5333af17b3b5b4b70ba9bc9da4e Reported-by: Andy Shevchenko Tested-by: Andy Shevchenko Thank you all. Now in linux-kselftest kunit-fixes branch. Will send this up for rc3. Sorry for the delay - have been away from the keyboard for a bit. JFYI, I am not sure that this is the right solution. I am looking into it, see https://lore.kernel.org/linux-kselftest/X%2FSRA1P8t+ONZFKb@alley/ for more details. Thanks Petr. I will hold off on sending the patch up to Linus and let you find a the right solution. thanks, -- Shuah
Re: [PATCH] kunit: tool: Force the use of the 'tty' console for UML
On Mon 2021-01-04 09:23:57, Shuah Khan wrote: > On 12/22/20 4:11 AM, Andy Shevchenko wrote: > > On Mon, Dec 21, 2020 at 11:39:00PM -0800, David Gow wrote: > > > kunit_tool relies on the UML console outputting printk() output to the > > > tty in order to get results. Since the default console driver could > > > change, pass 'console=tty' to the kernel. > > > > > > This is triggered by a change[1] to use ttynull as a fallback console > > > driver which -- by chance or by design -- seems to have changed the > > > default console output on UML, breaking kunit_tool. While this may be > > > fixed, we should be less fragile to such changes in the default. > > > > > > [1]: > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=757055ae8dedf5333af17b3b5b4b70ba9bc9da4e > > > > Reported-by: Andy Shevchenko > > Tested-by: Andy Shevchenko > > > > Thank you all. Now in linux-kselftest kunit-fixes branch. > > Will send this up for rc3. > > Sorry for the delay - have been away from the keyboard for a > bit. JFYI, I am not sure that this is the right solution. I am looking into it, see https://lore.kernel.org/linux-kselftest/X%2FSRA1P8t+ONZFKb@alley/ for more details. Best Regards, Petr
[PATCH v1] panic: push panic() messages to the console even from the MCE nmi handler
From: William Roche Force push panic messages to the console as panic() can be called from NMI interrupt handler functions where printed messages can't always reach the console without an explicit push provided by printk_safe_flush_on_panic() and console_flush_on_panic(). This is the case with the MCE handler that can lead to a system panic giving information on the fatal MCE root cause that must reach the console. Signed-off-by: William Roche --- Notes: While testing MCE injection and kernel reaction, we discovered a bug in the way the kernel provides the panic reason information: When dealing with fatal MCE, the machine (physical or virtual) can reboot without leaving any message on the console. This behavior can be reproduced on Intel with the mce-inject tool with a simple: # modprobe mce-inject # mce-inject test/uncorrected The investigations showed that the MCE panic can be totally message-less or can give a small set of messages. This behavior depends on the use of the crash_kexec mechanism (using the "crashkernel" parameter). Not using this parameter, we get a partial [Hardware Error] information on panic, but some important notifications can be missing. And when using it, a fatal MCE can panic the system without leaving any information. . Without "crashkernel", a Fatal MCE injection shows: [ 212.153928] mce: Machine check injector initialized [ 236.730682] mce: Triggering MCE exception on CPU 0 [ 236.731304] Disabling lock debugging due to kernel taint [ 236.731947] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 1: b000 [ 236.731948] mce: [Hardware Error]: TSC 78418fb4a83f [ 236.731949] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1605312952 SOCKET 0 APIC 0 microcode 1 [ 236.731949] mce: [Hardware Error]: Run the above through 'mcelog --ascii' [ 236.731950] mce: [Hardware Error]: Machine check: MCIP not set in MCA handler [ 236.731950] Kernel panic - not syncing: Fatal machine check [ 236.732047] Kernel Offset: disabled The system hangs 30 seconds without any additional message, and finally reboots. . With the use of "crashkernel", a Fatal MCE injection shows only the injection message [ 80.811708] mce: Machine check injector initialized [ 92.298755] mce: Triggering MCE exception on CPU 0 [ 92.299362] Disabling lock debugging due to kernel taint No other messages is displayed and the system reboots immediately. The reason behind this defective behavior is the fact that do_machine_check() MCE handler can call mce_panic() to finish in panic() trying to print out messages on the console. So we end up trying to print out messages directly from this interrupt handler, which can require the help of additional function calls like printk_safe_flush_on_panic() and console_flush_on_panic(). With the suggested fix here, the above output turns into the expected: . Without crashkernel: [ 92.025010] mce: Machine check injector initialized [ 113.006798] mce: Triggering MCE exception on CPU 0 [ 113.007391] Disabling lock debugging due to kernel taint [ 113.008061] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 1: b000 [ 113.008062] mce: [Hardware Error]: TSC 2ec380a0f187 [ 113.008063] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1606223541 SOCKET 0 APIC 0 microcode 1 [ 113.008063] mce: [Hardware Error]: Run the above through 'mcelog --ascii' [ 113.008064] mce: [Hardware Error]: Machine check: MCIP not set in MCA handler [ 113.008064] Kernel panic - not syncing: Fatal machine check [ 113.013010] Kernel Offset: disabled [ 113.013426] Rebooting in 30 seconds.. Note that we now have the important information about the Reboot in 30 seconds. . With crashkernel: [ 75.770143] mce: Machine check injector initialized [ 100.923650] mce: Triggering MCE exception on CPU 0 [ 100.924415] Disabling lock debugging due to kernel taint [ 100.925094] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 1: b000 [ 100.925095] mce: [Hardware Error]: TSC 2f3aa07b03de [ 100.925096] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1606223692 SOCKET 0 APIC 0 microcode 1 [ 100.925097] mce: [Hardware Error]: Run the above through 'mcelog --ascii' [ 100.925098] mce: [Hardware Error]: Machine check: MCIP not set in MCA handler [ 100.925100] Kernel panic - not syncing: Fatal machine check Here the reboot is immediate but the MCE error message reaches the console. Could you please review this fix ? Thanks in advance for any feedback you could have. William. kernel/panic.c | 18 ++ 1 file changed, 14 insertions(+), 4 del
Re: [PATCH] kunit: tool: Force the use of the 'tty' console for UML
On 12/22/20 4:11 AM, Andy Shevchenko wrote: On Mon, Dec 21, 2020 at 11:39:00PM -0800, David Gow wrote: kunit_tool relies on the UML console outputting printk() output to the tty in order to get results. Since the default console driver could change, pass 'console=tty' to the kernel. This is triggered by a change[1] to use ttynull as a fallback console driver which -- by chance or by design -- seems to have changed the default console output on UML, breaking kunit_tool. While this may be fixed, we should be less fragile to such changes in the default. [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=757055ae8dedf5333af17b3b5b4b70ba9bc9da4e Reported-by: Andy Shevchenko Tested-by: Andy Shevchenko Thank you all. Now in linux-kselftest kunit-fixes branch. Will send this up for rc3. Sorry for the delay - have been away from the keyboard for a bit. thanks, -- Shuah
[PATCH 4.4 018/132] USB: sisusbvga: Make console support depend on BROKEN
From: Thomas Gleixner commit 862ee699fefe1e6d6f2c1518395f0b999b8beb15 upstream. The console part of sisusbvga is broken vs. printk(). It uses in_atomic() to detect contexts in which it cannot sleep despite the big fat comment in preempt.h which says: Do not use in_atomic() in driver code. in_atomic() does not work on kernels with CONFIG_PREEMPT_COUNT=n which means that spin/rw_lock held regions are not detected by it. There is no way to make this work by handing context information through to the driver and this only can be solved once the core printk infrastructure supports sleepable console drivers. Make it depend on BROKEN for now. Fixes: 1bbb4f2035d9 ("[PATCH] USB: sisusb[vga] update") Signed-off-by: Thomas Gleixner Cc: Thomas Winischhofer Cc: Greg Kroah-Hartman Cc: linux-...@vger.kernel.org Cc: sta...@vger.kernel.org Link: https://lore.kernel.org/r/20201019101109.603244...@linutronix.de Signed-off-by: Greg Kroah-Hartman --- drivers/usb/misc/sisusbvga/Kconfig |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/usb/misc/sisusbvga/Kconfig +++ b/drivers/usb/misc/sisusbvga/Kconfig @@ -15,7 +15,7 @@ config USB_SISUSBVGA config USB_SISUSBVGA_CON bool "Text console and mode switching support" if USB_SISUSBVGA - depends on VT + depends on VT && BROKEN select FONT_8x16 ---help--- Say Y here if you want a VGA text console via the USB dongle or
[PATCH 4.14 031/242] USB: sisusbvga: Make console support depend on BROKEN
From: Thomas Gleixner commit 862ee699fefe1e6d6f2c1518395f0b999b8beb15 upstream. The console part of sisusbvga is broken vs. printk(). It uses in_atomic() to detect contexts in which it cannot sleep despite the big fat comment in preempt.h which says: Do not use in_atomic() in driver code. in_atomic() does not work on kernels with CONFIG_PREEMPT_COUNT=n which means that spin/rw_lock held regions are not detected by it. There is no way to make this work by handing context information through to the driver and this only can be solved once the core printk infrastructure supports sleepable console drivers. Make it depend on BROKEN for now. Fixes: 1bbb4f2035d9 ("[PATCH] USB: sisusb[vga] update") Signed-off-by: Thomas Gleixner Cc: Thomas Winischhofer Cc: Greg Kroah-Hartman Cc: linux-...@vger.kernel.org Cc: sta...@vger.kernel.org Link: https://lore.kernel.org/r/20201019101109.603244...@linutronix.de Signed-off-by: Greg Kroah-Hartman --- drivers/usb/misc/sisusbvga/Kconfig |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/usb/misc/sisusbvga/Kconfig +++ b/drivers/usb/misc/sisusbvga/Kconfig @@ -15,7 +15,7 @@ config USB_SISUSBVGA config USB_SISUSBVGA_CON bool "Text console and mode switching support" if USB_SISUSBVGA - depends on VT + depends on VT && BROKEN select FONT_8x16 ---help--- Say Y here if you want a VGA text console via the USB dongle or
[PATCH 4.19 045/346] USB: sisusbvga: Make console support depend on BROKEN
From: Thomas Gleixner commit 862ee699fefe1e6d6f2c1518395f0b999b8beb15 upstream. The console part of sisusbvga is broken vs. printk(). It uses in_atomic() to detect contexts in which it cannot sleep despite the big fat comment in preempt.h which says: Do not use in_atomic() in driver code. in_atomic() does not work on kernels with CONFIG_PREEMPT_COUNT=n which means that spin/rw_lock held regions are not detected by it. There is no way to make this work by handing context information through to the driver and this only can be solved once the core printk infrastructure supports sleepable console drivers. Make it depend on BROKEN for now. Fixes: 1bbb4f2035d9 ("[PATCH] USB: sisusb[vga] update") Signed-off-by: Thomas Gleixner Cc: Thomas Winischhofer Cc: Greg Kroah-Hartman Cc: linux-...@vger.kernel.org Cc: sta...@vger.kernel.org Link: https://lore.kernel.org/r/20201019101109.603244...@linutronix.de Signed-off-by: Greg Kroah-Hartman --- drivers/usb/misc/sisusbvga/Kconfig |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/usb/misc/sisusbvga/Kconfig +++ b/drivers/usb/misc/sisusbvga/Kconfig @@ -15,7 +15,7 @@ config USB_SISUSBVGA config USB_SISUSBVGA_CON bool "Text console and mode switching support" if USB_SISUSBVGA - depends on VT + depends on VT && BROKEN select FONT_8x16 ---help--- Say Y here if you want a VGA text console via the USB dongle or
[PATCH 4.9 023/175] USB: sisusbvga: Make console support depend on BROKEN
From: Thomas Gleixner commit 862ee699fefe1e6d6f2c1518395f0b999b8beb15 upstream. The console part of sisusbvga is broken vs. printk(). It uses in_atomic() to detect contexts in which it cannot sleep despite the big fat comment in preempt.h which says: Do not use in_atomic() in driver code. in_atomic() does not work on kernels with CONFIG_PREEMPT_COUNT=n which means that spin/rw_lock held regions are not detected by it. There is no way to make this work by handing context information through to the driver and this only can be solved once the core printk infrastructure supports sleepable console drivers. Make it depend on BROKEN for now. Fixes: 1bbb4f2035d9 ("[PATCH] USB: sisusb[vga] update") Signed-off-by: Thomas Gleixner Cc: Thomas Winischhofer Cc: Greg Kroah-Hartman Cc: linux-...@vger.kernel.org Cc: sta...@vger.kernel.org Link: https://lore.kernel.org/r/20201019101109.603244...@linutronix.de Signed-off-by: Greg Kroah-Hartman --- drivers/usb/misc/sisusbvga/Kconfig |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/usb/misc/sisusbvga/Kconfig +++ b/drivers/usb/misc/sisusbvga/Kconfig @@ -15,7 +15,7 @@ config USB_SISUSBVGA config USB_SISUSBVGA_CON bool "Text console and mode switching support" if USB_SISUSBVGA - depends on VT + depends on VT && BROKEN select FONT_8x16 ---help--- Say Y here if you want a VGA text console via the USB dongle or
Re: [PATCH] kunit: tool: Force the use of the 'tty' console for UML
On Mon, Dec 21, 2020 at 11:39 PM David Gow wrote: > > kunit_tool relies on the UML console outputting printk() output to the > tty in order to get results. Since the default console driver could > change, pass 'console=tty' to the kernel. > > This is triggered by a change[1] to use ttynull as a fallback console > driver which -- by chance or by design -- seems to have changed the > default console output on UML, breaking kunit_tool. While this may be > fixed, we should be less fragile to such changes in the default. > > [1]: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=757055ae8dedf5333af17b3b5b4b70ba9bc9da4e > > Signed-off-by: David Gow > Fixes: 757055ae8ded ("init/console: Use ttynull as a fallback when there is > no console") Acked-by: Brendan Higgins Thanks for taking care of this!
Re: [PATCH] kunit: tool: Force the use of the 'tty' console for UML
On Mon, Dec 21, 2020 at 11:39:00PM -0800, David Gow wrote: > kunit_tool relies on the UML console outputting printk() output to the > tty in order to get results. Since the default console driver could > change, pass 'console=tty' to the kernel. > > This is triggered by a change[1] to use ttynull as a fallback console > driver which -- by chance or by design -- seems to have changed the > default console output on UML, breaking kunit_tool. While this may be > fixed, we should be less fragile to such changes in the default. > > [1]: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=757055ae8dedf5333af17b3b5b4b70ba9bc9da4e Reported-by: Andy Shevchenko Tested-by: Andy Shevchenko Thanks! > Signed-off-by: David Gow > Fixes: 757055ae8ded ("init/console: Use ttynull as a fallback when there is > no console") > --- > tools/testing/kunit/kunit_kernel.py | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/tools/testing/kunit/kunit_kernel.py > b/tools/testing/kunit/kunit_kernel.py > index 57c1724b7e5d..698358c9c0d6 100644 > --- a/tools/testing/kunit/kunit_kernel.py > +++ b/tools/testing/kunit/kunit_kernel.py > @@ -198,7 +198,7 @@ class LinuxSourceTree(object): > return self.validate_config(build_dir) > > def run_kernel(self, args=[], build_dir='', timeout=None): > - args.extend(['mem=1G']) > + args.extend(['mem=1G', 'console=tty']) > self._ops.linux_bin(args, timeout, build_dir) > outfile = get_outfile_path(build_dir) > subprocess.call(['stty', 'sane']) > -- > 2.29.2.729.g45daf8777d-goog > -- With Best Regards, Andy Shevchenko
[PATCH] kunit: tool: Force the use of the 'tty' console for UML
kunit_tool relies on the UML console outputting printk() output to the tty in order to get results. Since the default console driver could change, pass 'console=tty' to the kernel. This is triggered by a change[1] to use ttynull as a fallback console driver which -- by chance or by design -- seems to have changed the default console output on UML, breaking kunit_tool. While this may be fixed, we should be less fragile to such changes in the default. [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=757055ae8dedf5333af17b3b5b4b70ba9bc9da4e Signed-off-by: David Gow Fixes: 757055ae8ded ("init/console: Use ttynull as a fallback when there is no console") --- tools/testing/kunit/kunit_kernel.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/kunit/kunit_kernel.py b/tools/testing/kunit/kunit_kernel.py index 57c1724b7e5d..698358c9c0d6 100644 --- a/tools/testing/kunit/kunit_kernel.py +++ b/tools/testing/kunit/kunit_kernel.py @@ -198,7 +198,7 @@ class LinuxSourceTree(object): return self.validate_config(build_dir) def run_kernel(self, args=[], build_dir='', timeout=None): - args.extend(['mem=1G']) + args.extend(['mem=1G', 'console=tty']) self._ops.linux_bin(args, timeout, build_dir) outfile = get_outfile_path(build_dir) subprocess.call(['stty', 'sane']) -- 2.29.2.729.g45daf8777d-goog
[PATCH 5.4 27/34] USB: sisusbvga: Make console support depend on BROKEN
From: Thomas Gleixner commit 862ee699fefe1e6d6f2c1518395f0b999b8beb15 upstream. The console part of sisusbvga is broken vs. printk(). It uses in_atomic() to detect contexts in which it cannot sleep despite the big fat comment in preempt.h which says: Do not use in_atomic() in driver code. in_atomic() does not work on kernels with CONFIG_PREEMPT_COUNT=n which means that spin/rw_lock held regions are not detected by it. There is no way to make this work by handing context information through to the driver and this only can be solved once the core printk infrastructure supports sleepable console drivers. Make it depend on BROKEN for now. Fixes: 1bbb4f2035d9 ("[PATCH] USB: sisusb[vga] update") Signed-off-by: Thomas Gleixner Cc: Thomas Winischhofer Cc: Greg Kroah-Hartman Cc: linux-...@vger.kernel.org Cc: sta...@vger.kernel.org Link: https://lore.kernel.org/r/20201019101109.603244...@linutronix.de Signed-off-by: Greg Kroah-Hartman --- drivers/usb/misc/sisusbvga/Kconfig |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/usb/misc/sisusbvga/Kconfig +++ b/drivers/usb/misc/sisusbvga/Kconfig @@ -16,7 +16,7 @@ config USB_SISUSBVGA config USB_SISUSBVGA_CON bool "Text console and mode switching support" if USB_SISUSBVGA - depends on VT + depends on VT && BROKEN select FONT_8x16 ---help--- Say Y here if you want a VGA text console via the USB dongle or
[PATCH 5.9 43/49] USB: sisusbvga: Make console support depend on BROKEN
From: Thomas Gleixner commit 862ee699fefe1e6d6f2c1518395f0b999b8beb15 upstream. The console part of sisusbvga is broken vs. printk(). It uses in_atomic() to detect contexts in which it cannot sleep despite the big fat comment in preempt.h which says: Do not use in_atomic() in driver code. in_atomic() does not work on kernels with CONFIG_PREEMPT_COUNT=n which means that spin/rw_lock held regions are not detected by it. There is no way to make this work by handing context information through to the driver and this only can be solved once the core printk infrastructure supports sleepable console drivers. Make it depend on BROKEN for now. Fixes: 1bbb4f2035d9 ("[PATCH] USB: sisusb[vga] update") Signed-off-by: Thomas Gleixner Cc: Thomas Winischhofer Cc: Greg Kroah-Hartman Cc: linux-...@vger.kernel.org Cc: sta...@vger.kernel.org Link: https://lore.kernel.org/r/20201019101109.603244...@linutronix.de Signed-off-by: Greg Kroah-Hartman --- drivers/usb/misc/sisusbvga/Kconfig |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/usb/misc/sisusbvga/Kconfig +++ b/drivers/usb/misc/sisusbvga/Kconfig @@ -16,7 +16,7 @@ config USB_SISUSBVGA config USB_SISUSBVGA_CON bool "Text console and mode switching support" if USB_SISUSBVGA - depends on VT + depends on VT && BROKEN select FONT_8x16 help Say Y here if you want a VGA text console via the USB dongle or
[PATCH 5.10 14/16] USB: sisusbvga: Make console support depend on BROKEN
From: Thomas Gleixner commit 862ee699fefe1e6d6f2c1518395f0b999b8beb15 upstream. The console part of sisusbvga is broken vs. printk(). It uses in_atomic() to detect contexts in which it cannot sleep despite the big fat comment in preempt.h which says: Do not use in_atomic() in driver code. in_atomic() does not work on kernels with CONFIG_PREEMPT_COUNT=n which means that spin/rw_lock held regions are not detected by it. There is no way to make this work by handing context information through to the driver and this only can be solved once the core printk infrastructure supports sleepable console drivers. Make it depend on BROKEN for now. Fixes: 1bbb4f2035d9 ("[PATCH] USB: sisusb[vga] update") Signed-off-by: Thomas Gleixner Cc: Thomas Winischhofer Cc: Greg Kroah-Hartman Cc: linux-...@vger.kernel.org Cc: sta...@vger.kernel.org Link: https://lore.kernel.org/r/20201019101109.603244...@linutronix.de Signed-off-by: Greg Kroah-Hartman --- drivers/usb/misc/sisusbvga/Kconfig |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/usb/misc/sisusbvga/Kconfig +++ b/drivers/usb/misc/sisusbvga/Kconfig @@ -16,7 +16,7 @@ config USB_SISUSBVGA config USB_SISUSBVGA_CON bool "Text console and mode switching support" if USB_SISUSBVGA - depends on VT + depends on VT && BROKEN select FONT_8x16 help Say Y here if you want a VGA text console via the USB dongle or
[tip: core/rcu] torture: Force weak-hashed pointers on console log
The following commit has been merged into the core/rcu branch of tip: Commit-ID: c1e06287583e5ec496e4c02bf5b319e5e41a1fd2 Gitweb: https://git.kernel.org/tip/c1e06287583e5ec496e4c02bf5b319e5e41a1fd2 Author:Paul E. McKenney AuthorDate:Mon, 21 Sep 2020 13:18:48 -07:00 Committer: Paul E. McKenney CommitterDate: Fri, 06 Nov 2020 17:13:54 -08:00 torture: Force weak-hashed pointers on console log Although the rcutorture scripting now deals correctly with full-up security-induced pointer obfuscation, it is still counter-productive for kernel hackers who are analyzing console output. This commit therefore sets the debug_boot_weak_hash kernel boot parameter, which enables printing of weak-hashed pointers for torture-test runs. Please note that this change applies only to runs initiated by the kvm.sh scripting. If you are instead using modprobe and rmmod, it is your responsibility to build and boot the underlying kernel to your taste. Please note further that this change does not result in a security hole in normal use. The rcutorture testing runs with a negligible userspace, no networking, and no user interaction. Besides which, there is no data of value that can be extracted from an rcutorture guest OS that could not also be extracted from the host that this guest is running on. Suggested-by: Anna-Maria Gleixner Signed-off-by: Paul E. McKenney --- tools/testing/selftests/rcutorture/bin/functions.sh | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/testing/selftests/rcutorture/bin/functions.sh b/tools/testing/selftests/rcutorture/bin/functions.sh index 51f3464..8266349 100644 --- a/tools/testing/selftests/rcutorture/bin/functions.sh +++ b/tools/testing/selftests/rcutorture/bin/functions.sh @@ -169,6 +169,7 @@ identify_qemu () { # Output arguments for the qemu "-append" string based on CPU type # and the TORTURE_QEMU_INTERACTIVE environment variable. identify_qemu_append () { + echo debug_boot_weak_hash local console=ttyS0 case "$1" in qemu-system-x86_64|qemu-system-i386)
Re: [PATCH] tty: serial: meson: enable console as module
On Thu, Dec 10, 2020 at 04:57:44PM -0800, Kevin Hilman wrote: > Enable serial driver to be built as a module. To do so, init the > console support on driver/module load instead of using > console_initcall(). > > Signed-off-by: Kevin Hilman > --- > Yes, I'm well aware that having the serial console as a module makes > devices hard to debug, so I'm not changing any default behavior. The > goal is just to enable building as a module for environments where > serial debug is not available or needed. It's a good goal, I'm all for it, thanks for the patch! greg k-h
[PATCH] tty: serial: meson: enable console as module
Enable serial driver to be built as a module. To do so, init the console support on driver/module load instead of using console_initcall(). Signed-off-by: Kevin Hilman --- Yes, I'm well aware that having the serial console as a module makes devices hard to debug, so I'm not changing any default behavior. The goal is just to enable building as a module for environments where serial debug is not available or needed. drivers/tty/serial/Kconfig | 2 +- drivers/tty/serial/meson_uart.c | 8 +++- 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig index 1044fc387691..c3fa78e63357 100644 --- a/drivers/tty/serial/Kconfig +++ b/drivers/tty/serial/Kconfig @@ -206,7 +206,7 @@ config SERIAL_MESON config SERIAL_MESON_CONSOLE bool "Support for console on meson" - depends on SERIAL_MESON=y + depends on SERIAL_MESON select SERIAL_CORE_CONSOLE select SERIAL_EARLYCON help diff --git a/drivers/tty/serial/meson_uart.c b/drivers/tty/serial/meson_uart.c index d2c08b760f83..69eeef9edfa5 100644 --- a/drivers/tty/serial/meson_uart.c +++ b/drivers/tty/serial/meson_uart.c @@ -604,7 +604,6 @@ static int __init meson_serial_console_init(void) register_console(_serial_console); return 0; } -console_initcall(meson_serial_console_init); static void meson_serial_early_console_write(struct console *co, const char *s, @@ -634,6 +633,9 @@ OF_EARLYCON_DECLARE(meson, "amlogic,meson-ao-uart", #define MESON_SERIAL_CONSOLE (_serial_console) #else +static int __init meson_serial_console_init(void) { + return 0; +} #define MESON_SERIAL_CONSOLE NULL #endif @@ -824,6 +826,10 @@ static int __init meson_uart_init(void) { int ret; + ret = meson_serial_console_init(); + if (ret) + return ret; + ret = uart_register_driver(_uart_driver); if (ret) return ret; -- 2.29.2
Re: "irq 4: Affinity broken due to vector space exhaustion." warning on restart of ttyS0 console
On Wed, Dec 09, 2020 at 07:28:49PM +0100, Thomas Gleixner wrote: > But the fix is not to tone down the warning. The proper fix is to do the > search in the correct order. Agree. Thank you for the speedy fix! Tested-by: Shung-Hsi Yu
Re: "irq 4: Affinity broken due to vector space exhaustion." warning on restart of ttyS0 console
Hi Shung-Hsi! On Wed, Dec 09 2020 at 14:33, Shung-Hsi Yu wrote: > On Tue, Nov 10, 2020 at 09:56:27PM +0100, Thomas Gleixner wrote: >> The real problem is irqbalanced aggressively exhausting the vector space >> of a _whole_ socket to the point that there is not a single vector left >> for serial. That's the problem you want to fix. > > I believe this warning also gets triggered even when there's _no_ vector > exhaustion. > > This seem to happen when the IRQ's affinity mask is set (wrongly) to CPUs on > a different NUMA node (e.g. cpumask_of_node(1) when the irqd->irq == 0). > > $ lscpu > ... > NUMA node0 CPU(s): 0-25,52-77 > NUMA node1 CPU(s): 26-51,78-103 > > $ cat /sys/kernel/debug/tracing/trace >... > (agetty)-3004[047] d...81.777152: vector_activate: irq=4 > is_managed=0 can_reserve=1 reserve=0 > (agetty)-3004[047] d...81.777157: vector_alloc: irq=4 vector=0 > reserved=1 ret=-22 > > irq_matrix_alloc() failed with > EINVAL because the cpumask > passed in is empty, which is a > result of affmask being > (ff,c000,000f,fc00) > and cpumask_of_node(node) > being > > (00,3fff,fff0,03ff). > > (agetty)-3004[047] d...81.789349: irq_matrix_alloc: bit=33 cpu=1 > online=1 avl=199 alloc=2 managed=1 online_maps=104 global_avl=20688, > global_rsvd=341, total_alloc=216 > (agetty)-3004[047] d...81.789351: vector_alloc: irq=4 vector=33 > reserved=1 ret=0 > (agetty)-3004[047] d...81.789353: vector_update: irq=4 vector=33 > cpu=1 prev_vector=0 prev_cpu=26 > (agetty)-3004[047] d...81.789355: vector_config: irq=4 vector=33 > cpu=1 apicdest=0x0002 > > "irq 4: Affinity broken due to > vector space exhaustion." > warning shows up > Ok. That's a different story. Nice explanation! But the fix is not to tone down the warning. The proper fix is to do the search in the correct order. Thanks, tglx --- arch/x86/kernel/apic/vector.c | 24 ++-- 1 file changed, 14 insertions(+), 10 deletions(-) --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -273,20 +273,24 @@ static int assign_irq_vector_any_locked( const struct cpumask *affmsk = irq_data_get_affinity_mask(irqd); int node = irq_data_get_node(irqd); - if (node == NUMA_NO_NODE) - goto all; - /* Try the intersection of @affmsk and node mask */ - cpumask_and(vector_searchmask, cpumask_of_node(node), affmsk); - if (!assign_vector_locked(irqd, vector_searchmask)) - return 0; - /* Try the node mask */ - if (!assign_vector_locked(irqd, cpumask_of_node(node))) - return 0; -all: + if (node != NUMA_NO_NODE) { + /* Try the intersection of @affmsk and node mask */ + cpumask_and(vector_searchmask, cpumask_of_node(node), affmsk); + if (!assign_vector_locked(irqd, vector_searchmask)) + return 0; + } + /* Try the full affinity mask */ cpumask_and(vector_searchmask, affmsk, cpu_online_mask); if (!assign_vector_locked(irqd, vector_searchmask)) return 0; + + if (node != NUMA_NO_NODE) { + /* Try the node mask */ + if (!assign_vector_locked(irqd, cpumask_of_node(node))) + return 0; + } + /* Try the full online mask */ return assign_vector_locked(irqd, cpu_online_mask); }
Re: "irq 4: Affinity broken due to vector space exhaustion." warning on restart of ttyS0 console
On Wed, Dec 09, 2020 at 02:33:04PM +0800, Shung-Hsi Yu wrote: > Hi Thomas, > > On Tue, Nov 10, 2020 at 09:56:27PM +0100, Thomas Gleixner wrote: > > The real problem is irqbalanced aggressively exhausting the vector space > > of a _whole_ socket to the point that there is not a single vector left > > for serial. That's the problem you want to fix. > > I believe this warning also gets triggered even when there's _no_ vector > exhaustion. > > This seem to happen when the IRQ's affinity mask is set (wrongly) to CPUs on > a different NUMA node (e.g. cpumask_of_node(1) when the irqd->irq == 0). typo ^^ Should be "affinity mask set to cpumask_of_node(1) when irq_data_get_node(irqd) == 0". Shung-Hsi > $ lscpu > ... > NUMA node0 CPU(s): 0-25,52-77 > NUMA node1 CPU(s): 26-51,78-103 > > $ cat /sys/kernel/debug/tracing/trace >... > irqbalance-1994[017] d...74.912799: irq_matrix_alloc: bit=33 cpu=26 > online=1 avl=198 alloc=3 managed=1 online_maps=104 global_avl=20687, > global_rsvd=341, total_alloc=217 > irqbalance-1994[017] d...74.912802: vector_alloc: irq=4 vector=33 > reserved=0 ret=0 > irqbalance-1994[017] d...74.912804: vector_update: irq=4 vector=33 > cpu=26 prev_vector=33 prev_cpu=7 > irqbalance-1994[017] d...74.912805: vector_config: irq=4 vector=33 > cpu=26 apicdest=0x0040 > -0 [007] d.h.74.970733: vector_free_moved: irq=4 cpu=7 > vector=33 is_managed=0 > -0 [007] d.h.74.970738: irq_matrix_free: bit=33 cpu=7 > online=1 avl=200 alloc=1 managed=1 online_maps=104 global_avl=20687, > global_rsvd=341, total_alloc=217 >... > (agetty)-3004[047] d...81.731231: vector_deactivate: irq=4 > is_managed=0 can_reserve=1 reserve=0 > (agetty)-3004[047] d...81.738035: vector_clear: irq=4 vector=33 > cpu=26 prev_vector=0 prev_cpu=7 > (agetty)-3004[047] d...81.738040: irq_matrix_free: bit=33 cpu=26 > online=1 avl=199 alloc=2 managed=1 online_maps=104 global_avl=20689, > global_rsvd=341, total_alloc=215 > (agetty)-3004[047] d...81.738046: irq_matrix_reserve: > online_maps=104 global_avl=20689, global_rsvd=342, total_alloc=215 > (agetty)-3004[047] d...81.766739: vector_reserve: irq=4 ret=0 > (agetty)-3004[047] d...81.766741: vector_config: irq=4 vector=239 > cpu=0 apicdest=0x > (agetty)-3004[047] d...81.777152: vector_activate: irq=4 > is_managed=0 can_reserve=1 reserve=0 > (agetty)-3004[047] d...81.777157: vector_alloc: irq=4 vector=0 > reserved=1 ret=-22 > > irq_matrix_alloc() failed with > EINVAL because the cpumask > passed in is empty, which is a > result of affmask being > (ff,c000,000f,fc00) > and cpumask_of_node(node) > being > > (00,3fff,fff0,03ff). > > (agetty)-3004[047] d...81.789349: irq_matrix_alloc: bit=33 cpu=1 > online=1 avl=199 alloc=2 managed=1 online_maps=104 global_avl=20688, > global_rsvd=341, total_alloc=216 > (agetty)-3004[047] d...81.789351: vector_alloc: irq=4 vector=33 > reserved=1 ret=0 > (agetty)-3004[047] d...81.789353: vector_update: irq=4 vector=33 > cpu=1 prev_vector=0 prev_cpu=26 > (agetty)-3004[047] d...81.789355: vector_config: irq=4 vector=33 > cpu=1 apicdest=0x0002 > > "irq 4: Affinity broken due to > vector space exhaustion." > warning shows up > > (agetty)-3004[047] d...81.900783: irq_matrix_alloc: bit=33 cpu=26 > online=1 avl=198 alloc=3 managed=1 online_maps=104 global_avl=20687, > global_rsvd=341, total_alloc=217 > (agetty)-3004[047] d...82.053535: vector_alloc: irq=4 vector=33 > reserved=0 ret=0 > (agetty)-3004[047] d...82.053536: vector_update: irq=4 vector=33 > cpu=26 prev_vector=33 prev_cpu=1 > (agetty)-3004[047] d...82.053538: vector_config: irq=4 vector=33 > cpu=26 apicdest=0x0040
Re: "irq 4: Affinity broken due to vector space exhaustion." warning on restart of ttyS0 console
Hi Thomas, On Tue, Nov 10, 2020 at 09:56:27PM +0100, Thomas Gleixner wrote: > The real problem is irqbalanced aggressively exhausting the vector space > of a _whole_ socket to the point that there is not a single vector left > for serial. That's the problem you want to fix. I believe this warning also gets triggered even when there's _no_ vector exhaustion. This seem to happen when the IRQ's affinity mask is set (wrongly) to CPUs on a different NUMA node (e.g. cpumask_of_node(1) when the irqd->irq == 0). $ lscpu ... NUMA node0 CPU(s): 0-25,52-77 NUMA node1 CPU(s): 26-51,78-103 $ cat /sys/kernel/debug/tracing/trace ... irqbalance-1994[017] d...74.912799: irq_matrix_alloc: bit=33 cpu=26 online=1 avl=198 alloc=3 managed=1 online_maps=104 global_avl=20687, global_rsvd=341, total_alloc=217 irqbalance-1994[017] d...74.912802: vector_alloc: irq=4 vector=33 reserved=0 ret=0 irqbalance-1994[017] d...74.912804: vector_update: irq=4 vector=33 cpu=26 prev_vector=33 prev_cpu=7 irqbalance-1994[017] d...74.912805: vector_config: irq=4 vector=33 cpu=26 apicdest=0x0040 -0 [007] d.h.74.970733: vector_free_moved: irq=4 cpu=7 vector=33 is_managed=0 -0 [007] d.h.74.970738: irq_matrix_free: bit=33 cpu=7 online=1 avl=200 alloc=1 managed=1 online_maps=104 global_avl=20687, global_rsvd=341, total_alloc=217 ... (agetty)-3004[047] d...81.731231: vector_deactivate: irq=4 is_managed=0 can_reserve=1 reserve=0 (agetty)-3004[047] d...81.738035: vector_clear: irq=4 vector=33 cpu=26 prev_vector=0 prev_cpu=7 (agetty)-3004[047] d...81.738040: irq_matrix_free: bit=33 cpu=26 online=1 avl=199 alloc=2 managed=1 online_maps=104 global_avl=20689, global_rsvd=341, total_alloc=215 (agetty)-3004[047] d...81.738046: irq_matrix_reserve: online_maps=104 global_avl=20689, global_rsvd=342, total_alloc=215 (agetty)-3004[047] d...81.766739: vector_reserve: irq=4 ret=0 (agetty)-3004[047] d...81.766741: vector_config: irq=4 vector=239 cpu=0 apicdest=0x (agetty)-3004[047] d...81.777152: vector_activate: irq=4 is_managed=0 can_reserve=1 reserve=0 (agetty)-3004[047] d...81.777157: vector_alloc: irq=4 vector=0 reserved=1 ret=-22 > irq_matrix_alloc() failed with EINVAL because the cpumask passed in is empty, which is a result of affmask being (ff,c000,000f,fc00) and cpumask_of_node(node) being (00,3fff,fff0,03ff). (agetty)-3004[047] d...81.789349: irq_matrix_alloc: bit=33 cpu=1 online=1 avl=199 alloc=2 managed=1 online_maps=104 global_avl=20688, global_rsvd=341, total_alloc=216 (agetty)-3004[047] d...81.789351: vector_alloc: irq=4 vector=33 reserved=1 ret=0 (agetty)-3004[047] d...81.789353: vector_update: irq=4 vector=33 cpu=1 prev_vector=0 prev_cpu=26 (agetty)-3004[047] d...81.789355: vector_config: irq=4 vector=33 cpu=1 apicdest=0x0002 > "irq 4: Affinity broken due to vector space exhaustion." warning shows up (agetty)-3004[047] d...81.900783: irq_matrix_alloc: bit=33 cpu=26 online=1 avl=198 alloc=3 managed=1 online_maps=104 global_avl=20687, global_rsvd=341, total_alloc=217 (agetty)-3004[047] d...82.053535: vector_alloc: irq=4 vector=33 reserved=0 ret=0 (agetty)-3004[047] d...82.053536: vector_update: irq=4 vector=33 cpu=26 prev_vector=33 prev_cpu=1 (agetty)-3004[047] d...82.053538: vector_config: irq=4 vector=33 cpu=26 apicdest=0x0040 Shung-Hsi Yu
[PATCH 4.9 37/47] tty: serial: imx: keep console clocks always on
From: Fugang Duan commit e67c139c488e84e7eae6c333231e791f0e89b3fb upstream. For below code, there has chance to cause deadlock in SMP system: Thread 1: clk_enable_lock(); pr_info("debug message"); clk_enable_unlock(); Thread 2: imx_uart_console_write() clk_enable() clk_enable_lock(); Thread 1: Acuired clk enable_lock -> printk -> console_trylock_spinning Thread 2: console_unlock() -> imx_uart_console_write -> clk_disable -> Acquite clk enable_lock So the patch is to keep console port clocks always on like other console drivers. Fixes: 1cf93e0d5488 ("serial: imx: remove the uart_console() check") Acked-by: Uwe Kleine-König Signed-off-by: Fugang Duan Link: https://lore.kernel.org/r/2020025136.29818-1-fugang.d...@nxp.com Cc: stable [fix up build warning - gregkh] Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/imx.c | 20 +++- 1 file changed, 3 insertions(+), 17 deletions(-) --- a/drivers/tty/serial/imx.c +++ b/drivers/tty/serial/imx.c @@ -1787,16 +1787,6 @@ imx_console_write(struct console *co, co unsigned int ucr1; unsigned long flags = 0; int locked = 1; - int retval; - - retval = clk_enable(sport->clk_per); - if (retval) - return; - retval = clk_enable(sport->clk_ipg); - if (retval) { - clk_disable(sport->clk_per); - return; - } if (sport->port.sysrq) locked = 0; @@ -1832,9 +1822,6 @@ imx_console_write(struct console *co, co if (locked) spin_unlock_irqrestore(>port.lock, flags); - - clk_disable(sport->clk_ipg); - clk_disable(sport->clk_per); } /* @@ -1935,15 +1922,14 @@ imx_console_setup(struct console *co, ch retval = uart_set_options(>port, co, baud, parity, bits, flow); - clk_disable(sport->clk_ipg); if (retval) { - clk_unprepare(sport->clk_ipg); + clk_disable_unprepare(sport->clk_ipg); goto error_console; } - retval = clk_prepare(sport->clk_per); + retval = clk_prepare_enable(sport->clk_per); if (retval) - clk_unprepare(sport->clk_ipg); + clk_disable_unprepare(sport->clk_ipg); error_console: return retval;
[PATCH 4.19 72/91] tty: serial: imx: keep console clocks always on
From: Fugang Duan commit e67c139c488e84e7eae6c333231e791f0e89b3fb upstream. For below code, there has chance to cause deadlock in SMP system: Thread 1: clk_enable_lock(); pr_info("debug message"); clk_enable_unlock(); Thread 2: imx_uart_console_write() clk_enable() clk_enable_lock(); Thread 1: Acuired clk enable_lock -> printk -> console_trylock_spinning Thread 2: console_unlock() -> imx_uart_console_write -> clk_disable -> Acquite clk enable_lock So the patch is to keep console port clocks always on like other console drivers. Fixes: 1cf93e0d5488 ("serial: imx: remove the uart_console() check") Acked-by: Uwe Kleine-König Signed-off-by: Fugang Duan Link: https://lore.kernel.org/r/2020025136.29818-1-fugang.d...@nxp.com Cc: stable [fix up build warning - gregkh] Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/imx.c | 20 +++- 1 file changed, 3 insertions(+), 17 deletions(-) --- a/drivers/tty/serial/imx.c +++ b/drivers/tty/serial/imx.c @@ -1915,16 +1915,6 @@ imx_uart_console_write(struct console *c unsigned int ucr1; unsigned long flags = 0; int locked = 1; - int retval; - - retval = clk_enable(sport->clk_per); - if (retval) - return; - retval = clk_enable(sport->clk_ipg); - if (retval) { - clk_disable(sport->clk_per); - return; - } if (sport->port.sysrq) locked = 0; @@ -1960,9 +1950,6 @@ imx_uart_console_write(struct console *c if (locked) spin_unlock_irqrestore(>port.lock, flags); - - clk_disable(sport->clk_ipg); - clk_disable(sport->clk_per); } /* @@ -2063,15 +2050,14 @@ imx_uart_console_setup(struct console *c retval = uart_set_options(>port, co, baud, parity, bits, flow); - clk_disable(sport->clk_ipg); if (retval) { - clk_unprepare(sport->clk_ipg); + clk_disable_unprepare(sport->clk_ipg); goto error_console; } - retval = clk_prepare(sport->clk_per); + retval = clk_prepare_enable(sport->clk_per); if (retval) - clk_unprepare(sport->clk_ipg); + clk_disable_unprepare(sport->clk_ipg); error_console: return retval;
[PATCH 5.9 199/252] tty: serial: imx: keep console clocks always on
From: Fugang Duan commit e67c139c488e84e7eae6c333231e791f0e89b3fb upstream. For below code, there has chance to cause deadlock in SMP system: Thread 1: clk_enable_lock(); pr_info("debug message"); clk_enable_unlock(); Thread 2: imx_uart_console_write() clk_enable() clk_enable_lock(); Thread 1: Acuired clk enable_lock -> printk -> console_trylock_spinning Thread 2: console_unlock() -> imx_uart_console_write -> clk_disable -> Acquite clk enable_lock So the patch is to keep console port clocks always on like other console drivers. Fixes: 1cf93e0d5488 ("serial: imx: remove the uart_console() check") Acked-by: Uwe Kleine-König Signed-off-by: Fugang Duan Link: https://lore.kernel.org/r/2020025136.29818-1-fugang.d...@nxp.com Cc: stable [fix up build warning - gregkh] Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/imx.c | 20 +++- 1 file changed, 3 insertions(+), 17 deletions(-) --- a/drivers/tty/serial/imx.c +++ b/drivers/tty/serial/imx.c @@ -2007,16 +2007,6 @@ imx_uart_console_write(struct console *c unsigned int ucr1; unsigned long flags = 0; int locked = 1; - int retval; - - retval = clk_enable(sport->clk_per); - if (retval) - return; - retval = clk_enable(sport->clk_ipg); - if (retval) { - clk_disable(sport->clk_per); - return; - } if (sport->port.sysrq) locked = 0; @@ -2052,9 +2042,6 @@ imx_uart_console_write(struct console *c if (locked) spin_unlock_irqrestore(>port.lock, flags); - - clk_disable(sport->clk_ipg); - clk_disable(sport->clk_per); } /* @@ -2155,15 +2142,14 @@ imx_uart_console_setup(struct console *c retval = uart_set_options(>port, co, baud, parity, bits, flow); - clk_disable(sport->clk_ipg); if (retval) { - clk_unprepare(sport->clk_ipg); + clk_disable_unprepare(sport->clk_ipg); goto error_console; } - retval = clk_prepare(sport->clk_per); + retval = clk_prepare_enable(sport->clk_per); if (retval) - clk_unprepare(sport->clk_ipg); + clk_disable_unprepare(sport->clk_ipg); error_console: return retval;
[PATCH 5.4 124/158] tty: serial: imx: keep console clocks always on
From: Fugang Duan commit e67c139c488e84e7eae6c333231e791f0e89b3fb upstream. For below code, there has chance to cause deadlock in SMP system: Thread 1: clk_enable_lock(); pr_info("debug message"); clk_enable_unlock(); Thread 2: imx_uart_console_write() clk_enable() clk_enable_lock(); Thread 1: Acuired clk enable_lock -> printk -> console_trylock_spinning Thread 2: console_unlock() -> imx_uart_console_write -> clk_disable -> Acquite clk enable_lock So the patch is to keep console port clocks always on like other console drivers. Fixes: 1cf93e0d5488 ("serial: imx: remove the uart_console() check") Acked-by: Uwe Kleine-König Signed-off-by: Fugang Duan Link: https://lore.kernel.org/r/2020025136.29818-1-fugang.d...@nxp.com Cc: stable [fix up build warning - gregkh] Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/imx.c | 20 +++- 1 file changed, 3 insertions(+), 17 deletions(-) --- a/drivers/tty/serial/imx.c +++ b/drivers/tty/serial/imx.c @@ -1942,16 +1942,6 @@ imx_uart_console_write(struct console *c unsigned int ucr1; unsigned long flags = 0; int locked = 1; - int retval; - - retval = clk_enable(sport->clk_per); - if (retval) - return; - retval = clk_enable(sport->clk_ipg); - if (retval) { - clk_disable(sport->clk_per); - return; - } if (sport->port.sysrq) locked = 0; @@ -1987,9 +1977,6 @@ imx_uart_console_write(struct console *c if (locked) spin_unlock_irqrestore(>port.lock, flags); - - clk_disable(sport->clk_ipg); - clk_disable(sport->clk_per); } /* @@ -2090,15 +2077,14 @@ imx_uart_console_setup(struct console *c retval = uart_set_options(>port, co, baud, parity, bits, flow); - clk_disable(sport->clk_ipg); if (retval) { - clk_unprepare(sport->clk_ipg); + clk_disable_unprepare(sport->clk_ipg); goto error_console; } - retval = clk_prepare(sport->clk_per); + retval = clk_prepare_enable(sport->clk_per); if (retval) - clk_unprepare(sport->clk_ipg); + clk_disable_unprepare(sport->clk_ipg); error_console: return retval;
[PATCH 4.14 45/60] tty: serial: imx: keep console clocks always on
From: Fugang Duan commit e67c139c488e84e7eae6c333231e791f0e89b3fb upstream. For below code, there has chance to cause deadlock in SMP system: Thread 1: clk_enable_lock(); pr_info("debug message"); clk_enable_unlock(); Thread 2: imx_uart_console_write() clk_enable() clk_enable_lock(); Thread 1: Acuired clk enable_lock -> printk -> console_trylock_spinning Thread 2: console_unlock() -> imx_uart_console_write -> clk_disable -> Acquite clk enable_lock So the patch is to keep console port clocks always on like other console drivers. Fixes: 1cf93e0d5488 ("serial: imx: remove the uart_console() check") Acked-by: Uwe Kleine-König Signed-off-by: Fugang Duan Link: https://lore.kernel.org/r/2020025136.29818-1-fugang.d...@nxp.com Cc: stable [fix up build warning - gregkh] Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/imx.c | 20 +++- 1 file changed, 3 insertions(+), 17 deletions(-) --- a/drivers/tty/serial/imx.c +++ b/drivers/tty/serial/imx.c @@ -1879,16 +1879,6 @@ imx_console_write(struct console *co, co unsigned int ucr1; unsigned long flags = 0; int locked = 1; - int retval; - - retval = clk_enable(sport->clk_per); - if (retval) - return; - retval = clk_enable(sport->clk_ipg); - if (retval) { - clk_disable(sport->clk_per); - return; - } if (sport->port.sysrq) locked = 0; @@ -1924,9 +1914,6 @@ imx_console_write(struct console *co, co if (locked) spin_unlock_irqrestore(>port.lock, flags); - - clk_disable(sport->clk_ipg); - clk_disable(sport->clk_per); } /* @@ -2027,15 +2014,14 @@ imx_console_setup(struct console *co, ch retval = uart_set_options(>port, co, baud, parity, bits, flow); - clk_disable(sport->clk_ipg); if (retval) { - clk_unprepare(sport->clk_ipg); + clk_disable_unprepare(sport->clk_ipg); goto error_console; } - retval = clk_prepare(sport->clk_per); + retval = clk_prepare_enable(sport->clk_per); if (retval) - clk_unprepare(sport->clk_ipg); + clk_disable_unprepare(sport->clk_ipg); error_console: return retval;
[PATCH 4.4 29/38] tty: serial: imx: keep console clocks always on
From: Fugang Duan commit e67c139c488e84e7eae6c333231e791f0e89b3fb upstream. For below code, there has chance to cause deadlock in SMP system: Thread 1: clk_enable_lock(); pr_info("debug message"); clk_enable_unlock(); Thread 2: imx_uart_console_write() clk_enable() clk_enable_lock(); Thread 1: Acuired clk enable_lock -> printk -> console_trylock_spinning Thread 2: console_unlock() -> imx_uart_console_write -> clk_disable -> Acquite clk enable_lock So the patch is to keep console port clocks always on like other console drivers. Fixes: 1cf93e0d5488 ("serial: imx: remove the uart_console() check") Acked-by: Uwe Kleine-König Signed-off-by: Fugang Duan Link: https://lore.kernel.org/r/2020025136.29818-1-fugang.d...@nxp.com Cc: stable [fix up build warning - gregkh] Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/imx.c | 20 +++- 1 file changed, 3 insertions(+), 17 deletions(-) --- a/drivers/tty/serial/imx.c +++ b/drivers/tty/serial/imx.c @@ -1628,16 +1628,6 @@ imx_console_write(struct console *co, co unsigned int ucr1; unsigned long flags = 0; int locked = 1; - int retval; - - retval = clk_enable(sport->clk_per); - if (retval) - return; - retval = clk_enable(sport->clk_ipg); - if (retval) { - clk_disable(sport->clk_per); - return; - } if (sport->port.sysrq) locked = 0; @@ -1673,9 +1663,6 @@ imx_console_write(struct console *co, co if (locked) spin_unlock_irqrestore(>port.lock, flags); - - clk_disable(sport->clk_ipg); - clk_disable(sport->clk_per); } /* @@ -1776,15 +1763,14 @@ imx_console_setup(struct console *co, ch retval = uart_set_options(>port, co, baud, parity, bits, flow); - clk_disable(sport->clk_ipg); if (retval) { - clk_unprepare(sport->clk_ipg); + clk_disable_unprepare(sport->clk_ipg); goto error_console; } - retval = clk_prepare(sport->clk_per); + retval = clk_prepare_enable(sport->clk_per); if (retval) - clk_unprepare(sport->clk_ipg); + clk_disable_unprepare(sport->clk_ipg); error_console: return retval;
Re: [PATCH 0/2] printk/console: Use ttynull when no console is available or wanted
On Wed 2020-11-11 14:54:48, Petr Mladek wrote: > This is another attempt to solve regression caused by the commit > 48021f98130880dd74 ("printk: handle blank console arguments passed in."). > > It prevented a crash caused by empty console= parameter. But it caused > performance problems on Chromebooks because they use it to disable > all consoles, see > see https://lore.kernel.org/r/20201006025935.GA597@jagdpanzerIV.localdomain > > Solve both problems by using ttynull console driver that was crated > exactly for this purpose. > > The 1st patch should prevent the crash for any invalid console name. > > The 2nd patch allows to used the ttynull driver also with the widely > used console= and console=null parameters. > > Best Regards, > Petr > > Petr Mladek (2): > init/console: Use ttynull as a fallback when there is no console > printk/console: Allow to disable console output by using console="" or > console=null The patchset has been committed into printk/linux.git, branch for-5.11-null-console. Best Regards, Petr
Re: [PATCH v3 0/5] console: Miscellaneous clean-ups, do not use FNTCHARCNT() in fbcon.c
On Thu, Nov 19, 2020 at 04:10:57PM +0100, Daniel Vetter wrote: > On Thu, Nov 19, 2020 at 9:33 AM Peilin Ye wrote: > > setfont seems to work fine, I tried Georgian-Fixed16 (256 chars) and > > Uni2-VGA16 (512 chars) under /usr/share/consolefonts/ in my Ubuntu box, > > including setting all consoles to the same font: > > > > for i in {1..6}; do > > sudo setfont -C /dev/tty${i} > > /usr/share/consolefonts/Georgian-Fixed16.psf.gz > > done > > > > Font rotation also seems to work fine: > > > > for i in {1..4}; do > > echo $i | sudo tee /sys/class/graphics/fbcon/rotate > > sleep 1 > > done > > Thanks a lot for checking all this. Not a problem, watching my console rotating was fun :) > > One last thing I can think of is tile blitting, but I don't have the > > hardware (e.g. a Matrox G400 card, see FB_TILEBLITTING in > > drivers/video/fbdev/Kconfig) at hand, nor did I figure out how to > > simulate it after searching for a while. However based on the other > > tests above I believe vc->vc_font.charcount is set properly. > > tbh I'll just go ahead and delete it if it's broken :-) > > Userspace we have to keep working (and there's actually people > creating new products on top of drm display drivers using fbdev > emulation and /dev/fb/0 interface!), but kernel internal stuff like > fbcon acceleration we can trim pretty aggressively I think. Ah, I see, I'll leave it be, then. Thanks, Peilin Ye
Re: [PATCH v3 0/5] console: Miscellaneous clean-ups, do not use FNTCHARCNT() in fbcon.c
On Thu, Nov 19, 2020 at 9:33 AM Peilin Ye wrote: > > On Sat, Nov 14, 2020 at 01:22:22PM +0100, Greg Kroah-Hartman wrote: > > Ah, here's a hint: > > https://wiki.archlinux.org/index.php/Linux_console#Fonts > > > > The setfont tool should help you out here. > > setfont seems to work fine, I tried Georgian-Fixed16 (256 chars) and > Uni2-VGA16 (512 chars) under /usr/share/consolefonts/ in my Ubuntu box, > including setting all consoles to the same font: > > for i in {1..6}; do > sudo setfont -C /dev/tty${i} > /usr/share/consolefonts/Georgian-Fixed16.psf.gz > done > > Font rotation also seems to work fine: > > for i in {1..4}; do > echo $i | sudo tee /sys/class/graphics/fbcon/rotate > sleep 1 > done Thanks a lot for checking all this. > One last thing I can think of is tile blitting, but I don't have the > hardware (e.g. a Matrox G400 card, see FB_TILEBLITTING in > drivers/video/fbdev/Kconfig) at hand, nor did I figure out how to > simulate it after searching for a while. However based on the other > tests above I believe vc->vc_font.charcount is set properly. tbh I'll just go ahead and delete it if it's broken :-) Userspace we have to keep working (and there's actually people creating new products on top of drm display drivers using fbdev emulation and /dev/fb/0 interface!), but kernel internal stuff like fbcon acceleration we can trim pretty aggressively I think. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH v3 0/5] console: Miscellaneous clean-ups, do not use FNTCHARCNT() in fbcon.c
On Sat, Nov 14, 2020 at 01:22:22PM +0100, Greg Kroah-Hartman wrote: > Ah, here's a hint: > https://wiki.archlinux.org/index.php/Linux_console#Fonts > > The setfont tool should help you out here. setfont seems to work fine, I tried Georgian-Fixed16 (256 chars) and Uni2-VGA16 (512 chars) under /usr/share/consolefonts/ in my Ubuntu box, including setting all consoles to the same font: for i in {1..6}; do sudo setfont -C /dev/tty${i} /usr/share/consolefonts/Georgian-Fixed16.psf.gz done Font rotation also seems to work fine: for i in {1..4}; do echo $i | sudo tee /sys/class/graphics/fbcon/rotate sleep 1 done One last thing I can think of is tile blitting, but I don't have the hardware (e.g. a Matrox G400 card, see FB_TILEBLITTING in drivers/video/fbdev/Kconfig) at hand, nor did I figure out how to simulate it after searching for a while. However based on the other tests above I believe vc->vc_font.charcount is set properly. Thanks, Peilin Ye
VGA text console corruption in 5.9.0 and 5.10-rc4
5.9 introduces VGA console corruption in one of my test PC-s (I do not have VGA console on most). The PC has Intel D2550MUD2 board with Atom D2550. The symptoms include: * missing screen updates on VT switch * fragments of other VT-s appear during scrolling (kernel compilation output on visible VT1 scrolls up, sometimes it includes 5 or so lines from curses application on VT2 or its scroll-back history) * missing up-scrolling of lines/fragments in curses applications. Visible in make menuconfig and mc and maybe more (these are the ones I can describe mostly clearly). 5.9.0 with fbcon (as packaged by debian) does not show these symptoms. 5.9.0 and todays 5.10-rc4+git exhibit this behavior if I let them use VGA text console. $ lspci -nn 00:00.0 Host bridge [0600]: Intel Corporation Atom Processor D2xxx/N2xxx DRAM Controller [8086:0bf3] (rev 04) 00:02.0 VGA compatible controller [0300]: Intel Corporation Atom Processor D2xxx/N2xxx Integrated Graphics Controller [8086:0be2] (rev 0b) 00:1b.0 Audio device [0403]: Intel Corporation NM10/ICH7 Family High Definition Audio Controller [8086:27d8] (rev 02) 00:1c.0 PCI bridge [0604]: Intel Corporation NM10/ICH7 Family PCI Express Port 1 [8086:27d0] (rev 02) 00:1d.0 USB controller [0c03]: Intel Corporation NM10/ICH7 Family USB UHCI Controller #1 [8086:27c8] (rev 02) 00:1d.1 USB controller [0c03]: Intel Corporation NM10/ICH7 Family USB UHCI Controller #2 [8086:27c9] (rev 02) 00:1d.2 USB controller [0c03]: Intel Corporation NM10/ICH7 Family USB UHCI Controller #3 [8086:27ca] (rev 02) 00:1d.3 USB controller [0c03]: Intel Corporation NM10/ICH7 Family USB UHCI Controller #4 [8086:27cb] (rev 02) 00:1d.7 USB controller [0c03]: Intel Corporation NM10/ICH7 Family USB2 EHCI Controller [8086:27cc] (rev 02) 00:1e.0 PCI bridge [0604]: Intel Corporation 82801 Mobile PCI Bridge [8086:2448] (rev e2) 00:1f.0 ISA bridge [0601]: Intel Corporation NM10 Family LPC Controller [8086:27bc] (rev 02) 00:1f.2 SATA controller [0106]: Intel Corporation NM10/ICH7 Family SATA Controller [AHCI mode] [8086:27c1] (rev 02) 00:1f.3 SMBus [0c05]: Intel Corporation NM10/ICH7 Family SMBus Controller [8086:27da] (rev 02) 01:00.0 Ethernet controller [0200]: Intel Corporation 82574L Gigabit Network Connection [8086:10d3] Nothing interesting in dmesg, selected lines: [0.00] Linux version 5.10.0-rc4-00067-g9c87c9f41245 (mroos@d2550) (gcc (Debian 10.2.0-15) 10.2.0, GNU ld (GNU Binutils for Debian) 2.35.1) #8 SMP Tue Nov 17 14:39:11 EET 2020 [0.00] DMI: /D2550MUD2, BIOS MUCDT10N.86A.0075.2013.0427.1548 04/27/2013 [0.001878] MTRR default type: uncachable [0.001881] MTRR fixed ranges enabled: [0.001885] 0-9 write-back [0.001888] A-B uncachable [0.001891] C-D write-protect [0.001893] E-F uncachable [0.001896] MTRR variable ranges enabled: [0.001900] 0 base 0 mask F8000 write-back [0.001903] 1 base 07F00 mask FFF00 uncachable [0.001907] 2 base 0FFE0 mask FFFE0 write-protect [0.001909] 3 disabled [0.001911] 4 disabled [0.001913] 5 disabled [0.001915] 6 disabled [0.002024] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT [0.028625] ACPI: HPET id: 0x8086a201 base: 0xfed0 [0.028636] smpboot: Allowing 4 CPUs, 0 hotplug CPUs [0.095056] Console: colour VGA+ 80x25 [0.099221] printk: console [tty0] enabled [0.226357] smpboot: CPU0: Intel(R) Atom(TM) CPU D2550 @ 1.86GHz (family: 0x6, model: 0x36, stepping: 0x1)[0.095056] Console: colour VGA+ 80x25 [0.227697] smp: Bringing up secondary CPUs ... [0.227697] x86: Booting SMP configuration: [0.227697] node #0, CPUs: #1 [0.010909] Disabled fast string operations [0.228016] #2 [0.010909] Disabled fast string operations [0.231720] #3 [0.010909] Disabled fast string operations [0.233935] smp: Brought up 1 node, 4 CPUs [0.233935] smpboot: Max logical packages: 1 [0.233935] smpboot: Total of 4 processors activated (14934.80 BogoMIPS) [0.238692] PCI: MMCONFIG for domain [bus 00-3f] at [mem 0xe000-0xe3ff] (base 0xe000) [0.238756] PCI: MMCONFIG at [mem 0xe000-0xe3ff] reserved in E820 [0.238824] PCI: Using configuration type 1 for base access [0.243986] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages Machine-specific config, from compiling current git: # # Automatically generated file; DO NOT EDIT. # Linux/x86 5.10.0-rc4 Kernel Configuration # CONFIG_CC_VERSION_TEXT="gcc (Debian 10.2.0-15) 10.2.0" CONFIG_CC_IS_GCC=y CONFIG_GCC_VERSION=100200 CONFIG_LD_VERSION=23501 CONFIG_CLANG_VERSION=0 CONFIG_CC_CAN_LINK=y CONFIG_CC_CAN_LINK_STATIC=y CONFIG_CC_HAS_ASM_GOTO=y CONFIG_CC_HAS_ASM_INLINE=y CONFIG_IRQ_WORK=y CONFIG_BUILDTIME_TABLE_SORT=y CONFIG_THREAD_INFO_IN_TASK=y # # General setup # CONFIG_INIT_ENV_ARG_LIMIT=32 # CONFIG_COMPILE_TEST
Re: [PATCH v3 0/5] console: Miscellaneous clean-ups, do not use FNTCHARCNT() in fbcon.c
On Mon, Nov 16, 2020 at 11:09:49AM +0100, Daniel Vetter wrote: > On Sat, Nov 14, 2020 at 07:47:16AM -0500, Peilin Ye wrote: > > On Sat, Nov 14, 2020 at 01:22:22PM +0100, Greg Kroah-Hartman wrote: > > > On Sat, Nov 14, 2020 at 01:18:06PM +0100, Greg Kroah-Hartman wrote: > > > > On Sat, Nov 14, 2020 at 03:10:21AM -0500, Peilin Ye wrote: > > > > > Thanks for reviewing! Questions about the last patch [5/5] though, it > > > > > depends on the following assumption: > > > > > > > > > > """ > > > > > For each console `idx`, `vc_cons[idx].d->vc_font.data` and > > > > > `fb_display[idx].fontdata` always point to the same buffer. > > > > > """ > > > > > > > > > > Is this true? I think it is true by grepping for `fontdata`. I also > > > > > noticed that fbcon.c is using `vc->vc_font.data` and `p->fontdata` > > > > > interchangeably, see fbcon_get_requirement(): > > > > > > > > > > vc = vc_cons[fg_console].d; > > > > > [...] > > > > > p = _display[fg_console]; > > > > > caps->x = 1 << (vc->vc_font.width - 1); > > > > > ^^^ > > > > > caps->y = 1 << (vc->vc_font.height - 1); > > > > > ^^^ > > > > > caps->len = (p->userfont) ? > > > > > FNTCHARCNT(p->fontdata) : 256; > > > > > ^^^ > > > > > > > > > > If it is true, then what is the point of using `fontdata` in `struct > > > > > fbcon_display`? Just for the `userfont` flag? Should we delete > > > > > `fontdata`, when we no longer need the `userfont` flag? > > > > > > > > Yes, after a quick look, I think your analysis here is correct. So if > > > > you can delete that, it would be nice if possible. > > > > I see, at the moment we still need `userfont` for refcount handling, I > > will try to delete both `fontdata` and `userfont` after refcount is > > taken care of. > > +1 on sunsetting fondata. I think there's a bunch of these in fbcon code, > because the console subsystem is older than the standard "allow drivers to > embed the subsystem struct into their own private struct" subclassing > model. So lots of global arrays indexed by the console id :-/ Yeah, I saw your comments about registered_fb[] :( > We're also trying to start some kind of test suite for fbdev chardev ioctl > interface in the gpu test suite. fbcon tests are maybe more related to > tty/vt, and I have no idea whether anything exists there already. > > But if you want to do some automated testcases for fbcon and there's > absolutely no other home for them, the gpu test suite might be an option > of last resort too: > > https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#testing-and-validation Oh, I didn't know about igt-gpu-tools, thanks for the pointer! Now, since charcount is taken care of, and font_desc now contains all fields of console_font, I think it is a good time to replace console_font with font_desc in vc_data. Then we can get rid of FNTSUM(), FNTSIZE(), then (finally) REFCOUNT(). I will start working on vc_data, after done enough testing on [5/5], ofc. Thanks, Peilin Ye
Re: [PATCH 1/2] init/console: Use ttynull as a fallback when there is no console
On Thu 2020-11-12 10:45:46, Sergey Senozhatsky wrote: > On (20/11/12 09:17), Sergey Senozhatsky wrote: > > On (20/11/11 14:54), Petr Mladek wrote: > > [..] > > > diff --git a/init/main.c b/init/main.c > > > index 130376ec10ba..24413c055a85 100644 > > > --- a/init/main.c > > > +++ b/init/main.c > > > @@ -1470,8 +1470,14 @@ void __init console_on_rootfs(void) > > > struct file *file = filp_open("/dev/console", O_RDWR, 0); > > > > > > if (IS_ERR(file)) { > > > - pr_err("Warning: unable to open an initial console.\n"); > > > - return; > > > + pr_err("Warning: unable to open an initial console. Fallback to > > > ttynull.\n"); > > > + register_ttynull_console(); > > > + > > > > A nit, this probably can be done in console_device() function. > > > > For several reasons: > > > > - we will get covered all the future cases when something other than > > console_on_rootfs() will filp_open("/dev/console") Good point! My concern is that console_device might be called in "unclear" context. For example, it is called under tty_mutex in: + tty_open_by_driver() + tty_lookup_driver() + console_device() Also console_on_rootf() is likely the first code that would actually use the device. Well, there is spk_ttyio_initialise_ldisc() that calls tty_kopen(). I am a bit lazy to investigate whether it is called sooner or later. Anyway, it is accessibility code, so that there should be configured an accessibility console anyway. > And the existing ones (including user-space). For instance, > kernel/bpf/preload/iterators/iterators.c probably fails (?) > on systems with console= > > debug_fd = open("/dev/console", O_WRONLY | O_NOCTTY | O_CLOEXEC); > if (debug_fd < 0) > return 1; > > -ss This looks like an userspace tool, so it should get called after console_on_rootfs(). It might be my laziness. But I would prefer to go with this patchset. We could always improve it when anyone meet the problem. Best Regards, Petr
Re: [PATCH v3 0/5] console: Miscellaneous clean-ups, do not use FNTCHARCNT() in fbcon.c
On Fri, Nov 13, 2020 at 11:47:51PM +0100, Greg Kroah-Hartman wrote: > On Fri, Nov 13, 2020 at 10:16:33PM +0100, Daniel Vetter wrote: > > On Thu, Nov 12, 2020 at 07:02:21AM -0500, Peilin Ye wrote: > > > Hi all, > > > > > > This is a collection of some miscellaneous clean-ups for fbcon and some > > > console drivers. Since v2, I rebased them on linux-next, added some > > > Reviewed-by: tags from Daniel and Greg, and rewrote the commit messages as > > > suggested by Jiri. See [1] for v2 links. > > > > > > It does the following: > > > > > > - Garbage collect KD_FONT_OP_COPY callbacks since we disabled it > > > recently. Mark it as obsolete. > > > - Delete dummy con_font_op() callbacks. (Reviewed by Greg) > > > > > > - Add a charcount field to our new font descriptor, `struct font_desc`. > > > (Reviewed by Daniel) > > > - Do not use a hard-coded 256 for built-in font charcount in > > > console/sticore.c, use the new charcount field of `struct font_desc` > > > instead. (Reviewed by Daniel) > > > - Similarly, in fbcon.c, avoid using the magic negative-indexing macro, > > > FNTCHARCNT(). Set `vc->vc_font.charcount` properly and always use that > > > instead. > > > > > > Daniel, hopefully [5/5] removes FNTCHARCNT() for ever, but I have not > > > tested it sufficiently yet. I remember you mentioned elsewhere that > > > "fbtest.c" is insufficient for framebuffer testing, then how should we > > > test it? The first 4 patches should be fine. > > > > > > Please reference commit messages for more information. Thank you! > > > > > > [1] v2 links: > > > > > > 2/5: > > > https://lore.kernel.org/lkml/c5563eeea36aae7bd72ea2e985bc610d585ece40.1604306433.git.yepeilin...@gmail.com/ > > > 3/5: > > > https://lore.kernel.org/lkml/20201028060533.1206307-1-yepeilin...@gmail.com/ > > > 4/5: > > > https://lore.kernel.org/lkml/c38042bbf5c9777c84900d56c09f3c156b32af48.1603788512.git.yepeilin...@gmail.com/ > > > 5/5: > > > https://lore.kernel.org/lkml/20201028155139.1220549-1-yepeilin...@gmail.com/ > > > > > > Peilin Ye (5): > > > console: Delete unused con_font_copy() callback implementations > > > console: Delete dummy con_font_set() and con_font_default() callback > > > implementations > > > Fonts: Add charcount field to font_desc > > > parisc/sticore: Avoid hard-coding built-in font charcount > > > fbcon: Avoid using FNTCHARCNT() and hard-coded built-in font charcount > > > > Patches all look good to me, if Greg is ok with me applying the entire > > pile to drm-misc-next I'll do that next week. > > Yes, please do! > > Reviewed-by: Greg Kroah-Hartman Pulled entire series into drm-misc-next. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH v3 0/5] console: Miscellaneous clean-ups, do not use FNTCHARCNT() in fbcon.c
On Sat, Nov 14, 2020 at 07:47:16AM -0500, Peilin Ye wrote: > On Sat, Nov 14, 2020 at 01:22:22PM +0100, Greg Kroah-Hartman wrote: > > On Sat, Nov 14, 2020 at 01:18:06PM +0100, Greg Kroah-Hartman wrote: > > > On Sat, Nov 14, 2020 at 03:10:21AM -0500, Peilin Ye wrote: > > > > Thanks for reviewing! Questions about the last patch [5/5] though, it > > > > depends on the following assumption: > > > > > > > > """ > > > > For each console `idx`, `vc_cons[idx].d->vc_font.data` and > > > > `fb_display[idx].fontdata` always point to the same buffer. > > > > """ > > > > > > > > Is this true? I think it is true by grepping for `fontdata`. I also > > > > noticed that fbcon.c is using `vc->vc_font.data` and `p->fontdata` > > > > interchangeably, see fbcon_get_requirement(): > > > > > > > > vc = vc_cons[fg_console].d; > > > > [...] > > > > p = _display[fg_console]; > > > > caps->x = 1 << (vc->vc_font.width - 1); > > > > ^^^ > > > > caps->y = 1 << (vc->vc_font.height - 1); > > > > ^^^ > > > > caps->len = (p->userfont) ? > > > > FNTCHARCNT(p->fontdata) : 256; > > > >^^^ > > > > > > > > If it is true, then what is the point of using `fontdata` in `struct > > > > fbcon_display`? Just for the `userfont` flag? Should we delete > > > > `fontdata`, when we no longer need the `userfont` flag? > > > > > > Yes, after a quick look, I think your analysis here is correct. So if > > > you can delete that, it would be nice if possible. > > I see, at the moment we still need `userfont` for refcount handling, I > will try to delete both `fontdata` and `userfont` after refcount is > taken care of. +1 on sunsetting fondata. I think there's a bunch of these in fbcon code, because the console subsystem is older than the standard "allow drivers to embed the subsystem struct into their own private struct" subclassing model. So lots of global arrays indexed by the console id :-/ > > > > In this sense I think [5/5] needs more testing. Do we have test files > > > > for fbcon, or should I try to write some tests from scratch? > > > > > > I don't know of any direct tests, I usually just booted into that mode > > > and saw if everything looked like it did before. There must be some > > > tool that you can use to change the current font, as it seems to happen > > > at boot time on some distros. I can't remember what it's called at the > > > moment... > > > > Ah, here's a hint: > > https://wiki.archlinux.org/index.php/Linux_console#Fonts > > > > The setfont tool should help you out here. > > Oh, I didn't know about this one. I'll go experimenting with it, > thank you! We're also trying to start some kind of test suite for fbdev chardev ioctl interface in the gpu test suite. fbcon tests are maybe more related to tty/vt, and I have no idea whether anything exists there already. But if you want to do some automated testcases for fbcon and there's absolutely no other home for them, the gpu test suite might be an option of last resort too: https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#testing-and-validation Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH v3 0/5] console: Miscellaneous clean-ups, do not use FNTCHARCNT() in fbcon.c
On Sat, Nov 14, 2020 at 01:22:22PM +0100, Greg Kroah-Hartman wrote: > On Sat, Nov 14, 2020 at 01:18:06PM +0100, Greg Kroah-Hartman wrote: > > On Sat, Nov 14, 2020 at 03:10:21AM -0500, Peilin Ye wrote: > > > Thanks for reviewing! Questions about the last patch [5/5] though, it > > > depends on the following assumption: > > > > > > """ > > > For each console `idx`, `vc_cons[idx].d->vc_font.data` and > > > `fb_display[idx].fontdata` always point to the same buffer. > > > """ > > > > > > Is this true? I think it is true by grepping for `fontdata`. I also > > > noticed that fbcon.c is using `vc->vc_font.data` and `p->fontdata` > > > interchangeably, see fbcon_get_requirement(): > > > > > > vc = vc_cons[fg_console].d; > > > [...] > > > p = _display[fg_console]; > > > caps->x = 1 << (vc->vc_font.width - 1); > > > ^^^ > > > caps->y = 1 << (vc->vc_font.height - 1); > > > ^^^ > > > caps->len = (p->userfont) ? > > > FNTCHARCNT(p->fontdata) : 256; > > > ^^^ > > > > > > If it is true, then what is the point of using `fontdata` in `struct > > > fbcon_display`? Just for the `userfont` flag? Should we delete > > > `fontdata`, when we no longer need the `userfont` flag? > > > > Yes, after a quick look, I think your analysis here is correct. So if > > you can delete that, it would be nice if possible. I see, at the moment we still need `userfont` for refcount handling, I will try to delete both `fontdata` and `userfont` after refcount is taken care of. > > > In this sense I think [5/5] needs more testing. Do we have test files > > > for fbcon, or should I try to write some tests from scratch? > > > > I don't know of any direct tests, I usually just booted into that mode > > and saw if everything looked like it did before. There must be some > > tool that you can use to change the current font, as it seems to happen > > at boot time on some distros. I can't remember what it's called at the > > moment... > > Ah, here's a hint: > https://wiki.archlinux.org/index.php/Linux_console#Fonts > > The setfont tool should help you out here. Oh, I didn't know about this one. I'll go experimenting with it, thank you! Peilin Ye
Re: [PATCH v3 0/5] console: Miscellaneous clean-ups, do not use FNTCHARCNT() in fbcon.c
On Sat, Nov 14, 2020 at 01:18:06PM +0100, Greg Kroah-Hartman wrote: > On Sat, Nov 14, 2020 at 03:10:21AM -0500, Peilin Ye wrote: > > > On Fri, Nov 13, 2020 at 10:16:33PM +0100, Daniel Vetter wrote: > > > > On Thu, Nov 12, 2020 at 07:02:21AM -0500, Peilin Ye wrote: > > > > > Peilin Ye (5): > > > > > console: Delete unused con_font_copy() callback implementations > > > > > console: Delete dummy con_font_set() and con_font_default() > > > > > callback implementations > > > > > Fonts: Add charcount field to font_desc > > > > > parisc/sticore: Avoid hard-coding built-in font charcount > > > > > fbcon: Avoid using FNTCHARCNT() and hard-coded built-in font > > > > > charcount > > > > > > > > Patches all look good to me, if Greg is ok with me applying the entire > > > > pile to drm-misc-next I'll do that next week. > > > > On Fri, Nov 13, 2020 at 11:47:51PM +0100, Greg Kroah-Hartman wrote: > > > Reviewed-by: Greg Kroah-Hartman > > > > Thanks for reviewing! Questions about the last patch [5/5] though, it > > depends on the following assumption: > > > > """ > > For each console `idx`, `vc_cons[idx].d->vc_font.data` and > > `fb_display[idx].fontdata` always point to the same buffer. > > """ > > > > Is this true? I think it is true by grepping for `fontdata`. I also > > noticed that fbcon.c is using `vc->vc_font.data` and `p->fontdata` > > interchangeably, see fbcon_get_requirement(): > > > > vc = vc_cons[fg_console].d; > > [...] > > p = _display[fg_console]; > > caps->x = 1 << (vc->vc_font.width - 1); > > ^^^ > > caps->y = 1 << (vc->vc_font.height - 1); > > ^^^ > > caps->len = (p->userfont) ? > > FNTCHARCNT(p->fontdata) : 256; > >^^^ > > > > If it is true, then what is the point of using `fontdata` in `struct > > fbcon_display`? Just for the `userfont` flag? Should we delete > > `fontdata`, when we no longer need the `userfont` flag? > > Yes, after a quick look, I think your analysis here is correct. So if > you can delete that, it would be nice if possible. > > > In this sense I think [5/5] needs more testing. Do we have test files > > for fbcon, or should I try to write some tests from scratch? > > I don't know of any direct tests, I usually just booted into that mode > and saw if everything looked like it did before. There must be some > tool that you can use to change the current font, as it seems to happen > at boot time on some distros. I can't remember what it's called at the > moment... Ah, here's a hint: https://wiki.archlinux.org/index.php/Linux_console#Fonts The setfont tool should help you out here.