Re: [RFC PATCH 0/7] Trace events to pstore
Hi Rob, (Back from holidays, digging through the email pile). Reply below: On Thu, Sep 3, 2020 at 2:09 PM Rob Herring wrote: > > On Wed, Sep 2, 2020 at 3:47 PM Joel Fernandes wrote: > > > > On Wed, Sep 2, 2020 at 4:01 PM Nachammai Karuppiah > > wrote: > > > > > > Hi, > > > > > > This patch series adds support to store trace events in pstore. > > > > > > Storing trace entries in persistent RAM would help in understanding what > > > happened just before the system went down. The trace events that led to > > > the > > > crash can be retrieved from the pstore after a warm reboot. This will help > > > debug what happened before machine’s last breath. This has to be done in a > > > scalable way so that tracing a live system does not impact the performance > > > of the system. > > > > Just to add, Nachammai was my intern in the recent outreachy program > > and we designed together a way for trace events to be written to > > pstore backed memory directory instead of regular memory. The basic > > idea is to allocate frace's ring buffer on pstore memory and have it > > right there. Then recover it on reboot. Nachammai wrote the code with > > some guidance :) . I talked to Steve as well in the past about the > > basic of idea of this. Steve is on vacation this week though. > > ramoops is already the RAM backend for pstore and ramoops already has > an ftrace region defined. What am I missing? ramoops is too slow for tracing. Honestly, the ftrace functionality in ramoops should be removed in favor of Nachammai's patches (she did it for events but function tracing could be trivially added). No one uses the current ftrace in pstore because it is darned slow. ramoops sits in between the writing of the ftrace record and the memory being written to adding more overhead in the process, while also writing ftrace records in a non-ftrace format. So ramoop's API and infrastructure fundamentally does not meet the requirements of high speed persistent tracing. The idea of this work is to keep the trace events enabled for a long period time (possibly even in production) and low overhead until the problem like machine crashing happens. > From a DT standpoint, we already have a reserved persistent RAM > binding too. There's already too much kernel specifics on how it is > used, we don't need more of that in DT. We're not going to add another > separate region (actually, you can have as many regions defined as you > want. They will just all be 'ramoops' compatible). I agree with the sentiment here on DT. Maybe the DT can be generalized to provide a ram region to which either ramoops or ramtrace can attach. - Joel
Re: [RFC PATCH 0/7] Trace events to pstore
On Wed, Sep 2, 2020 at 3:47 PM Joel Fernandes wrote: > > On Wed, Sep 2, 2020 at 4:01 PM Nachammai Karuppiah > wrote: > > > > Hi, > > > > This patch series adds support to store trace events in pstore. > > > > Storing trace entries in persistent RAM would help in understanding what > > happened just before the system went down. The trace events that led to the > > crash can be retrieved from the pstore after a warm reboot. This will help > > debug what happened before machine’s last breath. This has to be done in a > > scalable way so that tracing a live system does not impact the performance > > of the system. > > Just to add, Nachammai was my intern in the recent outreachy program > and we designed together a way for trace events to be written to > pstore backed memory directory instead of regular memory. The basic > idea is to allocate frace's ring buffer on pstore memory and have it > right there. Then recover it on reboot. Nachammai wrote the code with > some guidance :) . I talked to Steve as well in the past about the > basic of idea of this. Steve is on vacation this week though. ramoops is already the RAM backend for pstore and ramoops already has an ftrace region defined. What am I missing? >From a DT standpoint, we already have a reserved persistent RAM binding too. There's already too much kernel specifics on how it is used, we don't need more of that in DT. We're not going to add another separate region (actually, you can have as many regions defined as you want. They will just all be 'ramoops' compatible). Rob
Re: [RFC PATCH 0/7] Trace events to pstore
On 2020-09-03 03:17, Joel Fernandes wrote: On Wed, Sep 2, 2020 at 4:01 PM Nachammai Karuppiah wrote: Hi, This patch series adds support to store trace events in pstore. Storing trace entries in persistent RAM would help in understanding what happened just before the system went down. The trace events that led to the crash can be retrieved from the pstore after a warm reboot. This will help debug what happened before machine’s last breath. This has to be done in a scalable way so that tracing a live system does not impact the performance of the system. Just to add, Nachammai was my intern in the recent outreachy program and we designed together a way for trace events to be written to pstore backed memory directory instead of regular memory. The basic idea is to allocate frace's ring buffer on pstore memory and have it right there. Then recover it on reboot. Nachammai wrote the code with some guidance :) . I talked to Steve as well in the past about the basic of idea of this. Steve is on vacation this week though. This is similar to what +Sai Prakash Ranjan was trying to do sometime ago: https://lkml.org/lkml/2018/9/8/221 . But that approach involved higher overhead due to synchronization of writing to the otherwise lockless ring buffer. +Brian Norris has also expressed interest for this feature. Great work Nachammai and Joel, I have few boards with warm reboot support and will test this series in coming days. Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: [RFC PATCH 0/7] Trace events to pstore
On Wed, Sep 2, 2020 at 5:47 PM Joel Fernandes wrote: > > On Wed, Sep 2, 2020 at 4:01 PM Nachammai Karuppiah > wrote: > > > > Hi, > > > > This patch series adds support to store trace events in pstore. > > Been a long day... > > Storing trace entries in persistent RAM would help in understanding what > > happened just before the system went down. The trace events that led to the > > crash can be retrieved from the pstore after a warm reboot. This will help > > debug what happened before machine’s last breath. This has to be done in a > > scalable way so that tracing a live system does not impact the performance > > of the system. > > Just to add, Nachammai was my intern in the recent outreachy program > and we designed together a way for trace events to be written to > pstore backed memory directory instead of regular memory. The basic s/directory/directly/ > idea is to allocate frace's ring buffer on pstore memory and have it > right there. Then recover it on reboot. Nachammai wrote the code with s/right/write/ - Joel > some guidance :) . I talked to Steve as well in the past about the > basic of idea of this. Steve is on vacation this week though. > > This is similar to what +Sai Prakash Ranjan was trying to do sometime > ago: https://lkml.org/lkml/2018/9/8/221 . But that approach involved > higher overhead due to synchronization of writing to the otherwise > lockless ring buffer. > > +Brian Norris has also expressed interest for this feature. > > thanks, > > - Joel > > > > > This requires a new backend - ramtrace that allocates pages from > > persistent storage for the tracing utility. This feature can be enabled > > using TRACE_EVENTS_TO_PSTORE. > > In this feature, the new backend is used only as a page allocator and > > once the users chooses to use pstore to record trace entries, the ring > > buffer pages are freed and allocated in pstore. Once this switch is done, > > ring_buffer continues to operate just as before without much overhead. > > Since the ring buffer uses the persistent RAM buffer directly to record > > trace entries, all tracers would also persist across reboot. > > > > To test this feature, I used a simple module that would call panic during > > a write operation to file in tracefs directory. Before writing to the file, > > the ring buffer is moved to persistent RAM buffer through command line > > as shown below, > > > > $echo 1 > /sys/kernel/tracing/options/persist > > > > Writing to the file, > > $echo 1 > /sys/kernel/tracing/crash/panic_on_write > > > > The above write operation results in system crash. After reboot, once the > > pstore is mounted, the trace entries from previous boot are available in > > file, > > /sys/fs/pstore/trace-ramtrace-0 > > > > Looking through this file, gives us the stack trace that led to the crash. > > > ><...>-1 [001] 49.083909: __vfs_write <-vfs_write > ><...>-1 [001] 49.083933: panic <-panic_on_write > ><...>-1 [001] d...49.084195: printk <-panic > ><...>-1 [001] d...49.084201: vprintk_func <-printk > ><...>-1 [001] d...49.084207: vprintk_default <-printk > ><...>-1 [001] d...49.084211: vprintk_emit <-printk > ><...>-1 [001] d...49.084216: __printk_safe_enter > > <-vprintk_emit > ><...>-1 [001] d...49.084219: _raw_spin_lock > > <-vprintk_emit > ><...>-1 [001] d...49.084223: vprintk_store <-vprintk_emit > > > > Patchwise oneline description is given below: > > > > Patch 1 adds support to allocate ring buffer pages from persistent RAM > > buffer. > > > > Patch 2 introduces a new backend, ramtrace. > > > > Patch 3 adds methods to read previous boot pages from pstore. > > > > Patch 4 adds the functionality to allocate page-sized memory from pstore. > > > > Patch 5 adds the seq_operation methods to iterate through trace entries. > > > > Patch 6 modifies ring_buffer to allocate from ramtrace when pstore is used. > > > > Patch 7 adds ramtrace DT node as child-node of /reserved-memory. > > > > Nachammai Karuppiah (7): > > tracing: Add support to allocate pages from persistent memory > > pstore: Support a new backend, ramtrace > > pstore: Read and iterate through trace entries in PSTORE > > pstore: Allocate and free page-sized memory in persistent RAM buffer > > tracing: Add support to iterate through pages retrieved from pstore > > tracing: Use ramtrace alloc and free methods while using persistent > > RAM > > dt-bindings: ramtrace: Add ramtrace DT node > > > > .../bindings/reserved-memory/ramtrace.txt | 13 + > > drivers/of/platform.c | 1 + > > fs/pstore/Makefile | 2 + > > fs/pstore/inode.c | 46 +- > > fs/pstore/platform.c | 1 + > > fs/pstore/ramtrace.c | 821 > >
Re: [RFC PATCH 0/7] Trace events to pstore
On Wed, Sep 2, 2020 at 4:01 PM Nachammai Karuppiah wrote: > > Hi, > > This patch series adds support to store trace events in pstore. > > Storing trace entries in persistent RAM would help in understanding what > happened just before the system went down. The trace events that led to the > crash can be retrieved from the pstore after a warm reboot. This will help > debug what happened before machine’s last breath. This has to be done in a > scalable way so that tracing a live system does not impact the performance > of the system. Just to add, Nachammai was my intern in the recent outreachy program and we designed together a way for trace events to be written to pstore backed memory directory instead of regular memory. The basic idea is to allocate frace's ring buffer on pstore memory and have it right there. Then recover it on reboot. Nachammai wrote the code with some guidance :) . I talked to Steve as well in the past about the basic of idea of this. Steve is on vacation this week though. This is similar to what +Sai Prakash Ranjan was trying to do sometime ago: https://lkml.org/lkml/2018/9/8/221 . But that approach involved higher overhead due to synchronization of writing to the otherwise lockless ring buffer. +Brian Norris has also expressed interest for this feature. thanks, - Joel > > This requires a new backend - ramtrace that allocates pages from > persistent storage for the tracing utility. This feature can be enabled > using TRACE_EVENTS_TO_PSTORE. > In this feature, the new backend is used only as a page allocator and > once the users chooses to use pstore to record trace entries, the ring > buffer pages are freed and allocated in pstore. Once this switch is done, > ring_buffer continues to operate just as before without much overhead. > Since the ring buffer uses the persistent RAM buffer directly to record > trace entries, all tracers would also persist across reboot. > > To test this feature, I used a simple module that would call panic during > a write operation to file in tracefs directory. Before writing to the file, > the ring buffer is moved to persistent RAM buffer through command line > as shown below, > > $echo 1 > /sys/kernel/tracing/options/persist > > Writing to the file, > $echo 1 > /sys/kernel/tracing/crash/panic_on_write > > The above write operation results in system crash. After reboot, once the > pstore is mounted, the trace entries from previous boot are available in file, > /sys/fs/pstore/trace-ramtrace-0 > > Looking through this file, gives us the stack trace that led to the crash. > ><...>-1 [001] 49.083909: __vfs_write <-vfs_write ><...>-1 [001] 49.083933: panic <-panic_on_write ><...>-1 [001] d...49.084195: printk <-panic ><...>-1 [001] d...49.084201: vprintk_func <-printk ><...>-1 [001] d...49.084207: vprintk_default <-printk ><...>-1 [001] d...49.084211: vprintk_emit <-printk ><...>-1 [001] d...49.084216: __printk_safe_enter > <-vprintk_emit ><...>-1 [001] d...49.084219: _raw_spin_lock <-vprintk_emit ><...>-1 [001] d...49.084223: vprintk_store <-vprintk_emit > > Patchwise oneline description is given below: > > Patch 1 adds support to allocate ring buffer pages from persistent RAM buffer. > > Patch 2 introduces a new backend, ramtrace. > > Patch 3 adds methods to read previous boot pages from pstore. > > Patch 4 adds the functionality to allocate page-sized memory from pstore. > > Patch 5 adds the seq_operation methods to iterate through trace entries. > > Patch 6 modifies ring_buffer to allocate from ramtrace when pstore is used. > > Patch 7 adds ramtrace DT node as child-node of /reserved-memory. > > Nachammai Karuppiah (7): > tracing: Add support to allocate pages from persistent memory > pstore: Support a new backend, ramtrace > pstore: Read and iterate through trace entries in PSTORE > pstore: Allocate and free page-sized memory in persistent RAM buffer > tracing: Add support to iterate through pages retrieved from pstore > tracing: Use ramtrace alloc and free methods while using persistent > RAM > dt-bindings: ramtrace: Add ramtrace DT node > > .../bindings/reserved-memory/ramtrace.txt | 13 + > drivers/of/platform.c | 1 + > fs/pstore/Makefile | 2 + > fs/pstore/inode.c | 46 +- > fs/pstore/platform.c | 1 + > fs/pstore/ramtrace.c | 821 > + > include/linux/pstore.h | 3 + > include/linux/ramtrace.h | 28 + > include/linux/ring_buffer.h| 19 + > include/linux/trace.h | 13 + > kernel/trace/Kconfig | 10 + >