Re: [RFC PATCH 0/7] Trace events to pstore

2020-09-10 Thread Joel Fernandes
Hi Rob,
(Back from holidays, digging through the email pile). Reply below:

On Thu, Sep 3, 2020 at 2:09 PM Rob Herring  wrote:
>
> On Wed, Sep 2, 2020 at 3:47 PM Joel Fernandes  wrote:
> >
> > On Wed, Sep 2, 2020 at 4:01 PM Nachammai Karuppiah
> >  wrote:
> > >
> > > Hi,
> > >
> > > This patch series adds support to store trace events in pstore.
> > >
> > > Storing trace entries in persistent RAM would help in understanding what
> > > happened just before the system went down. The trace events that led to 
> > > the
> > > crash can be retrieved from the pstore after a warm reboot. This will help
> > > debug what happened before machine’s last breath. This has to be done in a
> > > scalable way so that tracing a live system does not impact the performance
> > > of the system.
> >
> > Just to add, Nachammai was my intern in the recent outreachy program
> > and we designed together a way for trace events to be written to
> > pstore backed memory directory instead of regular memory. The basic
> > idea is to allocate frace's ring buffer on pstore memory and have it
> > right there. Then recover it on reboot. Nachammai wrote the code with
> > some guidance :) . I talked to Steve as well in the past about the
> > basic of idea of this. Steve is on vacation this week though.
>
> ramoops is already the RAM backend for pstore and ramoops already has
> an ftrace region defined. What am I missing?

ramoops is too slow for tracing. Honestly, the ftrace functionality in
ramoops should be removed in favor of Nachammai's patches (she did it
for events but function tracing could be trivially added). No one uses
the current ftrace in pstore because it is darned slow. ramoops sits
in between the writing of the ftrace record and the memory being
written to adding more overhead in the process, while also writing
ftrace records in a non-ftrace format. So ramoop's API and
infrastructure fundamentally does not meet the requirements of high
speed persistent tracing.  The idea of this work is to keep the trace
events enabled for a long period time (possibly even in production)
and low overhead until the problem like machine crashing happens.

> From a DT standpoint, we already have a reserved persistent RAM
> binding too. There's already too much kernel specifics on how it is
> used, we don't need more of that in DT. We're not going to add another
> separate region (actually, you can have as many regions defined as you
> want. They will just all be 'ramoops' compatible).

I agree with the sentiment here on DT. Maybe the DT can be generalized
to provide a ram region to which either ramoops or ramtrace can
attach.

 - Joel


Re: [RFC PATCH 0/7] Trace events to pstore

2020-09-03 Thread Rob Herring
On Wed, Sep 2, 2020 at 3:47 PM Joel Fernandes  wrote:
>
> On Wed, Sep 2, 2020 at 4:01 PM Nachammai Karuppiah
>  wrote:
> >
> > Hi,
> >
> > This patch series adds support to store trace events in pstore.
> >
> > Storing trace entries in persistent RAM would help in understanding what
> > happened just before the system went down. The trace events that led to the
> > crash can be retrieved from the pstore after a warm reboot. This will help
> > debug what happened before machine’s last breath. This has to be done in a
> > scalable way so that tracing a live system does not impact the performance
> > of the system.
>
> Just to add, Nachammai was my intern in the recent outreachy program
> and we designed together a way for trace events to be written to
> pstore backed memory directory instead of regular memory. The basic
> idea is to allocate frace's ring buffer on pstore memory and have it
> right there. Then recover it on reboot. Nachammai wrote the code with
> some guidance :) . I talked to Steve as well in the past about the
> basic of idea of this. Steve is on vacation this week though.

ramoops is already the RAM backend for pstore and ramoops already has
an ftrace region defined. What am I missing?

>From a DT standpoint, we already have a reserved persistent RAM
binding too. There's already too much kernel specifics on how it is
used, we don't need more of that in DT. We're not going to add another
separate region (actually, you can have as many regions defined as you
want. They will just all be 'ramoops' compatible).

Rob


Re: [RFC PATCH 0/7] Trace events to pstore

2020-09-02 Thread Sai Prakash Ranjan

On 2020-09-03 03:17, Joel Fernandes wrote:

On Wed, Sep 2, 2020 at 4:01 PM Nachammai Karuppiah
 wrote:


Hi,

This patch series adds support to store trace events in pstore.

Storing trace entries in persistent RAM would help in understanding 
what
happened just before the system went down. The trace events that led 
to the
crash can be retrieved from the pstore after a warm reboot. This will 
help
debug what happened before machine’s last breath. This has to be done 
in a
scalable way so that tracing a live system does not impact the 
performance

of the system.


Just to add, Nachammai was my intern in the recent outreachy program
and we designed together a way for trace events to be written to
pstore backed memory directory instead of regular memory. The basic
idea is to allocate frace's ring buffer on pstore memory and have it
right there. Then recover it on reboot. Nachammai wrote the code with
some guidance :) . I talked to Steve as well in the past about the
basic of idea of this. Steve is on vacation this week though.

This is similar to what +Sai Prakash Ranjan was trying to do sometime
ago: https://lkml.org/lkml/2018/9/8/221 . But that approach involved
higher overhead due to synchronization of writing to the otherwise
lockless ring buffer.

+Brian Norris has also expressed interest for this feature.



Great work Nachammai and Joel, I have few boards with warm reboot 
support and will test

this series in coming days.

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation


Re: [RFC PATCH 0/7] Trace events to pstore

2020-09-02 Thread Joel Fernandes
On Wed, Sep 2, 2020 at 5:47 PM Joel Fernandes  wrote:
>
> On Wed, Sep 2, 2020 at 4:01 PM Nachammai Karuppiah
>  wrote:
> >
> > Hi,
> >
> > This patch series adds support to store trace events in pstore.
> >

Been a long day...

> > Storing trace entries in persistent RAM would help in understanding what
> > happened just before the system went down. The trace events that led to the
> > crash can be retrieved from the pstore after a warm reboot. This will help
> > debug what happened before machine’s last breath. This has to be done in a
> > scalable way so that tracing a live system does not impact the performance
> > of the system.
>
> Just to add, Nachammai was my intern in the recent outreachy program
> and we designed together a way for trace events to be written to
> pstore backed memory directory instead of regular memory. The basic

s/directory/directly/

> idea is to allocate frace's ring buffer on pstore memory and have it
> right there. Then recover it on reboot. Nachammai wrote the code with

s/right/write/

 - Joel


> some guidance :) . I talked to Steve as well in the past about the
> basic of idea of this. Steve is on vacation this week though.
>
> This is similar to what +Sai Prakash Ranjan was trying to do sometime
> ago: https://lkml.org/lkml/2018/9/8/221 . But that approach involved
> higher overhead due to synchronization of writing to the otherwise
> lockless ring buffer.
>
> +Brian Norris has also expressed interest for this feature.
>
> thanks,
>
>  - Joel
>
> >
> > This requires a new backend - ramtrace that allocates pages from
> > persistent storage for the tracing utility. This feature can be enabled
> > using TRACE_EVENTS_TO_PSTORE.
> > In this feature, the new backend is used only as a page allocator and
> > once the  users chooses to use pstore to record trace entries, the ring
> > buffer pages are freed and allocated in pstore. Once this switch is done,
> > ring_buffer continues to operate just as before without much overhead.
> > Since the ring buffer uses the persistent RAM buffer directly to record
> > trace entries, all tracers would also persist across reboot.
> >
> > To test this feature, I used a simple module that would call panic during
> > a write operation to file in tracefs directory. Before writing to the file,
> > the ring buffer is moved to persistent RAM buffer through command line
> > as shown below,
> >
> > $echo 1 > /sys/kernel/tracing/options/persist
> >
> > Writing to the file,
> > $echo 1 > /sys/kernel/tracing/crash/panic_on_write
> >
> > The above write operation results in system crash. After reboot, once the
> > pstore is mounted, the trace entries from previous boot are available in 
> > file,
> > /sys/fs/pstore/trace-ramtrace-0
> >
> > Looking through this file, gives us the stack trace that led to the crash.
> >
> ><...>-1 [001] 49.083909: __vfs_write <-vfs_write
> ><...>-1 [001] 49.083933: panic <-panic_on_write
> ><...>-1 [001] d...49.084195: printk <-panic
> ><...>-1 [001] d...49.084201: vprintk_func <-printk
> ><...>-1 [001] d...49.084207: vprintk_default <-printk
> ><...>-1 [001] d...49.084211: vprintk_emit <-printk
> ><...>-1 [001] d...49.084216: __printk_safe_enter 
> > <-vprintk_emit
> ><...>-1 [001] d...49.084219: _raw_spin_lock 
> > <-vprintk_emit
> ><...>-1 [001] d...49.084223: vprintk_store <-vprintk_emit
> >
> > Patchwise oneline description is given below:
> >
> > Patch 1 adds support to allocate ring buffer pages from persistent RAM 
> > buffer.
> >
> > Patch 2 introduces a new backend, ramtrace.
> >
> > Patch 3 adds methods to read previous boot pages from pstore.
> >
> > Patch 4 adds the functionality to allocate page-sized memory from pstore.
> >
> > Patch 5 adds the seq_operation methods to iterate through trace entries.
> >
> > Patch 6 modifies ring_buffer to allocate from ramtrace when pstore is used.
> >
> > Patch 7 adds ramtrace DT node as child-node of /reserved-memory.
> >
> > Nachammai Karuppiah (7):
> >   tracing: Add support to allocate pages from persistent memory
> >   pstore: Support a new backend, ramtrace
> >   pstore: Read and iterate through trace entries in PSTORE
> >   pstore: Allocate and free page-sized memory in persistent RAM buffer
> >   tracing: Add support to iterate through pages retrieved from pstore
> >   tracing: Use ramtrace alloc and free methods while using persistent
> > RAM
> >   dt-bindings: ramtrace: Add ramtrace DT node
> >
> >  .../bindings/reserved-memory/ramtrace.txt  |  13 +
> >  drivers/of/platform.c  |   1 +
> >  fs/pstore/Makefile |   2 +
> >  fs/pstore/inode.c  |  46 +-
> >  fs/pstore/platform.c   |   1 +
> >  fs/pstore/ramtrace.c   | 821 
> > 

Re: [RFC PATCH 0/7] Trace events to pstore

2020-09-02 Thread Joel Fernandes
On Wed, Sep 2, 2020 at 4:01 PM Nachammai Karuppiah
 wrote:
>
> Hi,
>
> This patch series adds support to store trace events in pstore.
>
> Storing trace entries in persistent RAM would help in understanding what
> happened just before the system went down. The trace events that led to the
> crash can be retrieved from the pstore after a warm reboot. This will help
> debug what happened before machine’s last breath. This has to be done in a
> scalable way so that tracing a live system does not impact the performance
> of the system.

Just to add, Nachammai was my intern in the recent outreachy program
and we designed together a way for trace events to be written to
pstore backed memory directory instead of regular memory. The basic
idea is to allocate frace's ring buffer on pstore memory and have it
right there. Then recover it on reboot. Nachammai wrote the code with
some guidance :) . I talked to Steve as well in the past about the
basic of idea of this. Steve is on vacation this week though.

This is similar to what +Sai Prakash Ranjan was trying to do sometime
ago: https://lkml.org/lkml/2018/9/8/221 . But that approach involved
higher overhead due to synchronization of writing to the otherwise
lockless ring buffer.

+Brian Norris has also expressed interest for this feature.

thanks,

 - Joel

>
> This requires a new backend - ramtrace that allocates pages from
> persistent storage for the tracing utility. This feature can be enabled
> using TRACE_EVENTS_TO_PSTORE.
> In this feature, the new backend is used only as a page allocator and
> once the  users chooses to use pstore to record trace entries, the ring
> buffer pages are freed and allocated in pstore. Once this switch is done,
> ring_buffer continues to operate just as before without much overhead.
> Since the ring buffer uses the persistent RAM buffer directly to record
> trace entries, all tracers would also persist across reboot.
>
> To test this feature, I used a simple module that would call panic during
> a write operation to file in tracefs directory. Before writing to the file,
> the ring buffer is moved to persistent RAM buffer through command line
> as shown below,
>
> $echo 1 > /sys/kernel/tracing/options/persist
>
> Writing to the file,
> $echo 1 > /sys/kernel/tracing/crash/panic_on_write
>
> The above write operation results in system crash. After reboot, once the
> pstore is mounted, the trace entries from previous boot are available in file,
> /sys/fs/pstore/trace-ramtrace-0
>
> Looking through this file, gives us the stack trace that led to the crash.
>
><...>-1 [001] 49.083909: __vfs_write <-vfs_write
><...>-1 [001] 49.083933: panic <-panic_on_write
><...>-1 [001] d...49.084195: printk <-panic
><...>-1 [001] d...49.084201: vprintk_func <-printk
><...>-1 [001] d...49.084207: vprintk_default <-printk
><...>-1 [001] d...49.084211: vprintk_emit <-printk
><...>-1 [001] d...49.084216: __printk_safe_enter 
> <-vprintk_emit
><...>-1 [001] d...49.084219: _raw_spin_lock <-vprintk_emit
><...>-1 [001] d...49.084223: vprintk_store <-vprintk_emit
>
> Patchwise oneline description is given below:
>
> Patch 1 adds support to allocate ring buffer pages from persistent RAM buffer.
>
> Patch 2 introduces a new backend, ramtrace.
>
> Patch 3 adds methods to read previous boot pages from pstore.
>
> Patch 4 adds the functionality to allocate page-sized memory from pstore.
>
> Patch 5 adds the seq_operation methods to iterate through trace entries.
>
> Patch 6 modifies ring_buffer to allocate from ramtrace when pstore is used.
>
> Patch 7 adds ramtrace DT node as child-node of /reserved-memory.
>
> Nachammai Karuppiah (7):
>   tracing: Add support to allocate pages from persistent memory
>   pstore: Support a new backend, ramtrace
>   pstore: Read and iterate through trace entries in PSTORE
>   pstore: Allocate and free page-sized memory in persistent RAM buffer
>   tracing: Add support to iterate through pages retrieved from pstore
>   tracing: Use ramtrace alloc and free methods while using persistent
> RAM
>   dt-bindings: ramtrace: Add ramtrace DT node
>
>  .../bindings/reserved-memory/ramtrace.txt  |  13 +
>  drivers/of/platform.c  |   1 +
>  fs/pstore/Makefile |   2 +
>  fs/pstore/inode.c  |  46 +-
>  fs/pstore/platform.c   |   1 +
>  fs/pstore/ramtrace.c   | 821 
> +
>  include/linux/pstore.h |   3 +
>  include/linux/ramtrace.h   |  28 +
>  include/linux/ring_buffer.h|  19 +
>  include/linux/trace.h  |  13 +
>  kernel/trace/Kconfig   |  10 +
>