Re: [PATCH net-next v2 1/2] fs/crashdd: add API to collect hardware dump in second kernel
On Tue, Apr 03, 2018 at 08:43:27AM +0300, Alex Vesker wrote: > > > On 4/2/2018 12:12 PM, Jiri Pirko wrote: > >Fri, Mar 30, 2018 at 05:11:29PM CEST, and...@lunn.ch wrote: > >>>Please see: > >>>http://patchwork.ozlabs.org/project/netdev/list/?series=36524 > >>> > >>>I bevieve that the solution in the patchset could be used for > >>>your usecase too. > >>Hi Jiri > >> > >>https://lkml.org/lkml/2018/3/20/436 > >> > >>How well does this API work for a 2Gbyte snapshot? > >Ccing Alex who did the tests. > > I didn't check the performance for such a large snapshot. > From my measurement it takes 0.09s for 1 MB of data this means > about ~3m. I was not really thinking about performance. More about how well does the system work when you ask the kernel for 2GB of RAM to put a snapshot into? And given your current design, you need another 2GB buffer for the driver to use before calling this new API. So i'm asking, how well does this API scale? I think you need to remove the need for a second buffer in the driver. Either the driver allocates the buffer and hands it over, or your core code allocates the buffer and gives it to the driver to fill. Maybe look at what makes most sense for the crash dump code? Andrew
Re: [PATCH net-next v2 1/2] fs/crashdd: add API to collect hardware dump in second kernel
Mon, Apr 02, 2018 at 02:30:45PM CEST, rahul.lakkire...@chelsio.com wrote: >On Monday, April 04/02/18, 2018 at 14:41:43 +0530, Jiri Pirko wrote: >> Fri, Mar 30, 2018 at 08:42:00PM CEST, ebied...@xmission.com wrote: >> >Rahul Lakkireddywrites: >> > >> >> On Friday, March 03/30/18, 2018 at 16:09:07 +0530, Jiri Pirko wrote: >> >>> Sat, Mar 24, 2018 at 11:56:33AM CET, rahul.lakkire...@chelsio.com wrote: >> >>> >Add a new module crashdd that exports the /sys/kernel/crashdd/ >> >>> >directory in second kernel, containing collected hardware/firmware >> >>> >dumps. >> >>> > >> >>> >The sequence of actions done by device drivers to append their device >> >>> >specific hardware/firmware logs to /sys/kernel/crashdd/ directory are >> >>> >as follows: >> >>> > >> >>> >1. During probe (before hardware is initialized), device drivers >> >>> >register to the crashdd module (via crashdd_add_dump()), with >> >>> >callback function, along with buffer size and log name needed for >> >>> >firmware/hardware log collection. >> >>> > >> >>> >2. Crashdd creates a driver's directory under >> >>> >/sys/kernel/crashdd/. Then, it allocates the buffer with >> >>> >> >>> This smells. I need to identify the exact ASIC instance that produced >> >>> the dump. To identify by driver name does not help me if I have multiple >> >>> instances of the same driver. This looks wrong to me. This looks like >> >>> a job for devlink where you have 1 devlink instance per 1 ASIC instance. >> >>> >> >>> Please see: >> >>> http://patchwork.ozlabs.org/project/netdev/list/?series=36524 >> >>> >> >>> I bevieve that the solution in the patchset could be used for >> >>> your usecase too. >> >>> >> >>> >> >> >> >> The sysfs approach proposed here had been dropped in favour exporting >> >> the dumps as ELF notes in /proc/vmcore. >> >> >> >> Will be posting the new patches soon. >> > >> >The concern was actually how you identify which device that came from. >> >Where you read the identifier changes but sysfs or /proc/vmcore the >> >change remains valid. >> >> Yeah. I still don't see how you link the dump and the device. > >In our case, the dump and the device are being identified by the >driver’s name followed by its corresponding pci bus id. I’ve posted an >example in my v3 series: > >https://www.spinics.net/lists/netdev/msg493781.html > >Here’s an extract from the link above: > ># readelf -n /proc/vmcore > >Displaying notes found at file offset 0x1000 with length 0x04003288: >Owner Data size Description >VMCOREDD_cxgb4_:02:00.4 0x02000fd8 Unknown note type:(0x0700) >VMCOREDD_cxgb4_:04:00.4 0x02000fd8 Unknown note type:(0x0700) >CORE 0x0150 NT_PRSTATUS (prstatus structure) >CORE 0x0150 NT_PRSTATUS (prstatus structure) >CORE 0x0150 NT_PRSTATUS (prstatus structure) >CORE 0x0150 NT_PRSTATUS (prstatus structure) >CORE 0x0150 NT_PRSTATUS (prstatus structure) >CORE 0x0150 NT_PRSTATUS (prstatus structure) >CORE 0x0150 NT_PRSTATUS (prstatus structure) >CORE 0x0150 NT_PRSTATUS (prstatus structure) >VMCOREINFO 0x074f Unknown note type: (0x) > >Here, for my two devices, the dump’s names are >VMCOREDD_cxgb4_:02:00.4 and VMCOREDD_cxgb4_:04:00.4. > >It’s really up to the callers to write their own unique name for the >dump. The name is appended to “VMCOREDD_” string. > >> Rahul, did you look at the patchset I pointed out? > >For devlink, I think the dump name would be identified by >bus_type/device_name; i.e. “pci/:02:00.4” for my example. >Is my understanding correct? Yes. > >Thanks, >Rahul
Re: [PATCH net-next v2 1/2] fs/crashdd: add API to collect hardware dump in second kernel
On 4/2/2018 12:12 PM, Jiri Pirko wrote: Fri, Mar 30, 2018 at 05:11:29PM CEST, and...@lunn.ch wrote: Please see: http://patchwork.ozlabs.org/project/netdev/list/?series=36524 I bevieve that the solution in the patchset could be used for your usecase too. Hi Jiri https://lkml.org/lkml/2018/3/20/436 How well does this API work for a 2Gbyte snapshot? Ccing Alex who did the tests. I didn't check the performance for such a large snapshot. From my measurement it takes 0.09s for 1 MB of data this means about ~3m. This can be tuned and improved since this is a socket application. Andrew
Re: [PATCH net-next v2 1/2] fs/crashdd: add API to collect hardware dump in second kernel
On Monday, April 04/02/18, 2018 at 14:41:43 +0530, Jiri Pirko wrote: > Fri, Mar 30, 2018 at 08:42:00PM CEST, ebied...@xmission.com wrote: > >Rahul Lakkireddywrites: > > > >> On Friday, March 03/30/18, 2018 at 16:09:07 +0530, Jiri Pirko wrote: > >>> Sat, Mar 24, 2018 at 11:56:33AM CET, rahul.lakkire...@chelsio.com wrote: > >>> >Add a new module crashdd that exports the /sys/kernel/crashdd/ > >>> >directory in second kernel, containing collected hardware/firmware > >>> >dumps. > >>> > > >>> >The sequence of actions done by device drivers to append their device > >>> >specific hardware/firmware logs to /sys/kernel/crashdd/ directory are > >>> >as follows: > >>> > > >>> >1. During probe (before hardware is initialized), device drivers > >>> >register to the crashdd module (via crashdd_add_dump()), with > >>> >callback function, along with buffer size and log name needed for > >>> >firmware/hardware log collection. > >>> > > >>> >2. Crashdd creates a driver's directory under > >>> >/sys/kernel/crashdd/. Then, it allocates the buffer with > >>> > >>> This smells. I need to identify the exact ASIC instance that produced > >>> the dump. To identify by driver name does not help me if I have multiple > >>> instances of the same driver. This looks wrong to me. This looks like > >>> a job for devlink where you have 1 devlink instance per 1 ASIC instance. > >>> > >>> Please see: > >>> http://patchwork.ozlabs.org/project/netdev/list/?series=36524 > >>> > >>> I bevieve that the solution in the patchset could be used for > >>> your usecase too. > >>> > >>> > >> > >> The sysfs approach proposed here had been dropped in favour exporting > >> the dumps as ELF notes in /proc/vmcore. > >> > >> Will be posting the new patches soon. > > > >The concern was actually how you identify which device that came from. > >Where you read the identifier changes but sysfs or /proc/vmcore the > >change remains valid. > > Yeah. I still don't see how you link the dump and the device. In our case, the dump and the device are being identified by the driver’s name followed by its corresponding pci bus id. I’ve posted an example in my v3 series: https://www.spinics.net/lists/netdev/msg493781.html Here’s an extract from the link above: # readelf -n /proc/vmcore Displaying notes found at file offset 0x1000 with length 0x04003288: Owner Data size Description VMCOREDD_cxgb4_:02:00.4 0x02000fd8 Unknown note type:(0x0700) VMCOREDD_cxgb4_:04:00.4 0x02000fd8 Unknown note type:(0x0700) CORE 0x0150 NT_PRSTATUS (prstatus structure) CORE 0x0150 NT_PRSTATUS (prstatus structure) CORE 0x0150 NT_PRSTATUS (prstatus structure) CORE 0x0150 NT_PRSTATUS (prstatus structure) CORE 0x0150 NT_PRSTATUS (prstatus structure) CORE 0x0150 NT_PRSTATUS (prstatus structure) CORE 0x0150 NT_PRSTATUS (prstatus structure) CORE 0x0150 NT_PRSTATUS (prstatus structure) VMCOREINFO 0x074f Unknown note type: (0x) Here, for my two devices, the dump’s names are VMCOREDD_cxgb4_:02:00.4 and VMCOREDD_cxgb4_:04:00.4. It’s really up to the callers to write their own unique name for the dump. The name is appended to “VMCOREDD_” string. > Rahul, did you look at the patchset I pointed out? For devlink, I think the dump name would be identified by bus_type/device_name; i.e. “pci/:02:00.4” for my example. Is my understanding correct? Thanks, Rahul
Re: [PATCH net-next v2 1/2] fs/crashdd: add API to collect hardware dump in second kernel
> >> The sysfs approach proposed here had been dropped in favour exporting > >> the dumps as ELF notes in /proc/vmcore. > >> > >> Will be posting the new patches soon. > > > >The concern was actually how you identify which device that came from. > >Where you read the identifier changes but sysfs or /proc/vmcore the > >change remains valid. > > Yeah. I still don't see how you link the dump and the device. Hi Jiri You can see in the third version the core code accept a free form name. The driver builds a name using the driver name and the adaptor name. What i think would be good is to try to have one API to the driver that can be used for both crash dumps and devlink snapshots. These are used at different times, but have basically the same purpose, get state from the device. Andrew
Re: [PATCH net-next v2 1/2] fs/crashdd: add API to collect hardware dump in second kernel
Fri, Mar 30, 2018 at 05:11:29PM CEST, and...@lunn.ch wrote: >> Please see: >> http://patchwork.ozlabs.org/project/netdev/list/?series=36524 >> >> I bevieve that the solution in the patchset could be used for >> your usecase too. > >Hi Jiri > >https://lkml.org/lkml/2018/3/20/436 > >How well does this API work for a 2Gbyte snapshot? Ccing Alex who did the tests. > >Andrew
Re: [PATCH net-next v2 1/2] fs/crashdd: add API to collect hardware dump in second kernel
Fri, Mar 30, 2018 at 08:42:00PM CEST, ebied...@xmission.com wrote: >Rahul Lakkireddywrites: > >> On Friday, March 03/30/18, 2018 at 16:09:07 +0530, Jiri Pirko wrote: >>> Sat, Mar 24, 2018 at 11:56:33AM CET, rahul.lakkire...@chelsio.com wrote: >>> >Add a new module crashdd that exports the /sys/kernel/crashdd/ >>> >directory in second kernel, containing collected hardware/firmware >>> >dumps. >>> > >>> >The sequence of actions done by device drivers to append their device >>> >specific hardware/firmware logs to /sys/kernel/crashdd/ directory are >>> >as follows: >>> > >>> >1. During probe (before hardware is initialized), device drivers >>> >register to the crashdd module (via crashdd_add_dump()), with >>> >callback function, along with buffer size and log name needed for >>> >firmware/hardware log collection. >>> > >>> >2. Crashdd creates a driver's directory under >>> >/sys/kernel/crashdd/. Then, it allocates the buffer with >>> >>> This smells. I need to identify the exact ASIC instance that produced >>> the dump. To identify by driver name does not help me if I have multiple >>> instances of the same driver. This looks wrong to me. This looks like >>> a job for devlink where you have 1 devlink instance per 1 ASIC instance. >>> >>> Please see: >>> http://patchwork.ozlabs.org/project/netdev/list/?series=36524 >>> >>> I bevieve that the solution in the patchset could be used for >>> your usecase too. >>> >>> >> >> The sysfs approach proposed here had been dropped in favour exporting >> the dumps as ELF notes in /proc/vmcore. >> >> Will be posting the new patches soon. > >The concern was actually how you identify which device that came from. >Where you read the identifier changes but sysfs or /proc/vmcore the >change remains valid. Yeah. I still don't see how you link the dump and the device. Rahul, did you look at the patchset I pointed out? Thanks!
Re: [PATCH net-next v2 1/2] fs/crashdd: add API to collect hardware dump in second kernel
Rahul Lakkireddywrites: > On Friday, March 03/30/18, 2018 at 16:09:07 +0530, Jiri Pirko wrote: >> Sat, Mar 24, 2018 at 11:56:33AM CET, rahul.lakkire...@chelsio.com wrote: >> >Add a new module crashdd that exports the /sys/kernel/crashdd/ >> >directory in second kernel, containing collected hardware/firmware >> >dumps. >> > >> >The sequence of actions done by device drivers to append their device >> >specific hardware/firmware logs to /sys/kernel/crashdd/ directory are >> >as follows: >> > >> >1. During probe (before hardware is initialized), device drivers >> >register to the crashdd module (via crashdd_add_dump()), with >> >callback function, along with buffer size and log name needed for >> >firmware/hardware log collection. >> > >> >2. Crashdd creates a driver's directory under >> >/sys/kernel/crashdd/. Then, it allocates the buffer with >> >> This smells. I need to identify the exact ASIC instance that produced >> the dump. To identify by driver name does not help me if I have multiple >> instances of the same driver. This looks wrong to me. This looks like >> a job for devlink where you have 1 devlink instance per 1 ASIC instance. >> >> Please see: >> http://patchwork.ozlabs.org/project/netdev/list/?series=36524 >> >> I bevieve that the solution in the patchset could be used for >> your usecase too. >> >> > > The sysfs approach proposed here had been dropped in favour exporting > the dumps as ELF notes in /proc/vmcore. > > Will be posting the new patches soon. The concern was actually how you identify which device that came from. Where you read the identifier changes but sysfs or /proc/vmcore the change remains valid. Eric
Re: [PATCH net-next v2 1/2] fs/crashdd: add API to collect hardware dump in second kernel
> Please see: > http://patchwork.ozlabs.org/project/netdev/list/?series=36524 > > I bevieve that the solution in the patchset could be used for > your usecase too. Hi Jiri https://lkml.org/lkml/2018/3/20/436 How well does this API work for a 2Gbyte snapshot? Andrew
Re: [PATCH net-next v2 1/2] fs/crashdd: add API to collect hardware dump in second kernel
On Friday, March 03/30/18, 2018 at 16:09:07 +0530, Jiri Pirko wrote: > Sat, Mar 24, 2018 at 11:56:33AM CET, rahul.lakkire...@chelsio.com wrote: > >Add a new module crashdd that exports the /sys/kernel/crashdd/ > >directory in second kernel, containing collected hardware/firmware > >dumps. > > > >The sequence of actions done by device drivers to append their device > >specific hardware/firmware logs to /sys/kernel/crashdd/ directory are > >as follows: > > > >1. During probe (before hardware is initialized), device drivers > >register to the crashdd module (via crashdd_add_dump()), with > >callback function, along with buffer size and log name needed for > >firmware/hardware log collection. > > > >2. Crashdd creates a driver's directory under > >/sys/kernel/crashdd/. Then, it allocates the buffer with > > This smells. I need to identify the exact ASIC instance that produced > the dump. To identify by driver name does not help me if I have multiple > instances of the same driver. This looks wrong to me. This looks like > a job for devlink where you have 1 devlink instance per 1 ASIC instance. > > Please see: > http://patchwork.ozlabs.org/project/netdev/list/?series=36524 > > I bevieve that the solution in the patchset could be used for > your usecase too. > > The sysfs approach proposed here had been dropped in favour exporting the dumps as ELF notes in /proc/vmcore. Will be posting the new patches soon. > >requested size and invokes the device driver's registered callback > >function. > > > >3. Device driver collects all hardware/firmware logs into the buffer > >and returns control back to crashdd. > > > >4. Crashdd exposes the buffer as a binary file via > >/sys/kernel/crashdd//. > >
Re: [PATCH net-next v2 1/2] fs/crashdd: add API to collect hardware dump in second kernel
Sat, Mar 24, 2018 at 11:56:33AM CET, rahul.lakkire...@chelsio.com wrote: >Add a new module crashdd that exports the /sys/kernel/crashdd/ >directory in second kernel, containing collected hardware/firmware >dumps. > >The sequence of actions done by device drivers to append their device >specific hardware/firmware logs to /sys/kernel/crashdd/ directory are >as follows: > >1. During probe (before hardware is initialized), device drivers >register to the crashdd module (via crashdd_add_dump()), with >callback function, along with buffer size and log name needed for >firmware/hardware log collection. > >2. Crashdd creates a driver's directory under >/sys/kernel/crashdd/. Then, it allocates the buffer with This smells. I need to identify the exact ASIC instance that produced the dump. To identify by driver name does not help me if I have multiple instances of the same driver. This looks wrong to me. This looks like a job for devlink where you have 1 devlink instance per 1 ASIC instance. Please see: http://patchwork.ozlabs.org/project/netdev/list/?series=36524 I bevieve that the solution in the patchset could be used for your usecase too. >requested size and invokes the device driver's registered callback >function. > >3. Device driver collects all hardware/firmware logs into the buffer >and returns control back to crashdd. > >4. Crashdd exposes the buffer as a binary file via >/sys/kernel/crashdd//. >
Re: [PATCH net-next v2 1/2] fs/crashdd: add API to collect hardware dump in second kernel
Hi Rahul, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on net-next/master] url: https://github.com/0day-ci/linux/commits/Rahul-Lakkireddy/fs-crashdd-add-API-to-collect-hardware-dump-in-second-kernel/20180325-191308 config: i386-randconfig-s0-03251817 (attached as .config) compiler: gcc-6 (Debian 6.4.0-9) 6.4.0 20171026 reproduce: # save the attached .config to linux build tree make ARCH=i386 All warnings (new ones prefixed by >>): In file included from fs//crashdd/crashdd.c:8:0: fs//crashdd/crashdd_internal.h:13:23: error: field 'bin_attr' has incomplete type struct bin_attribute bin_attr; /* Binary dump file's attributes */ ^~~~ fs//crashdd/crashdd.c: In function 'crashdd_read': fs//crashdd/crashdd.c:19:43: error: dereferencing pointer to incomplete type 'struct bin_attribute' struct crashdd_dump_node *dump = bin_attr->private; ^~ fs//crashdd/crashdd.c: In function 'crashdd_mkdir': fs//crashdd/crashdd.c:27:9: error: implicit declaration of function 'kobject_create_and_add' [-Werror=implicit-function-declaration] return kobject_create_and_add(name, crashdd_kobj); ^~ fs//crashdd/crashdd.c:27:9: warning: return makes pointer from integer without a cast [-Wint-conversion] return kobject_create_and_add(name, crashdd_kobj); ^~ fs//crashdd/crashdd.c: In function 'crashdd_add_file': fs//crashdd/crashdd.c:39:9: error: implicit declaration of function 'sysfs_create_bin_file' [-Werror=implicit-function-declaration] return sysfs_create_bin_file(kobj, >bin_attr); ^ fs//crashdd/crashdd.c: In function 'crashdd_rmdir': fs//crashdd/crashdd.c:44:2: error: implicit declaration of function 'kobject_put' [-Werror=implicit-function-declaration] kobject_put(kobj); ^~~ In file included from include/linux/kernel.h:10:0, from include/linux/list.h:9, from include/linux/preempt.h:11, from include/linux/spinlock.h:51, from include/linux/vmalloc.h:5, from fs//crashdd/crashdd.c:4: fs//crashdd/crashdd.c: In function 'crashdd_get_driver': fs//crashdd/crashdd.c:101:25: error: dereferencing pointer to incomplete type 'struct kobject' if (!strcmp(node->kobj->name, name)) { ^ include/linux/compiler.h:58:30: note: in definition of macro '__trace_if' if (__builtin_constant_p(!!(cond)) ? !!(cond) : \ ^~~~ >> fs//crashdd/crashdd.c:101:3: note: in expansion of macro 'if' if (!strcmp(node->kobj->name, name)) { ^~ fs//crashdd/crashdd.c: In function 'crashdd_init': fs//crashdd/crashdd.c:227:51: error: 'kernel_kobj' undeclared (first use in this function) crashdd_kobj = kobject_create_and_add("crashdd", kernel_kobj); ^~~ fs//crashdd/crashdd.c:227:51: note: each undeclared identifier is reported only once for each function it appears in fs//crashdd/crashdd.c: In function 'crashdd_add_file': fs//crashdd/crashdd.c:40:1: warning: control reaches end of non-void function [-Wreturn-type] } ^ cc1: some warnings being treated as errors vim +/if +101 fs//crashdd/crashdd.c 3 > 4 #include 5 #include 6 #include 7 8 #include "crashdd_internal.h" 9 10 static LIST_HEAD(crashdd_list); 11 static DEFINE_MUTEX(crashdd_mutex); 12 13 static struct kobject *crashdd_kobj; 14 15 static ssize_t crashdd_read(struct file *filp, struct kobject *kobj, 16 struct bin_attribute *bin_attr, 17 char *buf, loff_t fpos, size_t count) 18 { 19 struct crashdd_dump_node *dump = bin_attr->private; 20 21 memcpy(buf, dump->buf + fpos, count); 22 return count; 23 } 24 25 static struct kobject *crashdd_mkdir(const char *name) 26 { 27 return kobject_create_and_add(name, crashdd_kobj); 28 } 29 30 static int crashdd_add_file(struct kobject *kobj, const char *name, 31 struct crashdd_dump_node *dump) 32 { 33 dump->bin_attr.attr.name = name; 34 dump->bin_attr.attr.mode = 0444; 35 dump->bin_attr.size = dump->size; 36 dump->bin_attr.read = crashdd_read; 37 dump->bin_attr.private = dump; 38 39 return sysfs_create_bin_file(kobj, >bin_attr); 40 } 41 42 static void crashdd_rmdir(struct kobject *kobj) 43 { 44 kobject_put(kobj); 45 } 46 47 /** 48 * crashdd_init_driver -