Re: Crash in msm serial on dragonboard with ftrace bootargs
On 11/16/2018 9:09 AM, Viresh Kumar wrote: On Thu, Nov 15, 2018 at 4:23 PM Srinivas Kandagatla wrote: Yes, this is not the solution, but it proves that the hand-off between booloaders and kernel is the issue. In general there is wider issue with resources hand-off between bootloader and kernel. There has been some proposal in the past by Viresh for a new framework called boot-constriants (https://lkml.org/lkml/2017/12/14/440) which am not sure if its still actively looked at. But something similar should be the way to address such issues. It isn't dead code yet and I am waiting to gain few more use-cases before I attempt to convince Greg again :) Here is the code.. git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/linux.git boot-constraint -- viresh Maybe you can take this earlycon issue as a usecase. I have added a boot constraint for earlycon on db410c and have sent a patch. Whenever you repitch boot-constraint, you can add that as well :) Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: Crash in msm serial on dragonboard with ftrace bootargs
On Thu, Nov 15, 2018 at 4:23 PM Srinivas Kandagatla wrote: > Yes, this is not the solution, but it proves that the hand-off between > booloaders and kernel is the issue. > > In general there is wider issue with resources hand-off between > bootloader and kernel. > > There has been some proposal in the past by Viresh for a new framework > called boot-constriants (https://lkml.org/lkml/2017/12/14/440) which am > not sure if its still actively looked at. But something similar should > be the way to address such issues. It isn't dead code yet and I am waiting to gain few more use-cases before I attempt to convince Greg again :) Here is the code.. git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/linux.git boot-constraint -- viresh
Re: Crash in msm serial on dragonboard with ftrace bootargs
On 15/11/18 10:33, Sai Prakash Ranjan wrote: On 11/13/2018 3:14 PM, Srinivas Kandagatla wrote: Hi Sai, On 25/10/18 15:36, saiprakash.ran...@codeaurora.org wrote: "If I disable dma node and LS-UART0, then I don't see any crash and ftrace also works fine" And one more observation is that even without ftrace cmdline, if I use earlycon and disable dma, I face the same crash. So basically this seems to be some kind of earlycon and dma issue and not ftrace(I can be wrong). So adding Srinivas for more info on this dma node. Its Interesting that my old email conversations with SBoyd show that I have investigated this issue in early 2016! My analysis so far: This reason for such behavior is due the common iface clock (GCC_BLSP1_AHB_CLK) across multiple drivers(serial ports, bam dma and other low speed devices). The code flow in DB410C is bit different, as the uart0 is first attempted to set as console and then uart1, this ordering triggers pm state change uart_change_pm(state, UART_PM_STATE_OFF) from serial core while setting up uart0, this would go and disable all the clocks for uart0. As uart1 is not setup Yet, and earlycon is still active, any attempts by earlycon to write to registers would trigger a system reboot as the clock was just disabled by uart0 change_pm code. This can even be triggered with any drivers like spi which uses same clock I guess. Hope it helps, Either earlycon needs to reference the clocks or those clocks needs to be marked always-on (but only with earlycon). Also just for a note: apq8096-db820c.dtsi shows UART0 is disabled because bootloader does not allow access to it. Could this also be the case for db410c? No, this is not the case with DB410c. DB820c has added restrictions in TZ, I think new booloaders should have solved this issue. Hi Srinivas, Thanks a lot for pointing out the cause of crash. I just tried setting GCC_BLSP1_AHB_CLK with flag CLK_IS_CRITICAL and the crash disappears. But I suppose setting CLK_IS_CRITICAL is not the solution? Yes, this is not the solution, but it proves that the hand-off between booloaders and kernel is the issue. In general there is wider issue with resources hand-off between bootloader and kernel. There has been some proposal in the past by Viresh for a new framework called boot-constriants (https://lkml.org/lkml/2017/12/14/440) which am not sure if its still actively looked at. But something similar should be the way to address such issues. --srini Thanks, Sai
Re: Crash in msm serial on dragonboard with ftrace bootargs
On 11/13/2018 3:14 PM, Srinivas Kandagatla wrote: Hi Sai, On 25/10/18 15:36, saiprakash.ran...@codeaurora.org wrote: "If I disable dma node and LS-UART0, then I don't see any crash and ftrace also works fine" And one more observation is that even without ftrace cmdline, if I use earlycon and disable dma, I face the same crash. So basically this seems to be some kind of earlycon and dma issue and not ftrace(I can be wrong). So adding Srinivas for more info on this dma node. Its Interesting that my old email conversations with SBoyd show that I have investigated this issue in early 2016! My analysis so far: This reason for such behavior is due the common iface clock (GCC_BLSP1_AHB_CLK) across multiple drivers(serial ports, bam dma and other low speed devices). The code flow in DB410C is bit different, as the uart0 is first attempted to set as console and then uart1, this ordering triggers pm state change uart_change_pm(state, UART_PM_STATE_OFF) from serial core while setting up uart0, this would go and disable all the clocks for uart0. As uart1 is not setup Yet, and earlycon is still active, any attempts by earlycon to write to registers would trigger a system reboot as the clock was just disabled by uart0 change_pm code. This can even be triggered with any drivers like spi which uses same clock I guess. Hope it helps, Either earlycon needs to reference the clocks or those clocks needs to be marked always-on (but only with earlycon). Also just for a note: apq8096-db820c.dtsi shows UART0 is disabled because bootloader does not allow access to it. Could this also be the case for db410c? No, this is not the case with DB410c. DB820c has added restrictions in TZ, I think new booloaders should have solved this issue. Hi Srinivas, Thanks a lot for pointing out the cause of crash. I just tried setting GCC_BLSP1_AHB_CLK with flag CLK_IS_CRITICAL and the crash disappears. But I suppose setting CLK_IS_CRITICAL is not the solution? Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: Crash in msm serial on dragonboard with ftrace bootargs
Hi Sai, On 25/10/18 15:36, saiprakash.ran...@codeaurora.org wrote: "If I disable dma node and LS-UART0, then I don't see any crash and ftrace also works fine" And one more observation is that even without ftrace cmdline, if I use earlycon and disable dma, I face the same crash. So basically this seems to be some kind of earlycon and dma issue and not ftrace(I can be wrong). So adding Srinivas for more info on this dma node. Its Interesting that my old email conversations with SBoyd show that I have investigated this issue in early 2016! My analysis so far: This reason for such behavior is due the common iface clock (GCC_BLSP1_AHB_CLK) across multiple drivers(serial ports, bam dma and other low speed devices). The code flow in DB410C is bit different, as the uart0 is first attempted to set as console and then uart1, this ordering triggers pm state change uart_change_pm(state, UART_PM_STATE_OFF) from serial core while setting up uart0, this would go and disable all the clocks for uart0. As uart1 is not setup Yet, and earlycon is still active, any attempts by earlycon to write to registers would trigger a system reboot as the clock was just disabled by uart0 change_pm code. This can even be triggered with any drivers like spi which uses same clock I guess. Hope it helps, Either earlycon needs to reference the clocks or those clocks needs to be marked always-on (but only with earlycon). Also just for a note: apq8096-db820c.dtsi shows UART0 is disabled because bootloader does not allow access to it. Could this also be the case for db410c? No, this is not the case with DB410c. DB820c has added restrictions in TZ, I think new booloaders should have solved this issue. Thanks, srini
Re: Crash in msm serial on dragonboard with ftrace bootargs
On 10/19/2018 8:42 PM, Steven Rostedt wrote: On Fri, 19 Oct 2018 20:18:19 +0530 Sai Prakash Ranjan wrote: You said that if you add 'ftrace_filter=msm_read' to the command line, it still crashes? So only tracing that function we have an issue, right? Tracing msm_read does cause the crash, but that is not the only one causing the crash because even after "ftrace_notrace=msm_read" the board crashes which is why I suspect msm earlycon and not ftrace. I'm saying there's a combination of the two. Because when ftrace is not involved, early con doesn't crash. I would focus on why earlycon crashes when only msm_read is traced. That should help narrow down the cause. What we know is: earlycon -> boots ftrace=function -> boots earlycon ftrace=function ftrace_filter=msm_read -> crashes If we remove the "ftrace=function ftrace_filter=msm_read" it boots fine, and I'm assuming that if we just remove earlycon (keeping the ftrace_filter), it also boots too. Thus, ftrace tracing msm_read is causing earlycon to do something that triggers the crash. Hi, Sorry for late response on this, was running out of debug options ;-) Finally after bisecting (which I should have done earlier) this issue till 4.4 kernel where I don't see the crash, I could see that dma node in "apq8016-sbc.dtsi" is somehow causing the crash. dma@7884000 { status = "okay";<--- This one }; serial@78af000 { label = "LS-UART0"; status = "okay"; pinctrl-names = "default", "sleep"; pinctrl-0 = <&blsp1_uart1_default>; pinctrl-1 = <&blsp1_uart1_sleep>; }; serial@78b { label = "LS-UART1"; status = "okay"; pinctrl-names = "default", "sleep"; pinctrl-0 = <&blsp1_uart2_default>; pinctrl-1 = <&blsp1_uart2_sleep>; }; "If I disable dma node and LS-UART0, then I don't see any crash and ftrace also works fine" And one more observation is that even without ftrace cmdline, if I use earlycon and disable dma, I face the same crash. So basically this seems to be some kind of earlycon and dma issue and not ftrace(I can be wrong). So adding Srinivas for more info on this dma node. Also just for a note: apq8096-db820c.dtsi shows UART0 is disabled because bootloader does not allow access to it. Could this also be the case for db410c? Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: Crash in msm serial on dragonboard with ftrace bootargs
On Fri, 19 Oct 2018 20:18:19 +0530 Sai Prakash Ranjan wrote: > > You said that if you add 'ftrace_filter=msm_read' to the command line, > > it still crashes? > > > > So only tracing that function we have an issue, right? > > > > Tracing msm_read does cause the crash, but that is not the only one > causing the crash because even after "ftrace_notrace=msm_read" the board > crashes which is why I suspect msm earlycon and not ftrace. I'm saying there's a combination of the two. Because when ftrace is not involved, early con doesn't crash. I would focus on why earlycon crashes when only msm_read is traced. That should help narrow down the cause. What we know is: earlycon -> boots ftrace=function -> boots earlycon ftrace=function ftrace_filter=msm_read -> crashes If we remove the "ftrace=function ftrace_filter=msm_read" it boots fine, and I'm assuming that if we just remove earlycon (keeping the ftrace_filter), it also boots too. Thus, ftrace tracing msm_read is causing earlycon to do something that triggers the crash. -- Steve
Re: Crash in msm serial on dragonboard with ftrace bootargs
On 10/19/2018 7:21 PM, Steven Rostedt wrote: On Fri, 19 Oct 2018 12:24:05 +0530 Sai Prakash Ranjan wrote: Anyone see any problems here? This seems sane to me, he says in the other thread that he put 'notrace' to the msm serial functions (which AIUI should prevent ftrace instrumentation) and he still sees the issue. Yes I did add notrace to all functions in msm serial and checked the objdump to make sure that those were not instrumented, and yet the target crashed. This doesnt seem like an issue with ftrace but rather with msm early con. The thing that still bothers me is that it boots fine without ftrace running, and that it requires tracing something to cause the problem. This tells me that ftrace has something to do with it. Perhaps it's just changing the timing. Yes tracing does cause issue but only if earlycon is present. If I disable earlycon, then even tracing wont cause any issue and target boots fine. You said that if you add 'ftrace_filter=msm_read' to the command line, it still crashes? So only tracing that function we have an issue, right? Tracing msm_read does cause the crash, but that is not the only one causing the crash because even after "ftrace_notrace=msm_read" the board crashes which is why I suspect msm earlycon and not ftrace. Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: Crash in msm serial on dragonboard with ftrace bootargs
On Fri, 19 Oct 2018 12:24:05 +0530 Sai Prakash Ranjan wrote: > >> Anyone see any problems here? > > > > This seems sane to me, he says in the other thread that he put 'notrace' to > > the msm serial functions (which AIUI should prevent ftrace instrumentation) > > and he still sees the issue. > > > > Yes I did add notrace to all functions in msm serial and checked the > objdump to make sure that those were not instrumented, and yet the > target crashed. This doesnt seem like an issue with ftrace but rather > with msm early con. The thing that still bothers me is that it boots fine without ftrace running, and that it requires tracing something to cause the problem. This tells me that ftrace has something to do with it. Perhaps it's just changing the timing. You said that if you add 'ftrace_filter=msm_read' to the command line, it still crashes? So only tracing that function we have an issue, right? -- Steve
Re: Crash in msm serial on dragonboard with ftrace bootargs
On 10/19/2018 9:47 AM, Joel Fernandes wrote: On Thu, Oct 18, 2018 at 09:17:06AM -0400, Steven Rostedt wrote: On Thu, 18 Oct 2018 10:51:18 +0530 Sai Prakash Ranjan wrote: So something else is causing an issue besides just msm_read. Can you do an objdump -dr of the entire vmlinux binary and gzip it and post it somewhere. Not sure if it would be too big to email. You could try sending it to me privately. I'd like to see the binary that you are using. I have sent the objdump and dot config to you privately. Thanks. I don't see anything that pops out, but then again, my arm asm foo is very rusty (it has been literally decades since I did any arm asm). I wonder if it could simply be a timing issue? 086eb538 : 086eb538: a9be7bfdstp x29, x30, [sp,#-32]! 086eb53c: 910003fdmov x29, sp 086eb540: a90153f3stp x19, x20, [sp,#16] 086eb544: aa0003f4mov x20, x0 086eb548: 2a0103f3mov w19, w1 086eb54c: aa1e03e0mov x0, x30 086eb550: 97e6bae4bl 0809a0e0 <_mcount> The above is changed to nop on boot, but then to: bl ftrace_caller When ftrace is enabled. 086eb554: 8b334280add x0, x20, w19, uxtw 086eb558: b940ldr w0, [x0] 086eb55c: a94153f3ldp x19, x20, [sp,#16] 086eb560: a8c27bfdldp x29, x30, [sp],#32 086eb564: d65f03c0ret 0809a0e4 : 0809a0e4: a9bf7bfdstp x29, x30, [sp,#-16]! 0809a0e8: 910003fdmov x29, sp 0809a0ec: d10013c0sub x0, x30, #0x4 0809a0f0: f94003a1ldr x1, [x29] 0809a0f4: f9400421ldr x1, [x1,#8] 0809a0f8: d1001021sub x1, x1, #0x4 0809a0fc : 0809a0fc: d503201fnop The above nop gets patched to: bl ftrace_ops_no_ops Which will iterate through all the registered functions. 0809a100 : 0809a100: d503201fnop The above only gets set when function graph tracer is enabled, which it is not in this case. 0809a104: a8c17bfdldp x29, x30, [sp],#16 0809a108: d65f03c0ret Anyone see any problems here? This seems sane to me, he says in the other thread that he put 'notrace' to the msm serial functions (which AIUI should prevent ftrace instrumentation) and he still sees the issue. Yes I did add notrace to all functions in msm serial and checked the objdump to make sure that those were not instrumented, and yet the target crashed. This doesnt seem like an issue with ftrace but rather with msm early con. Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: Crash in msm serial on dragonboard with ftrace bootargs
On Thu, Oct 18, 2018 at 09:17:06AM -0400, Steven Rostedt wrote: > On Thu, 18 Oct 2018 10:51:18 +0530 > Sai Prakash Ranjan wrote: > > > > So something else is causing an issue besides just msm_read. > > > > > > Can you do an objdump -dr of the entire vmlinux binary and gzip it and > > > post it somewhere. Not sure if it would be too big to email. You could > > > try sending it to me privately. I'd like to see the binary that you are > > > using. > > > > > > > I have sent the objdump and dot config to you privately. > > Thanks. I don't see anything that pops out, but then again, my arm asm > foo is very rusty (it has been literally decades since I did any arm > asm). I wonder if it could simply be a timing issue? > > 086eb538 : > 086eb538: a9be7bfdstp x29, x30, [sp,#-32]! > 086eb53c: 910003fdmov x29, sp > 086eb540: a90153f3stp x19, x20, [sp,#16] > 086eb544: aa0003f4mov x20, x0 > 086eb548: 2a0103f3mov w19, w1 > 086eb54c: aa1e03e0mov x0, x30 > 086eb550: 97e6bae4bl 0809a0e0 <_mcount> > > The above is changed to nop on boot, but then to: > > bl ftrace_caller > > When ftrace is enabled. > > 086eb554: 8b334280add x0, x20, w19, uxtw > 086eb558: b940ldr w0, [x0] > 086eb55c: a94153f3ldp x19, x20, [sp,#16] > 086eb560: a8c27bfdldp x29, x30, [sp],#32 > 086eb564: d65f03c0ret > > > > 0809a0e4 : > 0809a0e4: a9bf7bfdstp x29, x30, [sp,#-16]! > 0809a0e8: 910003fdmov x29, sp > 0809a0ec: d10013c0sub x0, x30, #0x4 > 0809a0f0: f94003a1ldr x1, [x29] > 0809a0f4: f9400421ldr x1, [x1,#8] > 0809a0f8: d1001021sub x1, x1, #0x4 > > 0809a0fc : > 0809a0fc: d503201fnop > > The above nop gets patched to: > > bl ftrace_ops_no_ops > > Which will iterate through all the registered functions. > > > 0809a100 : > 0809a100: d503201fnop > > The above only gets set when function graph tracer is enabled, which it > is not in this case. > > 0809a104: a8c17bfdldp x29, x30, [sp],#16 > 0809a108: d65f03c0ret > > > Anyone see any problems here? This seems sane to me, he says in the other thread that he put 'notrace' to the msm serial functions (which AIUI should prevent ftrace instrumentation) and he still sees the issue. thanks, - Joel
Re: Crash in msm serial on dragonboard with ftrace bootargs
On Thu, 18 Oct 2018 10:51:18 +0530 Sai Prakash Ranjan wrote: > > So something else is causing an issue besides just msm_read. > > > > Can you do an objdump -dr of the entire vmlinux binary and gzip it and > > post it somewhere. Not sure if it would be too big to email. You could > > try sending it to me privately. I'd like to see the binary that you are > > using. > > > > I have sent the objdump and dot config to you privately. Thanks. I don't see anything that pops out, but then again, my arm asm foo is very rusty (it has been literally decades since I did any arm asm). I wonder if it could simply be a timing issue? 086eb538 : 086eb538: a9be7bfdstp x29, x30, [sp,#-32]! 086eb53c: 910003fdmov x29, sp 086eb540: a90153f3stp x19, x20, [sp,#16] 086eb544: aa0003f4mov x20, x0 086eb548: 2a0103f3mov w19, w1 086eb54c: aa1e03e0mov x0, x30 086eb550: 97e6bae4bl 0809a0e0 <_mcount> The above is changed to nop on boot, but then to: bl ftrace_caller When ftrace is enabled. 086eb554: 8b334280add x0, x20, w19, uxtw 086eb558: b940ldr w0, [x0] 086eb55c: a94153f3ldp x19, x20, [sp,#16] 086eb560: a8c27bfdldp x29, x30, [sp],#32 086eb564: d65f03c0ret 0809a0e4 : 0809a0e4: a9bf7bfdstp x29, x30, [sp,#-16]! 0809a0e8: 910003fdmov x29, sp 0809a0ec: d10013c0sub x0, x30, #0x4 0809a0f0: f94003a1ldr x1, [x29] 0809a0f4: f9400421ldr x1, [x1,#8] 0809a0f8: d1001021sub x1, x1, #0x4 0809a0fc : 0809a0fc: d503201fnop The above nop gets patched to: bl ftrace_ops_no_ops Which will iterate through all the registered functions. 0809a100 : 0809a100: d503201fnop The above only gets set when function graph tracer is enabled, which it is not in this case. 0809a104: a8c17bfdldp x29, x30, [sp],#16 0809a108: d65f03c0ret Anyone see any problems here? -- Steve
Re: Crash in msm serial on dragonboard with ftrace bootargs
On 10/18/2018 8:03 AM, Steven Rostedt wrote: On Wed, 17 Oct 2018 00:36:05 +0530 Sai Prakash Ranjan wrote: On 10/17/2018 12:33 AM, Steven Rostedt wrote: On Wed, 17 Oct 2018 00:31:03 +0530 Sai Prakash Ranjan wrote: Haa seems like you are right! With "ftrace=function ftrace_filter=msm_read" , I can trigger the crash, but sadly "ftrace_notrace=msm_read" also crashes. So there's more than one problem area. What about ftrace_notrace=m* ? That too crashes. So something else is causing an issue besides just msm_read. Can you do an objdump -dr of the entire vmlinux binary and gzip it and post it somewhere. Not sure if it would be too big to email. You could try sending it to me privately. I'd like to see the binary that you are using. I have sent the objdump and dot config to you privately. Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: Crash in msm serial on dragonboard with ftrace bootargs
On Wed, 17 Oct 2018 00:36:05 +0530 Sai Prakash Ranjan wrote: > On 10/17/2018 12:33 AM, Steven Rostedt wrote: > > On Wed, 17 Oct 2018 00:31:03 +0530 > > Sai Prakash Ranjan wrote: > > > >> Haa seems like you are right! With "ftrace=function > >> ftrace_filter=msm_read" , I can trigger the crash, but > >> sadly "ftrace_notrace=msm_read" also crashes. > > > > So there's more than one problem area. > > > > What about ftrace_notrace=m* > > > > ? > > > > That too crashes. > So something else is causing an issue besides just msm_read. Can you do an objdump -dr of the entire vmlinux binary and gzip it and post it somewhere. Not sure if it would be too big to email. You could try sending it to me privately. I'd like to see the binary that you are using. -- Steve
Re: Crash in msm serial on dragonboard with ftrace bootargs
On Wed, Oct 17, 2018 at 10:56 AM, Joel Fernandes wrote: > On Wed, Oct 17, 2018 at 08:19:41PM +0530, Sai Prakash Ranjan wrote: >> On 10/17/2018 5:08 PM, Sai Prakash Ranjan wrote: >> > > >> > > What do you think about the (untested) patch below? It seems to me >> > > that it >> > > should solve the issue of missing early crash dumps, but I have not >> > > tested it >> > > yet. Sai, would you mind trying it out and let me know if you can see the >> > > early crash dumps properly now? >> > > >> > > 8<--- >> > > From: "Joel Fernandes (Google)" >> > > Subject: [RFC] pstore: allocate compression during late_initcall >> > > >> > > ramoop's pstore registration (using pstore_register) has to run during >> > > late_initcall because crypto backend may not be ready during >> > > postcore_initcall. This causes missing of dmesg crash dumps which could >> > > have been caught by pstore. >> > > >> > > Instead, lets allow ramoops pstore registration earlier, and once crypto >> > > is ready we can initialize the compression. >> > > >> > > Reported-by: Sai Prakash Ranjan >> > > Signed-off-by: Joel Fernandes (Google) >> > > --- >> > > fs/pstore/platform.c | 13 + >> > > fs/pstore/ram.c | 2 +- >> > > 2 files changed, 14 insertions(+), 1 deletion(-) >> > > >> > > diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c >> > > index 15e99d5a681d..f09066db2d4d 100644 >> > > --- a/fs/pstore/platform.c >> > > +++ b/fs/pstore/platform.c >> > > @@ -780,6 +780,19 @@ void __init pstore_choose_compression(void) >> > > } >> > > } >> > > +static int __init pstore_compression_late_init(void) >> > > +{ >> > > +/* >> > > + * Check if any pstore backends registered earlier but did not >> > > allocate >> > > + * for compression because crypto was not ready, if so then >> > > initialize >> > > + * compression. >> > > + */ >> > > +if (psinfo && !tfm) >> > > +allocate_buf_for_compression(); >> > > +return 0; >> > > +} >> > > +late_initcall(pstore_compression_late_init); >> > > + >> > > module_param(compress, charp, 0444); >> > > MODULE_PARM_DESC(compress, "Pstore compression to use"); >> > > diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c >> > > index bbd1e357c23d..98e48d1a9776 100644 >> > > --- a/fs/pstore/ram.c >> > > +++ b/fs/pstore/ram.c >> > > @@ -940,7 +940,7 @@ static int __init ramoops_init(void) >> > > ramoops_register_dummy(); >> > > return platform_driver_register(&ramoops_driver); >> > > } >> > > -late_initcall(ramoops_init); >> > > +postcore_initcall(ramoops_init); >> > > static void __exit ramoops_exit(void) >> > > { >> > > >> > >> > Yes I could see the early crash dump. Also I tested with different >> > compression (LZO) instead of deflate just to be sure and it works fine, >> > thanks :) >> > >> > Tested-by: Sai Prakash Ranjan >> > > > Thanks. > >> I just noticed that allocate_buf_for_compression() is also called from >> pstore_register(). Shouldn't that call be removed now that ramoops_init is >> moved to postcore_initcall and allocate_buf_for_compression() will just >> return doing nothing when called from pstore_register()? > > Yes, that is the point. If crypto is not ready then my thought is > allocate_buf_for_compression() called from pstore_register() should do > nothing and pstore will work uncompressed and the kdump infrastructure will > only cause uncompressed writes. But say if the kdump triggered *after* crypto > was ready, then the dump write out would be compressed since pstore is ready > for compression. > > The idea is if a future pstore backend calls pstore_register late, then it > may as well do the allocate_buf_for_compression as well at that time when it > runs. In that cause pstore_compression_late_init would do nothing. > > So this approach is both dynamic and future proof. Yeah, thanks! I think this looks correct, but I'll spend some more time testing it too. -Kees -- Kees Cook Pixel Security
Re: Crash in msm serial on dragonboard with ftrace bootargs
On Wed, Oct 17, 2018 at 08:19:41PM +0530, Sai Prakash Ranjan wrote: > On 10/17/2018 5:08 PM, Sai Prakash Ranjan wrote: > > > > > > What do you think about the (untested) patch below? It seems to me > > > that it > > > should solve the issue of missing early crash dumps, but I have not > > > tested it > > > yet. Sai, would you mind trying it out and let me know if you can see the > > > early crash dumps properly now? > > > > > > 8<--- > > > From: "Joel Fernandes (Google)" > > > Subject: [RFC] pstore: allocate compression during late_initcall > > > > > > ramoop's pstore registration (using pstore_register) has to run during > > > late_initcall because crypto backend may not be ready during > > > postcore_initcall. This causes missing of dmesg crash dumps which could > > > have been caught by pstore. > > > > > > Instead, lets allow ramoops pstore registration earlier, and once crypto > > > is ready we can initialize the compression. > > > > > > Reported-by: Sai Prakash Ranjan > > > Signed-off-by: Joel Fernandes (Google) > > > --- > > > fs/pstore/platform.c | 13 + > > > fs/pstore/ram.c | 2 +- > > > 2 files changed, 14 insertions(+), 1 deletion(-) > > > > > > diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c > > > index 15e99d5a681d..f09066db2d4d 100644 > > > --- a/fs/pstore/platform.c > > > +++ b/fs/pstore/platform.c > > > @@ -780,6 +780,19 @@ void __init pstore_choose_compression(void) > > > } > > > } > > > +static int __init pstore_compression_late_init(void) > > > +{ > > > + /* > > > + * Check if any pstore backends registered earlier but did not > > > allocate > > > + * for compression because crypto was not ready, if so then > > > initialize > > > + * compression. > > > + */ > > > + if (psinfo && !tfm) > > > + allocate_buf_for_compression(); > > > + return 0; > > > +} > > > +late_initcall(pstore_compression_late_init); > > > + > > > module_param(compress, charp, 0444); > > > MODULE_PARM_DESC(compress, "Pstore compression to use"); > > > diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c > > > index bbd1e357c23d..98e48d1a9776 100644 > > > --- a/fs/pstore/ram.c > > > +++ b/fs/pstore/ram.c > > > @@ -940,7 +940,7 @@ static int __init ramoops_init(void) > > > ramoops_register_dummy(); > > > return platform_driver_register(&ramoops_driver); > > > } > > > -late_initcall(ramoops_init); > > > +postcore_initcall(ramoops_init); > > > static void __exit ramoops_exit(void) > > > { > > > > > > > Yes I could see the early crash dump. Also I tested with different > > compression (LZO) instead of deflate just to be sure and it works fine, > > thanks :) > > > > Tested-by: Sai Prakash Ranjan > > Thanks. > I just noticed that allocate_buf_for_compression() is also called from > pstore_register(). Shouldn't that call be removed now that ramoops_init is > moved to postcore_initcall and allocate_buf_for_compression() will just > return doing nothing when called from pstore_register()? Yes, that is the point. If crypto is not ready then my thought is allocate_buf_for_compression() called from pstore_register() should do nothing and pstore will work uncompressed and the kdump infrastructure will only cause uncompressed writes. But say if the kdump triggered *after* crypto was ready, then the dump write out would be compressed since pstore is ready for compression. The idea is if a future pstore backend calls pstore_register late, then it may as well do the allocate_buf_for_compression as well at that time when it runs. In that cause pstore_compression_late_init would do nothing. So this approach is both dynamic and future proof. - Joel
Re: Crash in msm serial on dragonboard with ftrace bootargs
On 10/17/2018 5:08 PM, Sai Prakash Ranjan wrote: What do you think about the (untested) patch below? It seems to me that it should solve the issue of missing early crash dumps, but I have not tested it yet. Sai, would you mind trying it out and let me know if you can see the early crash dumps properly now? 8<--- From: "Joel Fernandes (Google)" Subject: [RFC] pstore: allocate compression during late_initcall ramoop's pstore registration (using pstore_register) has to run during late_initcall because crypto backend may not be ready during postcore_initcall. This causes missing of dmesg crash dumps which could have been caught by pstore. Instead, lets allow ramoops pstore registration earlier, and once crypto is ready we can initialize the compression. Reported-by: Sai Prakash Ranjan Signed-off-by: Joel Fernandes (Google) --- fs/pstore/platform.c | 13 + fs/pstore/ram.c | 2 +- 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c index 15e99d5a681d..f09066db2d4d 100644 --- a/fs/pstore/platform.c +++ b/fs/pstore/platform.c @@ -780,6 +780,19 @@ void __init pstore_choose_compression(void) } } +static int __init pstore_compression_late_init(void) +{ + /* + * Check if any pstore backends registered earlier but did not allocate + * for compression because crypto was not ready, if so then initialize + * compression. + */ + if (psinfo && !tfm) + allocate_buf_for_compression(); + return 0; +} +late_initcall(pstore_compression_late_init); + module_param(compress, charp, 0444); MODULE_PARM_DESC(compress, "Pstore compression to use"); diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c index bbd1e357c23d..98e48d1a9776 100644 --- a/fs/pstore/ram.c +++ b/fs/pstore/ram.c @@ -940,7 +940,7 @@ static int __init ramoops_init(void) ramoops_register_dummy(); return platform_driver_register(&ramoops_driver); } -late_initcall(ramoops_init); +postcore_initcall(ramoops_init); static void __exit ramoops_exit(void) { Yes I could see the early crash dump. Also I tested with different compression (LZO) instead of deflate just to be sure and it works fine, thanks :) Tested-by: Sai Prakash Ranjan I just noticed that allocate_buf_for_compression() is also called from pstore_register(). Shouldn't that call be removed now that ramoops_init is moved to postcore_initcall and allocate_buf_for_compression() will just return doing nothing when called from pstore_register()? Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: Crash in msm serial on dragonboard with ftrace bootargs
On 10/17/2018 4:39 AM, Joel Fernandes wrote: On Tue, Oct 16, 2018 at 05:08:25PM +0530, Sai Prakash Ranjan wrote: Hi, On dragonboard 410c, with "ftrace=function" boot args, the console output slows down and board resets without any backtrace as below. This is tested on latest kernel and seems to exist even in older kernels as well. [2.949164] EINJ: ACPI disabled. [3.133001] Serial: 8250/16550 dri Format: Log Type - Time(microsec) - Message - Optional Info Log Type: B - Since Boot(Power On Reset), D - Delta, S - Statistic But with pstore enabled, able to get the below backtrace: <4>[2.949164] EINJ: ACPI disabled. <6>[3.133001] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled <6>[3.164097] SuperH (H)SCI(F) driver initialized <0>[3.164471] Internal error: synchronous external abort: 9610 [#1] PREEMPT SMP <4>[3.164479] Modules linked in: <4>[3.164495] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc8-8-ge033b9909fff-dirty #175 <4>[3.164501] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT) <4>[3.164508] pstate: 4085 (nZcv daIf -PAN -UAO) <4>[3.164514] pc : msm_read.isra.2+0x20/0x50 <4>[3.164520] lr : msm_read.isra.2+0x1c/0x50 <4>[3.164526] sp : 08033a50 <4>[3.164531] x29: 08033a50 x28: 09486018 <4>[3.164548] x27: 0001 x26: 7dfffe7ff070 <4>[3.164565] x25: 0034 x24: 09486000 <4>[3.164582] x23: x22: 0978e190 <4>[3.164599] x21: 095e8228 x20: 0034 <4>[3.164616] x19: 7dfffe7ff008 x18: <4>[3.164632] x17: x16: <4>[3.164649] x15: 094a96c8 x14: 8978e6bf <4>[3.164666] x13: 0978e6cd x12: 0038 <4>[3.164683] x11: 094c6000 x10: 0c24 <4>[3.164699] x9 : 80003c89b400 x8 : 08033970 <4>[3.164716] x7 : 8eb04100 x6 : 000af304 <4>[3.164732] x5 : 0c40 x4 : 80003c06f000 <4>[3.164750] x3 : 80003c89b498 x2 : <4>[3.164766] x1 : 80003ca68000 x0 : 0800 <0>[3.164785] Process swapper/0 (pid: 1, stack limit = 0x(ptrval)) <4>[3.164791] Call trace: <4>[3.164797] msm_read.isra.2+0x20/0x50 <4>[3.164804] msm_reset_dm_count+0x44/0x80 <4>[3.164810] __msm_console_write+0x1c8/0x1d0 <4>[3.164816] msm_serial_early_write_dm+0x3c/0x50 <4>[3.164823] console_unlock.part.6+0x468/0x528 <4>[3.164829] vprintk_emit+0x210/0x218 <4>[3.164835] vprintk_default+0x48/0x58 <4>[3.164841] vprintk_func+0xf0/0x1c0 <4>[3.164847] printk+0x74/0x94 <4>[3.164853] sci_init+0x24/0x3c <4>[3.164859] do_one_initcall+0x54/0x248 <4>[3.164866] kernel_init_freeable+0x210/0x378 <4>[3.164872] kernel_init+0x18/0x118 <4>[3.164878] ret_from_fork+0x10/0x1c <0>[3.164884] Code: aa1e03e0 8b214273 97e616f7 d503201f (b9400260) Seems to be issue with the msm serial driver and not ftrace. Could someone look into it. One more thing is for pstore dmesg-ramoops, I had to change late_initcall to postcore_initcall which brings the question as to why we changed to late_initcall? Simple git blame shows to support crypto compress api, but is it really helpful? A lot of boottime issues can be caught with pstore enabled at postcore_initcall rather than late_initcall, this backtrace is just one example. Is there any way we could change this? Any chance you are able to dig deeper into the stack trace? I would disassemble vmlinux and see which part of msm_read is generating the synchronous external abort and look into that. Yes I had checked the part of msm_read which was generating the abort and it always seems to be in "msm_wait_for_xmitr" at below pointed location. static inline void msm_wait_for_xmitr(struct uart_port *port) { while (!(msm_read(port, UART_SR) & UART_SR_TX_EMPTY)) { <--- if (msm_read(port, UART_ISR) & UART_ISR_TX_READY) break; udelay(1); } msm_write(port, UART_CR_CMD_RESET_TX_READY, UART_CR); } Also I could confirm that this path is entered repeatedly(with tracing register reads/writes from my patch series in https://lore.kernel.org/patchwork/cover/983795/ and tp_printk) and crash is seen at some random time, so could not get much from this. Also similar to what Steve suggested, I wonder if it boots for you if you annotate all the functions in the serial driver with 'notrace'. I have tried this too, but still the target crashes. So I am doubtful if this is ftrace issue? Maybe earlycon uart issue as Stephen is suspecting. Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: Crash in msm serial on dragonboard with ftrace bootargs
On 10/17/2018 3:43 PM, Joel Fernandes wrote: Hi Kees, On Tue, Oct 16, 2018 at 10:02:53AM -0700, Kees Cook wrote: On Tue, Oct 16, 2018 at 8:29 AM, Steven Rostedt wrote: On Tue, 16 Oct 2018 17:08:25 +0530 Sai Prakash Ranjan wrote: One more thing is for pstore dmesg-ramoops, I had to change late_initcall to postcore_initcall which brings the question as to why we changed to late_initcall? Simple git blame shows to support crypto compress api, but is it really helpful? A lot of boottime issues can be caught with pstore enabled at postcore_initcall rather than late_initcall, this backtrace is just one example. Is there any way we could change this? Does it break if the crypto is not initialized? Perhaps add a command line flag to have it happen earlier: ramoops=earlyinit and add a postcore_initcall that checks if that flag is set, and if so, it does the work then, and the late_initcall() will do nothing. That way, you can still have unmodified kernels use pstore when it crashes at boot up. Even better, if we could find a way to make it work with a late initialization of compression (i.e. postcore_initcall() by default means anything caught would be uncompressed, but anything after late_initcall() would be compressed). I'd be very happy to review patches for that! What do you think about the (untested) patch below? It seems to me that it should solve the issue of missing early crash dumps, but I have not tested it yet. Sai, would you mind trying it out and let me know if you can see the early crash dumps properly now? 8<--- From: "Joel Fernandes (Google)" Subject: [RFC] pstore: allocate compression during late_initcall ramoop's pstore registration (using pstore_register) has to run during late_initcall because crypto backend may not be ready during postcore_initcall. This causes missing of dmesg crash dumps which could have been caught by pstore. Instead, lets allow ramoops pstore registration earlier, and once crypto is ready we can initialize the compression. Reported-by: Sai Prakash Ranjan Signed-off-by: Joel Fernandes (Google) --- fs/pstore/platform.c | 13 + fs/pstore/ram.c | 2 +- 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c index 15e99d5a681d..f09066db2d4d 100644 --- a/fs/pstore/platform.c +++ b/fs/pstore/platform.c @@ -780,6 +780,19 @@ void __init pstore_choose_compression(void) } } +static int __init pstore_compression_late_init(void) +{ + /* +* Check if any pstore backends registered earlier but did not allocate +* for compression because crypto was not ready, if so then initialize +* compression. +*/ + if (psinfo && !tfm) + allocate_buf_for_compression(); + return 0; +} +late_initcall(pstore_compression_late_init); + module_param(compress, charp, 0444); MODULE_PARM_DESC(compress, "Pstore compression to use"); diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c index bbd1e357c23d..98e48d1a9776 100644 --- a/fs/pstore/ram.c +++ b/fs/pstore/ram.c @@ -940,7 +940,7 @@ static int __init ramoops_init(void) ramoops_register_dummy(); return platform_driver_register(&ramoops_driver); } -late_initcall(ramoops_init); +postcore_initcall(ramoops_init); static void __exit ramoops_exit(void) { Yes I could see the early crash dump. Also I tested with different compression (LZO) instead of deflate just to be sure and it works fine, thanks :) Tested-by: Sai Prakash Ranjan -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: Crash in msm serial on dragonboard with ftrace bootargs
On 10/17/2018 2:21 AM, Stephen Boyd wrote: Quoting Sai Prakash Ranjan (2018-10-16 12:35:57) On 10/17/2018 12:45 AM, Steven Rostedt wrote: On Wed, 17 Oct 2018 00:36:05 +0530 Sai Prakash Ranjan wrote: On 10/17/2018 12:33 AM, Steven Rostedt wrote: On Wed, 17 Oct 2018 00:31:03 +0530 Sai Prakash Ranjan wrote: Haa seems like you are right! With "ftrace=function ftrace_filter=msm_read" , I can trigger the crash, but sadly "ftrace_notrace=msm_read" also crashes. So there's more than one problem area. What about ftrace_notrace=m* ? That too crashes. Which compiler are you using and can you send me your config. Config attached. Compiler: aarch64-linux-gnu-gcc (Linaro GCC 6.3-2017.02) 6.3.1 20170109 I wonder if there's something screwing up with the way ftrace nops are working on this board. A couple things of note, 1) it works fine after boot up. 2) it crashes in the initcall code, so it's not due to ftrace being enabled too early. I'd like to see the full command line as well. I bet if you remove the qcom,msm-uartdm from the command line, and had just ftrace=function, it may also boot fine too. Can you try that? Kernel command line: root=/dev/disk/by-partlabel/rootfs rw rootwait ftrace=function ftrace_notrace=m* earlycon console=ttyMSM0,115200n8 qcom,msm-uartdm is not in command line, it is the earlycon. So without earlycon(bootconsole), board boots fine as we discussed earlier. Have you tried with earlycon and no ftrace things on the commandline? root=/dev/disk/by-partlabel/rootfs rw rootwait earlycon console=ttyMSM0,115200n8 If earlycon is causing problems, it sounds like this may be another case of earlycon uart handing off to the uart driver and that failing because something gets printed while the uart is transitioning from the earlycon console to the kernel boot console. I recall the uart would trample on itself in interesting ways. Yes I have tried with only earlycon enabled and everything is fine. Issue is reproduced only with ftrace=function cmdline. -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: Crash in msm serial on dragonboard with ftrace bootargs
Hi Kees, On Tue, Oct 16, 2018 at 10:02:53AM -0700, Kees Cook wrote: > On Tue, Oct 16, 2018 at 8:29 AM, Steven Rostedt wrote: > > On Tue, 16 Oct 2018 17:08:25 +0530 > > Sai Prakash Ranjan wrote: > >> One more thing is for pstore dmesg-ramoops, I had to change > >> late_initcall to postcore_initcall which brings the question as to why > >> we changed to late_initcall? > >> Simple git blame shows to support crypto compress api, but is it really > >> helpful? A lot of boottime issues can be caught with pstore enabled at > >> postcore_initcall rather than late_initcall, this backtrace > >> is just one example. Is there any way we could change this? > > > > Does it break if the crypto is not initialized? Perhaps add a command > > line flag to have it happen earlier: > > > > ramoops=earlyinit > > > > and add a postcore_initcall that checks if that flag is set, and if so, > > it does the work then, and the late_initcall() will do nothing. > > > > That way, you can still have unmodified kernels use pstore when it > > crashes at boot up. > > Even better, if we could find a way to make it work with a late > initialization of compression (i.e. postcore_initcall() by default > means anything caught would be uncompressed, but anything after > late_initcall() would be compressed). I'd be very happy to review > patches for that! What do you think about the (untested) patch below? It seems to me that it should solve the issue of missing early crash dumps, but I have not tested it yet. Sai, would you mind trying it out and let me know if you can see the early crash dumps properly now? 8<--- From: "Joel Fernandes (Google)" Subject: [RFC] pstore: allocate compression during late_initcall ramoop's pstore registration (using pstore_register) has to run during late_initcall because crypto backend may not be ready during postcore_initcall. This causes missing of dmesg crash dumps which could have been caught by pstore. Instead, lets allow ramoops pstore registration earlier, and once crypto is ready we can initialize the compression. Reported-by: Sai Prakash Ranjan Signed-off-by: Joel Fernandes (Google) --- fs/pstore/platform.c | 13 + fs/pstore/ram.c | 2 +- 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c index 15e99d5a681d..f09066db2d4d 100644 --- a/fs/pstore/platform.c +++ b/fs/pstore/platform.c @@ -780,6 +780,19 @@ void __init pstore_choose_compression(void) } } +static int __init pstore_compression_late_init(void) +{ + /* +* Check if any pstore backends registered earlier but did not allocate +* for compression because crypto was not ready, if so then initialize +* compression. +*/ + if (psinfo && !tfm) + allocate_buf_for_compression(); + return 0; +} +late_initcall(pstore_compression_late_init); + module_param(compress, charp, 0444); MODULE_PARM_DESC(compress, "Pstore compression to use"); diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c index bbd1e357c23d..98e48d1a9776 100644 --- a/fs/pstore/ram.c +++ b/fs/pstore/ram.c @@ -940,7 +940,7 @@ static int __init ramoops_init(void) ramoops_register_dummy(); return platform_driver_register(&ramoops_driver); } -late_initcall(ramoops_init); +postcore_initcall(ramoops_init); static void __exit ramoops_exit(void) { -- 2.19.1.331.ge82ca0e54c-goog
Re: Crash in msm serial on dragonboard with ftrace bootargs
On Tue, Oct 16, 2018 at 05:08:25PM +0530, Sai Prakash Ranjan wrote: > Hi, > > On dragonboard 410c, with "ftrace=function" boot args, the console output > slows down and board resets without any backtrace as below. This is tested > on latest kernel and seems to exist even in older kernels as well. > > [2.949164] EINJ: ACPI disabled. > [3.133001] Serial: 8250/16550 dri > Format: Log Type - Time(microsec) - Message - Optional Info > Log Type: B - Since Boot(Power On Reset), D - Delta, S - Statistic > > But with pstore enabled, able to get the below backtrace: > <4>[2.949164] EINJ: ACPI disabled. > <6>[3.133001] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled > <6>[3.164097] SuperH (H)SCI(F) driver initialized > <0>[3.164471] Internal error: synchronous external abort: 9610 [#1] > PREEMPT SMP > <4>[3.164479] Modules linked in: > <4>[3.164495] CPU: 2 PID: 1 Comm: swapper/0 Not tainted > 4.19.0-rc8-8-ge033b9909fff-dirty #175 > <4>[3.164501] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC > (DT) > <4>[3.164508] pstate: 4085 (nZcv daIf -PAN -UAO) > <4>[3.164514] pc : msm_read.isra.2+0x20/0x50 > <4>[3.164520] lr : msm_read.isra.2+0x1c/0x50 > <4>[3.164526] sp : 08033a50 > <4>[3.164531] x29: 08033a50 x28: 09486018 > <4>[3.164548] x27: 0001 x26: 7dfffe7ff070 > <4>[3.164565] x25: 0034 x24: 09486000 > <4>[3.164582] x23: x22: 0978e190 > <4>[3.164599] x21: 095e8228 x20: 0034 > <4>[3.164616] x19: 7dfffe7ff008 x18: > <4>[3.164632] x17: x16: > <4>[3.164649] x15: 094a96c8 x14: 8978e6bf > <4>[3.164666] x13: 0978e6cd x12: 0038 > <4>[3.164683] x11: 094c6000 x10: 0c24 > <4>[3.164699] x9 : 80003c89b400 x8 : 08033970 > <4>[3.164716] x7 : 8eb04100 x6 : 000af304 > <4>[3.164732] x5 : 0c40 x4 : 80003c06f000 > <4>[3.164750] x3 : 80003c89b498 x2 : > <4>[3.164766] x1 : 80003ca68000 x0 : 0800 > <0>[3.164785] Process swapper/0 (pid: 1, stack limit = > 0x(ptrval)) > <4>[3.164791] Call trace: > <4>[3.164797] msm_read.isra.2+0x20/0x50 > <4>[3.164804] msm_reset_dm_count+0x44/0x80 > <4>[3.164810] __msm_console_write+0x1c8/0x1d0 > <4>[3.164816] msm_serial_early_write_dm+0x3c/0x50 > <4>[3.164823] console_unlock.part.6+0x468/0x528 > <4>[3.164829] vprintk_emit+0x210/0x218 > <4>[3.164835] vprintk_default+0x48/0x58 > <4>[3.164841] vprintk_func+0xf0/0x1c0 > <4>[3.164847] printk+0x74/0x94 > <4>[3.164853] sci_init+0x24/0x3c > <4>[3.164859] do_one_initcall+0x54/0x248 > <4>[3.164866] kernel_init_freeable+0x210/0x378 > <4>[3.164872] kernel_init+0x18/0x118 > <4>[3.164878] ret_from_fork+0x10/0x1c > <0>[3.164884] Code: aa1e03e0 8b214273 97e616f7 d503201f (b9400260) > > Seems to be issue with the msm serial driver and not ftrace. > Could someone look into it. > > One more thing is for pstore dmesg-ramoops, I had to change late_initcall to > postcore_initcall which brings the question as to why we changed to > late_initcall? > Simple git blame shows to support crypto compress api, but is it really > helpful? A lot of boottime issues can be caught with pstore enabled at > postcore_initcall rather than late_initcall, this backtrace > is just one example. Is there any way we could change this? Any chance you are able to dig deeper into the stack trace? I would disassemble vmlinux and see which part of msm_read is generating the synchronous external abort and look into that. Also similar to what Steve suggested, I wonder if it boots for you if you annotate all the functions in the serial driver with 'notrace'. thanks, - Joel
Re: Crash in msm serial on dragonboard with ftrace bootargs
Quoting Sai Prakash Ranjan (2018-10-16 12:35:57) > On 10/17/2018 12:45 AM, Steven Rostedt wrote: > > On Wed, 17 Oct 2018 00:36:05 +0530 > > Sai Prakash Ranjan wrote: > > > >> On 10/17/2018 12:33 AM, Steven Rostedt wrote: > >>> On Wed, 17 Oct 2018 00:31:03 +0530 > >>> Sai Prakash Ranjan wrote: > >>> > Haa seems like you are right! With "ftrace=function > ftrace_filter=msm_read" , I can trigger the crash, but > sadly "ftrace_notrace=msm_read" also crashes. > >>> > >>> So there's more than one problem area. > >>> > >>> What about ftrace_notrace=m* > >>> > >>> ? > >>> > >> > >> That too crashes. > >> > > > > Which compiler are you using and can you send me your config. > > > Config attached. > > Compiler: aarch64-linux-gnu-gcc (Linaro GCC 6.3-2017.02) 6.3.1 20170109 > > > I wonder if there's something screwing up with the way ftrace nops are > > working on this board. > > > > A couple things of note, 1) it works fine after boot up. 2) it crashes > > in the initcall code, so it's not due to ftrace being enabled too early. > > > > I'd like to see the full command line as well. I bet if you remove the > > qcom,msm-uartdm from the command line, and had just ftrace=function, it > > may also boot fine too. Can you try that? > > > > Kernel command line: root=/dev/disk/by-partlabel/rootfs rw rootwait > ftrace=function ftrace_notrace=m* earlycon console=ttyMSM0,115200n8 > > qcom,msm-uartdm is not in command line, it is the earlycon. So without > earlycon(bootconsole), board boots fine as we discussed earlier. > Have you tried with earlycon and no ftrace things on the commandline? root=/dev/disk/by-partlabel/rootfs rw rootwait earlycon console=ttyMSM0,115200n8 If earlycon is causing problems, it sounds like this may be another case of earlycon uart handing off to the uart driver and that failing because something gets printed while the uart is transitioning from the earlycon console to the kernel boot console. I recall the uart would trample on itself in interesting ways.
Re: Crash in msm serial on dragonboard with ftrace bootargs
On 10/17/2018 12:46 AM, Steven Rostedt wrote: On Tue, 16 Oct 2018 15:15:16 -0400 Steven Rostedt wrote: I'd like to see the full command line as well. I bet if you remove the qcom,msm-uartdm from the command line, and had just ftrace=function, it may also boot fine too. Can you try that? Note, I probably wont respond for the rest of the day. Cool, no problem. Will update if I find anything. Thanks for your time! -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: Crash in msm serial on dragonboard with ftrace bootargs
On Tue, 16 Oct 2018 15:15:16 -0400 Steven Rostedt wrote: > I'd like to see the full command line as well. I bet if you remove the > qcom,msm-uartdm from the command line, and had just ftrace=function, it > may also boot fine too. Can you try that? Note, I probably wont respond for the rest of the day. -- Steve
Re: Crash in msm serial on dragonboard with ftrace bootargs
On Wed, 17 Oct 2018 00:36:05 +0530 Sai Prakash Ranjan wrote: > On 10/17/2018 12:33 AM, Steven Rostedt wrote: > > On Wed, 17 Oct 2018 00:31:03 +0530 > > Sai Prakash Ranjan wrote: > > > >> Haa seems like you are right! With "ftrace=function > >> ftrace_filter=msm_read" , I can trigger the crash, but > >> sadly "ftrace_notrace=msm_read" also crashes. > > > > So there's more than one problem area. > > > > What about ftrace_notrace=m* > > > > ? > > > > That too crashes. > Which compiler are you using and can you send me your config. I wonder if there's something screwing up with the way ftrace nops are working on this board. A couple things of note, 1) it works fine after boot up. 2) it crashes in the initcall code, so it's not due to ftrace being enabled too early. I'd like to see the full command line as well. I bet if you remove the qcom,msm-uartdm from the command line, and had just ftrace=function, it may also boot fine too. Can you try that? -- Steve
Re: Crash in msm serial on dragonboard with ftrace bootargs
On 10/17/2018 12:33 AM, Steven Rostedt wrote: On Wed, 17 Oct 2018 00:31:03 +0530 Sai Prakash Ranjan wrote: Haa seems like you are right! With "ftrace=function ftrace_filter=msm_read" , I can trigger the crash, but sadly "ftrace_notrace=msm_read" also crashes. So there's more than one problem area. What about ftrace_notrace=m* ? That too crashes. -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: Crash in msm serial on dragonboard with ftrace bootargs
On Wed, 17 Oct 2018 00:31:03 +0530 Sai Prakash Ranjan wrote: > Haa seems like you are right! With "ftrace=function > ftrace_filter=msm_read" , I can trigger the crash, but > sadly "ftrace_notrace=msm_read" also crashes. So there's more than one problem area. What about ftrace_notrace=m* ? -- Steve
Re: Crash in msm serial on dragonboard with ftrace bootargs
On 10/17/2018 12:11 AM, Steven Rostedt wrote: On Tue, 16 Oct 2018 23:55:14 +0530 Sai Prakash Ranjan wrote: On 10/16/2018 11:46 PM, Steven Rostedt wrote: [ Removed ivan.iva...@linaro.org due to getting mail delivery errors ] On Tue, 16 Oct 2018 23:35:00 +0530 Sai Prakash Ranjan wrote: Ok got it, this sounds fun. I'll give it a try. Awesome, I'm looking forward to seeing what you come up with. Now, I can trigger this crash reboot with "ftrace=function ftrace_filter=f*,g*,h*,i*,j*,k*,l*,m*,n*,o*,p*,q*" Does it crash with just "f*" or "g*" or any singularity of the above? Hmm, I wonder if it is from tracing msm_read? Haa seems like you are right! With "ftrace=function ftrace_filter=msm_read" , I can trigger the crash, but sadly "ftrace_notrace=msm_read" also crashes. -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: Crash in msm serial on dragonboard with ftrace bootargs
On Tue, 16 Oct 2018 23:55:14 +0530 Sai Prakash Ranjan wrote: > On 10/16/2018 11:46 PM, Steven Rostedt wrote: > > > > [ Removed ivan.iva...@linaro.org due to getting mail delivery errors ] > > > > On Tue, 16 Oct 2018 23:35:00 +0530 > > Sai Prakash Ranjan wrote: > > > >> Ok got it, this sounds fun. I'll give it a try. > > > > Awesome, I'm looking forward to seeing what you come up with. > > > > Now, I can trigger this crash reboot with "ftrace=function > ftrace_filter=f*,g*,h*,i*,j*,k*,l*,m*,n*,o*,p*,q*" Does it crash with just "f*" or "g*" or any singularity of the above? Hmm, I wonder if it is from tracing msm_read? -- Steve > > And pstore dmesg log gives slightly different backtrace than > before(i.e., uart_add_one_port+0x374/0x4c8): > > <4>[1.976945] EINJ: ACPI disabled. > <6>[2.070011] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled > <6>[2.087353] SuperH (H)SCI(F) driver initialized > <6>[2.093708] msm_serial 78af000.serial: msm_serial: detected port #1 > <6>[2.093901] msm_serial 78af000.serial: uartclk = 1920 > <6>[2.099078] 78af000.serial: ttyMSM1 at MMIO 0x78af000 (irq = 9, > base_baud = 120) is a MSM > <6>[2.108386] msm_serial 78b.serial: msm_serial: detected port #0 > <6>[2.113076] msm_serial 78b.serial: uartclk = 7372800 > <6>[2.119300] 78b.serial: ttyMSM0 at MMIO 0x78b (irq = 10, > base_baud = 460800) is a MSM > <0>[2.119769] Internal error: synchronous external abort: 9610 > [#1] PREEMPT SMP > <4>[2.119772] Modules linked in: > <4>[2.119780] CPU: 0 PID: 1 Comm: swapper/0 Not tainted > 4.19.0-rc8-8-ge033b9909fff-dirty #2 > <4>[2.119785] Hardware name: Qualcomm Technologies, Inc. APQ 8016 > SBC (DT) > <4>[2.119789] pstate: 4085 (nZcv daIf -PAN -UAO) > <4>[2.119792] pc : msm_read.isra.2+0x20/0x50 > <4>[2.119796] lr : msm_read.isra.2+0x1c/0x50 > <4>[2.119799] sp : 08033760 > <4>[2.119802] x29: 08033760 x28: 09486018 > <4>[2.119809] x27: 0001 x26: 7dfffe7ff070 > <4>[2.119816] x25: 0062 x24: 09486000 > <4>[2.119824] x23: x22: 0978e190 > <4>[2.119831] x21: 095e8228 x20: 0062 > <4>[2.119838] x19: 7dfffe7ff008 x18: > <4>[2.119845] x17: x16: > <4>[2.119852] x15: 094a96c8 x14: 3d20647561625f65 > <4>[2.119859] x13: 736162202c303120 x12: 3d20717269282030 > <4>[2.119866] x11: 3030306238377830 x10: 0134 > <4>[2.119874] x9 : 80003c356400 x8 : 08033680 > <4>[2.119881] x7 : 80003c354100 x6 : 000d9b72 > <4>[2.119888] x5 : 0150 x4 : 80003bc8f000 > <4>[2.119895] x3 : 80003c356498 x2 : > <4>[2.119902] x1 : 80003bf1 x0 : 0800 > <0>[2.119911] Process swapper/0 (pid: 1, stack limit = > 0x(ptrval)) > <4>[2.119914] Call trace: > <4>[2.119917] msm_read.isra.2+0x20/0x50 > <4>[2.119920] msm_reset_dm_count+0x44/0x80 > <4>[2.119924] __msm_console_write+0x1c8/0x1d0 > <4>[2.119928] msm_serial_early_write_dm+0x3c/0x50 > <4>[2.119931] console_unlock.part.6+0x468/0x528 > <4>[2.119935] vprintk_emit+0x210/0x218 > <4>[2.119938] vprintk_default+0x48/0x58 > <4>[2.119942] vprintk_func+0xf0/0x1c0 > <4>[2.119945] printk+0x74/0x94 > <4>[2.119949] uart_add_one_port+0x374/0x4c8 > <4>[2.119952] msm_serial_probe+0x140/0x1c8 > <4>[2.119955] platform_drv_probe+0x58/0xb0 > <4>[2.119959] really_probe+0x1f4/0x3c8 > <4>[2.119963] driver_probe_device+0x12c/0x150 > <4>[2.119966] __driver_attach+0x144/0x148 > <4>[2.119969] bus_for_each_dev+0x78/0xe0 > <4>[2.119973] driver_attach+0x30/0x40 > <4>[2.119976] bus_add_driver+0x1c8/0x2a8 > <4>[2.119979] driver_register+0x68/0x118 > <4>[2.119983] __platform_driver_register+0x54/0x60 > <4>[2.119987] msm_serial_init+0x40/0x70 > <4>[2.119990] do_one_initcall+0x54/0x248 > <4>[2.119993] kernel_init_freeable+0x210/0x378 > <4>[2.119997] kernel_init+0x18/0x118 > <4>[2.12] ret_from_fork+0x10/0x1c > <0>[2.120004] Code: aa1e03e0 8b214273 97e61727 d503201f (b9400260) >
Re: Crash in msm serial on dragonboard with ftrace bootargs
On 10/16/2018 11:46 PM, Steven Rostedt wrote: [ Removed ivan.iva...@linaro.org due to getting mail delivery errors ] On Tue, 16 Oct 2018 23:35:00 +0530 Sai Prakash Ranjan wrote: Ok got it, this sounds fun. I'll give it a try. Awesome, I'm looking forward to seeing what you come up with. Now, I can trigger this crash reboot with "ftrace=function ftrace_filter=f*,g*,h*,i*,j*,k*,l*,m*,n*,o*,p*,q*" And pstore dmesg log gives slightly different backtrace than before(i.e., uart_add_one_port+0x374/0x4c8): <4>[1.976945] EINJ: ACPI disabled. <6>[2.070011] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled <6>[2.087353] SuperH (H)SCI(F) driver initialized <6>[2.093708] msm_serial 78af000.serial: msm_serial: detected port #1 <6>[2.093901] msm_serial 78af000.serial: uartclk = 1920 <6>[2.099078] 78af000.serial: ttyMSM1 at MMIO 0x78af000 (irq = 9, base_baud = 120) is a MSM <6>[2.108386] msm_serial 78b.serial: msm_serial: detected port #0 <6>[2.113076] msm_serial 78b.serial: uartclk = 7372800 <6>[2.119300] 78b.serial: ttyMSM0 at MMIO 0x78b (irq = 10, base_baud = 460800) is a MSM <0>[2.119769] Internal error: synchronous external abort: 9610 [#1] PREEMPT SMP <4>[2.119772] Modules linked in: <4>[2.119780] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc8-8-ge033b9909fff-dirty #2 <4>[2.119785] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT) <4>[2.119789] pstate: 4085 (nZcv daIf -PAN -UAO) <4>[2.119792] pc : msm_read.isra.2+0x20/0x50 <4>[2.119796] lr : msm_read.isra.2+0x1c/0x50 <4>[2.119799] sp : 08033760 <4>[2.119802] x29: 08033760 x28: 09486018 <4>[2.119809] x27: 0001 x26: 7dfffe7ff070 <4>[2.119816] x25: 0062 x24: 09486000 <4>[2.119824] x23: x22: 0978e190 <4>[2.119831] x21: 095e8228 x20: 0062 <4>[2.119838] x19: 7dfffe7ff008 x18: <4>[2.119845] x17: x16: <4>[2.119852] x15: 094a96c8 x14: 3d20647561625f65 <4>[2.119859] x13: 736162202c303120 x12: 3d20717269282030 <4>[2.119866] x11: 3030306238377830 x10: 0134 <4>[2.119874] x9 : 80003c356400 x8 : 08033680 <4>[2.119881] x7 : 80003c354100 x6 : 000d9b72 <4>[2.119888] x5 : 0150 x4 : 80003bc8f000 <4>[2.119895] x3 : 80003c356498 x2 : <4>[2.119902] x1 : 80003bf1 x0 : 0800 <0>[2.119911] Process swapper/0 (pid: 1, stack limit = 0x(ptrval)) <4>[2.119914] Call trace: <4>[2.119917] msm_read.isra.2+0x20/0x50 <4>[2.119920] msm_reset_dm_count+0x44/0x80 <4>[2.119924] __msm_console_write+0x1c8/0x1d0 <4>[2.119928] msm_serial_early_write_dm+0x3c/0x50 <4>[2.119931] console_unlock.part.6+0x468/0x528 <4>[2.119935] vprintk_emit+0x210/0x218 <4>[2.119938] vprintk_default+0x48/0x58 <4>[2.119942] vprintk_func+0xf0/0x1c0 <4>[2.119945] printk+0x74/0x94 <4>[2.119949] uart_add_one_port+0x374/0x4c8 <4>[2.119952] msm_serial_probe+0x140/0x1c8 <4>[2.119955] platform_drv_probe+0x58/0xb0 <4>[2.119959] really_probe+0x1f4/0x3c8 <4>[2.119963] driver_probe_device+0x12c/0x150 <4>[2.119966] __driver_attach+0x144/0x148 <4>[2.119969] bus_for_each_dev+0x78/0xe0 <4>[2.119973] driver_attach+0x30/0x40 <4>[2.119976] bus_add_driver+0x1c8/0x2a8 <4>[2.119979] driver_register+0x68/0x118 <4>[2.119983] __platform_driver_register+0x54/0x60 <4>[2.119987] msm_serial_init+0x40/0x70 <4>[2.119990] do_one_initcall+0x54/0x248 <4>[2.119993] kernel_init_freeable+0x210/0x378 <4>[2.119997] kernel_init+0x18/0x118 <4>[2.12] ret_from_fork+0x10/0x1c <0>[2.120004] Code: aa1e03e0 8b214273 97e61727 d503201f (b9400260) -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: Crash in msm serial on dragonboard with ftrace bootargs
[ Removed ivan.iva...@linaro.org due to getting mail delivery errors ] On Tue, 16 Oct 2018 23:35:00 +0530 Sai Prakash Ranjan wrote: > Ok got it, this sounds fun. I'll give it a try. Awesome, I'm looking forward to seeing what you come up with. -- Steve
Re: Crash in msm serial on dragonboard with ftrace bootargs
On 10/16/2018 11:18 PM, Steven Rostedt wrote: On Tue, 16 Oct 2018 23:06:24 +0530 Sai Prakash Ranjan wrote: On 10/16/2018 10:27 PM, Steven Rostedt wrote: OK, can you add to the command line: ftrace=function ftrace_filter=*schedule* to see if it's a specific function that may be causing the issue (but hopefully it's not one of the scheduling functions that caused it). Target boots fine with this. So its not scheduling functions that is causing it. Also I tried with ftrace_filter=*msm* just to be sure if tracing driver functions is causing any issue but its NOT. OK, seems that something is being traced that shouldn't be. When this happens after boot up, it's easy to bisect down to the problem function. But since it's at boot up, it will take a lot longer. I would suggest to start by going down the alphabet. ftrace_filter=a* ftrace_filter=b* ftrace_filter=c* [...] And at least find the letter the bad function starts with. Note, it could be more than one function (I've had that a couple of times), and to find that out, you can test with "ftrace_notrace". Say you find that the problem function starts with 'x'. You can do: ftrace_notrace=x* Which will trace all functions except those that start with an 'x', to make sure it still boots. Remember, you still need to have ftrace=function for all of this. Once you find the letter of the function, you can try the next letter, or perhaps come up with another method. I would say look at the functions in /sys/kernel/debug/tracing/available_filter_functions, but they don't list the init function (that can be traced). But you can use /proc/kallsyms instead. Ok got it, this sounds fun. I'll give it a try. Thanks -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: Crash in msm serial on dragonboard with ftrace bootargs
On Tue, 16 Oct 2018 23:06:24 +0530 Sai Prakash Ranjan wrote: > On 10/16/2018 10:27 PM, Steven Rostedt wrote: > > > > OK, can you add to the command line: > > > > ftrace=function ftrace_filter=*schedule* > > > > to see if it's a specific function that may be causing the issue (but > > hopefully it's not one of the scheduling functions that caused it). > > > > Target boots fine with this. So its not scheduling functions that is > causing it. Also I tried with ftrace_filter=*msm* just to be sure if > tracing driver functions is causing any issue but its NOT. OK, seems that something is being traced that shouldn't be. When this happens after boot up, it's easy to bisect down to the problem function. But since it's at boot up, it will take a lot longer. I would suggest to start by going down the alphabet. ftrace_filter=a* ftrace_filter=b* ftrace_filter=c* [...] And at least find the letter the bad function starts with. Note, it could be more than one function (I've had that a couple of times), and to find that out, you can test with "ftrace_notrace". Say you find that the problem function starts with 'x'. You can do: ftrace_notrace=x* Which will trace all functions except those that start with an 'x', to make sure it still boots. Remember, you still need to have ftrace=function for all of this. Once you find the letter of the function, you can try the next letter, or perhaps come up with another method. I would say look at the functions in /sys/kernel/debug/tracing/available_filter_functions, but they don't list the init function (that can be traced). But you can use /proc/kallsyms instead. -- Steve
Re: Crash in msm serial on dragonboard with ftrace bootargs
On 10/16/2018 10:27 PM, Steven Rostedt wrote: OK, can you add to the command line: ftrace=function ftrace_filter=*schedule* to see if it's a specific function that may be causing the issue (but hopefully it's not one of the scheduling functions that caused it). Target boots fine with this. So its not scheduling functions that is causing it. Also I tried with ftrace_filter=*msm* just to be sure if tracing driver functions is causing any issue but its NOT. Does it break if the crypto is not initialized? Perhaps add a command line flag to have it happen earlier: I didnt see any breakage, have been using ramoops with postcore_initcall for sometime now. ramoops=earlyinit and add a postcore_initcall that checks if that flag is set, and if so, it does the work then, and the late_initcall() will do nothing. That way, you can still have unmodified kernels use pstore when it crashes at boot up. Sounds good. Great, I guess you can write a patch to do that ;-) Sure I can :) but as Kees said it would be better if we could find a way to make it work with a late initialization of compression. I will try on that. Thanks, -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: Crash in msm serial on dragonboard with ftrace bootargs
On Tue, Oct 16, 2018 at 8:29 AM, Steven Rostedt wrote: > On Tue, 16 Oct 2018 17:08:25 +0530 > Sai Prakash Ranjan wrote: >> One more thing is for pstore dmesg-ramoops, I had to change >> late_initcall to postcore_initcall which brings the question as to why >> we changed to late_initcall? >> Simple git blame shows to support crypto compress api, but is it really >> helpful? A lot of boottime issues can be caught with pstore enabled at >> postcore_initcall rather than late_initcall, this backtrace >> is just one example. Is there any way we could change this? > > Does it break if the crypto is not initialized? Perhaps add a command > line flag to have it happen earlier: > > ramoops=earlyinit > > and add a postcore_initcall that checks if that flag is set, and if so, > it does the work then, and the late_initcall() will do nothing. > > That way, you can still have unmodified kernels use pstore when it > crashes at boot up. Even better, if we could find a way to make it work with a late initialization of compression (i.e. postcore_initcall() by default means anything caught would be uncompressed, but anything after late_initcall() would be compressed). I'd be very happy to review patches for that! -Kees -- Kees Cook Pixel Security
Re: Crash in msm serial on dragonboard with ftrace bootargs
On Tue, 16 Oct 2018 22:05:23 +0530 Sai Prakash Ranjan wrote: > On 10/16/2018 8:59 PM, Steven Rostedt wrote: > > On Tue, 16 Oct 2018 17:08:25 +0530 > > Sai Prakash Ranjan wrote: > > > >> Hi, > >> > >> On dragonboard 410c, with "ftrace=function" boot args, the console > >> output slows down and board resets without any backtrace as below. This > >> is tested on latest kernel and seems to exist even in older kernels as > >> well. > > > > So this only happens when ftrace=function is on the boot console. > > > > Yes. If I do not use boot console, target does not crash. > > > > > Does function tracing work after boot up? That is, without the > > ftrace=function, can you do: > > > > echo function > /sys/kernel/debug/tracing/current_tracer > > > > without any issue? > > > > Yes ftrace in general works without any issue. I have also tested on > db820c and sdm845 where "ftrace=function" works fine. I am seeing this > issue only on db410c board. OK, can you add to the command line: ftrace=function ftrace_filter=*schedule* to see if it's a specific function that may be causing the issue (but hopefully it's not one of the scheduling functions that caused it). > > >> > >> One more thing is for pstore dmesg-ramoops, I had to change > >> late_initcall to postcore_initcall which brings the question as to why > >> we changed to late_initcall? > >> Simple git blame shows to support crypto compress api, but is it really > >> helpful? A lot of boottime issues can be caught with pstore enabled at > >> postcore_initcall rather than late_initcall, this backtrace > >> is just one example. Is there any way we could change this? > > > > Does it break if the crypto is not initialized? Perhaps add a command > > line flag to have it happen earlier: > > > > I didnt see any breakage, have been using ramoops with postcore_initcall > for sometime now. > > > ramoops=earlyinit > > > > and add a postcore_initcall that checks if that flag is set, and if so, > > it does the work then, and the late_initcall() will do nothing. > > > > That way, you can still have unmodified kernels use pstore when it > > crashes at boot up. > > > > Sounds good. Great, I guess you can write a patch to do that ;-) -- Steve
Re: Crash in msm serial on dragonboard with ftrace bootargs
On 10/16/2018 8:59 PM, Steven Rostedt wrote: On Tue, 16 Oct 2018 17:08:25 +0530 Sai Prakash Ranjan wrote: Hi, On dragonboard 410c, with "ftrace=function" boot args, the console output slows down and board resets without any backtrace as below. This is tested on latest kernel and seems to exist even in older kernels as well. So this only happens when ftrace=function is on the boot console. Yes. If I do not use boot console, target does not crash. [2.949164] EINJ: ACPI disabled. [3.133001] Serial: 8250/16550 dri Format: Log Type - Time(microsec) - Message - Optional Info Log Type: B - Since Boot(Power On Reset), D - Delta, S - Statistic But with pstore enabled, able to get the below backtrace: <4>[2.949164] EINJ: ACPI disabled. <6>[3.133001] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled <6>[3.164097] SuperH (H)SCI(F) driver initialized <0>[3.164471] Internal error: synchronous external abort: 9610 [#1] PREEMPT SMP <4>[3.164479] Modules linked in: <4>[3.164495] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc8-8-ge033b9909fff-dirty #175 <4>[3.164501] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT) <4>[3.164508] pstate: 4085 (nZcv daIf -PAN -UAO) <4>[3.164514] pc : msm_read.isra.2+0x20/0x50 <4>[3.164520] lr : msm_read.isra.2+0x1c/0x50 <4>[3.164526] sp : 08033a50 <4>[3.164531] x29: 08033a50 x28: 09486018 <4>[3.164548] x27: 0001 x26: 7dfffe7ff070 <4>[3.164565] x25: 0034 x24: 09486000 <4>[3.164582] x23: x22: 0978e190 <4>[3.164599] x21: 095e8228 x20: 0034 <4>[3.164616] x19: 7dfffe7ff008 x18: <4>[3.164632] x17: x16: <4>[3.164649] x15: 094a96c8 x14: 8978e6bf <4>[3.164666] x13: 0978e6cd x12: 0038 <4>[3.164683] x11: 094c6000 x10: 0c24 <4>[3.164699] x9 : 80003c89b400 x8 : 08033970 <4>[3.164716] x7 : 8eb04100 x6 : 000af304 <4>[3.164732] x5 : 0c40 x4 : 80003c06f000 <4>[3.164750] x3 : 80003c89b498 x2 : <4>[3.164766] x1 : 80003ca68000 x0 : 0800 <0>[3.164785] Process swapper/0 (pid: 1, stack limit = 0x(ptrval)) <4>[3.164791] Call trace: <4>[3.164797] msm_read.isra.2+0x20/0x50 <4>[3.164804] msm_reset_dm_count+0x44/0x80 <4>[3.164810] __msm_console_write+0x1c8/0x1d0 <4>[3.164816] msm_serial_early_write_dm+0x3c/0x50 <4>[3.164823] console_unlock.part.6+0x468/0x528 <4>[3.164829] vprintk_emit+0x210/0x218 <4>[3.164835] vprintk_default+0x48/0x58 <4>[3.164841] vprintk_func+0xf0/0x1c0 <4>[3.164847] printk+0x74/0x94 <4>[3.164853] sci_init+0x24/0x3c <4>[3.164859] do_one_initcall+0x54/0x248 <4>[3.164866] kernel_init_freeable+0x210/0x378 <4>[3.164872] kernel_init+0x18/0x118 <4>[3.164878] ret_from_fork+0x10/0x1c <0>[3.164884] Code: aa1e03e0 8b214273 97e616f7 d503201f (b9400260) Seems to be issue with the msm serial driver and not ftrace. Could someone look into it. I'm guessing that there may have been an issue with ftrace, it tried to print, but the printing caused a bug that rebooted the box. Does function tracing work after boot up? That is, without the ftrace=function, can you do: echo function > /sys/kernel/debug/tracing/current_tracer without any issue? Yes ftrace in general works without any issue. I have also tested on db820c and sdm845 where "ftrace=function" works fine. I am seeing this issue only on db410c board. One more thing is for pstore dmesg-ramoops, I had to change late_initcall to postcore_initcall which brings the question as to why we changed to late_initcall? Simple git blame shows to support crypto compress api, but is it really helpful? A lot of boottime issues can be caught with pstore enabled at postcore_initcall rather than late_initcall, this backtrace is just one example. Is there any way we could change this? Does it break if the crypto is not initialized? Perhaps add a command line flag to have it happen earlier: I didnt see any breakage, have been using ramoops with postcore_initcall for sometime now. ramoops=earlyinit and add a postcore_initcall that checks if that flag is set, and if so, it does the work then, and the late_initcall() will do nothing. That way, you can still have unmodified kernels use pstore when it crashes at boot up. Sounds good. Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: Crash in msm serial on dragonboard with ftrace bootargs
On Tue, 16 Oct 2018 17:08:25 +0530 Sai Prakash Ranjan wrote: > Hi, > > On dragonboard 410c, with "ftrace=function" boot args, the console > output slows down and board resets without any backtrace as below. This > is tested on latest kernel and seems to exist even in older kernels as well. So this only happens when ftrace=function is on the boot console. > > [2.949164] EINJ: ACPI disabled. > [3.133001] Serial: 8250/16550 dri > Format: Log Type - Time(microsec) - Message - Optional Info > Log Type: B - Since Boot(Power On Reset), D - Delta, S - Statistic > > But with pstore enabled, able to get the below backtrace: > > <4>[2.949164] EINJ: ACPI disabled. > <6>[3.133001] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled > <6>[3.164097] SuperH (H)SCI(F) driver initialized > <0>[3.164471] Internal error: synchronous external abort: 9610 > [#1] PREEMPT SMP > <4>[3.164479] Modules linked in: > <4>[3.164495] CPU: 2 PID: 1 Comm: swapper/0 Not tainted > 4.19.0-rc8-8-ge033b9909fff-dirty #175 > <4>[3.164501] Hardware name: Qualcomm Technologies, Inc. APQ 8016 > SBC (DT) > <4>[3.164508] pstate: 4085 (nZcv daIf -PAN -UAO) > <4>[3.164514] pc : msm_read.isra.2+0x20/0x50 > <4>[3.164520] lr : msm_read.isra.2+0x1c/0x50 > <4>[3.164526] sp : 08033a50 > <4>[3.164531] x29: 08033a50 x28: 09486018 > <4>[3.164548] x27: 0001 x26: 7dfffe7ff070 > <4>[3.164565] x25: 0034 x24: 09486000 > <4>[3.164582] x23: x22: 0978e190 > <4>[3.164599] x21: 095e8228 x20: 0034 > <4>[3.164616] x19: 7dfffe7ff008 x18: > <4>[3.164632] x17: x16: > <4>[3.164649] x15: 094a96c8 x14: 8978e6bf > <4>[3.164666] x13: 0978e6cd x12: 0038 > <4>[3.164683] x11: 094c6000 x10: 0c24 > <4>[3.164699] x9 : 80003c89b400 x8 : 08033970 > <4>[3.164716] x7 : 8eb04100 x6 : 000af304 > <4>[3.164732] x5 : 0c40 x4 : 80003c06f000 > <4>[3.164750] x3 : 80003c89b498 x2 : > <4>[3.164766] x1 : 80003ca68000 x0 : 0800 > <0>[3.164785] Process swapper/0 (pid: 1, stack limit = > 0x(ptrval)) > <4>[3.164791] Call trace: > <4>[3.164797] msm_read.isra.2+0x20/0x50 > <4>[3.164804] msm_reset_dm_count+0x44/0x80 > <4>[3.164810] __msm_console_write+0x1c8/0x1d0 > <4>[3.164816] msm_serial_early_write_dm+0x3c/0x50 > <4>[3.164823] console_unlock.part.6+0x468/0x528 > <4>[3.164829] vprintk_emit+0x210/0x218 > <4>[3.164835] vprintk_default+0x48/0x58 > <4>[3.164841] vprintk_func+0xf0/0x1c0 > <4>[3.164847] printk+0x74/0x94 > <4>[3.164853] sci_init+0x24/0x3c > <4>[3.164859] do_one_initcall+0x54/0x248 > <4>[3.164866] kernel_init_freeable+0x210/0x378 > <4>[3.164872] kernel_init+0x18/0x118 > <4>[3.164878] ret_from_fork+0x10/0x1c > <0>[3.164884] Code: aa1e03e0 8b214273 97e616f7 d503201f (b9400260) > > Seems to be issue with the msm serial driver and not ftrace. > Could someone look into it. I'm guessing that there may have been an issue with ftrace, it tried to print, but the printing caused a bug that rebooted the box. Does function tracing work after boot up? That is, without the ftrace=function, can you do: echo function > /sys/kernel/debug/tracing/current_tracer without any issue? > > One more thing is for pstore dmesg-ramoops, I had to change > late_initcall to postcore_initcall which brings the question as to why > we changed to late_initcall? > Simple git blame shows to support crypto compress api, but is it really > helpful? A lot of boottime issues can be caught with pstore enabled at > postcore_initcall rather than late_initcall, this backtrace > is just one example. Is there any way we could change this? Does it break if the crypto is not initialized? Perhaps add a command line flag to have it happen earlier: ramoops=earlyinit and add a postcore_initcall that checks if that flag is set, and if so, it does the work then, and the late_initcall() will do nothing. That way, you can still have unmodified kernels use pstore when it crashes at boot up. -- Steve
Re: Crash in msm serial on dragonboard with ftrace bootargs
On 10/16/2018 5:14 PM, Greg Kroah-Hartman wrote: On Tue, Oct 16, 2018 at 05:08:25PM +0530, Sai Prakash Ranjan wrote: Hi, On dragonboard 410c, with "ftrace=function" boot args, the console output slows down and board resets without any backtrace as below. This is tested on latest kernel and seems to exist even in older kernels as well. [2.949164] EINJ: ACPI disabled. [3.133001] Serial: 8250/16550 dri Format: Log Type - Time(microsec) - Message - Optional Info Log Type: B - Since Boot(Power On Reset), D - Delta, S - Statistic But with pstore enabled, able to get the below backtrace: <4>[2.949164] EINJ: ACPI disabled. <6>[3.133001] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled <6>[3.164097] SuperH (H)SCI(F) driver initialized <0>[3.164471] Internal error: synchronous external abort: 9610 [#1] PREEMPT SMP <4>[3.164479] Modules linked in: <4>[3.164495] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc8-8-ge033b9909fff-dirty #175 <4>[3.164501] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT) <4>[3.164508] pstate: 4085 (nZcv daIf -PAN -UAO) <4>[3.164514] pc : msm_read.isra.2+0x20/0x50 <4>[3.164520] lr : msm_read.isra.2+0x1c/0x50 <4>[3.164526] sp : 08033a50 <4>[3.164531] x29: 08033a50 x28: 09486018 <4>[3.164548] x27: 0001 x26: 7dfffe7ff070 <4>[3.164565] x25: 0034 x24: 09486000 <4>[3.164582] x23: x22: 0978e190 <4>[3.164599] x21: 095e8228 x20: 0034 <4>[3.164616] x19: 7dfffe7ff008 x18: <4>[3.164632] x17: x16: <4>[3.164649] x15: 094a96c8 x14: 8978e6bf <4>[3.164666] x13: 0978e6cd x12: 0038 <4>[3.164683] x11: 094c6000 x10: 0c24 <4>[3.164699] x9 : 80003c89b400 x8 : 08033970 <4>[3.164716] x7 : 8eb04100 x6 : 000af304 <4>[3.164732] x5 : 0c40 x4 : 80003c06f000 <4>[3.164750] x3 : 80003c89b498 x2 : <4>[3.164766] x1 : 80003ca68000 x0 : 0800 <0>[3.164785] Process swapper/0 (pid: 1, stack limit = 0x(ptrval)) <4>[3.164791] Call trace: <4>[3.164797] msm_read.isra.2+0x20/0x50 <4>[3.164804] msm_reset_dm_count+0x44/0x80 <4>[3.164810] __msm_console_write+0x1c8/0x1d0 <4>[3.164816] msm_serial_early_write_dm+0x3c/0x50 <4>[3.164823] console_unlock.part.6+0x468/0x528 <4>[3.164829] vprintk_emit+0x210/0x218 <4>[3.164835] vprintk_default+0x48/0x58 <4>[3.164841] vprintk_func+0xf0/0x1c0 <4>[3.164847] printk+0x74/0x94 <4>[3.164853] sci_init+0x24/0x3c <4>[3.164859] do_one_initcall+0x54/0x248 <4>[3.164866] kernel_init_freeable+0x210/0x378 <4>[3.164872] kernel_init+0x18/0x118 <4>[3.164878] ret_from_fork+0x10/0x1c <0>[3.164884] Code: aa1e03e0 8b214273 97e616f7 d503201f (b9400260) Seems to be issue with the msm serial driver and not ftrace. Could someone look into it. As you have the hardware to reproduce this, and it has always been an issue, you are the best placed to help resolve this. I would definitely test if serial guys could point somewhere, not that the backtrace is not pointing :) but I am not aware of this serial driver. Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: Crash in msm serial on dragonboard with ftrace bootargs
On Tue, Oct 16, 2018 at 05:08:25PM +0530, Sai Prakash Ranjan wrote: > Hi, > > On dragonboard 410c, with "ftrace=function" boot args, the console output > slows down and board resets without any backtrace as below. This is tested > on latest kernel and seems to exist even in older kernels as well. > > [2.949164] EINJ: ACPI disabled. > [3.133001] Serial: 8250/16550 dri > Format: Log Type - Time(microsec) - Message - Optional Info > Log Type: B - Since Boot(Power On Reset), D - Delta, S - Statistic > > But with pstore enabled, able to get the below backtrace: > > <4>[2.949164] EINJ: ACPI disabled. > <6>[3.133001] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled > <6>[3.164097] SuperH (H)SCI(F) driver initialized > <0>[3.164471] Internal error: synchronous external abort: 9610 [#1] > PREEMPT SMP > <4>[3.164479] Modules linked in: > <4>[3.164495] CPU: 2 PID: 1 Comm: swapper/0 Not tainted > 4.19.0-rc8-8-ge033b9909fff-dirty #175 > <4>[3.164501] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC > (DT) > <4>[3.164508] pstate: 4085 (nZcv daIf -PAN -UAO) > <4>[3.164514] pc : msm_read.isra.2+0x20/0x50 > <4>[3.164520] lr : msm_read.isra.2+0x1c/0x50 > <4>[3.164526] sp : 08033a50 > <4>[3.164531] x29: 08033a50 x28: 09486018 > <4>[3.164548] x27: 0001 x26: 7dfffe7ff070 > <4>[3.164565] x25: 0034 x24: 09486000 > <4>[3.164582] x23: x22: 0978e190 > <4>[3.164599] x21: 095e8228 x20: 0034 > <4>[3.164616] x19: 7dfffe7ff008 x18: > <4>[3.164632] x17: x16: > <4>[3.164649] x15: 094a96c8 x14: 8978e6bf > <4>[3.164666] x13: 0978e6cd x12: 0038 > <4>[3.164683] x11: 094c6000 x10: 0c24 > <4>[3.164699] x9 : 80003c89b400 x8 : 08033970 > <4>[3.164716] x7 : 8eb04100 x6 : 000af304 > <4>[3.164732] x5 : 0c40 x4 : 80003c06f000 > <4>[3.164750] x3 : 80003c89b498 x2 : > <4>[3.164766] x1 : 80003ca68000 x0 : 0800 > <0>[3.164785] Process swapper/0 (pid: 1, stack limit = > 0x(ptrval)) > <4>[3.164791] Call trace: > <4>[3.164797] msm_read.isra.2+0x20/0x50 > <4>[3.164804] msm_reset_dm_count+0x44/0x80 > <4>[3.164810] __msm_console_write+0x1c8/0x1d0 > <4>[3.164816] msm_serial_early_write_dm+0x3c/0x50 > <4>[3.164823] console_unlock.part.6+0x468/0x528 > <4>[3.164829] vprintk_emit+0x210/0x218 > <4>[3.164835] vprintk_default+0x48/0x58 > <4>[3.164841] vprintk_func+0xf0/0x1c0 > <4>[3.164847] printk+0x74/0x94 > <4>[3.164853] sci_init+0x24/0x3c > <4>[3.164859] do_one_initcall+0x54/0x248 > <4>[3.164866] kernel_init_freeable+0x210/0x378 > <4>[3.164872] kernel_init+0x18/0x118 > <4>[3.164878] ret_from_fork+0x10/0x1c > <0>[3.164884] Code: aa1e03e0 8b214273 97e616f7 d503201f (b9400260) > > Seems to be issue with the msm serial driver and not ftrace. > Could someone look into it. As you have the hardware to reproduce this, and it has always been an issue, you are the best placed to help resolve this. Good luck! greg k-h
Crash in msm serial on dragonboard with ftrace bootargs
Hi, On dragonboard 410c, with "ftrace=function" boot args, the console output slows down and board resets without any backtrace as below. This is tested on latest kernel and seems to exist even in older kernels as well. [2.949164] EINJ: ACPI disabled. [3.133001] Serial: 8250/16550 dri Format: Log Type - Time(microsec) - Message - Optional Info Log Type: B - Since Boot(Power On Reset), D - Delta, S - Statistic But with pstore enabled, able to get the below backtrace: <4>[2.949164] EINJ: ACPI disabled. <6>[3.133001] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled <6>[3.164097] SuperH (H)SCI(F) driver initialized <0>[3.164471] Internal error: synchronous external abort: 9610 [#1] PREEMPT SMP <4>[3.164479] Modules linked in: <4>[3.164495] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc8-8-ge033b9909fff-dirty #175 <4>[3.164501] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT) <4>[3.164508] pstate: 4085 (nZcv daIf -PAN -UAO) <4>[3.164514] pc : msm_read.isra.2+0x20/0x50 <4>[3.164520] lr : msm_read.isra.2+0x1c/0x50 <4>[3.164526] sp : 08033a50 <4>[3.164531] x29: 08033a50 x28: 09486018 <4>[3.164548] x27: 0001 x26: 7dfffe7ff070 <4>[3.164565] x25: 0034 x24: 09486000 <4>[3.164582] x23: x22: 0978e190 <4>[3.164599] x21: 095e8228 x20: 0034 <4>[3.164616] x19: 7dfffe7ff008 x18: <4>[3.164632] x17: x16: <4>[3.164649] x15: 094a96c8 x14: 8978e6bf <4>[3.164666] x13: 0978e6cd x12: 0038 <4>[3.164683] x11: 094c6000 x10: 0c24 <4>[3.164699] x9 : 80003c89b400 x8 : 08033970 <4>[3.164716] x7 : 8eb04100 x6 : 000af304 <4>[3.164732] x5 : 0c40 x4 : 80003c06f000 <4>[3.164750] x3 : 80003c89b498 x2 : <4>[3.164766] x1 : 80003ca68000 x0 : 0800 <0>[3.164785] Process swapper/0 (pid: 1, stack limit = 0x(ptrval)) <4>[3.164791] Call trace: <4>[3.164797] msm_read.isra.2+0x20/0x50 <4>[3.164804] msm_reset_dm_count+0x44/0x80 <4>[3.164810] __msm_console_write+0x1c8/0x1d0 <4>[3.164816] msm_serial_early_write_dm+0x3c/0x50 <4>[3.164823] console_unlock.part.6+0x468/0x528 <4>[3.164829] vprintk_emit+0x210/0x218 <4>[3.164835] vprintk_default+0x48/0x58 <4>[3.164841] vprintk_func+0xf0/0x1c0 <4>[3.164847] printk+0x74/0x94 <4>[3.164853] sci_init+0x24/0x3c <4>[3.164859] do_one_initcall+0x54/0x248 <4>[3.164866] kernel_init_freeable+0x210/0x378 <4>[3.164872] kernel_init+0x18/0x118 <4>[3.164878] ret_from_fork+0x10/0x1c <0>[3.164884] Code: aa1e03e0 8b214273 97e616f7 d503201f (b9400260) Seems to be issue with the msm serial driver and not ftrace. Could someone look into it. One more thing is for pstore dmesg-ramoops, I had to change late_initcall to postcore_initcall which brings the question as to why we changed to late_initcall? Simple git blame shows to support crypto compress api, but is it really helpful? A lot of boottime issues can be caught with pstore enabled at postcore_initcall rather than late_initcall, this backtrace is just one example. Is there any way we could change this? Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation