Hi,

On 14/11/2017 15:41, Brian J. Johnson wrote:
On 11/14/2017 11:23 AM, Andrew Fish wrote:

On Nov 14, 2017, at 8:33 AM, Brian J. Johnson <brian.john...@hpe.com <mailto:brian.john...@hpe.com>> wrote:

On 11/14/2017 09:37 AM, Paulo Alcantara wrote:
Hi Fan,
On 14/11/2017 12:03, Fan Jeff wrote:
Paul,

I like this feature very much. Actually, I did some POC one year ago but I did finalize it.

In my POC, I could use EBP to tack the stack frame on IAS32 arch.

But for x64, I tried to use –keepexceptiontable flag to explain stack frame from the debug section of image.

I may workson MSFT toolchain, but it did now work well for GCC toolchain.

I think Eric could help to verify MSFT for your patch. If it works well, that’s will be great!

Say again, I like this feature!!!:-)
Cool! Your help would be really appreciable! If we get this working for X64 in both toolchains, that should be easy to port it to IA-3 2 as well.
Thank you very much for willing to help on that.
Paulo

Great feature!  You do need some sort of sanity check on the RIP and RBP values, though, so if the stack gets corrupted or the RIP is nonsense from following a bad pointer, you don't start dereferencing garbage addresses and trigger an exception loop.


Brian,

This was a long time ago and my memory might be fuzzy.... I think we talked to some debugger folks about unwinding the stack and they mentioned it was common for the C runtime to have a return address or frame pointer have a zero value so the unwind logic knows when to stop. This is in addition to generic sanity checking.

We got an extra push $0 added to the stack switch to help with stack unwind. https://github.com/tianocore/edk2/blob/master/MdePkg/Library/BaseLib/X64/SwitchStack.S

If might be a good idea to have a PCD for the max number of stack frames to display as a fallback for the error check. For X64 you may also have to add a check for a non-cononical address as that will GP fault.


Good idea.

Regarding sanity checks:  I've had good luck validating code locations (EIP values) by using a modified PeCoffExtraActionLib to track the top and bottom of the range where images have been loaded.  (I've actually used two ranges:  one for code executed from firmware space, and one for code executed from RAM.)

I'm not sure offhand if there's a platform-independent way to validate stack pointer values.  For most PC-like systems, just ensuring that it's larger than 1 or 2M (to avoid NULL pointers and the legacy spaces) and less than about 3G (or the low memory size, if that's known) may be enough to avoid an exception loop.

Yeah, I agree with you guys. We certainly should be validating the RIP and RSP values and then avoiding the exception loop.

For the RIP value, I think we should validate it in PeCoffSearchImageBase(), so if it's outside PE/COFF image's address space, then we should return an address of zero and no trace would be printed out.

Since we already have a "SizeOfImage" field in PE/COFF Optional Header and it's available in the process' image, we might end up with checking whether RIP is between ImageBase through ImageBase + SizeOfImage - 1.

For the RSP value, I have no idea :-)

Thanks!
Paulo


Brian

Thanks,

Andrew Fish


For at least some versions of Microsoft's IA32 compiler, it's possible to compile using EBP as a stack frame base pointer (like gcc) by using the "/Oy-" switch.  The proposed unwind code should work in that case. The X64 compiler doesn't support this switch, though.

AFAIK the only way to unwind the stack with Microsoft's X64 compilers is to parse the unwind info in the .pdata and .xdata sections.  Genfw.exe usually strips those sections, but the "--keepexceptiontable" flag will preserve them, as Jeff pointed out.  I've looked hard for open source code to decode them, but haven't found any, even though the format is well documented.  And I haven't gotten around to writing it myself.  I'd love it if someone could contribute the code!

Another possibility is to use the branch history MSRs available on some x86-family processors.  Recent Intel processors can use them as a stack, as opposed to a circular list, so they can record a backtrace directly. (I'm not familiar with AMD processors' capabilities.)  You can enable call stack recording like this:

 #define LBR_ON_FLAG   0x0000000000000001
 #define IA32_DEBUGCTL 0x1D9
 #define CALL_STACK_SET_FLAG 0x3C4
 #define CALL_STACK_CLR_FLAG 0xFC7
 #define MSR_LBR_SELECT 0x1C8

 //
 // Enable branch recording
 //
 LbControl = AsmReadMsr64 ((UINT32)IA32_DEBUGCTL);
 LbControl |= LBR_ON_FLAG;
 AsmWriteMsr64 ((UINT32)IA32_DEBUGCTL, LbControl);

 //
 // Configure for call stack
 //
 LbSelect = AsmReadMsr64 ((UINT32)MSR_LBR_SELECT);
 LbSelect &= CALL_STACK_CLR_FLAG;
 LbSelect |= CALL_STACK_SET_FLAG;
 AsmWriteMsr64((UINT32)MSR_LBR_SELECT, LbSelect);

The EIP/RIP values are logged in MSR_SANDY_BRIDGE_LASTBRANCH_n_FROM_IP and MSR_SANDY_BRIDGE_LASTBRANCH_n_TO_IP, and the current depth is tracked in MSR_LASTBRANCH_TOS.  This works quite well.  Gen10 (Sky Lake) processors support 32 LASTBRANCH_n MSR pairs, which is sufficient in almost all cases.

Different processor generations have different branch recording capabilities, and different numbers of LASTBRANCH_n MSRs; see Intel's manuals for details.

Thanks,
Brian


Thanks!

Jeff

*发件人: *Paulo Alcantara <mailto:pca...@zytor.com>
*发送时间: *2017年11月14日21:23
*收件人: *edk2-devel@lists.01.org <mailto:edk2-devel@lists.01.org> <mailto:edk2-devel@lists.01.org> *抄送: *Rick Bramley <mailto:richard.bram...@hp.com>; Laszlo Ersek <mailto:ler...@redhat.com>; Andrew Fish <mailto:af...@apple.com>; Eric Dong <mailto:eric.d...@intel.com> *主题: *Re: [edk2] [RFC 0/1] Stack trace support in X64 exception handling

Hi,

On 14/11/2017 10:47, Paulo Alcantara wrote:
Hi,

This series adds stack trace support during a X64 CPU exception.

Informations like back trace, stack contents and image module names
(that were part of the call stack) will be dumped out.

We already have such support in ARM/AArch64 (IIRC) exception handling
(thanks to Ard), and then I thought we'd also deserve it in X64 and
IA-32 platforms.

What do you think guys?

BTW, I've tested this only with OVMF (X64 only), using:
- gcc-6.3.0, GCC5, NOOPT

Any other tests  would be really appreciable.

I've attached a file to show you how the trace would look like.

Thanks!
Paulo


Thanks!
Paulo

Repo: https://github.com/pcacjr/edk2.git
Branch: stacktrace_x64

Cc: Rick Bramley <richard.bram...@hp.com <mailto:richard.bram...@hp.com>>
Cc: Andrew Fish <af...@apple.com <mailto:af...@apple.com>>
Cc: Eric Dong <eric.d...@intel.com <mailto:eric.d...@intel.com>>
Cc: Laszlo Ersek <ler...@redhat.com <mailto:ler...@redhat.com>>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Paulo Alcantara <pca...@zytor.com <mailto:pca...@zytor.com>>
---

Paulo Alcantara (1):
UefiCpuPkg/CpuExceptionHandlerLib/X64: Add stack trace support

  UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ArchExceptionHandler.c | 344 +++++++++++++++++++-
1 file changed, 342 insertions(+), 2 deletions(-)


_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org <mailto:edk2-devel@lists.01.org>
https://lists.01.org/mailman/listinfo/edk2-devel


--

                                               Brian

--------------------------------------------------------------------

  "Most people would like to be delivered from temptation but would
   like it to keep in touch."
                                          -- Robert Orben
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org <mailto:edk2-devel@lists.01.org>
https://lists.01.org/mailman/listinfo/edk2-devel



_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to