Re: [edk2] EFI GOP with manual vsync trigger

Michael Zimmermann Fri, 13 Nov 2015 22:54:04 -0800

I think I've made some progress.
Thx to the fact that the only thing stuck is the ARM thread mode and the
timer interrupts are still fired(I verified that my callback is called from
an interrupt, being it a fluke or not) I could easily dump the current PC
of the DxeCore context.


This way I discovered that the code is stuck in the delay functions of the
timerLib because they're not thread safe.

The reason is that a hw counter gets reset and started for this and if this
happens from an interrupt(UART uses delay functions) while the thread mode
is whiting for the counter to reach an specific value it will just wait
forever because the counter gets disabled after the delay.

Code:
usecs = (usecs * 33 + 1000 - 33) / 1000;

writel(0, GPT_CLEAR);
writel(0, GPT_ENABLE);
while (readl(GPT_COUNT_VAL) != 0) ;

writel(GPT_ENABLE_EN, GPT_ENABLE);
while (readl(GPT_COUNT_VAL) < usecs) ;

writel(0, GPT_ENABLE);
writel(0, GPT_CLEAR);

I guess I need to try keeping the counter running and detect when it rolled
over to make it thread-safe.

Michael

On Fri, Nov 13, 2015 at 1:14 AM, Andrew Fish <af...@apple.com> wrote:

>
> > On Nov 12, 2015, at 11:21 AM, Michael Zimmermann <
> sigmaepsilo...@gmail.com> wrote:
> >
> > thx for this information.
> > I don't have any debug hw but UART.
> >
> > After some careful DEBUG printing and tracing the call stack using
> > '__builtin_return_address(0)'
>
> FYI the edk2 has a portable form of this RETURN_ADDRESS(0)
>
> > I found that the application's dead happens
> > at 'WaitForEvent'
>
> The WaitForEvent() is what happens at the shell prompt when it is waiting
> for keyboard input. So that is functioning as intended. gBS->WaitForEvent()
> is the power efficient way to wait.
>
> If you look at CoreWaitForEvent() it checks all the events, and then calls
> CoreSignalEvent with the gIdleLoopEvent. If the DXE Cpu driver supports
> this feature gIdleLoopEvent will put the CPU in a low power state until the
> next timer tick. Since  CoreWaitForEvent() checked all the event state it
> can not change until the next timer tick so this lets the system spend its
> idle time in a lower power mode.
>
>
> https://github.com/tianocore/edk2/blob/master/MdeModulePkg/Core/Dxe/Event/Event.c#L685
>   for(;;) {
>
>     for(Index = 0; Index < NumberOfEvents; Index++) {
>
>       Status = CoreCheckEvent (UserEvents[Index]);
>
>       //
>       // provide index of event that caused problem
>       //
>       if (Status != EFI_NOT_READY) {
>         if (UserIndex != NULL) {
>           *UserIndex = Index;
>         }
>         return Status;
>       }
>     }
>
>     //
>     // Signal the Idle event
>     //
>     CoreSignalEvent (gIdleLoopEvent);
>   }
>
> So if you are really stuck here it could imply your Simple Text In is
> hung? Maybe your issue is data corruption vs an event issue?
>
> > and that there aren't any long running events(it returns
> > from CoreDispatchEventNotifies and CoreRestoreTpl everytime).
> >
> > I'm not sure this is always the case though because without the DEBUG's
> the
> > application can even stop when the edk2 shell is printing the device map
> > and WaitForEvent usually is only used for key input.
> >
> > isn't there some GCC option to save the stacktrace so I can dump it after
> > some seconds runtime(when the system died) and just print the whole
> > backtrace?(like the linux kernel does).
> >
>
> Well this is when it gets tricky. For VC++ you can only _ReturnAddress(),
> and that is equivalent of __builtin_return_address(0) on the GCC side. But
> it gets worse... To get a stack trace for X64 with VC++ you need symbols as
> the unwind is in the .PDB file.
>
> For clang, and I think with the right GCC compiler flags, you can walk the
> stack as each routine saves the frame pointer on entry. This is a simple
> X64 C function that just returns 0. The push of the %rbp and saving %rsp
> allow the stack to be unwound in software.
>
>         pushq   %rbp
>         movq    %rsp, %rbp
>         xorl            %eax, %eax
>         popq    %rbp
>         retq
>
> So it is possible to unwind the stack in software.
>
> Simple answer is try RETURN_ADDRESS(1), RETURN_ADDRESS(2), etc. If it is
> supported the compiler will generate the code to unwind for you.
>
> Complex answer is write the stack unwind code, if it is possible? Which
> Processor and I assume you are using GCC?
>
> You can use the GetImageName() function in
> https://github.com/tianocore/edk2/blob/master/ArmPkg/Library/DefaultExceptionHandlerLib/DefaultExceptionHandlerUefi.c#L64
> to see how you can use the PE/COFF library functions to convert a
> FaultAddress to the ImageBase, and name of the image. If you know the
> offset into the image relative to the start you can load image in gdb from
> the build output directory and see what code it maps to.
>
> The X86 version of all this lives:
> https://github.com/tianocore/edk2/blob/master/UefiCpuPkg/Library/CpuExceptionHandlerLib/CpuExceptionCommon.c
>
> Thanks,
>
> Andrew Fish
>
> > On Thu, Nov 12, 2015 at 7:39 PM, Andrew Fish <af...@apple.com> wrote:
> >
> >>
> >> On Nov 12, 2015, at 9:05 AM, Michael Zimmermann <
> sigmaepsilo...@gmail.com>
> >> wrote:
> >>
> >> thx, this perfectly explains my situation(that EDK2 shell stops while
> >> printing sth. like the map or waiting for this 5s startup.nsh timeout).
> >>
> >> So this means that processing the event queue is caused by any API call
> >> which never returns to the DXE phase for some reason.
> >>
> >>
> >> You should also check your code looking for TPL violations. Calling EFI
> >> Services at to high a TPL can lead to undefined behavior.
> >>
> >> If you look at the UEFI 2.5 spec, Section 6.1 Event, Timer, and Task
> >> Priority Services Table 23 lists TPL Restrictions. You can check that
> your
> >> code is not violating any of these restrictions. The natural reaction
> for
> >> some one having performance issues is to cheat and try to elevate their
> >> TPL.
> >>
> >>
> >> ...
> >>
> >> The class of bugs created by calling EFI Services at to hight a TPL are
> >> usually reentrancy related. Basically the locks in the DXE Core raise
> TPL
> >> to protect critical sections, like the protocol data base or event
> queue.
> >> Calling at an illegal TPL can cause corruption of some of these
> structures,
> >> and thus undefined behavior.
> >>
> >> Another possibility is an event TPL deadlock. When your event runs at a
> >> TPL only events of a higher TPL can run. Your event is blocking code at
> <=
> >> current TPL from running. So if your event is waiting on something to
> >> complete that needs to run at <= the current TPL that code is starved
> and
> >> will never get run. The TPL Restriction table is the guide here also.
> For
> >> example if you driver implements Block IO it can be called at TPL <=
> >> TPL_CALLBACK, so if your driver has an event that needs to complete to
> make
> >> forward progress it needs to run at TPL_NOTIFY.
> >>
> >> Since you are producing GOP, there is going to be system code that
> >> implements Simple Text Output on top of GOP so you inherit those TPL
> >> restrictions.
> >>
> >> Is there a special way to debug the event system or do I have to put
> DEBUG
> >> calls all over the place?
> >>
> >>
> >> Actually putting DEBUG prints all over the place can make it worse as
> you
> >> can increase the amount of time spent processing events.
> >>
> >> I'm not sure if you have a debugger, but if you do....
> >> 1) I'd break in a few times during the hang to try and get an idea of
> >> which event is taking all the time from the back trace
> >> 2) Pay attention to the TPL when you break in, dump gEfiCurrentTpl
> >> 3) You can walk the gEventQueue[] LIST_ENTRY to see what events are
> >> active. >= your current TPL.
> >> 4) As I mentioned I have lldb scripts to dump queues so you can figure
> out
> >> the high frequency timer functions and inspect that code.
> >>
> >> If you don't have a debugger...
> >> 1) Add DEBUG print to CoreCreateEventInternal() and print out info about
> >> timer events that get created. Then you can look at the timer event
> >> functions.
> >> 2) CoreDispatchEventNotifies() calls the event notify functions.
> >>  Event->NotifyFunction (Event, Event->NotifyContext);
> >> You can use the TimerLib to try and figure out what event is taking al
> the
> >> time. You can DEBUG print the slowest function, and maybe only do it
> every
> >> Nth time so you don't do it 1,000 times a second.
> >> You could also add code to figure out if the function is running longer
> >> than (or some large percentage of) its period.
> >>
> >>
> >> Caveat Emptor ....
> >> Assume it is your event and figure out if you are doing something slow
> at
> >> an elevated TPL,  calling a service at too hight a TPL., or have TPL
> >> related deadlock based on the TPL restrictions.
> >>
> >> Thanks,
> >>
> >> Andrew Fish
> >>
> >> PS Sorry if this is more detail than you needed, but I figured other
> folks
> >> might find this useful.
> >>
> >>
> >> On Thu, Nov 12, 2015 at 5:44 PM, Andrew Fish <af...@apple.com> wrote:
> >>
> >>>
> >>>> On Nov 12, 2015, at 3:28 AM, Andrew Fish <af...@apple.com> wrote:
> >>>>
> >>>>>
> >>>>> On Nov 12, 2015, at 3:22 AM, Andrew Fish <af...@apple.com> wrote:
> >>>>>
> >>>>>
> >>>>>> On Nov 12, 2015, at 12:49 AM, Michael Zimmermann <
> >>> sigmaepsilo...@gmail.com> wrote:
> >>>>>>
> >>>>>> Stall was just an example, I can also use DEBUG. My timer interval
> is
> >>>>>> 100ms(I converted it from the 100ns unit).
> >>>>>>
> >>>>>> What I meant with "thread mode code" is the "normal" non IRQ context
> >>> code
> >>>>>> running on the CPU. That means that there are actually two contexts
> in
> >>>>>> EDK2, the exception(i.e. IRQ's like the Timer/Watchdog) and the
> >>> normal mode
> >>>>>> all code is running in.
> >>>>>>
> >>>>>
> >>>>> OK thanks that helps. The timer ISR runs in interrupt context, and
> >>> from an ARM point of view EFI is running the ARM "Thread mode".  The
> ISR
> >>> context is an implementation detail and not really defined by the
> >>> specification.
> >>>>>
> >>>>> I'd also point out the Events dispatch independent of the interrupt
> >>> context. gBS->RestoreTpl() can cause events to dispatch. For example a
> lot
> >>> of the EFI Protocol services have locks that raise TPL to prevent
> >>> recursion. When these locks are released and the TPL is restored
> events are
> >>> dispatched. So just calling EFI services can cause events to run. So
> >>> conceptually you can ignore the interrupt context, as that is used to
> >>> implement the timer tick. Every thing else is an event that runs in the
> >>> main EFI context.
> >>>>>
> >>>>>
> >>>
> https://github.com/tianocore/edk2/blob/master/MdeModulePkg/Core/Dxe/Event/Tpl.c#L126
> >>>>>
> >>>>
> >>>> As Kinney pointed out the events are cooperative and there is no
> >>> scheduler, so if you are getting stuck some chunk of code is running
> too
> >>> long at an elevated TPL. You may need to performance profile to figure
> out
> >>> the bad code.
> >>>>
> >>>
> >>> The events are managed by queues in the DXE Core. The events are
> >>> described by the IEVENT data structure and linked into queues based on
> >>> state. The state transition to the event is calling a C function
> stored in
> >>> the IEVENT. The event must return, or call an EFI service, to give
> control
> >>> back to the DXE Core.
> >>>
> >>>
> >>>
> https://github.com/tianocore/edk2/blob/master/MdeModulePkg/Core/Dxe/Event/Event.c
> >>>
> >>> ///
> >>> /// gEventQueueLock - Protects the event queues
> >>> ///
> >>> EFI_LOCK gEventQueueLock = EFI_INITIALIZE_LOCK_VARIABLE
> (TPL_HIGH_LEVEL);
> >>>
> >>> ///
> >>> /// gEventQueue - A list of event's to notify for each priority level
> >>> ///
> >>> LIST_ENTRY      gEventQueue[TPL_HIGH_LEVEL + 1];
> >>>
> >>> ///
> >>> /// gEventPending - A bitmask of the EventQueues that are pending
> >>> ///
> >>> UINTN           gEventPending = 0;
> >>>
> >>> ///
> >>> /// gEventSignalQueue - A list of events to signal based on EventGroup
> >>> type
> >>> ///
> >>> LIST_ENTRY      gEventSignalQueue = INITIALIZE_LIST_HEAD_VARIABLE
> >>> (gEventSignalQueue);
> >>>
> >>> It is possible to write a debugger script to dump out the info about
> the
> >>> events if you have source level debug available for the DXE Core.
> >>>
> >>> Thanks,
> >>>
> >>> Andrew Fish
> >>>
> >>>> Thanks,
> >>>>
> >>>> Andrew Fish
> >>>>
> >>>>
> >>>>> Thanks,
> >>>>>
> >>>>> Andrew Fish
> >>>>>
> >>>>>
> >>>>>> On Thu, Nov 12, 2015 at 9:27 AM, Andrew Fish <af...@apple.com>
> wrote:
> >>>>>>
> >>>>>>>
> >>>>>>>> On Nov 11, 2015, at 11:51 PM, Michael Zimmermann <
> >>>>>>> sigmaepsilo...@gmail.com> wrote:
> >>>>>>>>
> >>>>>>>> I've started investigating in the timer event problem and I think
> I
> >>> have
> >>>>>>>> some weird problem with my platform drivers(I hope, so it's not a
> >>> EDK2
> >>>>>>> bug).
> >>>>>>>>
> >>>>>>>> If I create a timer that runs every 100ms which does nothing but a
> >>>>>>>> stall(1), the thread mode code stops after some random
> time(usually
> >>> in
> >>>>>>> edk2
> >>>>>>>> shell so I guess it's a race condition which needs some cpu load).
> >>>>>>>>
> >>>>>>>
> >>>>>>> Michael,
> >>>>>>>
> >>>>>>> Watch out as Stall() is Microseconds, and SetTimer() is 100ns I've
> >>> seen
> >>>>>>> bugs like that before in code.
> >>>>>>>
> >>>>>>>> When threadmode code is stopped the timer continues getting called
> >>> and
> >>>>>>> even
> >>>>>>>> if I stop the timer afterwards(with CloseEvent) it keeps being
> >>> stopped.
> >>>>>>>>
> >>>>>>>> Is there a way to get the threadmode context from inside a timer
> >>>>>>> callback?
> >>>>>>>> This way I could read the PC to check what's going on.
> >>>>>>>>
> >>>>>>>
> >>>>>>> There are no threads. EFI is an event model. If your code is
> running
> >>> you
> >>>>>>> are blocking every one else's forward progress. Only code running
> at
> >>> a
> >>>>>>> higher TPL can preempt. So when your event is running it is
> blocking
> >>> the
> >>>>>>> main flow and any event trying to run at <= TPL of your event from
> >>> making
> >>>>>>> forward progress. gBS->Stall() does not yield, it is no different
> >>> than
> >>>>>>> running code.
> >>>>>>>
> >>>>>>> So there is only one context, you can print it out any time you
> want.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>>
> >>>>>>> Andrew Fish
> >>>>>>>
> >>>>>>> PS When folks yell at us for not having threads in EFI we point
> them
> >>> at:
> >>>>>>> https://web.stanford.edu/~ouster/cgi-bin/papers/threads.pdf
> >>>>>>>
> >>>>>>>
> >>>>>>>> Michael
> >>>>>>>>
> >>>>>>>> On Wed, Nov 11, 2015 at 8:10 PM, Kinney, Michael D <
> >>>>>>>> michael.d.kin...@intel.com> wrote:
> >>>>>>>>
> >>>>>>>>> Michael,
> >>>>>>>>>
> >>>>>>>>> A periodic event timer at 30 times a second should not cause
> pauses
> >>>>>>>>> forever, unless the action you are performing in the event
> >>> notification
> >>>>>>>>> function takes more than 1/30 of a second to complete.  You
> should
> >>> be
> >>>>>>> able
> >>>>>>>>> to just add a periodic event handler that does nothing, so you
> can
> >>>>>>> measure
> >>>>>>>>> what the overhead is.
> >>>>>>>>>
> >>>>>>>>> Another option is to use the performance counter in the TimerLib
> >>> each
> >>>>>>> time
> >>>>>>>>> a Blt() is called (GetPerformanceCounterProperties() and
> >>>>>>>>> GetPerformanceCounter()).  When Blt() is called frequently, the
> >>> amount
> >>>>>>> of
> >>>>>>>>> time since last vsync will have elapsed, and you can go the vsync
> >>> action
> >>>>>>>>> within the Blt() call.  If you also set a one shot timer event,
> so
> >>> if
> >>>>>>> the
> >>>>>>>>> last call to Blt() did not do a vsync and there are no more Blt()
> >>> calls,
> >>>>>>>>> 1/30th of a second later, the vsync action can be done.  Every
> time
> >>>>>>> Blt()
> >>>>>>>>> is called, the one shot timer can be re-armed.  This way, the one
> >>> shot
> >>>>>>>>> timer event is not actually executed very often.
> >>>>>>>>>
> >>>>>>>>> Mike
> >>>>>>>>>
> >>>>>>>>>> -----Original Message-----
> >>>>>>>>>> From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On
> >>> Behalf Of
> >>>>>>>>> Michael Zimmermann
> >>>>>>>>>> Sent: Wednesday, November 11, 2015 12:32 AM
> >>>>>>>>>> To: edk2-devel@lists.01.org
> >>>>>>>>>> Subject: [edk2] EFI GOP with manual vsync trigger
> >>>>>>>>>>
> >>>>>>>>>> Hi,
> >>>>>>>>>>
> >>>>>>>>>> my Graphics HW uses a manual vsync trigger. That means that
> after
> >>>>>>> drawing
> >>>>>>>>>> to the framebuffer I need to manually trigger vsync(you can
> >>> compare it
> >>>>>>> to
> >>>>>>>>>> switching between double buffers).
> >>>>>>>>>>
> >>>>>>>>>> The problem is that UEFI's GraphicsOutputProtocol(GOP) doesn't
> >>> take
> >>>>>>> care
> >>>>>>>>> of
> >>>>>>>>>> HW that needs a flush.
> >>>>>>>>>> While issuing the trigger after every Blt Operation works, this
> >>>>>>> obviously
> >>>>>>>>>> causes extremely slow rendering for applications like the Shell
> >>> which
> >>>>>>> call
> >>>>>>>>>> Blt very often(like for every character).
> >>>>>>>>>>
> >>>>>>>>>> Also I can't use a timer to set the trigger(like 30times a
> second)
> >>>>>>> because
> >>>>>>>>>> it takes too much time and the Timer Interrupt ends up consuming
> >>> too
> >>>>>>> much
> >>>>>>>>>> time and the "normal" code gets paused forever.
> >>>>>>>>>>
> >>>>>>>>>> Do you have any other ideas how to handle this?
> >>>>>>>>>>
> >>>>>>>>>> Thx
> >>>>>>>>>> Michael
> >>>>>>>>>> _______________________________________________
> >>>>>>>>>> edk2-devel mailing list
> >>>>>>>>>> edk2-devel@lists.01.org
> >>>>>>>>>> https://lists.01.org/mailman/listinfo/edk2-devel
> >>>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> edk2-devel mailing list
> >>>>>>>> edk2-devel@lists.01.org
> >>>>>>>> https://lists.01.org/mailman/listinfo/edk2-devel
> >>>>>>>
> >>>>>>>
> >>>>>> _______________________________________________
> >>>>>> edk2-devel mailing list
> >>>>>> edk2-devel@lists.01.org
> >>>>>> https://lists.01.org/mailman/listinfo/edk2-devel
> >>>>>
> >>>>> _______________________________________________
> >>>>> edk2-devel mailing list
> >>>>> edk2-devel@lists.01.org
> >>>>> https://lists.01.org/mailman/listinfo/edk2-devel
> >>>
> >>>
> >>
> >>
> >>
> > _______________________________________________
> > edk2-devel mailing list
> > edk2-devel@lists.01.org
> > https://lists.01.org/mailman/listinfo/edk2-devel
>
>
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Re: [edk2] EFI GOP with manual vsync trigger

Reply via email to