Hi folks,
I'd like to keep the option of wrapping Perfmon's spinlocks,
especially those that do irqsave/restore. Why? Because it could allow
the implementation of NMI's on systems that don't support them in
hardware.
Phil
On May 13, 2008, at 2:41 AM, Corey J Ashford wrote:
Hi Stephane,
I tried out changing the wrapper for the PMU interrupt from
STD_EXCEPTION_PSERIES to MASKABLE_EXCEPTION_PSERIES then ran the
test which usually crashed within a minute or two, and it ran for 20
minutes non-stop. Then I tried removing POWER's defines for
pfm_spin_lock_irqsave (and so on), and that worked as expected - no
problems.
After seeing all of the complications working around the issue of
the wrapper used for the PMU interrupt, I'm about 95% convinced that
we should go with an #ifdef for CONFIG_PERFMON to use the
MASKABLE_EXCEPTION_PSERIES wrapper, and then remove POWER's
definitions for the locking macros. Optionally, we could get rid of
the pfm_spin* macros as well from perfmon, simplifying the code
(back to what it was originally).
The only downside of this is that if someone configures their kernel
with perfmon and then uses Oprofile, they will not get any samples
during interrupt disabled code in the kernel.
Any thoughts?
- Corey
Corey Ashford
Software Engineer
IBM Linux Technology Center, Linux Toolchain
Beaverton, OR
503-578-3507
[EMAIL PROTECTED]
[EMAIL PROTECTED] wrote on 05/12/2008
03:34:16 PM:
> Hi Stephane,
>
> Thanks for your response.
>
> I was thinking of trying out a kernel without the change, but I have
> a strong suspicion that we are in the interrupt handler because of
> the dreaded "soft interrupt disabling" optimization in POWER.
>
> I'm going to start looking at the schedule code, but I'm afraid we
> might be in a bind here, because the interrupt disable call is in
> common code, rather than in perfmon code where we could change it to
> a hard disable.
>
> I may have to start pushing for the PMU interrupt to use the
> MASKABLE_EXCEPTION_PSERIES wrapper, perhaps only when
CONFIG_PERFMON is true.
>
> Regards,
>
> - Corey
>
> Corey Ashford
> Software Engineer
> IBM Linux Technology Center, Linux Toolchain
> Beaverton, OR
> 503-578-3507
> [EMAIL PROTECTED]
>
> "stephane eranian" <[EMAIL PROTECTED]> wrote on 05/12/2008
01:58:23 PM:
>
> > Corey,
> >
> > It looks like you have an interrupt masking issue. You should
not get
> > into the interrupt handler
> > while executing in schedule. Do this happen when you do get into a
> > resend_irq situation or
> > when you're not?
> >
> >
> >
> > On Sat, May 10, 2008 at 1:55 AM, Corey Ashford <[EMAIL PROTECTED]
> wrote:
> > > Hello Stephane,
> > >
> > > While trying to test my implementation of resend_irq for
POWER, I ran
> > > into a perfmon2 problem, I think.
> > >
> > > In order to increase the number of PMU interrupts I'm getting,
I decided
> > > to start with a PAPI C test case "first.c" and modify it so
that it
> > > records both user and kernel domain, and adds a call to
PAPI_overflow to
> > > set the threshold on PAPI_TOT_CYC to 1 million so that I'd get
about
> > > 2000 PMU interrupts per second (2 GHz processor) just as a
starting point.
> > >
> > > The overflow handler passed to PAPI_overflow() does nothing
other than
> > > print a count of overflows received every 1000 counts.
> > >
> > > Well, I can run this test case to completion sometimes, but
fairly often
> > > it will hang in the kernel with a stack trace similar to this:
> > >
> > > 3:mon> c0
> > > 0:mon> t
> > > [c00000002e32eef0] c0000000004b18d4 ._spin_lock+0x5c/0x88
> > > [c00000002e32ef70] c00000000004991c .task_rq_lock+0x68/0xcc
> > > [c00000002e32f010] c000000000049b48 .try_to_wake_up+0x40/0x1c0
> > > [c00000002e32f0d0] c000000000062758 .signal_wake_up+0x48/0x74
> > > [c00000002e32f160] c000000000062a88 .__group_send_sig_info
+0xa8/0xcc
> > > [c00000002e32f200] c0000000000631fc .group_send_sig_info
+0x64/0xa0
> > > [c00000002e32f2b0] c0000000000eca24 .send_sigio+0x124/0x1f0
> > > [c00000002e32f3f0] c0000000000ecb5c .__kill_fasync+0x6c/0xa0
> > > [c00000002e32f480] c0000000000ecbec .kill_fasync+0x5c/0x94
> > > [c00000002e32f520] c00000000024d6c4 .pfm_notify_user+0xd4/0xf0
> > > [c00000002e32f5a0] c00000000024d850 .pfm_ovfl_notify+0x170/0x198
> > > [c00000002e32f640] c00000000024b314 .pfm_interrupt_handler
+0xbec/0xf9c
> > > [c00000002e32f7b0] d0000000000e82f4 .pfm_power5_irq_handler
+0x40/0x80
> > > [perfmon_power5]
> > > [c00000002e32f840] c000000000043dbc .powerpc_irq_handler
+0x60/0x78
> > > [c00000002e32f8c0] c000000000023488 .
> performance_monitor_exception+0x38/0x50
> > > [c00000002e32f940] c000000000003d80 performance_monitor_common
+0x100/0x180
> > > --- Exception: f00 (Performance Monitor) at c000000000045e38
> > > .__enqueue_entity+0
> > > x3c/0xb8
> > > [c00000002e32fcb0] c00000000004e5e8 .put_prev_task_fair
+0x74/0x98
> > > [c00000002e32fd40] c0000000004af708 .schedule+0x46c/0x768
> > > [c00000002e32fe30] c000000000008c54 do_work+0x14/0x34
> > > --- Exception: 901 (Decrementer) at 0000000010001ff8
> > > SP (ffffd320) is in userspace
> > >
> > >
> > > So what we have here is that spin_lock is getting called in
the context
> > > of schedule(). That doesn't seem good to me, but I'm am not
wise enough
> > > in the ways of the Linux kernel. Do you think this should
work correctly?
> > >
> > > To make forward progress on resend_irq, I'm going to switch
away from
> > > using an overflow handler to a sampler test case, so this
shouldn't stop
> > > my progress, it will just slow me down a bit.
> > >
> > > Regards,
> > >
> > > - Corey
> > >
> > > --
> > > Corey Ashford
> > > Software Engineer
> > > IBM Linux Technology Center, Linux Toolchain
> > > Beaverton, OR
> > > 503-578-3507
> > > [EMAIL PROTECTED]
> > >
> > >
> > >
-------------------------------------------------------------------------
> > > This SF.net email is sponsored by the 2008 JavaOne(SM)
Conference
> > > Don't miss this year's exciting event. There's still time to
save $100.
> > > Use priority code J8TL2D2.
> > > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.
> > sun.com/javaone
> > > _______________________________________________
> > > perfmon2-devel mailing list
> > > [email protected]
> > > https://lists.sourceforge.net/lists/listinfo/perfmon2-devel
> >
>
-------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2008.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> perfmon2-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/perfmon2-devel
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/_______________________________________________
perfmon2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
perfmon2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel