Hi folks,

I'd like to keep the option of wrapping Perfmon's spinlocks, especially those that do irqsave/restore. Why? Because it could allow the implementation of NMI's on systems that don't support them in hardware.

Phil

On May 13, 2008, at 2:41 AM, Corey J Ashford wrote:

Hi Stephane,

I tried out changing the wrapper for the PMU interrupt from STD_EXCEPTION_PSERIES to MASKABLE_EXCEPTION_PSERIES then ran the test which usually crashed within a minute or two, and it ran for 20 minutes non-stop. Then I tried removing POWER's defines for pfm_spin_lock_irqsave (and so on), and that worked as expected - no problems.

After seeing all of the complications working around the issue of the wrapper used for the PMU interrupt, I'm about 95% convinced that we should go with an #ifdef for CONFIG_PERFMON to use the MASKABLE_EXCEPTION_PSERIES wrapper, and then remove POWER's definitions for the locking macros. Optionally, we could get rid of the pfm_spin* macros as well from perfmon, simplifying the code (back to what it was originally).

The only downside of this is that if someone configures their kernel with perfmon and then uses Oprofile, they will not get any samples during interrupt disabled code in the kernel.

Any thoughts?

- Corey

Corey Ashford
Software Engineer
IBM Linux Technology Center, Linux Toolchain
Beaverton, OR
503-578-3507
[EMAIL PROTECTED]

[EMAIL PROTECTED] wrote on 05/12/2008 03:34:16 PM:

> Hi Stephane,
>
> Thanks for your response.
>
> I was thinking of trying out a kernel without the change, but I have
> a strong suspicion that we are in the interrupt handler because of
> the dreaded "soft interrupt disabling" optimization in POWER.
>
> I'm going to start looking at the schedule code, but I'm afraid we
> might be in a bind here, because the interrupt disable call is in
> common code, rather than in perfmon code where we could change it to
> a hard disable.
>
> I may have to start pushing for the PMU interrupt to use the
> MASKABLE_EXCEPTION_PSERIES wrapper, perhaps only when CONFIG_PERFMON is true.
>
> Regards,
>
> - Corey
>
> Corey Ashford
> Software Engineer
> IBM Linux Technology Center, Linux Toolchain
> Beaverton, OR
> 503-578-3507
> [EMAIL PROTECTED]
>
> "stephane eranian" <[EMAIL PROTECTED]> wrote on 05/12/2008 01:58:23 PM:
>
> > Corey,
> >
> > It looks like you have an interrupt masking issue. You should not get
> > into the interrupt handler
> > while executing in schedule. Do this happen when you do get into a
> > resend_irq situation or
> > when you're not?
> >
> >
> >
> > On Sat, May 10, 2008 at 1:55 AM, Corey Ashford <[EMAIL PROTECTED] > wrote:
> > > Hello Stephane,
> > >
> > > While trying to test my implementation of resend_irq for POWER, I ran
> > > into a perfmon2 problem, I think.
> > >
> > > In order to increase the number of PMU interrupts I'm getting, I decided > > > to start with a PAPI C test case "first.c" and modify it so that it > > > records both user and kernel domain, and adds a call to PAPI_overflow to > > > set the threshold on PAPI_TOT_CYC to 1 million so that I'd get about > > > 2000 PMU interrupts per second (2 GHz processor) just as a starting point.
> > >
> > > The overflow handler passed to PAPI_overflow() does nothing other than
> > > print a count of overflows received every 1000 counts.
> > >
> > > Well, I can run this test case to completion sometimes, but fairly often
> > > it will hang in the kernel with a stack trace similar to this:
> > >
> > > 3:mon> c0
> > > 0:mon> t
> > > [c00000002e32eef0] c0000000004b18d4 ._spin_lock+0x5c/0x88
> > > [c00000002e32ef70] c00000000004991c .task_rq_lock+0x68/0xcc
> > > [c00000002e32f010] c000000000049b48 .try_to_wake_up+0x40/0x1c0
> > > [c00000002e32f0d0] c000000000062758 .signal_wake_up+0x48/0x74
> > > [c00000002e32f160] c000000000062a88 .__group_send_sig_info +0xa8/0xcc > > > [c00000002e32f200] c0000000000631fc .group_send_sig_info +0x64/0xa0
> > > [c00000002e32f2b0] c0000000000eca24 .send_sigio+0x124/0x1f0
> > > [c00000002e32f3f0] c0000000000ecb5c .__kill_fasync+0x6c/0xa0
> > > [c00000002e32f480] c0000000000ecbec .kill_fasync+0x5c/0x94
> > > [c00000002e32f520] c00000000024d6c4 .pfm_notify_user+0xd4/0xf0
> > > [c00000002e32f5a0] c00000000024d850 .pfm_ovfl_notify+0x170/0x198
> > > [c00000002e32f640] c00000000024b314 .pfm_interrupt_handler +0xbec/0xf9c > > > [c00000002e32f7b0] d0000000000e82f4 .pfm_power5_irq_handler +0x40/0x80
> > > [perfmon_power5]
> > > [c00000002e32f840] c000000000043dbc .powerpc_irq_handler +0x60/0x78
> > > [c00000002e32f8c0] c000000000023488 .
> performance_monitor_exception+0x38/0x50
> > > [c00000002e32f940] c000000000003d80 performance_monitor_common +0x100/0x180
> > > --- Exception: f00 (Performance Monitor) at c000000000045e38
> > > .__enqueue_entity+0
> > > x3c/0xb8
> > > [c00000002e32fcb0] c00000000004e5e8 .put_prev_task_fair +0x74/0x98
> > > [c00000002e32fd40] c0000000004af708 .schedule+0x46c/0x768
> > > [c00000002e32fe30] c000000000008c54 do_work+0x14/0x34
> > > --- Exception: 901 (Decrementer) at 0000000010001ff8
> > > SP (ffffd320) is in userspace
> > >
> > >
> > > So what we have here is that spin_lock is getting called in the context > > > of schedule(). That doesn't seem good to me, but I'm am not wise enough > > > in the ways of the Linux kernel. Do you think this should work correctly?
> > >
> > > To make forward progress on resend_irq, I'm going to switch away from > > > using an overflow handler to a sampler test case, so this shouldn't stop
> > > my progress, it will just slow me down a bit.
> > >
> > > Regards,
> > >
> > > - Corey
> > >
> > > --
> > > Corey Ashford
> > > Software Engineer
> > > IBM Linux Technology Center, Linux Toolchain
> > > Beaverton, OR
> > > 503-578-3507
> > > [EMAIL PROTECTED]
> > >
> > >
> > > ------------------------------------------------------------------------- > > > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference > > > Don't miss this year's exciting event. There's still time to save $100.
> > > Use priority code J8TL2D2.
> > > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.
> > sun.com/javaone
> > > _______________________________________________
> > > perfmon2-devel mailing list
> > > [email protected]
> > > https://lists.sourceforge.net/lists/listinfo/perfmon2-devel
> > > -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2008.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> perfmon2-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/_______________________________________________
perfmon2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
perfmon2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to