DRM and/or X trouble (was Re: CFS review)
On 08/31/2007 08:46 AM, Tilman Sauerbeck wrote: On 08/29/2007 09:56 PM, Rene Herman wrote: With X server 1.3, I'm getting consistent crashes with two glxgear instances running. So, if you're getting any output, it's better than my situation. Before people focuss on software rendering too much -- also with 1.3.0 (and a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly crummy using hardware rendering. While I can move the glxgears window itself, the actual spinning wheels stay in the upper-left corner of the screen and the movement leaves a non-repainting trace on the screen. This sounds like you're running an older version of Mesa. The bugfix went into Mesa 6.3 and 7.0. I have Mesa 6.5.2 it seems (slackware-12.0 standard): OpenGL renderer string: Mesa DRI G400 20061030 AGP 2x x86/MMX+/3DNow!+/SSE OpenGL version string: 1.2 Mesa 6.5.2 The bit of the problem sketched above -- the gears just sitting there in the upper left corner of the screen and not moving alongside their window is fully reproduceable. The bit below ... : Running a second instance of glxgears in addition seems to make both instances unkillable -- and when I just now forcefully killed X in this situation (the spinning wheels were covering the upper left corner of all my desktops) I got the below. [ two kernel BUGs ] ... isn't. This seems to (again) have been a race of sorts that I hit by accident since I haven't reproduced yet. Had the same type of "racyness" trouble with keyboard behaviour in this version of X earlier. Running two instances of glxgears and killing them works for me, too. I'm using xorg-server 1.3.0.0, Mesa 7.0.1 with the latest DRM bits from http://gitweb.freedesktop.org/?p=mesa/drm.git;a=summary For me, everything standard slackware-12.0 (X.org 1.3.0) and kernel 2.6.22 DRM. I'm not running CFS though, but I guess the oops wasn't related to that. I've noticed before the Matrox driver seems to get little attention/testing so maybe that's just it. A G550 is ofcourse in graphics-time a Model T by now. I'm rather decidedly not a graphics person so I don't care a lot but every time I try to do something fashionable (run Google Earth for example) I notice things are horribly, horribly broken. X bugs I do not find very interesting (there's just too many) and the kernel bugs are requiring more time to reproduce than I have available. If the BUGs as posted aren't enough for a diagnosis, please consider the report withdrawn. Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Rene Herman [2007-08-30 09:05]: > On 08/29/2007 09:56 PM, Rene Herman wrote: > > Realised the BUGs may mean the kernel DRM people could want to be in CC... > > > On 08/29/2007 05:57 PM, Keith Packard wrote: > > > >> With X server 1.3, I'm getting consistent crashes with two glxgear > >> instances running. So, if you're getting any output, it's better than my > >> situation. > > > > Before people focuss on software rendering too much -- also with 1.3.0 > > (and a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly > > crummy using hardware rendering. While I can move the glxgears window > > itself, the actual spinning wheels stay in the upper-left corner of the > > screen and the movement leaves a non-repainting trace on the screen. This sounds like you're running an older version of Mesa. The bugfix went into Mesa 6.3 and 7.0. > > Running a second instance of glxgears in addition seems to make both > > instances unkillable -- and when I just now forcefully killed X in this > > situation (the spinning wheels were covering the upper left corner of all > > my desktops) I got the below. Running two instances of glxgears and killing them works for me, too. I'm using xorg-server 1.3.0.0, Mesa 7.0.1 with the latest DRM bits from http://gitweb.freedesktop.org/?p=mesa/drm.git;a=summary I'm not running CFS though, but I guess the oops wasn't related to that. Regards, Tilman -- A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail? pgpEzsAUWSOSG.pgp Description: PGP signature
Re: CFS review
Rene Herman [2007-08-30 09:05]: On 08/29/2007 09:56 PM, Rene Herman wrote: Realised the BUGs may mean the kernel DRM people could want to be in CC... On 08/29/2007 05:57 PM, Keith Packard wrote: With X server 1.3, I'm getting consistent crashes with two glxgear instances running. So, if you're getting any output, it's better than my situation. Before people focuss on software rendering too much -- also with 1.3.0 (and a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly crummy using hardware rendering. While I can move the glxgears window itself, the actual spinning wheels stay in the upper-left corner of the screen and the movement leaves a non-repainting trace on the screen. This sounds like you're running an older version of Mesa. The bugfix went into Mesa 6.3 and 7.0. Running a second instance of glxgears in addition seems to make both instances unkillable -- and when I just now forcefully killed X in this situation (the spinning wheels were covering the upper left corner of all my desktops) I got the below. Running two instances of glxgears and killing them works for me, too. I'm using xorg-server 1.3.0.0, Mesa 7.0.1 with the latest DRM bits from http://gitweb.freedesktop.org/?p=mesa/drm.git;a=summary I'm not running CFS though, but I guess the oops wasn't related to that. Regards, Tilman -- A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail? pgpEzsAUWSOSG.pgp Description: PGP signature
DRM and/or X trouble (was Re: CFS review)
On 08/31/2007 08:46 AM, Tilman Sauerbeck wrote: On 08/29/2007 09:56 PM, Rene Herman wrote: With X server 1.3, I'm getting consistent crashes with two glxgear instances running. So, if you're getting any output, it's better than my situation. Before people focuss on software rendering too much -- also with 1.3.0 (and a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly crummy using hardware rendering. While I can move the glxgears window itself, the actual spinning wheels stay in the upper-left corner of the screen and the movement leaves a non-repainting trace on the screen. This sounds like you're running an older version of Mesa. The bugfix went into Mesa 6.3 and 7.0. I have Mesa 6.5.2 it seems (slackware-12.0 standard): OpenGL renderer string: Mesa DRI G400 20061030 AGP 2x x86/MMX+/3DNow!+/SSE OpenGL version string: 1.2 Mesa 6.5.2 The bit of the problem sketched above -- the gears just sitting there in the upper left corner of the screen and not moving alongside their window is fully reproduceable. The bit below ... : Running a second instance of glxgears in addition seems to make both instances unkillable -- and when I just now forcefully killed X in this situation (the spinning wheels were covering the upper left corner of all my desktops) I got the below. [ two kernel BUGs ] ... isn't. This seems to (again) have been a race of sorts that I hit by accident since I haven't reproduced yet. Had the same type of racyness trouble with keyboard behaviour in this version of X earlier. Running two instances of glxgears and killing them works for me, too. I'm using xorg-server 1.3.0.0, Mesa 7.0.1 with the latest DRM bits from http://gitweb.freedesktop.org/?p=mesa/drm.git;a=summary For me, everything standard slackware-12.0 (X.org 1.3.0) and kernel 2.6.22 DRM. I'm not running CFS though, but I guess the oops wasn't related to that. I've noticed before the Matrox driver seems to get little attention/testing so maybe that's just it. A G550 is ofcourse in graphics-time a Model T by now. I'm rather decidedly not a graphics person so I don't care a lot but every time I try to do something fashionable (run Google Earth for example) I notice things are horribly, horribly broken. X bugs I do not find very interesting (there's just too many) and the kernel bugs are requiring more time to reproduce than I have available. If the BUGs as posted aren't enough for a diagnosis, please consider the report withdrawn. Rene. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On 08/30/2007 06:06 PM, Chuck Ebbert wrote: On 08/29/2007 03:56 PM, Rene Herman wrote: Before people focuss on software rendering too much -- also with 1.3.0 (and a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly crummy using hardware rendering. While I can move the glxgears window itself, the actual spinning wheels stay in the upper-left corner of the screen and the movement leaves a non-repainting trace on the screen. Running a second instance of glxgears in addition seems to make both instances unkillable -- and when I just now forcefully killed X in this situation (the spinning wheels were covering the upper left corner of all my desktops) I got the below. Kernel is 2.6.22.5-cfs-v20.5, schedule() is in the traces (but that may be expected anyway). And this doesn't happen at all with the stock scheduler? (Just confirming, in case you didn't compare.) I didn't compare -- it no doubt will. I know the title of this thread is "CFS review" but it turned into Keith Packard noticing glxgears being broken on recent-ish X.org. The start of the thread was about things being broken using _software_ rendering though, so I thought it might be useful to remark/report glxgears also being quite broken using hardware rendering on my setup at least. BUG: unable to handle kernel NULL pointer dereference at virtual address 0010 printing eip: c10ff416 *pde = Oops: [#1] PREEMPT Try it without preempt? If you're asking in a "I'll go debug the DRM" way I'll go dig a bit later (please say) but if you are only interested in the thread due to CFS, note that I'm aware it's not likely to have anything to do with CFS. It's not reproducable for you? (full description of bug above). Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On 08/29/2007 03:56 PM, Rene Herman wrote: > > Before people focuss on software rendering too much -- also with 1.3.0 (and > a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly crummy > using > hardware rendering. While I can move the glxgears window itself, the actual > spinning wheels stay in the upper-left corner of the screen and the > movement > leaves a non-repainting trace on the screen. Running a second instance of > glxgears in addition seems to make both instances unkillable -- and when > I just now forcefully killed X in this situation (the spinning wheels were > covering the upper left corner of all my desktops) I got the below. > > Kernel is 2.6.22.5-cfs-v20.5, schedule() is in the traces (but that may be > expected anyway). > And this doesn't happen at all with the stock scheduler? (Just confirming, in case you didn't compare.) > BUG: unable to handle kernel NULL pointer dereference at virtual address > 0010 > printing eip: > c10ff416 > *pde = > Oops: [#1] > PREEMPT Try it without preempt? > Modules linked in: nfsd exportfs lockd nfs_acl sunrpc nls_iso8859_1 > nls_cp437 vfat fat nls_base > CPU:0 > EIP:0060:[]Not tainted VLI > EFLAGS: 00210246 (2.6.22.5-cfs-v20.5-local #5) > EIP is at mga_dma_buffers+0x189/0x2e3 > eax: ebx: efd07200 ecx: 0001 edx: efc32c00 > esi: edi: c12756cc ebp: dfea44c0 esp: dddaaec0 > ds: 007b es: 007b fs: gs: 0033 ss: 0068 > Process glxgears (pid: 1775, ti=dddaa000 task=e9daca60 task.ti=dddaa000) > Stack: efc32c00 0004 e4c3bd20 c10fa54b e4c3bd20 efc32c00 > >0004 0001 0001 > bfbdb8bc >bfbdb8b8 c10ff28d 0029 c12756cc dfea44c0 c10f87fc > bfbdb844 > Call Trace: > [] drm_lock+0x255/0x2de > [] mga_dma_buffers+0x0/0x2e3 > [] drm_ioctl+0x142/0x18a > [] do_IRQ+0x97/0xb0 > [] drm_ioctl+0x0/0x18a > [] drm_ioctl+0x0/0x18a > [] do_ioctl+0x87/0x9f > [] vfs_ioctl+0x23d/0x250 > [] schedule+0x2d0/0x2e6 > [] sys_ioctl+0x33/0x4d > [] syscall_call+0x7/0xb > === > Code: 9a 08 03 00 00 8b 73 30 74 14 c7 44 24 04 28 76 1c c1 c7 04 24 49 > 51 23 c1 e8 b0 74 f1 ff 8b 83 d8 00 00 00 83 3d 1c 47 30 c1 00 <8b> 40 > 10 8b a8 58 1e 00 00 8b 43 28 8b b8 64 01 00 00 74 32 8b > EIP: [] mga_dma_buffers+0x189/0x2e3 SS:ESP 0068:dddaaec0 dev->dev_private->mmio is NULL when trying to access mmio.handle - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Rene Herman <[EMAIL PROTECTED]> wrote: > Realised the BUGs may mean the kernel DRM people could want to be in CC... and note that the schedule() call in there is not part of the crash backtrace: > >Call Trace: > > [] drm_lock+0x255/0x2de > > [] mga_dma_buffers+0x0/0x2e3 > > [] drm_ioctl+0x142/0x18a > > [] do_IRQ+0x97/0xb0 > > [] drm_ioctl+0x0/0x18a > > [] drm_ioctl+0x0/0x18a > > [] do_ioctl+0x87/0x9f > > [] vfs_ioctl+0x23d/0x250 > > [] schedule+0x2d0/0x2e6 > > [] sys_ioctl+0x33/0x4d > > [] syscall_call+0x7/0xb it just happened to be on the kernel stack. Nor is the do_IRQ() entry real. Both are frequent functions (and were executed recently) that's why they were still in the stackframe. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On 08/29/2007 09:56 PM, Rene Herman wrote: Realised the BUGs may mean the kernel DRM people could want to be in CC... On 08/29/2007 05:57 PM, Keith Packard wrote: With X server 1.3, I'm getting consistent crashes with two glxgear instances running. So, if you're getting any output, it's better than my situation. Before people focuss on software rendering too much -- also with 1.3.0 (and a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly crummy using hardware rendering. While I can move the glxgears window itself, the actual spinning wheels stay in the upper-left corner of the screen and the movement leaves a non-repainting trace on the screen. Running a second instance of glxgears in addition seems to make both instances unkillable -- and when I just now forcefully killed X in this situation (the spinning wheels were covering the upper left corner of all my desktops) I got the below. Kernel is 2.6.22.5-cfs-v20.5, schedule() is in the traces (but that may be expected anyway). BUG: unable to handle kernel NULL pointer dereference at virtual address 0010 printing eip: c10ff416 *pde = Oops: [#1] PREEMPT Modules linked in: nfsd exportfs lockd nfs_acl sunrpc nls_iso8859_1 nls_cp437 vfat fat nls_base CPU:0 EIP:0060:[]Not tainted VLI EFLAGS: 00210246 (2.6.22.5-cfs-v20.5-local #5) EIP is at mga_dma_buffers+0x189/0x2e3 eax: ebx: efd07200 ecx: 0001 edx: efc32c00 esi: edi: c12756cc ebp: dfea44c0 esp: dddaaec0 ds: 007b es: 007b fs: gs: 0033 ss: 0068 Process glxgears (pid: 1775, ti=dddaa000 task=e9daca60 task.ti=dddaa000) Stack: efc32c00 0004 e4c3bd20 c10fa54b e4c3bd20 efc32c00 0004 0001 0001 bfbdb8bc bfbdb8b8 c10ff28d 0029 c12756cc dfea44c0 c10f87fc bfbdb844 Call Trace: [] drm_lock+0x255/0x2de [] mga_dma_buffers+0x0/0x2e3 [] drm_ioctl+0x142/0x18a [] do_IRQ+0x97/0xb0 [] drm_ioctl+0x0/0x18a [] drm_ioctl+0x0/0x18a [] do_ioctl+0x87/0x9f [] vfs_ioctl+0x23d/0x250 [] schedule+0x2d0/0x2e6 [] sys_ioctl+0x33/0x4d [] syscall_call+0x7/0xb === Code: 9a 08 03 00 00 8b 73 30 74 14 c7 44 24 04 28 76 1c c1 c7 04 24 49 51 23 c1 e8 b0 74 f1 ff 8b 83 d8 00 00 00 83 3d 1c 47 30 c1 00 <8b> 40 10 8b a8 58 1e 00 00 8b 43 28 8b b8 64 01 00 00 74 32 8b EIP: [] mga_dma_buffers+0x189/0x2e3 SS:ESP 0068:dddaaec0 BUG: unable to handle kernel NULL pointer dereference at virtual address 0010 printing eip: c10ff416 *pde = Oops: [#2] PREEMPT Modules linked in: nfsd exportfs lockd nfs_acl sunrpc nls_iso8859_1 nls_cp437 vfat fat nls_base CPU:0 EIP:0060:[]Not tainted VLI EFLAGS: 00210246 (2.6.22.5-cfs-v20.5-local #5) EIP is at mga_dma_buffers+0x189/0x2e3 eax: ebx: efd07200 ecx: 0001 edx: efc32c00 esi: edi: c12756cc ebp: dfea4780 esp: e0552ec0 ds: 007b es: 007b fs: gs: 0033 ss: 0068 Process glxgears (pid: 1776, ti=e0552000 task=c19ec000 task.ti=e0552000) Stack: efc32c00 0003 efc64b40 c10fa54b efc64b40 efc32c00 0003 0001 0001 bf8dbdcc bf8dbdc8 c10ff28d 0029 c12756cc dfea4780 c10f87fc bf8dbd54 Call Trace: [] drm_lock+0x255/0x2de [] mga_dma_buffers+0x0/0x2e3 [] drm_ioctl+0x142/0x18a [] preempt_schedule+0x4e/0x5a [] drm_ioctl+0x0/0x18a [] drm_ioctl+0x0/0x18a [] do_ioctl+0x87/0x9f [] vfs_ioctl+0x23d/0x250 [] schedule+0x23b/0x2e6 [] schedule+0x2d0/0x2e6 [] sys_ioctl+0x33/0x4d [] syscall_call+0x7/0xb === Code: 9a 08 03 00 00 8b 73 30 74 14 c7 44 24 04 28 76 1c c1 c7 04 24 49 51 23 c1 e8 b0 74 f1 ff 8b 83 d8 00 00 00 83 3d 1c 47 30 c1 00 <8b> 40 10 8b a8 58 1e 00 00 8b 43 28 8b b8 64 01 00 00 74 32 8b EIP: [] mga_dma_buffers+0x189/0x2e3 SS:ESP 0068:e0552ec0 [drm:drm_release] *ERROR* Device busy: 2 0 Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On 08/29/2007 09:56 PM, Rene Herman wrote: Realised the BUGs may mean the kernel DRM people could want to be in CC... On 08/29/2007 05:57 PM, Keith Packard wrote: With X server 1.3, I'm getting consistent crashes with two glxgear instances running. So, if you're getting any output, it's better than my situation. Before people focuss on software rendering too much -- also with 1.3.0 (and a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly crummy using hardware rendering. While I can move the glxgears window itself, the actual spinning wheels stay in the upper-left corner of the screen and the movement leaves a non-repainting trace on the screen. Running a second instance of glxgears in addition seems to make both instances unkillable -- and when I just now forcefully killed X in this situation (the spinning wheels were covering the upper left corner of all my desktops) I got the below. Kernel is 2.6.22.5-cfs-v20.5, schedule() is in the traces (but that may be expected anyway). BUG: unable to handle kernel NULL pointer dereference at virtual address 0010 printing eip: c10ff416 *pde = Oops: [#1] PREEMPT Modules linked in: nfsd exportfs lockd nfs_acl sunrpc nls_iso8859_1 nls_cp437 vfat fat nls_base CPU:0 EIP:0060:[c10ff416]Not tainted VLI EFLAGS: 00210246 (2.6.22.5-cfs-v20.5-local #5) EIP is at mga_dma_buffers+0x189/0x2e3 eax: ebx: efd07200 ecx: 0001 edx: efc32c00 esi: edi: c12756cc ebp: dfea44c0 esp: dddaaec0 ds: 007b es: 007b fs: gs: 0033 ss: 0068 Process glxgears (pid: 1775, ti=dddaa000 task=e9daca60 task.ti=dddaa000) Stack: efc32c00 0004 e4c3bd20 c10fa54b e4c3bd20 efc32c00 0004 0001 0001 bfbdb8bc bfbdb8b8 c10ff28d 0029 c12756cc dfea44c0 c10f87fc bfbdb844 Call Trace: [c10fa54b] drm_lock+0x255/0x2de [c10ff28d] mga_dma_buffers+0x0/0x2e3 [c10f87fc] drm_ioctl+0x142/0x18a [c1005973] do_IRQ+0x97/0xb0 [c10f86ba] drm_ioctl+0x0/0x18a [c10f86ba] drm_ioctl+0x0/0x18a [c105b0d7] do_ioctl+0x87/0x9f [c105b32c] vfs_ioctl+0x23d/0x250 [c11b533e] schedule+0x2d0/0x2e6 [c105b372] sys_ioctl+0x33/0x4d [c1003d1e] syscall_call+0x7/0xb === Code: 9a 08 03 00 00 8b 73 30 74 14 c7 44 24 04 28 76 1c c1 c7 04 24 49 51 23 c1 e8 b0 74 f1 ff 8b 83 d8 00 00 00 83 3d 1c 47 30 c1 00 8b 40 10 8b a8 58 1e 00 00 8b 43 28 8b b8 64 01 00 00 74 32 8b EIP: [c10ff416] mga_dma_buffers+0x189/0x2e3 SS:ESP 0068:dddaaec0 BUG: unable to handle kernel NULL pointer dereference at virtual address 0010 printing eip: c10ff416 *pde = Oops: [#2] PREEMPT Modules linked in: nfsd exportfs lockd nfs_acl sunrpc nls_iso8859_1 nls_cp437 vfat fat nls_base CPU:0 EIP:0060:[c10ff416]Not tainted VLI EFLAGS: 00210246 (2.6.22.5-cfs-v20.5-local #5) EIP is at mga_dma_buffers+0x189/0x2e3 eax: ebx: efd07200 ecx: 0001 edx: efc32c00 esi: edi: c12756cc ebp: dfea4780 esp: e0552ec0 ds: 007b es: 007b fs: gs: 0033 ss: 0068 Process glxgears (pid: 1776, ti=e0552000 task=c19ec000 task.ti=e0552000) Stack: efc32c00 0003 efc64b40 c10fa54b efc64b40 efc32c00 0003 0001 0001 bf8dbdcc bf8dbdc8 c10ff28d 0029 c12756cc dfea4780 c10f87fc bf8dbd54 Call Trace: [c10fa54b] drm_lock+0x255/0x2de [c10ff28d] mga_dma_buffers+0x0/0x2e3 [c10f87fc] drm_ioctl+0x142/0x18a [c11b53f6] preempt_schedule+0x4e/0x5a [c10f86ba] drm_ioctl+0x0/0x18a [c10f86ba] drm_ioctl+0x0/0x18a [c105b0d7] do_ioctl+0x87/0x9f [c105b32c] vfs_ioctl+0x23d/0x250 [c11b52a9] schedule+0x23b/0x2e6 [c11b533e] schedule+0x2d0/0x2e6 [c105b372] sys_ioctl+0x33/0x4d [c1003d1e] syscall_call+0x7/0xb === Code: 9a 08 03 00 00 8b 73 30 74 14 c7 44 24 04 28 76 1c c1 c7 04 24 49 51 23 c1 e8 b0 74 f1 ff 8b 83 d8 00 00 00 83 3d 1c 47 30 c1 00 8b 40 10 8b a8 58 1e 00 00 8b 43 28 8b b8 64 01 00 00 74 32 8b EIP: [c10ff416] mga_dma_buffers+0x189/0x2e3 SS:ESP 0068:e0552ec0 [drm:drm_release] *ERROR* Device busy: 2 0 Rene. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Rene Herman [EMAIL PROTECTED] wrote: Realised the BUGs may mean the kernel DRM people could want to be in CC... and note that the schedule() call in there is not part of the crash backtrace: Call Trace: [c10fa54b] drm_lock+0x255/0x2de [c10ff28d] mga_dma_buffers+0x0/0x2e3 [c10f87fc] drm_ioctl+0x142/0x18a [c1005973] do_IRQ+0x97/0xb0 [c10f86ba] drm_ioctl+0x0/0x18a [c10f86ba] drm_ioctl+0x0/0x18a [c105b0d7] do_ioctl+0x87/0x9f [c105b32c] vfs_ioctl+0x23d/0x250 [c11b533e] schedule+0x2d0/0x2e6 [c105b372] sys_ioctl+0x33/0x4d [c1003d1e] syscall_call+0x7/0xb it just happened to be on the kernel stack. Nor is the do_IRQ() entry real. Both are frequent functions (and were executed recently) that's why they were still in the stackframe. Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On 08/29/2007 03:56 PM, Rene Herman wrote: Before people focuss on software rendering too much -- also with 1.3.0 (and a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly crummy using hardware rendering. While I can move the glxgears window itself, the actual spinning wheels stay in the upper-left corner of the screen and the movement leaves a non-repainting trace on the screen. Running a second instance of glxgears in addition seems to make both instances unkillable -- and when I just now forcefully killed X in this situation (the spinning wheels were covering the upper left corner of all my desktops) I got the below. Kernel is 2.6.22.5-cfs-v20.5, schedule() is in the traces (but that may be expected anyway). And this doesn't happen at all with the stock scheduler? (Just confirming, in case you didn't compare.) BUG: unable to handle kernel NULL pointer dereference at virtual address 0010 printing eip: c10ff416 *pde = Oops: [#1] PREEMPT Try it without preempt? Modules linked in: nfsd exportfs lockd nfs_acl sunrpc nls_iso8859_1 nls_cp437 vfat fat nls_base CPU:0 EIP:0060:[c10ff416]Not tainted VLI EFLAGS: 00210246 (2.6.22.5-cfs-v20.5-local #5) EIP is at mga_dma_buffers+0x189/0x2e3 eax: ebx: efd07200 ecx: 0001 edx: efc32c00 esi: edi: c12756cc ebp: dfea44c0 esp: dddaaec0 ds: 007b es: 007b fs: gs: 0033 ss: 0068 Process glxgears (pid: 1775, ti=dddaa000 task=e9daca60 task.ti=dddaa000) Stack: efc32c00 0004 e4c3bd20 c10fa54b e4c3bd20 efc32c00 0004 0001 0001 bfbdb8bc bfbdb8b8 c10ff28d 0029 c12756cc dfea44c0 c10f87fc bfbdb844 Call Trace: [c10fa54b] drm_lock+0x255/0x2de [c10ff28d] mga_dma_buffers+0x0/0x2e3 [c10f87fc] drm_ioctl+0x142/0x18a [c1005973] do_IRQ+0x97/0xb0 [c10f86ba] drm_ioctl+0x0/0x18a [c10f86ba] drm_ioctl+0x0/0x18a [c105b0d7] do_ioctl+0x87/0x9f [c105b32c] vfs_ioctl+0x23d/0x250 [c11b533e] schedule+0x2d0/0x2e6 [c105b372] sys_ioctl+0x33/0x4d [c1003d1e] syscall_call+0x7/0xb === Code: 9a 08 03 00 00 8b 73 30 74 14 c7 44 24 04 28 76 1c c1 c7 04 24 49 51 23 c1 e8 b0 74 f1 ff 8b 83 d8 00 00 00 83 3d 1c 47 30 c1 00 8b 40 10 8b a8 58 1e 00 00 8b 43 28 8b b8 64 01 00 00 74 32 8b EIP: [c10ff416] mga_dma_buffers+0x189/0x2e3 SS:ESP 0068:dddaaec0 dev-dev_private-mmio is NULL when trying to access mmio.handle - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On 08/30/2007 06:06 PM, Chuck Ebbert wrote: On 08/29/2007 03:56 PM, Rene Herman wrote: Before people focuss on software rendering too much -- also with 1.3.0 (and a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly crummy using hardware rendering. While I can move the glxgears window itself, the actual spinning wheels stay in the upper-left corner of the screen and the movement leaves a non-repainting trace on the screen. Running a second instance of glxgears in addition seems to make both instances unkillable -- and when I just now forcefully killed X in this situation (the spinning wheels were covering the upper left corner of all my desktops) I got the below. Kernel is 2.6.22.5-cfs-v20.5, schedule() is in the traces (but that may be expected anyway). And this doesn't happen at all with the stock scheduler? (Just confirming, in case you didn't compare.) I didn't compare -- it no doubt will. I know the title of this thread is CFS review but it turned into Keith Packard noticing glxgears being broken on recent-ish X.org. The start of the thread was about things being broken using _software_ rendering though, so I thought it might be useful to remark/report glxgears also being quite broken using hardware rendering on my setup at least. BUG: unable to handle kernel NULL pointer dereference at virtual address 0010 printing eip: c10ff416 *pde = Oops: [#1] PREEMPT Try it without preempt? If you're asking in a I'll go debug the DRM way I'll go dig a bit later (please say) but if you are only interested in the thread due to CFS, note that I'm aware it's not likely to have anything to do with CFS. It's not reproducable for you? (full description of bug above). Rene. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On 08/29/2007 05:57 PM, Keith Packard wrote: With X server 1.3, I'm getting consistent crashes with two glxgear instances running. So, if you're getting any output, it's better than my situation. Before people focuss on software rendering too much -- also with 1.3.0 (and a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly crummy using hardware rendering. While I can move the glxgears window itself, the actual spinning wheels stay in the upper-left corner of the screen and the movement leaves a non-repainting trace on the screen. Running a second instance of glxgears in addition seems to make both instances unkillable -- and when I just now forcefully killed X in this situation (the spinning wheels were covering the upper left corner of all my desktops) I got the below. Kernel is 2.6.22.5-cfs-v20.5, schedule() is in the traces (but that may be expected anyway). BUG: unable to handle kernel NULL pointer dereference at virtual address 0010 printing eip: c10ff416 *pde = Oops: [#1] PREEMPT Modules linked in: nfsd exportfs lockd nfs_acl sunrpc nls_iso8859_1 nls_cp437 vfat fat nls_base CPU:0 EIP:0060:[]Not tainted VLI EFLAGS: 00210246 (2.6.22.5-cfs-v20.5-local #5) EIP is at mga_dma_buffers+0x189/0x2e3 eax: ebx: efd07200 ecx: 0001 edx: efc32c00 esi: edi: c12756cc ebp: dfea44c0 esp: dddaaec0 ds: 007b es: 007b fs: gs: 0033 ss: 0068 Process glxgears (pid: 1775, ti=dddaa000 task=e9daca60 task.ti=dddaa000) Stack: efc32c00 0004 e4c3bd20 c10fa54b e4c3bd20 efc32c00 0004 0001 0001 bfbdb8bc bfbdb8b8 c10ff28d 0029 c12756cc dfea44c0 c10f87fc bfbdb844 Call Trace: [] drm_lock+0x255/0x2de [] mga_dma_buffers+0x0/0x2e3 [] drm_ioctl+0x142/0x18a [] do_IRQ+0x97/0xb0 [] drm_ioctl+0x0/0x18a [] drm_ioctl+0x0/0x18a [] do_ioctl+0x87/0x9f [] vfs_ioctl+0x23d/0x250 [] schedule+0x2d0/0x2e6 [] sys_ioctl+0x33/0x4d [] syscall_call+0x7/0xb === Code: 9a 08 03 00 00 8b 73 30 74 14 c7 44 24 04 28 76 1c c1 c7 04 24 49 51 23 c1 e8 b0 74 f1 ff 8b 83 d8 00 00 00 83 3d 1c 47 30 c1 00 <8b> 40 10 8b a8 58 1e 00 00 8b 43 28 8b b8 64 01 00 00 74 32 8b EIP: [] mga_dma_buffers+0x189/0x2e3 SS:ESP 0068:dddaaec0 BUG: unable to handle kernel NULL pointer dereference at virtual address 0010 printing eip: c10ff416 *pde = Oops: [#2] PREEMPT Modules linked in: nfsd exportfs lockd nfs_acl sunrpc nls_iso8859_1 nls_cp437 vfat fat nls_base CPU:0 EIP:0060:[]Not tainted VLI EFLAGS: 00210246 (2.6.22.5-cfs-v20.5-local #5) EIP is at mga_dma_buffers+0x189/0x2e3 eax: ebx: efd07200 ecx: 0001 edx: efc32c00 esi: edi: c12756cc ebp: dfea4780 esp: e0552ec0 ds: 007b es: 007b fs: gs: 0033 ss: 0068 Process glxgears (pid: 1776, ti=e0552000 task=c19ec000 task.ti=e0552000) Stack: efc32c00 0003 efc64b40 c10fa54b efc64b40 efc32c00 0003 0001 0001 bf8dbdcc bf8dbdc8 c10ff28d 0029 c12756cc dfea4780 c10f87fc bf8dbd54 Call Trace: [] drm_lock+0x255/0x2de [] mga_dma_buffers+0x0/0x2e3 [] drm_ioctl+0x142/0x18a [] preempt_schedule+0x4e/0x5a [] drm_ioctl+0x0/0x18a [] drm_ioctl+0x0/0x18a [] do_ioctl+0x87/0x9f [] vfs_ioctl+0x23d/0x250 [] schedule+0x23b/0x2e6 [] schedule+0x2d0/0x2e6 [] sys_ioctl+0x33/0x4d [] syscall_call+0x7/0xb === Code: 9a 08 03 00 00 8b 73 30 74 14 c7 44 24 04 28 76 1c c1 c7 04 24 49 51 23 c1 e8 b0 74 f1 ff 8b 83 d8 00 00 00 83 3d 1c 47 30 c1 00 <8b> 40 10 8b a8 58 1e 00 00 8b 43 28 8b b8 64 01 00 00 74 32 8b EIP: [] mga_dma_buffers+0x189/0x2e3 SS:ESP 0068:e0552ec0 [drm:drm_release] *ERROR* Device busy: 2 0 Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On Wed, 2007-08-29 at 10:04 +0200, Ingo Molnar wrote: > is that old enough to not have the smart X scheduler? The smart scheduler went into the server in like 2000. I don't think you've got any systems that old. XFree86 4.1 or 4.2, I can't remember which. > (probably > the GLX bug you mentioned) so i cannot reproduce the bug. With X server 1.3, I'm getting consistent crashes with two glxgear instances running. So, if you're getting any output, it's better than my situation. -- [EMAIL PROTECTED] signature.asc Description: This is a digitally signed message part
Re: CFS review
Ingo Molnar wrote: * Bill Davidsen <[EMAIL PROTECTED]> wrote: There is another way to show the problem visually under X (vesa-driver), by starting 3 gears simultaneously, which after laying them out side-by-side need some settling time before smoothing out. Without __update_curr it's absolutely smooth from the start. I posted a LOT of stuff using the glitch1 script, and finally found a set of tuning values which make the test script run smooth. See back posts, I don't have them here. but you have real 3D hw and DRI enabled, correct? In that case X uses up almost no CPU time and glxgears makes most of the processing. That is quite different from the above software-rendering case, where X spends most of the CPU time. No, my test machine for that is a compile server, and uses the built-in motherboard graphics which are very limited. This is not in any sense a graphics powerhouse, it is used to build custom kernels and applications, and for testing of kvm and xen, and I grabbed it because it had the only Core2 CPU I could reboot to try new kernel versions and "from cold boot" testing, discovered the graphics smoothness issue by having several windows open on compiles, and developed the glitch1 script as a way to reproduce it. The settings I used, features=14, granularity=50, work to improve smoothness on other machines for other uses, but they do seem to impact performance for compiles, video processing, etc, so they are not optimal for general use. I regard the existence of these tuning knobs as one of the real strengths of CFS, when you change the tuning it has a visible effect. -- bill davidsen <[EMAIL PROTECTED]> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Ingo Molnar wrote: > * Keith Packard <[EMAIL PROTECTED]> wrote: > > Make sure the X server isn't running with the smart scheduler > > disabled; that will cause precisely the symptoms you're seeing here. > > In the normal usptream sources, you'd have to use '-dumbSched' as an X > > server command line option. > > > > The old 'scheduler' would run an entire X client's input buffer dry > > before looking for requests from another client. Because glxgears > > requests are small but time consuming, this can cause very long delays > > between client switching. > > on the old box where i've reproduced this i've got an ancient X version: > > neptune:~> X -version > > X Window System Version 6.8.2 > Release Date: 9 February 2005 > X Protocol Version 11, Revision 0, Release 6.8.2 > Build Operating System: Linux 2.6.9-22.ELsmp i686 [ELF] > > is that old enough to not have the smart X scheduler? > > on newer systems i dont see correctly updated glxgears output (probably > the GLX bug you mentioned) so i cannot reproduce the bug. > > Al, could you send us your 'X -version' output? This is the one I have been talking about: XFree86 Version 4.3.0 Release Date: 27 February 2003 X Protocol Version 11, Revision 0, Release 6.6 Build Operating System: Linux 2.4.21-0.13mdksmp i686 [ELF] I also tried the gears test just now on this: X Window System Version 6.8.1 Release Date: 17 September 2004 X Protocol Version 11, Revision 0, Release 6.8.1 Build Operating System: Linux 2.6.9-1.860_ELsmp i686 [ELF] but it completely locks up. Disabling add_wait_runtime seems to fix it. Thanks! -- Al - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Keith Packard <[EMAIL PROTECTED]> wrote: > Make sure the X server isn't running with the smart scheduler > disabled; that will cause precisely the symptoms you're seeing here. > In the normal usptream sources, you'd have to use '-dumbSched' as an X > server command line option. > > The old 'scheduler' would run an entire X client's input buffer dry > before looking for requests from another client. Because glxgears > requests are small but time consuming, this can cause very long delays > between client switching. on the old box where i've reproduced this i've got an ancient X version: neptune:~> X -version X Window System Version 6.8.2 Release Date: 9 February 2005 X Protocol Version 11, Revision 0, Release 6.8.2 Build Operating System: Linux 2.6.9-22.ELsmp i686 [ELF] is that old enough to not have the smart X scheduler? on newer systems i dont see correctly updated glxgears output (probably the GLX bug you mentioned) so i cannot reproduce the bug. Al, could you send us your 'X -version' output? Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On Wed, 2007-08-29 at 06:46 +0200, Ingo Molnar wrote: > ok, i finally managed to reproduce the "artifact" myself on an older > box. It goes like this: start up X with the vesa driver (or with NoDRI) > to force software rendering. Then start up a couple of glxgears > instances. Those glxgears instances update in a very "chunky", > "stuttering" way - each glxgears instance runs/stops/runs/stops at a > rate of a about once per second, and this was reported to me as a > potential CPU scheduler regression. Hmm. I can't even run two copies of glxgears on software GL code today; it's broken in every X server I have available. Someone broke it a while ago, but no-one noticed. However, this shouldn't be GLX related as the software rasterizer is no different from any other rendering code. Testing with my smart-scheduler case (many copies of 'plaid') shows that at least with git master, things are working as designed. When GLX is working again, I'll try that as well. > at a quick glance this is not a CPU scheduler thing: X uses up 99% of > CPU time, all the glxgears tasks (i needed 8 parallel instances to see > the stallings) are using up the remaining 1% of CPU time. The ordering > of the requests from the glxgears tasks is X's choice - and for a > pathological overload situation like this we cannot blame X at all for > not producing a completely smooth output. (although Xorg could perhaps > try to schedule such requests more smoothly, in a more finegrained way?) It does. It should switch between clients ever 20ms; that's why X spends so much time asking the kernel for the current time. Make sure the X server isn't running with the smart scheduler disabled; that will cause precisely the symptoms you're seeing here. In the normal usptream sources, you'd have to use '-dumbSched' as an X server command line option. The old 'scheduler' would run an entire X client's input buffer dry before looking for requests from another client. Because glxgears requests are small but time consuming, this can cause very long delays between client switching. -- [EMAIL PROTECTED] signature.asc Description: This is a digitally signed message part
Re: CFS review
* Al Boldi <[EMAIL PROTECTED]> wrote: > > se.sleep_max : 2194711437 > > se.block_max : 0 > > se.exec_max : 977446 > > se.wait_max : 1912321 > > > > the scheduler itself had a worst-case scheduling delay of 1.9 > > milliseconds for that glxgears instance (which is perfectly good - in > > fact - excellent interactivity) - but the task had a maximum sleep time > > of 2.19 seconds. So the 'glitch' was not caused by the scheduler. > > 2.19sec is probably the time you need to lay them out side by side. > [...] nope, i cleared the stats after i laid the glxgears out, via: for N in /proc/*/sched; do echo 0 > $N; done and i did the strace (which showed a 1+ seconds latency) while the glxgears was not manipulated in any way. > [...] You see, gears sleeps when it is covered by another window, > [...] none of the gear windows in my test were overlaid... > [...] so once you lay them out it starts running, and that's when they > start to stutter for about 10sec. After that they should run > smoothly, because they used up all the sleep bonus. that's plain wrong - at least in the test i've reproduced. In any case, if that were the case then that would be visible in the stats. So please send me your cfs-debug-info.sh output captured while the test is running (with a CONFIG_SCHEDSTATS=y and CONFIG_SCHED_DEBUG=y kernel) - you can download it from: http://people.redhat.com/mingo/cfs-scheduler/tools/cfs-debug-info.sh for best data, execute this before running it: for N in /proc/*/sched; do echo 0 > $N; done > If you like, I can send you my straces, but they are kind of big > though, and you need to strace each gear, as stracing itself changes > the workload balance. sure, send them along or upload them somewhere - but more importantly, please send the cfs-debug-info.sh output. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Ingo Molnar wrote: > * Al Boldi <[EMAIL PROTECTED]> wrote: > > I have narrowed it down a bit to add_wait_runtime. > > the scheduler is a red herring here. Could you "strace -ttt -TTT" one of > the glxgears instances (and send us the cfs-debug-info.sh output, with > CONFIG_SCHED_DEBUG=y and CONFIG_SCHEDSTATS=y as requested before) so > that we can have a closer look? > > i reproduced something similar and there the stall is caused by 1+ > second select() delays on the X client<->server socket. The scheduler > stats agree with that: > > se.sleep_max : 2194711437 > se.block_max : 0 > se.exec_max : 977446 > se.wait_max : 1912321 > > the scheduler itself had a worst-case scheduling delay of 1.9 > milliseconds for that glxgears instance (which is perfectly good - in > fact - excellent interactivity) - but the task had a maximum sleep time > of 2.19 seconds. So the 'glitch' was not caused by the scheduler. 2.19sec is probably the time you need to lay them out side by side. You see, gears sleeps when it is covered by another window, so once you lay them out it starts running, and that's when they start to stutter for about 10sec. After that they should run smoothly, because they used up all the sleep bonus. If you like, I can send you my straces, but they are kind of big though, and you need to strace each gear, as stracing itself changes the workload balance. Let's first make sure what we are looking for: 1. start # gears & gears & gears & 2. lay them out side by side, don't worry about sleep times yet. 3. now they start stuttering for about 10sec 4. now they run out of sleep bonuses and smooth out If this is the sequence you get on your machine, then try disabling add_wait_runtime to see the difference. Thanks! -- Al - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Ingo Molnar wrote: * Al Boldi [EMAIL PROTECTED] wrote: I have narrowed it down a bit to add_wait_runtime. the scheduler is a red herring here. Could you strace -ttt -TTT one of the glxgears instances (and send us the cfs-debug-info.sh output, with CONFIG_SCHED_DEBUG=y and CONFIG_SCHEDSTATS=y as requested before) so that we can have a closer look? i reproduced something similar and there the stall is caused by 1+ second select() delays on the X client-server socket. The scheduler stats agree with that: se.sleep_max : 2194711437 se.block_max : 0 se.exec_max : 977446 se.wait_max : 1912321 the scheduler itself had a worst-case scheduling delay of 1.9 milliseconds for that glxgears instance (which is perfectly good - in fact - excellent interactivity) - but the task had a maximum sleep time of 2.19 seconds. So the 'glitch' was not caused by the scheduler. 2.19sec is probably the time you need to lay them out side by side. You see, gears sleeps when it is covered by another window, so once you lay them out it starts running, and that's when they start to stutter for about 10sec. After that they should run smoothly, because they used up all the sleep bonus. If you like, I can send you my straces, but they are kind of big though, and you need to strace each gear, as stracing itself changes the workload balance. Let's first make sure what we are looking for: 1. start # gears gears gears 2. lay them out side by side, don't worry about sleep times yet. 3. now they start stuttering for about 10sec 4. now they run out of sleep bonuses and smooth out If this is the sequence you get on your machine, then try disabling add_wait_runtime to see the difference. Thanks! -- Al - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Al Boldi [EMAIL PROTECTED] wrote: se.sleep_max : 2194711437 se.block_max : 0 se.exec_max : 977446 se.wait_max : 1912321 the scheduler itself had a worst-case scheduling delay of 1.9 milliseconds for that glxgears instance (which is perfectly good - in fact - excellent interactivity) - but the task had a maximum sleep time of 2.19 seconds. So the 'glitch' was not caused by the scheduler. 2.19sec is probably the time you need to lay them out side by side. [...] nope, i cleared the stats after i laid the glxgears out, via: for N in /proc/*/sched; do echo 0 $N; done and i did the strace (which showed a 1+ seconds latency) while the glxgears was not manipulated in any way. [...] You see, gears sleeps when it is covered by another window, [...] none of the gear windows in my test were overlaid... [...] so once you lay them out it starts running, and that's when they start to stutter for about 10sec. After that they should run smoothly, because they used up all the sleep bonus. that's plain wrong - at least in the test i've reproduced. In any case, if that were the case then that would be visible in the stats. So please send me your cfs-debug-info.sh output captured while the test is running (with a CONFIG_SCHEDSTATS=y and CONFIG_SCHED_DEBUG=y kernel) - you can download it from: http://people.redhat.com/mingo/cfs-scheduler/tools/cfs-debug-info.sh for best data, execute this before running it: for N in /proc/*/sched; do echo 0 $N; done If you like, I can send you my straces, but they are kind of big though, and you need to strace each gear, as stracing itself changes the workload balance. sure, send them along or upload them somewhere - but more importantly, please send the cfs-debug-info.sh output. Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On Wed, 2007-08-29 at 06:46 +0200, Ingo Molnar wrote: ok, i finally managed to reproduce the artifact myself on an older box. It goes like this: start up X with the vesa driver (or with NoDRI) to force software rendering. Then start up a couple of glxgears instances. Those glxgears instances update in a very chunky, stuttering way - each glxgears instance runs/stops/runs/stops at a rate of a about once per second, and this was reported to me as a potential CPU scheduler regression. Hmm. I can't even run two copies of glxgears on software GL code today; it's broken in every X server I have available. Someone broke it a while ago, but no-one noticed. However, this shouldn't be GLX related as the software rasterizer is no different from any other rendering code. Testing with my smart-scheduler case (many copies of 'plaid') shows that at least with git master, things are working as designed. When GLX is working again, I'll try that as well. at a quick glance this is not a CPU scheduler thing: X uses up 99% of CPU time, all the glxgears tasks (i needed 8 parallel instances to see the stallings) are using up the remaining 1% of CPU time. The ordering of the requests from the glxgears tasks is X's choice - and for a pathological overload situation like this we cannot blame X at all for not producing a completely smooth output. (although Xorg could perhaps try to schedule such requests more smoothly, in a more finegrained way?) It does. It should switch between clients ever 20ms; that's why X spends so much time asking the kernel for the current time. Make sure the X server isn't running with the smart scheduler disabled; that will cause precisely the symptoms you're seeing here. In the normal usptream sources, you'd have to use '-dumbSched' as an X server command line option. The old 'scheduler' would run an entire X client's input buffer dry before looking for requests from another client. Because glxgears requests are small but time consuming, this can cause very long delays between client switching. -- [EMAIL PROTECTED] signature.asc Description: This is a digitally signed message part
Re: CFS review
* Keith Packard [EMAIL PROTECTED] wrote: Make sure the X server isn't running with the smart scheduler disabled; that will cause precisely the symptoms you're seeing here. In the normal usptream sources, you'd have to use '-dumbSched' as an X server command line option. The old 'scheduler' would run an entire X client's input buffer dry before looking for requests from another client. Because glxgears requests are small but time consuming, this can cause very long delays between client switching. on the old box where i've reproduced this i've got an ancient X version: neptune:~ X -version X Window System Version 6.8.2 Release Date: 9 February 2005 X Protocol Version 11, Revision 0, Release 6.8.2 Build Operating System: Linux 2.6.9-22.ELsmp i686 [ELF] is that old enough to not have the smart X scheduler? on newer systems i dont see correctly updated glxgears output (probably the GLX bug you mentioned) so i cannot reproduce the bug. Al, could you send us your 'X -version' output? Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Ingo Molnar wrote: * Keith Packard [EMAIL PROTECTED] wrote: Make sure the X server isn't running with the smart scheduler disabled; that will cause precisely the symptoms you're seeing here. In the normal usptream sources, you'd have to use '-dumbSched' as an X server command line option. The old 'scheduler' would run an entire X client's input buffer dry before looking for requests from another client. Because glxgears requests are small but time consuming, this can cause very long delays between client switching. on the old box where i've reproduced this i've got an ancient X version: neptune:~ X -version X Window System Version 6.8.2 Release Date: 9 February 2005 X Protocol Version 11, Revision 0, Release 6.8.2 Build Operating System: Linux 2.6.9-22.ELsmp i686 [ELF] is that old enough to not have the smart X scheduler? on newer systems i dont see correctly updated glxgears output (probably the GLX bug you mentioned) so i cannot reproduce the bug. Al, could you send us your 'X -version' output? This is the one I have been talking about: XFree86 Version 4.3.0 Release Date: 27 February 2003 X Protocol Version 11, Revision 0, Release 6.6 Build Operating System: Linux 2.4.21-0.13mdksmp i686 [ELF] I also tried the gears test just now on this: X Window System Version 6.8.1 Release Date: 17 September 2004 X Protocol Version 11, Revision 0, Release 6.8.1 Build Operating System: Linux 2.6.9-1.860_ELsmp i686 [ELF] but it completely locks up. Disabling add_wait_runtime seems to fix it. Thanks! -- Al - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Ingo Molnar wrote: * Bill Davidsen [EMAIL PROTECTED] wrote: There is another way to show the problem visually under X (vesa-driver), by starting 3 gears simultaneously, which after laying them out side-by-side need some settling time before smoothing out. Without __update_curr it's absolutely smooth from the start. I posted a LOT of stuff using the glitch1 script, and finally found a set of tuning values which make the test script run smooth. See back posts, I don't have them here. but you have real 3D hw and DRI enabled, correct? In that case X uses up almost no CPU time and glxgears makes most of the processing. That is quite different from the above software-rendering case, where X spends most of the CPU time. No, my test machine for that is a compile server, and uses the built-in motherboard graphics which are very limited. This is not in any sense a graphics powerhouse, it is used to build custom kernels and applications, and for testing of kvm and xen, and I grabbed it because it had the only Core2 CPU I could reboot to try new kernel versions and from cold boot testing, discovered the graphics smoothness issue by having several windows open on compiles, and developed the glitch1 script as a way to reproduce it. The settings I used, features=14, granularity=50, work to improve smoothness on other machines for other uses, but they do seem to impact performance for compiles, video processing, etc, so they are not optimal for general use. I regard the existence of these tuning knobs as one of the real strengths of CFS, when you change the tuning it has a visible effect. -- bill davidsen [EMAIL PROTECTED] CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On Wed, 2007-08-29 at 10:04 +0200, Ingo Molnar wrote: is that old enough to not have the smart X scheduler? The smart scheduler went into the server in like 2000. I don't think you've got any systems that old. XFree86 4.1 or 4.2, I can't remember which. (probably the GLX bug you mentioned) so i cannot reproduce the bug. With X server 1.3, I'm getting consistent crashes with two glxgear instances running. So, if you're getting any output, it's better than my situation. -- [EMAIL PROTECTED] signature.asc Description: This is a digitally signed message part
Re: CFS review
On 08/29/2007 05:57 PM, Keith Packard wrote: With X server 1.3, I'm getting consistent crashes with two glxgear instances running. So, if you're getting any output, it's better than my situation. Before people focuss on software rendering too much -- also with 1.3.0 (and a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly crummy using hardware rendering. While I can move the glxgears window itself, the actual spinning wheels stay in the upper-left corner of the screen and the movement leaves a non-repainting trace on the screen. Running a second instance of glxgears in addition seems to make both instances unkillable -- and when I just now forcefully killed X in this situation (the spinning wheels were covering the upper left corner of all my desktops) I got the below. Kernel is 2.6.22.5-cfs-v20.5, schedule() is in the traces (but that may be expected anyway). BUG: unable to handle kernel NULL pointer dereference at virtual address 0010 printing eip: c10ff416 *pde = Oops: [#1] PREEMPT Modules linked in: nfsd exportfs lockd nfs_acl sunrpc nls_iso8859_1 nls_cp437 vfat fat nls_base CPU:0 EIP:0060:[c10ff416]Not tainted VLI EFLAGS: 00210246 (2.6.22.5-cfs-v20.5-local #5) EIP is at mga_dma_buffers+0x189/0x2e3 eax: ebx: efd07200 ecx: 0001 edx: efc32c00 esi: edi: c12756cc ebp: dfea44c0 esp: dddaaec0 ds: 007b es: 007b fs: gs: 0033 ss: 0068 Process glxgears (pid: 1775, ti=dddaa000 task=e9daca60 task.ti=dddaa000) Stack: efc32c00 0004 e4c3bd20 c10fa54b e4c3bd20 efc32c00 0004 0001 0001 bfbdb8bc bfbdb8b8 c10ff28d 0029 c12756cc dfea44c0 c10f87fc bfbdb844 Call Trace: [c10fa54b] drm_lock+0x255/0x2de [c10ff28d] mga_dma_buffers+0x0/0x2e3 [c10f87fc] drm_ioctl+0x142/0x18a [c1005973] do_IRQ+0x97/0xb0 [c10f86ba] drm_ioctl+0x0/0x18a [c10f86ba] drm_ioctl+0x0/0x18a [c105b0d7] do_ioctl+0x87/0x9f [c105b32c] vfs_ioctl+0x23d/0x250 [c11b533e] schedule+0x2d0/0x2e6 [c105b372] sys_ioctl+0x33/0x4d [c1003d1e] syscall_call+0x7/0xb === Code: 9a 08 03 00 00 8b 73 30 74 14 c7 44 24 04 28 76 1c c1 c7 04 24 49 51 23 c1 e8 b0 74 f1 ff 8b 83 d8 00 00 00 83 3d 1c 47 30 c1 00 8b 40 10 8b a8 58 1e 00 00 8b 43 28 8b b8 64 01 00 00 74 32 8b EIP: [c10ff416] mga_dma_buffers+0x189/0x2e3 SS:ESP 0068:dddaaec0 BUG: unable to handle kernel NULL pointer dereference at virtual address 0010 printing eip: c10ff416 *pde = Oops: [#2] PREEMPT Modules linked in: nfsd exportfs lockd nfs_acl sunrpc nls_iso8859_1 nls_cp437 vfat fat nls_base CPU:0 EIP:0060:[c10ff416]Not tainted VLI EFLAGS: 00210246 (2.6.22.5-cfs-v20.5-local #5) EIP is at mga_dma_buffers+0x189/0x2e3 eax: ebx: efd07200 ecx: 0001 edx: efc32c00 esi: edi: c12756cc ebp: dfea4780 esp: e0552ec0 ds: 007b es: 007b fs: gs: 0033 ss: 0068 Process glxgears (pid: 1776, ti=e0552000 task=c19ec000 task.ti=e0552000) Stack: efc32c00 0003 efc64b40 c10fa54b efc64b40 efc32c00 0003 0001 0001 bf8dbdcc bf8dbdc8 c10ff28d 0029 c12756cc dfea4780 c10f87fc bf8dbd54 Call Trace: [c10fa54b] drm_lock+0x255/0x2de [c10ff28d] mga_dma_buffers+0x0/0x2e3 [c10f87fc] drm_ioctl+0x142/0x18a [c11b53f6] preempt_schedule+0x4e/0x5a [c10f86ba] drm_ioctl+0x0/0x18a [c10f86ba] drm_ioctl+0x0/0x18a [c105b0d7] do_ioctl+0x87/0x9f [c105b32c] vfs_ioctl+0x23d/0x250 [c11b52a9] schedule+0x23b/0x2e6 [c11b533e] schedule+0x2d0/0x2e6 [c105b372] sys_ioctl+0x33/0x4d [c1003d1e] syscall_call+0x7/0xb === Code: 9a 08 03 00 00 8b 73 30 74 14 c7 44 24 04 28 76 1c c1 c7 04 24 49 51 23 c1 e8 b0 74 f1 ff 8b 83 d8 00 00 00 83 3d 1c 47 30 c1 00 8b 40 10 8b a8 58 1e 00 00 8b 43 28 8b b8 64 01 00 00 74 32 8b EIP: [c10ff416] mga_dma_buffers+0x189/0x2e3 SS:ESP 0068:e0552ec0 [drm:drm_release] *ERROR* Device busy: 2 0 Rene. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Al Boldi <[EMAIL PROTECTED]> wrote: > I have narrowed it down a bit to add_wait_runtime. the scheduler is a red herring here. Could you "strace -ttt -TTT" one of the glxgears instances (and send us the cfs-debug-info.sh output, with CONFIG_SCHED_DEBUG=y and CONFIG_SCHEDSTATS=y as requested before) so that we can have a closer look? i reproduced something similar and there the stall is caused by 1+ second select() delays on the X client<->server socket. The scheduler stats agree with that: se.sleep_max : 2194711437 se.block_max : 0 se.exec_max : 977446 se.wait_max : 1912321 the scheduler itself had a worst-case scheduling delay of 1.9 milliseconds for that glxgears instance (which is perfectly good - in fact - excellent interactivity) - but the task had a maximum sleep time of 2.19 seconds. So the 'glitch' was not caused by the scheduler. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On Wed, 2007-08-29 at 06:18 +0200, Ingo Molnar wrote: > * Al Boldi <[EMAIL PROTECTED]> wrote: > > > No need for framebuffer. All you need is X using the X.org > > vesa-driver. Then start gears like this: > > > > # gears & gears & gears & > > > > Then lay them out side by side to see the periodic stallings for > > ~10sec. > > i just tried something similar (by adding Option "NoDRI" to xorg.conf) > and i'm wondering how it can be smooth on vesa-driver at all. I tested > it on a Core2Duo box and software rendering manages to do about 3 frames > per second. (although glxgears itself thinks it does ~600 fps) If i > start 3 glxgears then they do ~1 frame per second each. This is on > Fedora 7 with xorg-x11-server-Xorg-1.3.0.0-9.fc7 and > xorg-x11-drv-i810-2.0.0-4.fc7. At least you can run the darn test... the third instance of glxgears here means say bye bye to GUI instantly. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On Wed, 2007-08-29 at 06:18 +0200, Ingo Molnar wrote: > > Then lay them out side by side to see the periodic stallings for > > ~10sec. The X scheduling code isn't really designed to handle software GL well; the requests can be very expensive to execute, and yet are specified as atomic operations (sigh). > i just tried something similar (by adding Option "NoDRI" to xorg.conf) > and i'm wondering how it can be smooth on vesa-driver at all. I tested > it on a Core2Duo box and software rendering manages to do about 3 frames > per second. (although glxgears itself thinks it does ~600 fps) If i > start 3 glxgears then they do ~1 frame per second each. This is on > Fedora 7 with xorg-x11-server-Xorg-1.3.0.0-9.fc7 and > xorg-x11-drv-i810-2.0.0-4.fc7. Are you attempting to measure the visible updates by eye? Or are you using some other metric? In any case, attempting to measure anything using glxgears is a bad idea; it's not representative of *any* real applications. And then using software GL on top of that... What was the question again? -- [EMAIL PROTECTED] signature.asc Description: This is a digitally signed message part
Re: CFS review
Ingo Molnar wrote: > * Linus Torvalds <[EMAIL PROTECTED]> wrote: > > On Tue, 28 Aug 2007, Al Boldi wrote: > > > I like your analysis, but how do you explain that these stalls > > > vanish when __update_curr is disabled? > > > > It's entirely possible that what happens is that the X scheduling is > > just a slightly unstable system - which effectively would turn a small > > scheduling difference into a *huge* visible difference. > > i think it's because disabling __update_curr() in essence removes the > ability of scheduler to preempt tasks - that hack in essence results in > a non-scheduler. Hence the gears + X pair of tasks becomes a synchronous > pair of tasks in essence - and thus gears cannot "overload" X. I have narrowed it down a bit to add_wait_runtime. Patch 2.6.22.5-v20.4 like this: 346- * the two values are equal) 347- * [Note: delta_mine - delta_exec is negative]: 348- */ 349:// add_wait_runtime(cfs_rq, curr, delta_mine - delta_exec); 350-} 351- 352-static void update_curr(struct cfs_rq *cfs_rq) When disabling add_wait_runtime the stalls are gone. With this change the scheduler is still usable, but it does not constitute a fix. Now, even with this hack, uneven nice-levels between X and gears causes a return of the stalls, so make sure both X and gears run on the same nice-level when testing. Again, the whole point of this workload is to expose scheduler glitches regardless of whether X is broken or not, and my hunch is that this problem looks suspiciously like an ia-boosting bug. What's important to note is that by adjusting the scheduler we can effect a correction in behaviour, and as such should yield this problem as fixable. It's probably a good idea to look further into add_wait_runtime. Thanks! -- Al - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Al Boldi <[EMAIL PROTECTED]> wrote: > No need for framebuffer. All you need is X using the X.org > vesa-driver. Then start gears like this: > > # gears & gears & gears & > > Then lay them out side by side to see the periodic stallings for > ~10sec. i just tried something similar (by adding Option "NoDRI" to xorg.conf) and i'm wondering how it can be smooth on vesa-driver at all. I tested it on a Core2Duo box and software rendering manages to do about 3 frames per second. (although glxgears itself thinks it does ~600 fps) If i start 3 glxgears then they do ~1 frame per second each. This is on Fedora 7 with xorg-x11-server-Xorg-1.3.0.0-9.fc7 and xorg-x11-drv-i810-2.0.0-4.fc7. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Bill Davidsen <[EMAIL PROTECTED]> wrote: > > There is another way to show the problem visually under X > > (vesa-driver), by starting 3 gears simultaneously, which after > > laying them out side-by-side need some settling time before > > smoothing out. Without __update_curr it's absolutely smooth from > > the start. > > I posted a LOT of stuff using the glitch1 script, and finally found a > set of tuning values which make the test script run smooth. See back > posts, I don't have them here. but you have real 3D hw and DRI enabled, correct? In that case X uses up almost no CPU time and glxgears makes most of the processing. That is quite different from the above software-rendering case, where X spends most of the CPU time. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Ingo Molnar wrote: * Al Boldi <[EMAIL PROTECTED]> wrote: ok. I think i might finally have found the bug causing this. Could you try the fix below, does your webserver thread-startup test work any better? It seems to help somewhat, but the problem is still visible. Even v20.3 on 2.6.22.5 didn't help. It does look related to ia-boosting, so I turned off __update_curr like Roman mentioned, which had an enormous smoothing effect, but then nice levels completely break down and lockup the system. you can turn sleeper-fairness off via: echo 28 > /proc/sys/kernel/sched_features another thing to try would be: echo 12 > /proc/sys/kernel/sched_features 14, and drop the granularity to 50. (that's the new-task penalty turned off.) Another thing to try would be to edit this: if (sysctl_sched_features & SCHED_FEAT_START_DEBIT) p->se.wait_runtime = -(sched_granularity(cfs_rq) / 2); to: if (sysctl_sched_features & SCHED_FEAT_START_DEBIT) p->se.wait_runtime = -(sched_granularity(cfs_rq); and could you also check 20.4 on 2.6.22.5 perhaps, or very latest -git? (Peter has experienced smaller spikes with that.) Ingo -- Bill Davidsen <[EMAIL PROTECTED]> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Al Boldi wrote: Ingo Molnar wrote: * Al Boldi <[EMAIL PROTECTED]> wrote: The problem is that consecutive runs don't give consistent results and sometimes stalls. You may want to try that. well, there's a natural saturation point after a few hundred tasks (depending on your CPU's speed), at which point there's no idle time left. From that point on things get slower progressively (and the ability of the shell to start new ping tasks is impacted as well), but that's expected on an overloaded system, isnt it? Of course, things should get slower with higher load, but it should be consistent without stalls. To see this problem, make sure you boot into /bin/sh with the normal VGA console (ie. not fb-console). Then try each loop a few times to show different behaviour; loops like: # for ((i=0; i<; i++)); do ping 10.1 -A > /dev/null & done # for ((i=0; i<; i++)); do nice -99 ping 10.1 -A > /dev/null & done # { for ((i=0; i<; i++)); do ping 10.1 -A > /dev/null & done } > /dev/null 2>&1 Especially the last one sometimes causes a complete console lock-up, while the other two sometimes stall then surge periodically. ok. I think i might finally have found the bug causing this. Could you try the fix below, does your webserver thread-startup test work any better? It seems to help somewhat, but the problem is still visible. Even v20.3 on 2.6.22.5 didn't help. It does look related to ia-boosting, so I turned off __update_curr like Roman mentioned, which had an enormous smoothing effect, but then nice levels completely break down and lockup the system. There is another way to show the problem visually under X (vesa-driver), by starting 3 gears simultaneously, which after laying them out side-by-side need some settling time before smoothing out. Without __update_curr it's absolutely smooth from the start. I posted a LOT of stuff using the glitch1 script, and finally found a set of tuning values which make the test script run smooth. See back posts, I don't have them here. -- Bill Davidsen <[EMAIL PROTECTED]> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On Mon, 27 Aug 2007 22:05:37 PDT, Linus Torvalds said: > > > On Tue, 28 Aug 2007, Al Boldi wrote: > > > > No need for framebuffer. All you need is X using the X.org vesa-driver. > > Then start gears like this: > > > > # gears & gears & gears & > > > > Then lay them out side by side to see the periodic stallings for ~10sec. > > I don't think this is a good test. > > Why? > > If you're not using direct rendering, what you have is the X server doing > all the rendering, which in turn means that what you are testing is quite > possibly not so much about the *kernel* scheduling, but about *X-server* > scheduling! I wonder - can people who are doing this as a test please specify whether they're using an older X that has the libX11 or the newer libxcb code? That may have a similar impact as well. (libxcb is pretty new - it landed in Fedora Rawhide just about a month ago, after Fedora 7 shipped. Not sure what other distros have it now...) pgpI8maTCY4aR.pgp Description: PGP signature
Re: CFS review
* Willy Tarreau <[EMAIL PROTECTED]> wrote: > On Tue, Aug 28, 2007 at 10:02:18AM +0200, Ingo Molnar wrote: > > > > * Xavier Bestel <[EMAIL PROTECTED]> wrote: > > > > > Are you sure they are stalled ? What you may have is simple gears > > > running at a multiple of your screen refresh rate, so they only appear > > > stalled. > > > > > > Plus, as said Linus, you're not really testing the kernel scheduler. > > > gears is really bad benchmark, it should die. > > > > i like glxgears as long as it runs on _real_ 3D hardware, because there > > it has minimal interaction with X and so it's an excellent visual test > > about consistency of scheduling. You can immediately see (literally) > > scheduling hickups down to a millisecond range (!). In this sense, if > > done and interpreted carefully, glxgears gives more feedback than many > > audio tests. (audio latency problems are audible, but on most sound hw > > it takes quite a bit of latency to produce an xrun.) So basically > > glxgears is the "early warning system" that tells us about the potential > > for xruns earlier than an xrun would happen for real. > > > > [ of course you can also run all the other tools to get numeric results, > > but glxgears is nice in that it gives immediate visual feedback. ] > > Al could also test ocbench, which brings visual feedback without > harnessing the X server : http://linux.1wt.eu/sched/ > > I packaged it exactly for this problem and it has already helped. It > uses X after each loop, so if you run it with large run time, X is > nearly not sollicitated. yeah, and ocbench is one of my favorite cross-task-fairness tests - i dont release a CFS patch without checking it with ocbench first :-) Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On Tue, Aug 28, 2007 at 10:02:18AM +0200, Ingo Molnar wrote: > > * Xavier Bestel <[EMAIL PROTECTED]> wrote: > > > Are you sure they are stalled ? What you may have is simple gears > > running at a multiple of your screen refresh rate, so they only appear > > stalled. > > > > Plus, as said Linus, you're not really testing the kernel scheduler. > > gears is really bad benchmark, it should die. > > i like glxgears as long as it runs on _real_ 3D hardware, because there > it has minimal interaction with X and so it's an excellent visual test > about consistency of scheduling. You can immediately see (literally) > scheduling hickups down to a millisecond range (!). In this sense, if > done and interpreted carefully, glxgears gives more feedback than many > audio tests. (audio latency problems are audible, but on most sound hw > it takes quite a bit of latency to produce an xrun.) So basically > glxgears is the "early warning system" that tells us about the potential > for xruns earlier than an xrun would happen for real. > > [ of course you can also run all the other tools to get numeric results, > but glxgears is nice in that it gives immediate visual feedback. ] Al could also test ocbench, which brings visual feedback without harnessing the X server : http://linux.1wt.eu/sched/ I packaged it exactly for this problem and it has already helped. It uses X after each loop, so if you run it with large run time, X is nearly not sollicitated. Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Linus Torvalds <[EMAIL PROTECTED]> wrote: > On Tue, 28 Aug 2007, Al Boldi wrote: > > > > I like your analysis, but how do you explain that these stalls > > vanish when __update_curr is disabled? > > It's entirely possible that what happens is that the X scheduling is > just a slightly unstable system - which effectively would turn a small > scheduling difference into a *huge* visible difference. i think it's because disabling __update_curr() in essence removes the ability of scheduler to preempt tasks - that hack in essence results in a non-scheduler. Hence the gears + X pair of tasks becomes a synchronous pair of tasks in essence - and thus gears cannot "overload" X. Normally gears + X is an asynchronous pair of tasks, with gears (or xperf, or devel versions of firefox, etc.) not being throttled at all and thus being able to overload/spam the X server with requests. (And we generally want to _reward_ asynchronity and want to allow tasks to overlap each other and we want each task to go as fast and as parallel as it can.) Eventually X's built-in "bad, abusive client" throttling code kicks in, which, AFAIK is pretty crude and might yield to such artifacts. But ... it would be nice for an X person to confirm - and in any case i'll try Al's workload - i thought i had a reproducer but i barked up the wrong tree :-) My laptop doesnt run with the vesa driver, so i have no easy reproducer for now. ( also, it would be nice if Al could try rc4 plus my latest scheduler tree as well - just on the odd chance that something got fixed meanwhile. In particular Mike's sleeper-bonus-limit fix could be related. ) Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On Tue, 28 Aug 2007 09:34:03 -0700 (PDT) Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > On Tue, 28 Aug 2007, Al Boldi wrote: > > > > I like your analysis, but how do you explain that these stalls > > vanish when __update_curr is disabled? > > It's entirely possible that what happens is that the X scheduling is > just a slightly unstable system - which effectively would turn a > small scheduling difference into a *huge* visible difference. one thing that happens if you remove __update_curr is the following pattern (since no apps will get preempted involuntarily) app 1 submits a full frame worth of 3D stuff to X app 1 then sleeps/waits for that to complete X gets to run, has 1 full frame to render, does this X now waits for more input app 2 now gets to run and submits a full frame app 2 then sleeps again X gets to run again to process and complete X goes to sleep app 3 gets to run and submits a full frame app 3 then sleeps X runs X sleeps app 1 gets to submit a frame etc etc so without preemption happening, you can get "perfect" behavior, just because everything is perfectly doing 1 thing at a time cooperatively. once you start doing timeslices and enforcing limits on them, this "perfect pattern" will break down (remember this is all software rendering in the problem being described), and whatever you get won't be as perfect as this. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On Tue, 28 Aug 2007, Al Boldi wrote: > > I like your analysis, but how do you explain that these stalls vanish when > __update_curr is disabled? It's entirely possible that what happens is that the X scheduling is just a slightly unstable system - which effectively would turn a small scheduling difference into a *huge* visible difference. And the "small scheduling difference" might be as simple as "if the process slept for a while, we give it a bit more CPU time". And then you get into some unbalanced setup where the X scheduler makes it sleep even more, because it fills its buffers. Or something. I can easily see two schedulers that are trying to *individually* be "fair", fighting it out in a way where the end result is not very good. I do suspect it's probably a very interesting load, so I hope Ingo looks more at it, but I also suspect it's more than just the kernel scheduler. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Xavier Bestel <[EMAIL PROTECTED]> wrote: > Are you sure they are stalled ? What you may have is simple gears > running at a multiple of your screen refresh rate, so they only appear > stalled. > > Plus, as said Linus, you're not really testing the kernel scheduler. > gears is really bad benchmark, it should die. i like glxgears as long as it runs on _real_ 3D hardware, because there it has minimal interaction with X and so it's an excellent visual test about consistency of scheduling. You can immediately see (literally) scheduling hickups down to a millisecond range (!). In this sense, if done and interpreted carefully, glxgears gives more feedback than many audio tests. (audio latency problems are audible, but on most sound hw it takes quite a bit of latency to produce an xrun.) So basically glxgears is the "early warning system" that tells us about the potential for xruns earlier than an xrun would happen for real. [ of course you can also run all the other tools to get numeric results, but glxgears is nice in that it gives immediate visual feedback. ] but i agree that on a non-accelerated X setup glxgears is not really meaningful. It can have similar "spam the X server" effects as xperf. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On Tue, 2007-08-28 at 07:37 +0300, Al Boldi wrote: > start gears like this: > > # gears & gears & gears & > > Then lay them out side by side to see the periodic stallings for > ~10sec. Are you sure they are stalled ? What you may have is simple gears running at a multiple of your screen refresh rate, so they only appear stalled. Plus, as said Linus, you're not really testing the kernel scheduler. gears is really bad benchmark, it should die. Xav - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Mike Galbraith <[EMAIL PROTECTED]> wrote: > > I like your analysis, but how do you explain that these stalls > > vanish when __update_curr is disabled? > > When you disable __update_curr(), you're utterly destroying the > scheduler. There may well be a scheduler connection, but disabling > __update_curr() doesn't tell you anything meaningful. Basically, you're > letting all tasks run uninterrupted for just as long as they please > (which is why busy loops lock your box solid as a rock). I'd suggest > gathering some sched_debug stats or something... [...] the output of the following would be nice: http://people.redhat.com/mingo/cfs-scheduler/tools/cfs-debug-info.sh captured while the gears are running. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On Tue, 2007-08-28 at 08:23 +0300, Al Boldi wrote: > Linus Torvalds wrote: > > On Tue, 28 Aug 2007, Al Boldi wrote: > > > No need for framebuffer. All you need is X using the X.org vesa-driver. > > > Then start gears like this: > > > > > > # gears & gears & gears & > > > > > > Then lay them out side by side to see the periodic stallings for ~10sec. > > > > I don't think this is a good test. > > > > Why? > > > > If you're not using direct rendering, what you have is the X server doing > > all the rendering, which in turn means that what you are testing is quite > > possibly not so much about the *kernel* scheduling, but about *X-server* > > scheduling! > > > > I'm sure the kernel scheduler has an impact, but what's more likely to be > > going on is that you're seeing effects that are indirect, and not > > necessarily at all even "good". > > > > For example, if the X server is the scheduling point, it's entirely > > possible that it ends up showing effects that are more due to the queueing > > of the X command stream than due to the scheduler - and that those > > stalls are simply due to *that*. > > > > One thing to try is to run the X connection in synchronous mode, which > > minimizes queueing issues. I don't know if gears has a flag to turn on > > synchronous X messaging, though. Many X programs take the "[+-]sync" flag > > to turn on synchronous mode, iirc. > > I like your analysis, but how do you explain that these stalls vanish when > __update_curr is disabled? When you disable __update_curr(), you're utterly destroying the scheduler. There may well be a scheduler connection, but disabling __update_curr() doesn't tell you anything meaningful. Basically, you're letting all tasks run uninterrupted for just as long as they please (which is why busy loops lock your box solid as a rock). I'd suggest gathering some sched_debug stats or something... shoot, _anything_ but what you did :) -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On Tue, 2007-08-28 at 08:23 +0300, Al Boldi wrote: Linus Torvalds wrote: On Tue, 28 Aug 2007, Al Boldi wrote: No need for framebuffer. All you need is X using the X.org vesa-driver. Then start gears like this: # gears gears gears Then lay them out side by side to see the periodic stallings for ~10sec. I don't think this is a good test. Why? If you're not using direct rendering, what you have is the X server doing all the rendering, which in turn means that what you are testing is quite possibly not so much about the *kernel* scheduling, but about *X-server* scheduling! I'm sure the kernel scheduler has an impact, but what's more likely to be going on is that you're seeing effects that are indirect, and not necessarily at all even good. For example, if the X server is the scheduling point, it's entirely possible that it ends up showing effects that are more due to the queueing of the X command stream than due to the scheduler - and that those stalls are simply due to *that*. One thing to try is to run the X connection in synchronous mode, which minimizes queueing issues. I don't know if gears has a flag to turn on synchronous X messaging, though. Many X programs take the [+-]sync flag to turn on synchronous mode, iirc. I like your analysis, but how do you explain that these stalls vanish when __update_curr is disabled? When you disable __update_curr(), you're utterly destroying the scheduler. There may well be a scheduler connection, but disabling __update_curr() doesn't tell you anything meaningful. Basically, you're letting all tasks run uninterrupted for just as long as they please (which is why busy loops lock your box solid as a rock). I'd suggest gathering some sched_debug stats or something... shoot, _anything_ but what you did :) -Mike - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Mike Galbraith [EMAIL PROTECTED] wrote: I like your analysis, but how do you explain that these stalls vanish when __update_curr is disabled? When you disable __update_curr(), you're utterly destroying the scheduler. There may well be a scheduler connection, but disabling __update_curr() doesn't tell you anything meaningful. Basically, you're letting all tasks run uninterrupted for just as long as they please (which is why busy loops lock your box solid as a rock). I'd suggest gathering some sched_debug stats or something... [...] the output of the following would be nice: http://people.redhat.com/mingo/cfs-scheduler/tools/cfs-debug-info.sh captured while the gears are running. Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On Tue, 2007-08-28 at 07:37 +0300, Al Boldi wrote: start gears like this: # gears gears gears Then lay them out side by side to see the periodic stallings for ~10sec. Are you sure they are stalled ? What you may have is simple gears running at a multiple of your screen refresh rate, so they only appear stalled. Plus, as said Linus, you're not really testing the kernel scheduler. gears is really bad benchmark, it should die. Xav - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Xavier Bestel [EMAIL PROTECTED] wrote: Are you sure they are stalled ? What you may have is simple gears running at a multiple of your screen refresh rate, so they only appear stalled. Plus, as said Linus, you're not really testing the kernel scheduler. gears is really bad benchmark, it should die. i like glxgears as long as it runs on _real_ 3D hardware, because there it has minimal interaction with X and so it's an excellent visual test about consistency of scheduling. You can immediately see (literally) scheduling hickups down to a millisecond range (!). In this sense, if done and interpreted carefully, glxgears gives more feedback than many audio tests. (audio latency problems are audible, but on most sound hw it takes quite a bit of latency to produce an xrun.) So basically glxgears is the early warning system that tells us about the potential for xruns earlier than an xrun would happen for real. [ of course you can also run all the other tools to get numeric results, but glxgears is nice in that it gives immediate visual feedback. ] but i agree that on a non-accelerated X setup glxgears is not really meaningful. It can have similar spam the X server effects as xperf. Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On Tue, 28 Aug 2007, Al Boldi wrote: I like your analysis, but how do you explain that these stalls vanish when __update_curr is disabled? It's entirely possible that what happens is that the X scheduling is just a slightly unstable system - which effectively would turn a small scheduling difference into a *huge* visible difference. And the small scheduling difference might be as simple as if the process slept for a while, we give it a bit more CPU time. And then you get into some unbalanced setup where the X scheduler makes it sleep even more, because it fills its buffers. Or something. I can easily see two schedulers that are trying to *individually* be fair, fighting it out in a way where the end result is not very good. I do suspect it's probably a very interesting load, so I hope Ingo looks more at it, but I also suspect it's more than just the kernel scheduler. Linus - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Linus Torvalds [EMAIL PROTECTED] wrote: On Tue, 28 Aug 2007, Al Boldi wrote: I like your analysis, but how do you explain that these stalls vanish when __update_curr is disabled? It's entirely possible that what happens is that the X scheduling is just a slightly unstable system - which effectively would turn a small scheduling difference into a *huge* visible difference. i think it's because disabling __update_curr() in essence removes the ability of scheduler to preempt tasks - that hack in essence results in a non-scheduler. Hence the gears + X pair of tasks becomes a synchronous pair of tasks in essence - and thus gears cannot overload X. Normally gears + X is an asynchronous pair of tasks, with gears (or xperf, or devel versions of firefox, etc.) not being throttled at all and thus being able to overload/spam the X server with requests. (And we generally want to _reward_ asynchronity and want to allow tasks to overlap each other and we want each task to go as fast and as parallel as it can.) Eventually X's built-in bad, abusive client throttling code kicks in, which, AFAIK is pretty crude and might yield to such artifacts. But ... it would be nice for an X person to confirm - and in any case i'll try Al's workload - i thought i had a reproducer but i barked up the wrong tree :-) My laptop doesnt run with the vesa driver, so i have no easy reproducer for now. ( also, it would be nice if Al could try rc4 plus my latest scheduler tree as well - just on the odd chance that something got fixed meanwhile. In particular Mike's sleeper-bonus-limit fix could be related. ) Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On Tue, 28 Aug 2007 09:34:03 -0700 (PDT) Linus Torvalds [EMAIL PROTECTED] wrote: On Tue, 28 Aug 2007, Al Boldi wrote: I like your analysis, but how do you explain that these stalls vanish when __update_curr is disabled? It's entirely possible that what happens is that the X scheduling is just a slightly unstable system - which effectively would turn a small scheduling difference into a *huge* visible difference. one thing that happens if you remove __update_curr is the following pattern (since no apps will get preempted involuntarily) app 1 submits a full frame worth of 3D stuff to X app 1 then sleeps/waits for that to complete X gets to run, has 1 full frame to render, does this X now waits for more input app 2 now gets to run and submits a full frame app 2 then sleeps again X gets to run again to process and complete X goes to sleep app 3 gets to run and submits a full frame app 3 then sleeps X runs X sleeps app 1 gets to submit a frame etc etc so without preemption happening, you can get perfect behavior, just because everything is perfectly doing 1 thing at a time cooperatively. once you start doing timeslices and enforcing limits on them, this perfect pattern will break down (remember this is all software rendering in the problem being described), and whatever you get won't be as perfect as this. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On Tue, Aug 28, 2007 at 10:02:18AM +0200, Ingo Molnar wrote: * Xavier Bestel [EMAIL PROTECTED] wrote: Are you sure they are stalled ? What you may have is simple gears running at a multiple of your screen refresh rate, so they only appear stalled. Plus, as said Linus, you're not really testing the kernel scheduler. gears is really bad benchmark, it should die. i like glxgears as long as it runs on _real_ 3D hardware, because there it has minimal interaction with X and so it's an excellent visual test about consistency of scheduling. You can immediately see (literally) scheduling hickups down to a millisecond range (!). In this sense, if done and interpreted carefully, glxgears gives more feedback than many audio tests. (audio latency problems are audible, but on most sound hw it takes quite a bit of latency to produce an xrun.) So basically glxgears is the early warning system that tells us about the potential for xruns earlier than an xrun would happen for real. [ of course you can also run all the other tools to get numeric results, but glxgears is nice in that it gives immediate visual feedback. ] Al could also test ocbench, which brings visual feedback without harnessing the X server : http://linux.1wt.eu/sched/ I packaged it exactly for this problem and it has already helped. It uses X after each loop, so if you run it with large run time, X is nearly not sollicitated. Willy - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Willy Tarreau [EMAIL PROTECTED] wrote: On Tue, Aug 28, 2007 at 10:02:18AM +0200, Ingo Molnar wrote: * Xavier Bestel [EMAIL PROTECTED] wrote: Are you sure they are stalled ? What you may have is simple gears running at a multiple of your screen refresh rate, so they only appear stalled. Plus, as said Linus, you're not really testing the kernel scheduler. gears is really bad benchmark, it should die. i like glxgears as long as it runs on _real_ 3D hardware, because there it has minimal interaction with X and so it's an excellent visual test about consistency of scheduling. You can immediately see (literally) scheduling hickups down to a millisecond range (!). In this sense, if done and interpreted carefully, glxgears gives more feedback than many audio tests. (audio latency problems are audible, but on most sound hw it takes quite a bit of latency to produce an xrun.) So basically glxgears is the early warning system that tells us about the potential for xruns earlier than an xrun would happen for real. [ of course you can also run all the other tools to get numeric results, but glxgears is nice in that it gives immediate visual feedback. ] Al could also test ocbench, which brings visual feedback without harnessing the X server : http://linux.1wt.eu/sched/ I packaged it exactly for this problem and it has already helped. It uses X after each loop, so if you run it with large run time, X is nearly not sollicitated. yeah, and ocbench is one of my favorite cross-task-fairness tests - i dont release a CFS patch without checking it with ocbench first :-) Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On Mon, 27 Aug 2007 22:05:37 PDT, Linus Torvalds said: On Tue, 28 Aug 2007, Al Boldi wrote: No need for framebuffer. All you need is X using the X.org vesa-driver. Then start gears like this: # gears gears gears Then lay them out side by side to see the periodic stallings for ~10sec. I don't think this is a good test. Why? If you're not using direct rendering, what you have is the X server doing all the rendering, which in turn means that what you are testing is quite possibly not so much about the *kernel* scheduling, but about *X-server* scheduling! I wonder - can people who are doing this as a test please specify whether they're using an older X that has the libX11 or the newer libxcb code? That may have a similar impact as well. (libxcb is pretty new - it landed in Fedora Rawhide just about a month ago, after Fedora 7 shipped. Not sure what other distros have it now...) pgpI8maTCY4aR.pgp Description: PGP signature
Re: CFS review
Al Boldi wrote: Ingo Molnar wrote: * Al Boldi [EMAIL PROTECTED] wrote: The problem is that consecutive runs don't give consistent results and sometimes stalls. You may want to try that. well, there's a natural saturation point after a few hundred tasks (depending on your CPU's speed), at which point there's no idle time left. From that point on things get slower progressively (and the ability of the shell to start new ping tasks is impacted as well), but that's expected on an overloaded system, isnt it? Of course, things should get slower with higher load, but it should be consistent without stalls. To see this problem, make sure you boot into /bin/sh with the normal VGA console (ie. not fb-console). Then try each loop a few times to show different behaviour; loops like: # for ((i=0; i; i++)); do ping 10.1 -A /dev/null done # for ((i=0; i; i++)); do nice -99 ping 10.1 -A /dev/null done # { for ((i=0; i; i++)); do ping 10.1 -A /dev/null done } /dev/null 21 Especially the last one sometimes causes a complete console lock-up, while the other two sometimes stall then surge periodically. ok. I think i might finally have found the bug causing this. Could you try the fix below, does your webserver thread-startup test work any better? It seems to help somewhat, but the problem is still visible. Even v20.3 on 2.6.22.5 didn't help. It does look related to ia-boosting, so I turned off __update_curr like Roman mentioned, which had an enormous smoothing effect, but then nice levels completely break down and lockup the system. There is another way to show the problem visually under X (vesa-driver), by starting 3 gears simultaneously, which after laying them out side-by-side need some settling time before smoothing out. Without __update_curr it's absolutely smooth from the start. I posted a LOT of stuff using the glitch1 script, and finally found a set of tuning values which make the test script run smooth. See back posts, I don't have them here. -- Bill Davidsen [EMAIL PROTECTED] We have more to fear from the bungling of the incompetent than from the machinations of the wicked. - from Slashdot - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Ingo Molnar wrote: * Al Boldi [EMAIL PROTECTED] wrote: ok. I think i might finally have found the bug causing this. Could you try the fix below, does your webserver thread-startup test work any better? It seems to help somewhat, but the problem is still visible. Even v20.3 on 2.6.22.5 didn't help. It does look related to ia-boosting, so I turned off __update_curr like Roman mentioned, which had an enormous smoothing effect, but then nice levels completely break down and lockup the system. you can turn sleeper-fairness off via: echo 28 /proc/sys/kernel/sched_features another thing to try would be: echo 12 /proc/sys/kernel/sched_features 14, and drop the granularity to 50. (that's the new-task penalty turned off.) Another thing to try would be to edit this: if (sysctl_sched_features SCHED_FEAT_START_DEBIT) p-se.wait_runtime = -(sched_granularity(cfs_rq) / 2); to: if (sysctl_sched_features SCHED_FEAT_START_DEBIT) p-se.wait_runtime = -(sched_granularity(cfs_rq); and could you also check 20.4 on 2.6.22.5 perhaps, or very latest -git? (Peter has experienced smaller spikes with that.) Ingo -- Bill Davidsen [EMAIL PROTECTED] We have more to fear from the bungling of the incompetent than from the machinations of the wicked. - from Slashdot - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Bill Davidsen [EMAIL PROTECTED] wrote: There is another way to show the problem visually under X (vesa-driver), by starting 3 gears simultaneously, which after laying them out side-by-side need some settling time before smoothing out. Without __update_curr it's absolutely smooth from the start. I posted a LOT of stuff using the glitch1 script, and finally found a set of tuning values which make the test script run smooth. See back posts, I don't have them here. but you have real 3D hw and DRI enabled, correct? In that case X uses up almost no CPU time and glxgears makes most of the processing. That is quite different from the above software-rendering case, where X spends most of the CPU time. Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Al Boldi [EMAIL PROTECTED] wrote: No need for framebuffer. All you need is X using the X.org vesa-driver. Then start gears like this: # gears gears gears Then lay them out side by side to see the periodic stallings for ~10sec. i just tried something similar (by adding Option NoDRI to xorg.conf) and i'm wondering how it can be smooth on vesa-driver at all. I tested it on a Core2Duo box and software rendering manages to do about 3 frames per second. (although glxgears itself thinks it does ~600 fps) If i start 3 glxgears then they do ~1 frame per second each. This is on Fedora 7 with xorg-x11-server-Xorg-1.3.0.0-9.fc7 and xorg-x11-drv-i810-2.0.0-4.fc7. Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Ingo Molnar wrote: * Linus Torvalds [EMAIL PROTECTED] wrote: On Tue, 28 Aug 2007, Al Boldi wrote: I like your analysis, but how do you explain that these stalls vanish when __update_curr is disabled? It's entirely possible that what happens is that the X scheduling is just a slightly unstable system - which effectively would turn a small scheduling difference into a *huge* visible difference. i think it's because disabling __update_curr() in essence removes the ability of scheduler to preempt tasks - that hack in essence results in a non-scheduler. Hence the gears + X pair of tasks becomes a synchronous pair of tasks in essence - and thus gears cannot overload X. I have narrowed it down a bit to add_wait_runtime. Patch 2.6.22.5-v20.4 like this: 346- * the two values are equal) 347- * [Note: delta_mine - delta_exec is negative]: 348- */ 349:// add_wait_runtime(cfs_rq, curr, delta_mine - delta_exec); 350-} 351- 352-static void update_curr(struct cfs_rq *cfs_rq) When disabling add_wait_runtime the stalls are gone. With this change the scheduler is still usable, but it does not constitute a fix. Now, even with this hack, uneven nice-levels between X and gears causes a return of the stalls, so make sure both X and gears run on the same nice-level when testing. Again, the whole point of this workload is to expose scheduler glitches regardless of whether X is broken or not, and my hunch is that this problem looks suspiciously like an ia-boosting bug. What's important to note is that by adjusting the scheduler we can effect a correction in behaviour, and as such should yield this problem as fixable. It's probably a good idea to look further into add_wait_runtime. Thanks! -- Al - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On Wed, 2007-08-29 at 06:18 +0200, Ingo Molnar wrote: Then lay them out side by side to see the periodic stallings for ~10sec. The X scheduling code isn't really designed to handle software GL well; the requests can be very expensive to execute, and yet are specified as atomic operations (sigh). i just tried something similar (by adding Option NoDRI to xorg.conf) and i'm wondering how it can be smooth on vesa-driver at all. I tested it on a Core2Duo box and software rendering manages to do about 3 frames per second. (although glxgears itself thinks it does ~600 fps) If i start 3 glxgears then they do ~1 frame per second each. This is on Fedora 7 with xorg-x11-server-Xorg-1.3.0.0-9.fc7 and xorg-x11-drv-i810-2.0.0-4.fc7. Are you attempting to measure the visible updates by eye? Or are you using some other metric? In any case, attempting to measure anything using glxgears is a bad idea; it's not representative of *any* real applications. And then using software GL on top of that... What was the question again? -- [EMAIL PROTECTED] signature.asc Description: This is a digitally signed message part
Re: CFS review
On Wed, 2007-08-29 at 06:18 +0200, Ingo Molnar wrote: * Al Boldi [EMAIL PROTECTED] wrote: No need for framebuffer. All you need is X using the X.org vesa-driver. Then start gears like this: # gears gears gears Then lay them out side by side to see the periodic stallings for ~10sec. i just tried something similar (by adding Option NoDRI to xorg.conf) and i'm wondering how it can be smooth on vesa-driver at all. I tested it on a Core2Duo box and software rendering manages to do about 3 frames per second. (although glxgears itself thinks it does ~600 fps) If i start 3 glxgears then they do ~1 frame per second each. This is on Fedora 7 with xorg-x11-server-Xorg-1.3.0.0-9.fc7 and xorg-x11-drv-i810-2.0.0-4.fc7. At least you can run the darn test... the third instance of glxgears here means say bye bye to GUI instantly. -Mike - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Al Boldi [EMAIL PROTECTED] wrote: I have narrowed it down a bit to add_wait_runtime. the scheduler is a red herring here. Could you strace -ttt -TTT one of the glxgears instances (and send us the cfs-debug-info.sh output, with CONFIG_SCHED_DEBUG=y and CONFIG_SCHEDSTATS=y as requested before) so that we can have a closer look? i reproduced something similar and there the stall is caused by 1+ second select() delays on the X client-server socket. The scheduler stats agree with that: se.sleep_max : 2194711437 se.block_max : 0 se.exec_max : 977446 se.wait_max : 1912321 the scheduler itself had a worst-case scheduling delay of 1.9 milliseconds for that glxgears instance (which is perfectly good - in fact - excellent interactivity) - but the task had a maximum sleep time of 2.19 seconds. So the 'glitch' was not caused by the scheduler. Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Linus Torvalds wrote: > On Tue, 28 Aug 2007, Al Boldi wrote: > > No need for framebuffer. All you need is X using the X.org vesa-driver. > > Then start gears like this: > > > > # gears & gears & gears & > > > > Then lay them out side by side to see the periodic stallings for ~10sec. > > I don't think this is a good test. > > Why? > > If you're not using direct rendering, what you have is the X server doing > all the rendering, which in turn means that what you are testing is quite > possibly not so much about the *kernel* scheduling, but about *X-server* > scheduling! > > I'm sure the kernel scheduler has an impact, but what's more likely to be > going on is that you're seeing effects that are indirect, and not > necessarily at all even "good". > > For example, if the X server is the scheduling point, it's entirely > possible that it ends up showing effects that are more due to the queueing > of the X command stream than due to the scheduler - and that those > stalls are simply due to *that*. > > One thing to try is to run the X connection in synchronous mode, which > minimizes queueing issues. I don't know if gears has a flag to turn on > synchronous X messaging, though. Many X programs take the "[+-]sync" flag > to turn on synchronous mode, iirc. I like your analysis, but how do you explain that these stalls vanish when __update_curr is disabled? Thanks! -- Al - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On Tue, 28 Aug 2007, Al Boldi wrote: > > No need for framebuffer. All you need is X using the X.org vesa-driver. > Then start gears like this: > > # gears & gears & gears & > > Then lay them out side by side to see the periodic stallings for ~10sec. I don't think this is a good test. Why? If you're not using direct rendering, what you have is the X server doing all the rendering, which in turn means that what you are testing is quite possibly not so much about the *kernel* scheduling, but about *X-server* scheduling! I'm sure the kernel scheduler has an impact, but what's more likely to be going on is that you're seeing effects that are indirect, and not necessarily at all even "good". For example, if the X server is the scheduling point, it's entirely possible that it ends up showing effects that are more due to the queueing of the X command stream than due to the scheduler - and that those stalls are simply due to *that*. One thing to try is to run the X connection in synchronous mode, which minimizes queueing issues. I don't know if gears has a flag to turn on synchronous X messaging, though. Many X programs take the "[+-]sync" flag to turn on synchronous mode, iirc. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Ingo Molnar wrote: > * Al Boldi <[EMAIL PROTECTED]> wrote: > > > Could you try the patch below instead, does this make 3x glxgears > > > smooth again? (if yes, could you send me your Signed-off-by line as > > > well.) > > > > The task-startup stalling is still there for ~10sec. > > > > Can you see the problem on your machine? > > nope (i have no framebuffer setup) No need for framebuffer. All you need is X using the X.org vesa-driver. Then start gears like this: # gears & gears & gears & Then lay them out side by side to see the periodic stallings for ~10sec. > - but i can see some chew-max > latencies that occur when new tasks are started up. I _think_ it's > probably the same problem as yours. chew-max is great, but it's too accurate in that it exposes any scheduling glitches and as such hides the startup glitch within the many glitches it exposes. For example, it fluctuates all over the place using this: # for ((i=0;i<9;i++)); do chew-max 60 > /dev/shm/chew$i.log & done Also, chew-max locks-up when disabling __update_curr, which means that the workload of chew-max is different from either the ping-startup loop or the gears. You really should try the gears test by any means, as the problem is really pronounced there. > could you try the patch below (which is the combo patch of my current > queue), ontop of head 50c46637aa? This makes chew-max behave better > during task mass-startup here. Still no improvement. Thanks! -- Al - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Al Boldi <[EMAIL PROTECTED]> wrote: > > Could you try the patch below instead, does this make 3x glxgears > > smooth again? (if yes, could you send me your Signed-off-by line as > > well.) > > The task-startup stalling is still there for ~10sec. > > Can you see the problem on your machine? nope (i have no framebuffer setup) - but i can see some chew-max latencies that occur when new tasks are started up. I _think_ it's probably the same problem as yours. could you try the patch below (which is the combo patch of my current queue), ontop of head 50c46637aa? This makes chew-max behave better during task mass-startup here. Ingo -> Index: linux/include/linux/sched.h === --- linux.orig/include/linux/sched.h +++ linux/include/linux/sched.h @@ -904,6 +904,7 @@ struct sched_entity { u64 exec_start; u64 sum_exec_runtime; + u64 prev_sum_exec_runtime; u64 wait_start_fair; u64 sleep_start_fair; Index: linux/kernel/sched.c === --- linux.orig/kernel/sched.c +++ linux/kernel/sched.c @@ -1587,6 +1587,7 @@ static void __sched_fork(struct task_str p->se.wait_start_fair = 0; p->se.exec_start= 0; p->se.sum_exec_runtime = 0; + p->se.prev_sum_exec_runtime = 0; p->se.delta_exec= 0; p->se.delta_fair_run= 0; p->se.delta_fair_sleep = 0; Index: linux/kernel/sched_fair.c === --- linux.orig/kernel/sched_fair.c +++ linux/kernel/sched_fair.c @@ -82,12 +82,12 @@ enum { }; unsigned int sysctl_sched_features __read_mostly = - SCHED_FEAT_FAIR_SLEEPERS*1 | + SCHED_FEAT_FAIR_SLEEPERS*0 | SCHED_FEAT_SLEEPER_AVG *0 | SCHED_FEAT_SLEEPER_LOAD_AVG *1 | SCHED_FEAT_PRECISE_CPU_LOAD *1 | - SCHED_FEAT_START_DEBIT *1 | - SCHED_FEAT_SKIP_INITIAL *0; + SCHED_FEAT_START_DEBIT *0 | + SCHED_FEAT_SKIP_INITIAL *1; extern struct sched_class fair_sched_class; @@ -225,39 +225,15 @@ static struct sched_entity *__pick_next_ * Calculate the preemption granularity needed to schedule every * runnable task once per sysctl_sched_latency amount of time. * (down to a sensible low limit on granularity) - * - * For example, if there are 2 tasks running and latency is 10 msecs, - * we switch tasks every 5 msecs. If we have 3 tasks running, we have - * to switch tasks every 3.33 msecs to get a 10 msecs observed latency - * for each task. We do finer and finer scheduling up to until we - * reach the minimum granularity value. - * - * To achieve this we use the following dynamic-granularity rule: - * - *gran = lat/nr - lat/nr/nr - * - * This comes out of the following equations: - * - *kA1 + gran = kB1 - *kB2 + gran = kA2 - *kA2 = kA1 - *kB2 = kB1 - d + d/nr - *lat = d * nr - * - * Where 'k' is key, 'A' is task A (waiting), 'B' is task B (running), - * '1' is start of time, '2' is end of time, 'd' is delay between - * 1 and 2 (during which task B was running), 'nr' is number of tasks - * running, 'lat' is the the period of each task. ('lat' is the - * sched_latency that we aim for.) */ -static long +static unsigned long sched_granularity(struct cfs_rq *cfs_rq) { unsigned int gran = sysctl_sched_latency; unsigned int nr = cfs_rq->nr_running; if (nr > 1) { - gran = gran/nr - gran/nr/nr; + gran = gran/nr; gran = max(gran, sysctl_sched_min_granularity); } @@ -489,6 +465,9 @@ update_stats_wait_end(struct cfs_rq *cfs { unsigned long delta_fair; + if (unlikely(!se->wait_start_fair)) + return; + delta_fair = (unsigned long)min((u64)(2*sysctl_sched_runtime_limit), (u64)(cfs_rq->fair_clock - se->wait_start_fair)); @@ -668,7 +647,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, st /* * Preempt the current task with a newly woken task if needed: */ -static void +static int __check_preempt_curr_fair(struct cfs_rq *cfs_rq, struct sched_entity *se, struct sched_entity *curr, unsigned long granularity) { @@ -679,8 +658,11 @@ __check_preempt_curr_fair(struct cfs_rq * preempt the current task unless the best task has * a larger than sched_granularity fairness advantage: */ - if (__delta > niced_granularity(curr, granularity)) + if (__delta > niced_granularity(curr, granularity)) { resched_task(rq_of(cfs_rq)->curr); + return
Re: CFS review
Ingo Molnar wrote: > * Al Boldi <[EMAIL PROTECTED]> wrote: > > > could you send the exact patch that shows what you did? > > > > On 2.6.22.5-v20.3 (not v20.4): > > > > 340-curr->delta_exec += delta_exec; > > 341- > > 342-if (unlikely(curr->delta_exec > sysctl_sched_stat_granularity)) > > { 343:// __update_curr(cfs_rq, curr); > > 344-curr->delta_exec = 0; > > 345-} > > 346-curr->exec_start = rq_of(cfs_rq)->clock; > > ouch - this produces a really broken scheduler - with this we dont do > any run-time accounting (!). Of course it's broken, and it's not meant as a fix, but this change allows you to see the amount of overhead as well as any miscalculations __update_curr incurs. In terms of overhead, __update_curr incurs ~3x slowdown, and in terms of run-time accounting it exhibits a ~10sec task-startup miscalculation. > Could you try the patch below instead, does this make 3x glxgears smooth > again? (if yes, could you send me your Signed-off-by line as well.) The task-startup stalling is still there for ~10sec. Can you see the problem on your machine? Thanks! -- Al - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Al Boldi <[EMAIL PROTECTED]> wrote: > > could you send the exact patch that shows what you did? > > On 2.6.22.5-v20.3 (not v20.4): > > 340-curr->delta_exec += delta_exec; > 341- > 342-if (unlikely(curr->delta_exec > sysctl_sched_stat_granularity)) { > 343:// __update_curr(cfs_rq, curr); > 344-curr->delta_exec = 0; > 345-} > 346-curr->exec_start = rq_of(cfs_rq)->clock; ouch - this produces a really broken scheduler - with this we dont do any run-time accounting (!). Could you try the patch below instead, does this make 3x glxgears smooth again? (if yes, could you send me your Signed-off-by line as well.) Ingo > Subject: sched: make the scheduler converge to the ideal latency From: Ingo Molnar <[EMAIL PROTECTED]> de-HZ-ification of the granularity defaults unearthed a pre-existing property of CFS: while it correctly converges to the granularity goal, it does not prevent run-time fluctuations in the range of [-gran ... +gran]. With the increase of the granularity due to the removal of HZ dependencies, this becomes visible in chew-max output (with 5 tasks running): out: 28 . 27. 32 | flu: 0 . 0 | ran:9 . 13 | per: 37 . 40 out: 27 . 27. 32 | flu: 0 . 0 | ran: 17 . 13 | per: 44 . 40 out: 27 . 27. 32 | flu: 0 . 0 | ran:9 . 13 | per: 36 . 40 out: 29 . 27. 32 | flu: 2 . 0 | ran: 17 . 13 | per: 46 . 40 out: 28 . 27. 32 | flu: 0 . 0 | ran:9 . 13 | per: 37 . 40 out: 29 . 27. 32 | flu: 0 . 0 | ran: 18 . 13 | per: 47 . 40 out: 28 . 27. 32 | flu: 0 . 0 | ran:9 . 13 | per: 37 . 40 average slice is the ideal 13 msecs and the period is picture-perfect 40 msecs. But the 'ran' field fluctuates around 13.33 msecs and there's no mechanism in CFS to keep that from happening: it's a perfectly valid solution that CFS finds. the solution is to add a granularity/preemption rule that knows about the "target latency", which makes tasks that run longer than the ideal latency run a bit less. The simplest approach is to simply decrease the preemption granularity when a task overruns its ideal latency. For this we have to track how much the task executed since its last preemption. ( this adds a new field to task_struct, but we can eliminate that overhead in 2.6.24 by putting all the scheduler timestamps into an anonymous union. ) with this change in place, chew-max output is fluctuation-less all around: out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40 out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40 out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40 out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40 out: 28 . 27. 39 | flu: 0 . 1 | ran: 13 . 13 | per: 41 . 40 out: 28 . 27. 39 | flu: 0 . 1 | ran: 13 . 13 | per: 41 . 40 this patch has no impact on any fastpath or on any globally observable scheduling property. (unless you have sharp enough eyes to see millisecond-level ruckles in glxgears smoothness :-) Also, with this mechanism in place the formula for adaptive granularity can be simplified down to the obvious "granularity = latency/nr_running" calculation. Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> Signed-off-by: Peter Zijlstra <[EMAIL PROTECTED]> --- include/linux/sched.h |1 + kernel/sched_fair.c | 43 ++- 2 files changed, 15 insertions(+), 29 deletions(-) Index: linux/include/linux/sched.h === --- linux.orig/include/linux/sched.h +++ linux/include/linux/sched.h @@ -904,6 +904,7 @@ struct sched_entity { u64 exec_start; u64 sum_exec_runtime; + u64 prev_sum_exec_runtime; u64 wait_start_fair; u64 sleep_start_fair; Index: linux/kernel/sched_fair.c === --- linux.orig/kernel/sched_fair.c +++ linux/kernel/sched_fair.c @@ -225,30 +225,6 @@ static struct sched_entity *__pick_next_ * Calculate the preemption granularity needed to schedule every * runnable task once per sysctl_sched_latency amount of time. * (down to a sensible low limit on granularity) - * - * For example, if there are 2 tasks running and latency is 10 msecs, - * we switch tasks every 5 msecs. If we have 3 tasks running, we have - * to switch tasks every 3.33 msecs to get a 10 msecs observed latency - * for each task. We do finer and finer scheduling up to until we - * reach the minimum granularity value. - * - * To achieve this we use the following dynamic-granularity rule: - * - *gran = lat/nr - lat/nr/nr - * - * This comes out of the following equations: - * - *kA1 + gran = kB1 - *kB2 + gran = kA2 - *kA2 = kA1 - *kB2 = kB1 - d + d/nr - *
Re: CFS review
* Al Boldi [EMAIL PROTECTED] wrote: could you send the exact patch that shows what you did? On 2.6.22.5-v20.3 (not v20.4): 340-curr-delta_exec += delta_exec; 341- 342-if (unlikely(curr-delta_exec sysctl_sched_stat_granularity)) { 343:// __update_curr(cfs_rq, curr); 344-curr-delta_exec = 0; 345-} 346-curr-exec_start = rq_of(cfs_rq)-clock; ouch - this produces a really broken scheduler - with this we dont do any run-time accounting (!). Could you try the patch below instead, does this make 3x glxgears smooth again? (if yes, could you send me your Signed-off-by line as well.) Ingo Subject: sched: make the scheduler converge to the ideal latency From: Ingo Molnar [EMAIL PROTECTED] de-HZ-ification of the granularity defaults unearthed a pre-existing property of CFS: while it correctly converges to the granularity goal, it does not prevent run-time fluctuations in the range of [-gran ... +gran]. With the increase of the granularity due to the removal of HZ dependencies, this becomes visible in chew-max output (with 5 tasks running): out: 28 . 27. 32 | flu: 0 . 0 | ran:9 . 13 | per: 37 . 40 out: 27 . 27. 32 | flu: 0 . 0 | ran: 17 . 13 | per: 44 . 40 out: 27 . 27. 32 | flu: 0 . 0 | ran:9 . 13 | per: 36 . 40 out: 29 . 27. 32 | flu: 2 . 0 | ran: 17 . 13 | per: 46 . 40 out: 28 . 27. 32 | flu: 0 . 0 | ran:9 . 13 | per: 37 . 40 out: 29 . 27. 32 | flu: 0 . 0 | ran: 18 . 13 | per: 47 . 40 out: 28 . 27. 32 | flu: 0 . 0 | ran:9 . 13 | per: 37 . 40 average slice is the ideal 13 msecs and the period is picture-perfect 40 msecs. But the 'ran' field fluctuates around 13.33 msecs and there's no mechanism in CFS to keep that from happening: it's a perfectly valid solution that CFS finds. the solution is to add a granularity/preemption rule that knows about the target latency, which makes tasks that run longer than the ideal latency run a bit less. The simplest approach is to simply decrease the preemption granularity when a task overruns its ideal latency. For this we have to track how much the task executed since its last preemption. ( this adds a new field to task_struct, but we can eliminate that overhead in 2.6.24 by putting all the scheduler timestamps into an anonymous union. ) with this change in place, chew-max output is fluctuation-less all around: out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40 out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40 out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40 out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40 out: 28 . 27. 39 | flu: 0 . 1 | ran: 13 . 13 | per: 41 . 40 out: 28 . 27. 39 | flu: 0 . 1 | ran: 13 . 13 | per: 41 . 40 this patch has no impact on any fastpath or on any globally observable scheduling property. (unless you have sharp enough eyes to see millisecond-level ruckles in glxgears smoothness :-) Also, with this mechanism in place the formula for adaptive granularity can be simplified down to the obvious granularity = latency/nr_running calculation. Signed-off-by: Ingo Molnar [EMAIL PROTECTED] Signed-off-by: Peter Zijlstra [EMAIL PROTECTED] --- include/linux/sched.h |1 + kernel/sched_fair.c | 43 ++- 2 files changed, 15 insertions(+), 29 deletions(-) Index: linux/include/linux/sched.h === --- linux.orig/include/linux/sched.h +++ linux/include/linux/sched.h @@ -904,6 +904,7 @@ struct sched_entity { u64 exec_start; u64 sum_exec_runtime; + u64 prev_sum_exec_runtime; u64 wait_start_fair; u64 sleep_start_fair; Index: linux/kernel/sched_fair.c === --- linux.orig/kernel/sched_fair.c +++ linux/kernel/sched_fair.c @@ -225,30 +225,6 @@ static struct sched_entity *__pick_next_ * Calculate the preemption granularity needed to schedule every * runnable task once per sysctl_sched_latency amount of time. * (down to a sensible low limit on granularity) - * - * For example, if there are 2 tasks running and latency is 10 msecs, - * we switch tasks every 5 msecs. If we have 3 tasks running, we have - * to switch tasks every 3.33 msecs to get a 10 msecs observed latency - * for each task. We do finer and finer scheduling up to until we - * reach the minimum granularity value. - * - * To achieve this we use the following dynamic-granularity rule: - * - *gran = lat/nr - lat/nr/nr - * - * This comes out of the following equations: - * - *kA1 + gran = kB1 - *kB2 + gran = kA2 - *kA2 = kA1 - *kB2 = kB1 - d + d/nr - *lat = d * nr - * - * Where
Re: CFS review
Ingo Molnar wrote: * Al Boldi [EMAIL PROTECTED] wrote: could you send the exact patch that shows what you did? On 2.6.22.5-v20.3 (not v20.4): 340-curr-delta_exec += delta_exec; 341- 342-if (unlikely(curr-delta_exec sysctl_sched_stat_granularity)) { 343:// __update_curr(cfs_rq, curr); 344-curr-delta_exec = 0; 345-} 346-curr-exec_start = rq_of(cfs_rq)-clock; ouch - this produces a really broken scheduler - with this we dont do any run-time accounting (!). Of course it's broken, and it's not meant as a fix, but this change allows you to see the amount of overhead as well as any miscalculations __update_curr incurs. In terms of overhead, __update_curr incurs ~3x slowdown, and in terms of run-time accounting it exhibits a ~10sec task-startup miscalculation. Could you try the patch below instead, does this make 3x glxgears smooth again? (if yes, could you send me your Signed-off-by line as well.) The task-startup stalling is still there for ~10sec. Can you see the problem on your machine? Thanks! -- Al - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Al Boldi [EMAIL PROTECTED] wrote: Could you try the patch below instead, does this make 3x glxgears smooth again? (if yes, could you send me your Signed-off-by line as well.) The task-startup stalling is still there for ~10sec. Can you see the problem on your machine? nope (i have no framebuffer setup) - but i can see some chew-max latencies that occur when new tasks are started up. I _think_ it's probably the same problem as yours. could you try the patch below (which is the combo patch of my current queue), ontop of head 50c46637aa? This makes chew-max behave better during task mass-startup here. Ingo - Index: linux/include/linux/sched.h === --- linux.orig/include/linux/sched.h +++ linux/include/linux/sched.h @@ -904,6 +904,7 @@ struct sched_entity { u64 exec_start; u64 sum_exec_runtime; + u64 prev_sum_exec_runtime; u64 wait_start_fair; u64 sleep_start_fair; Index: linux/kernel/sched.c === --- linux.orig/kernel/sched.c +++ linux/kernel/sched.c @@ -1587,6 +1587,7 @@ static void __sched_fork(struct task_str p-se.wait_start_fair = 0; p-se.exec_start= 0; p-se.sum_exec_runtime = 0; + p-se.prev_sum_exec_runtime = 0; p-se.delta_exec= 0; p-se.delta_fair_run= 0; p-se.delta_fair_sleep = 0; Index: linux/kernel/sched_fair.c === --- linux.orig/kernel/sched_fair.c +++ linux/kernel/sched_fair.c @@ -82,12 +82,12 @@ enum { }; unsigned int sysctl_sched_features __read_mostly = - SCHED_FEAT_FAIR_SLEEPERS*1 | + SCHED_FEAT_FAIR_SLEEPERS*0 | SCHED_FEAT_SLEEPER_AVG *0 | SCHED_FEAT_SLEEPER_LOAD_AVG *1 | SCHED_FEAT_PRECISE_CPU_LOAD *1 | - SCHED_FEAT_START_DEBIT *1 | - SCHED_FEAT_SKIP_INITIAL *0; + SCHED_FEAT_START_DEBIT *0 | + SCHED_FEAT_SKIP_INITIAL *1; extern struct sched_class fair_sched_class; @@ -225,39 +225,15 @@ static struct sched_entity *__pick_next_ * Calculate the preemption granularity needed to schedule every * runnable task once per sysctl_sched_latency amount of time. * (down to a sensible low limit on granularity) - * - * For example, if there are 2 tasks running and latency is 10 msecs, - * we switch tasks every 5 msecs. If we have 3 tasks running, we have - * to switch tasks every 3.33 msecs to get a 10 msecs observed latency - * for each task. We do finer and finer scheduling up to until we - * reach the minimum granularity value. - * - * To achieve this we use the following dynamic-granularity rule: - * - *gran = lat/nr - lat/nr/nr - * - * This comes out of the following equations: - * - *kA1 + gran = kB1 - *kB2 + gran = kA2 - *kA2 = kA1 - *kB2 = kB1 - d + d/nr - *lat = d * nr - * - * Where 'k' is key, 'A' is task A (waiting), 'B' is task B (running), - * '1' is start of time, '2' is end of time, 'd' is delay between - * 1 and 2 (during which task B was running), 'nr' is number of tasks - * running, 'lat' is the the period of each task. ('lat' is the - * sched_latency that we aim for.) */ -static long +static unsigned long sched_granularity(struct cfs_rq *cfs_rq) { unsigned int gran = sysctl_sched_latency; unsigned int nr = cfs_rq-nr_running; if (nr 1) { - gran = gran/nr - gran/nr/nr; + gran = gran/nr; gran = max(gran, sysctl_sched_min_granularity); } @@ -489,6 +465,9 @@ update_stats_wait_end(struct cfs_rq *cfs { unsigned long delta_fair; + if (unlikely(!se-wait_start_fair)) + return; + delta_fair = (unsigned long)min((u64)(2*sysctl_sched_runtime_limit), (u64)(cfs_rq-fair_clock - se-wait_start_fair)); @@ -668,7 +647,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, st /* * Preempt the current task with a newly woken task if needed: */ -static void +static int __check_preempt_curr_fair(struct cfs_rq *cfs_rq, struct sched_entity *se, struct sched_entity *curr, unsigned long granularity) { @@ -679,8 +658,11 @@ __check_preempt_curr_fair(struct cfs_rq * preempt the current task unless the best task has * a larger than sched_granularity fairness advantage: */ - if (__delta niced_granularity(curr, granularity)) + if (__delta niced_granularity(curr, granularity)) { resched_task(rq_of(cfs_rq)-curr); + return 1; + } + return
Re: CFS review
Ingo Molnar wrote: * Al Boldi [EMAIL PROTECTED] wrote: Could you try the patch below instead, does this make 3x glxgears smooth again? (if yes, could you send me your Signed-off-by line as well.) The task-startup stalling is still there for ~10sec. Can you see the problem on your machine? nope (i have no framebuffer setup) No need for framebuffer. All you need is X using the X.org vesa-driver. Then start gears like this: # gears gears gears Then lay them out side by side to see the periodic stallings for ~10sec. - but i can see some chew-max latencies that occur when new tasks are started up. I _think_ it's probably the same problem as yours. chew-max is great, but it's too accurate in that it exposes any scheduling glitches and as such hides the startup glitch within the many glitches it exposes. For example, it fluctuates all over the place using this: # for ((i=0;i9;i++)); do chew-max 60 /dev/shm/chew$i.log done Also, chew-max locks-up when disabling __update_curr, which means that the workload of chew-max is different from either the ping-startup loop or the gears. You really should try the gears test by any means, as the problem is really pronounced there. could you try the patch below (which is the combo patch of my current queue), ontop of head 50c46637aa? This makes chew-max behave better during task mass-startup here. Still no improvement. Thanks! -- Al - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Linus Torvalds wrote: On Tue, 28 Aug 2007, Al Boldi wrote: No need for framebuffer. All you need is X using the X.org vesa-driver. Then start gears like this: # gears gears gears Then lay them out side by side to see the periodic stallings for ~10sec. I don't think this is a good test. Why? If you're not using direct rendering, what you have is the X server doing all the rendering, which in turn means that what you are testing is quite possibly not so much about the *kernel* scheduling, but about *X-server* scheduling! I'm sure the kernel scheduler has an impact, but what's more likely to be going on is that you're seeing effects that are indirect, and not necessarily at all even good. For example, if the X server is the scheduling point, it's entirely possible that it ends up showing effects that are more due to the queueing of the X command stream than due to the scheduler - and that those stalls are simply due to *that*. One thing to try is to run the X connection in synchronous mode, which minimizes queueing issues. I don't know if gears has a flag to turn on synchronous X messaging, though. Many X programs take the [+-]sync flag to turn on synchronous mode, iirc. I like your analysis, but how do you explain that these stalls vanish when __update_curr is disabled? Thanks! -- Al - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On Tue, 28 Aug 2007, Al Boldi wrote: No need for framebuffer. All you need is X using the X.org vesa-driver. Then start gears like this: # gears gears gears Then lay them out side by side to see the periodic stallings for ~10sec. I don't think this is a good test. Why? If you're not using direct rendering, what you have is the X server doing all the rendering, which in turn means that what you are testing is quite possibly not so much about the *kernel* scheduling, but about *X-server* scheduling! I'm sure the kernel scheduler has an impact, but what's more likely to be going on is that you're seeing effects that are indirect, and not necessarily at all even good. For example, if the X server is the scheduling point, it's entirely possible that it ends up showing effects that are more due to the queueing of the X command stream than due to the scheduler - and that those stalls are simply due to *that*. One thing to try is to run the X connection in synchronous mode, which minimizes queueing issues. I don't know if gears has a flag to turn on synchronous X messaging, though. Many X programs take the [+-]sync flag to turn on synchronous mode, iirc. Linus - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Ingo Molnar wrote: > * Al Boldi <[EMAIL PROTECTED]> wrote: > > > and could you also check 20.4 on 2.6.22.5 perhaps, or very latest > > > -git? (Peter has experienced smaller spikes with that.) > > > > Ok, I tried all your suggestions, but nothing works as smooth as > > removing __update_curr. > > could you send the exact patch that shows what you did? On 2.6.22.5-v20.3 (not v20.4): 340-curr->delta_exec += delta_exec; 341- 342-if (unlikely(curr->delta_exec > sysctl_sched_stat_granularity)) { 343:// __update_curr(cfs_rq, curr); 344-curr->delta_exec = 0; 345-} 346-curr->exec_start = rq_of(cfs_rq)->clock; > And could you > also please describe it exactly which aspect of the workload you call > 'smooth'. Could it be made quantitative somehow? The 3x gears test shows the startup problem in a really noticeable way. With v20.4 they startup surging and stalling periodically for about 10sec, then they are smooth. With v20.3 + above patch they startup completely smooth. Thanks! -- Al - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Al Boldi <[EMAIL PROTECTED]> wrote: > > and could you also check 20.4 on 2.6.22.5 perhaps, or very latest > > -git? (Peter has experienced smaller spikes with that.) > > Ok, I tried all your suggestions, but nothing works as smooth as > removing __update_curr. could you send the exact patch that shows what you did? And could you also please describe it exactly which aspect of the workload you call 'smooth'. Could it be made quantitative somehow? Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Ingo Molnar wrote: > * Al Boldi <[EMAIL PROTECTED]> wrote: > > > ok. I think i might finally have found the bug causing this. Could > > > you try the fix below, does your webserver thread-startup test work > > > any better? > > > > It seems to help somewhat, but the problem is still visible. Even > > v20.3 on 2.6.22.5 didn't help. > > > > It does look related to ia-boosting, so I turned off __update_curr > > like Roman mentioned, which had an enormous smoothing effect, but then > > nice levels completely break down and lockup the system. > > you can turn sleeper-fairness off via: > >echo 28 > /proc/sys/kernel/sched_features > > another thing to try would be: > >echo 12 > /proc/sys/kernel/sched_features > > (that's the new-task penalty turned off.) > > Another thing to try would be to edit this: > > if (sysctl_sched_features & SCHED_FEAT_START_DEBIT) > p->se.wait_runtime = -(sched_granularity(cfs_rq) / 2); > > to: > > if (sysctl_sched_features & SCHED_FEAT_START_DEBIT) > p->se.wait_runtime = -(sched_granularity(cfs_rq); > > and could you also check 20.4 on 2.6.22.5 perhaps, or very latest -git? > (Peter has experienced smaller spikes with that.) Ok, I tried all your suggestions, but nothing works as smooth as removing __update_curr. Does the problem show on your machine with the 3x gears under X-vesa test? Thanks! -- Al - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Ingo Molnar wrote: * Al Boldi [EMAIL PROTECTED] wrote: ok. I think i might finally have found the bug causing this. Could you try the fix below, does your webserver thread-startup test work any better? It seems to help somewhat, but the problem is still visible. Even v20.3 on 2.6.22.5 didn't help. It does look related to ia-boosting, so I turned off __update_curr like Roman mentioned, which had an enormous smoothing effect, but then nice levels completely break down and lockup the system. you can turn sleeper-fairness off via: echo 28 /proc/sys/kernel/sched_features another thing to try would be: echo 12 /proc/sys/kernel/sched_features (that's the new-task penalty turned off.) Another thing to try would be to edit this: if (sysctl_sched_features SCHED_FEAT_START_DEBIT) p-se.wait_runtime = -(sched_granularity(cfs_rq) / 2); to: if (sysctl_sched_features SCHED_FEAT_START_DEBIT) p-se.wait_runtime = -(sched_granularity(cfs_rq); and could you also check 20.4 on 2.6.22.5 perhaps, or very latest -git? (Peter has experienced smaller spikes with that.) Ok, I tried all your suggestions, but nothing works as smooth as removing __update_curr. Does the problem show on your machine with the 3x gears under X-vesa test? Thanks! -- Al - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Al Boldi [EMAIL PROTECTED] wrote: and could you also check 20.4 on 2.6.22.5 perhaps, or very latest -git? (Peter has experienced smaller spikes with that.) Ok, I tried all your suggestions, but nothing works as smooth as removing __update_curr. could you send the exact patch that shows what you did? And could you also please describe it exactly which aspect of the workload you call 'smooth'. Could it be made quantitative somehow? Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Ingo Molnar wrote: * Al Boldi [EMAIL PROTECTED] wrote: and could you also check 20.4 on 2.6.22.5 perhaps, or very latest -git? (Peter has experienced smaller spikes with that.) Ok, I tried all your suggestions, but nothing works as smooth as removing __update_curr. could you send the exact patch that shows what you did? On 2.6.22.5-v20.3 (not v20.4): 340-curr-delta_exec += delta_exec; 341- 342-if (unlikely(curr-delta_exec sysctl_sched_stat_granularity)) { 343:// __update_curr(cfs_rq, curr); 344-curr-delta_exec = 0; 345-} 346-curr-exec_start = rq_of(cfs_rq)-clock; And could you also please describe it exactly which aspect of the workload you call 'smooth'. Could it be made quantitative somehow? The 3x gears test shows the startup problem in a really noticeable way. With v20.4 they startup surging and stalling periodically for about 10sec, then they are smooth. With v20.3 + above patch they startup completely smooth. Thanks! -- Al - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Al Boldi <[EMAIL PROTECTED]> wrote: > > ok. I think i might finally have found the bug causing this. Could > > you try the fix below, does your webserver thread-startup test work > > any better? > > It seems to help somewhat, but the problem is still visible. Even > v20.3 on 2.6.22.5 didn't help. > > It does look related to ia-boosting, so I turned off __update_curr > like Roman mentioned, which had an enormous smoothing effect, but then > nice levels completely break down and lockup the system. you can turn sleeper-fairness off via: echo 28 > /proc/sys/kernel/sched_features another thing to try would be: echo 12 > /proc/sys/kernel/sched_features (that's the new-task penalty turned off.) Another thing to try would be to edit this: if (sysctl_sched_features & SCHED_FEAT_START_DEBIT) p->se.wait_runtime = -(sched_granularity(cfs_rq) / 2); to: if (sysctl_sched_features & SCHED_FEAT_START_DEBIT) p->se.wait_runtime = -(sched_granularity(cfs_rq); and could you also check 20.4 on 2.6.22.5 perhaps, or very latest -git? (Peter has experienced smaller spikes with that.) Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Ingo Molnar wrote: > * Al Boldi <[EMAIL PROTECTED]> wrote: > > > > The problem is that consecutive runs don't give consistent results > > > > and sometimes stalls. You may want to try that. > > > > > > well, there's a natural saturation point after a few hundred tasks > > > (depending on your CPU's speed), at which point there's no idle time > > > left. From that point on things get slower progressively (and the > > > ability of the shell to start new ping tasks is impacted as well), > > > but that's expected on an overloaded system, isnt it? > > > > Of course, things should get slower with higher load, but it should be > > consistent without stalls. > > > > To see this problem, make sure you boot into /bin/sh with the normal > > VGA console (ie. not fb-console). Then try each loop a few times to > > show different behaviour; loops like: > > > > # for ((i=0; i<; i++)); do ping 10.1 -A > /dev/null & done > > > > # for ((i=0; i<; i++)); do nice -99 ping 10.1 -A > /dev/null & done > > > > # { for ((i=0; i<; i++)); do > > ping 10.1 -A > /dev/null & > > done } > /dev/null 2>&1 > > > > Especially the last one sometimes causes a complete console lock-up, > > while the other two sometimes stall then surge periodically. > > ok. I think i might finally have found the bug causing this. Could you > try the fix below, does your webserver thread-startup test work any > better? It seems to help somewhat, but the problem is still visible. Even v20.3 on 2.6.22.5 didn't help. It does look related to ia-boosting, so I turned off __update_curr like Roman mentioned, which had an enormous smoothing effect, but then nice levels completely break down and lockup the system. There is another way to show the problem visually under X (vesa-driver), by starting 3 gears simultaneously, which after laying them out side-by-side need some settling time before smoothing out. Without __update_curr it's absolutely smooth from the start. Thanks! -- Al - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Ingo Molnar wrote: * Al Boldi [EMAIL PROTECTED] wrote: The problem is that consecutive runs don't give consistent results and sometimes stalls. You may want to try that. well, there's a natural saturation point after a few hundred tasks (depending on your CPU's speed), at which point there's no idle time left. From that point on things get slower progressively (and the ability of the shell to start new ping tasks is impacted as well), but that's expected on an overloaded system, isnt it? Of course, things should get slower with higher load, but it should be consistent without stalls. To see this problem, make sure you boot into /bin/sh with the normal VGA console (ie. not fb-console). Then try each loop a few times to show different behaviour; loops like: # for ((i=0; i; i++)); do ping 10.1 -A /dev/null done # for ((i=0; i; i++)); do nice -99 ping 10.1 -A /dev/null done # { for ((i=0; i; i++)); do ping 10.1 -A /dev/null done } /dev/null 21 Especially the last one sometimes causes a complete console lock-up, while the other two sometimes stall then surge periodically. ok. I think i might finally have found the bug causing this. Could you try the fix below, does your webserver thread-startup test work any better? It seems to help somewhat, but the problem is still visible. Even v20.3 on 2.6.22.5 didn't help. It does look related to ia-boosting, so I turned off __update_curr like Roman mentioned, which had an enormous smoothing effect, but then nice levels completely break down and lockup the system. There is another way to show the problem visually under X (vesa-driver), by starting 3 gears simultaneously, which after laying them out side-by-side need some settling time before smoothing out. Without __update_curr it's absolutely smooth from the start. Thanks! -- Al - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Al Boldi [EMAIL PROTECTED] wrote: ok. I think i might finally have found the bug causing this. Could you try the fix below, does your webserver thread-startup test work any better? It seems to help somewhat, but the problem is still visible. Even v20.3 on 2.6.22.5 didn't help. It does look related to ia-boosting, so I turned off __update_curr like Roman mentioned, which had an enormous smoothing effect, but then nice levels completely break down and lockup the system. you can turn sleeper-fairness off via: echo 28 /proc/sys/kernel/sched_features another thing to try would be: echo 12 /proc/sys/kernel/sched_features (that's the new-task penalty turned off.) Another thing to try would be to edit this: if (sysctl_sched_features SCHED_FEAT_START_DEBIT) p-se.wait_runtime = -(sched_granularity(cfs_rq) / 2); to: if (sysctl_sched_features SCHED_FEAT_START_DEBIT) p-se.wait_runtime = -(sched_granularity(cfs_rq); and could you also check 20.4 on 2.6.22.5 perhaps, or very latest -git? (Peter has experienced smaller spikes with that.) Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Al Boldi <[EMAIL PROTECTED]> wrote: > > > The problem is that consecutive runs don't give consistent results > > > and sometimes stalls. You may want to try that. > > > > well, there's a natural saturation point after a few hundred tasks > > (depending on your CPU's speed), at which point there's no idle time > > left. From that point on things get slower progressively (and the > > ability of the shell to start new ping tasks is impacted as well), > > but that's expected on an overloaded system, isnt it? > > Of course, things should get slower with higher load, but it should be > consistent without stalls. > > To see this problem, make sure you boot into /bin/sh with the normal > VGA console (ie. not fb-console). Then try each loop a few times to > show different behaviour; loops like: > > # for ((i=0; i<; i++)); do ping 10.1 -A > /dev/null & done > > # for ((i=0; i<; i++)); do nice -99 ping 10.1 -A > /dev/null & done > > # { for ((i=0; i<; i++)); do > ping 10.1 -A > /dev/null & > done } > /dev/null 2>&1 > > Especially the last one sometimes causes a complete console lock-up, > while the other two sometimes stall then surge periodically. ok. I think i might finally have found the bug causing this. Could you try the fix below, does your webserver thread-startup test work any better? Ingo ---> Subject: sched: fix startup penalty calculation From: Ingo Molnar <[EMAIL PROTECTED]> fix task startup penalty miscalculation: sysctl_sched_granularity is unsigned int and wait_runtime is long so we first have to convert it to long before turning it negative ... Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> --- kernel/sched_fair.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux/kernel/sched_fair.c === --- linux.orig/kernel/sched_fair.c +++ linux/kernel/sched_fair.c @@ -1048,7 +1048,7 @@ static void task_new_fair(struct rq *rq, * -granularity/2, so initialize the task with that: */ if (sysctl_sched_features & SCHED_FEAT_START_DEBIT) - p->se.wait_runtime = -(sysctl_sched_granularity / 2); + p->se.wait_runtime = -((long)sysctl_sched_granularity / 2); __enqueue_entity(cfs_rq, se); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Al Boldi [EMAIL PROTECTED] wrote: The problem is that consecutive runs don't give consistent results and sometimes stalls. You may want to try that. well, there's a natural saturation point after a few hundred tasks (depending on your CPU's speed), at which point there's no idle time left. From that point on things get slower progressively (and the ability of the shell to start new ping tasks is impacted as well), but that's expected on an overloaded system, isnt it? Of course, things should get slower with higher load, but it should be consistent without stalls. To see this problem, make sure you boot into /bin/sh with the normal VGA console (ie. not fb-console). Then try each loop a few times to show different behaviour; loops like: # for ((i=0; i; i++)); do ping 10.1 -A /dev/null done # for ((i=0; i; i++)); do nice -99 ping 10.1 -A /dev/null done # { for ((i=0; i; i++)); do ping 10.1 -A /dev/null done } /dev/null 21 Especially the last one sometimes causes a complete console lock-up, while the other two sometimes stall then surge periodically. ok. I think i might finally have found the bug causing this. Could you try the fix below, does your webserver thread-startup test work any better? Ingo --- Subject: sched: fix startup penalty calculation From: Ingo Molnar [EMAIL PROTECTED] fix task startup penalty miscalculation: sysctl_sched_granularity is unsigned int and wait_runtime is long so we first have to convert it to long before turning it negative ... Signed-off-by: Ingo Molnar [EMAIL PROTECTED] --- kernel/sched_fair.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux/kernel/sched_fair.c === --- linux.orig/kernel/sched_fair.c +++ linux/kernel/sched_fair.c @@ -1048,7 +1048,7 @@ static void task_new_fair(struct rq *rq, * -granularity/2, so initialize the task with that: */ if (sysctl_sched_features SCHED_FEAT_START_DEBIT) - p-se.wait_runtime = -(sysctl_sched_granularity / 2); + p-se.wait_runtime = -((long)sysctl_sched_granularity / 2); __enqueue_entity(cfs_rq, se); } - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Ingo Molnar wrote: > * Al Boldi <[EMAIL PROTECTED]> wrote: > > There is one workload that still isn't performing well; it's a > > web-server workload that spawns 1K+ client procs. It can be emulated > > by using this: > > > > for i in `seq 1 to `; do ping 10.1 -A > /dev/null & done > > on bash i did this as: > > for ((i=0; i<; i++)); do ping 10.1 -A > /dev/null & done > > and this quickly creates a monster-runqueue with tons of ping tasks > pending. (i replaced 10.1 with the IP of another box on the same LAN as > the testbox) Is this what should happen? Yes, sometimes they start pending and sometimes they run immediately. > > The problem is that consecutive runs don't give consistent results and > > sometimes stalls. You may want to try that. > > well, there's a natural saturation point after a few hundred tasks > (depending on your CPU's speed), at which point there's no idle time > left. From that point on things get slower progressively (and the > ability of the shell to start new ping tasks is impacted as well), but > that's expected on an overloaded system, isnt it? Of course, things should get slower with higher load, but it should be consistent without stalls. To see this problem, make sure you boot into /bin/sh with the normal VGA console (ie. not fb-console). Then try each loop a few times to show different behaviour; loops like: # for ((i=0; i<; i++)); do ping 10.1 -A > /dev/null & done # for ((i=0; i<; i++)); do nice -99 ping 10.1 -A > /dev/null & done # { for ((i=0; i<; i++)); do ping 10.1 -A > /dev/null & done } > /dev/null 2>&1 Especially the last one sometimes causes a complete console lock-up, while the other two sometimes stall then surge periodically. BTW, I am also wondering how one might test threading behaviour wrt to startup and sync-on-exit with parent thread. This may not show any problems with small number of threads, but how does it scale with 1K+? Thanks! -- Al - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Hi, On Tue, 21 Aug 2007, Mike Galbraith wrote: > I thought this was history. With your config, I was finally able to > reproduce the anomaly (only with your proggy though), and Ingo's patch > does indeed fix it here. > > Freshly reproduced anomaly and patch verification, running 2.6.23-rc3 > with your config, both with and without Ingo's patch reverted: I did update to 2.6.23-rc3-git1 first, but I ended up reverting the patch, as I didn't notice it had been applied already. Sorry about that. With this patch the underflows are gone, but there are still the overflows, so the questions from the last mail still remain. bye, Roman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Al Boldi <[EMAIL PROTECTED]> wrote: > There is one workload that still isn't performing well; it's a > web-server workload that spawns 1K+ client procs. It can be emulated > by using this: > > for i in `seq 1 to `; do ping 10.1 -A > /dev/null & done on bash i did this as: for ((i=0; i<; i++)); do ping 10.1 -A > /dev/null & done and this quickly creates a monster-runqueue with tons of ping tasks pending. (i replaced 10.1 with the IP of another box on the same LAN as the testbox) Is this what should happen? > The problem is that consecutive runs don't give consistent results and > sometimes stalls. You may want to try that. well, there's a natural saturation point after a few hundred tasks (depending on your CPU's speed), at which point there's no idle time left. From that point on things get slower progressively (and the ability of the shell to start new ping tasks is impacted as well), but that's expected on an overloaded system, isnt it? Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Mike Galbraith <[EMAIL PROTECTED]> wrote: > > It doesn't make much of a difference. > > I thought this was history. With your config, I was finally able to > reproduce the anomaly (only with your proggy though), and Ingo's patch > does indeed fix it here. > > Freshly reproduced anomaly and patch verification, running 2.6.23-rc3 > with your config, both with and without Ingo's patch reverted: > > 6561 root 20 0 1696 492 404 S 32.0 0.0 0:30.83 0 lt > 6562 root 20 0 1696 336 248 R 32.0 0.0 0:30.79 0 lt > 6563 root 20 0 1696 336 248 R 32.0 0.0 0:30.80 0 lt > 6564 root 20 0 2888 1236 1028 R 4.6 0.1 0:05.26 0 sh > > 6507 root 20 0 2888 1236 1028 R 25.8 0.1 0:30.75 0 sh > 6504 root 20 0 1696 492 404 R 24.4 0.0 0:29.26 0 lt > 6505 root 20 0 1696 336 248 R 24.4 0.0 0:29.26 0 lt > 6506 root 20 0 1696 336 248 R 24.4 0.0 0:29.25 0 lt oh, great! I'm glad we didnt discard this as a pure sched_clock resolution artifact. Roman, a quick & easy request: please send the usual cfs-debug-info.sh output captured while your testcase is running. (Preferably try .23-rc3 or later as Mike did, which has the most recent scheduler code, it includes the patch i sent to you already.) I'll reply to your sleeper-fairness questions separately, but in any case we need to figure out what's happening on your box - if you can still reproduce it with .23-rc3. Thanks, Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On Tue, 2007-08-21 at 00:19 +0200, Roman Zippel wrote: > Hi, > > On Sat, 11 Aug 2007, Ingo Molnar wrote: > > > the only relevant thing that comes to mind at the moment is that last > > week Peter noticed a buggy aspect of sleeper bonuses (in that we do not > > rate-limit their output, hence we 'waste' them instead of redistributing > > them), and i've got the small patch below in my queue to fix that - > > could you give it a try? > > It doesn't make much of a difference. I thought this was history. With your config, I was finally able to reproduce the anomaly (only with your proggy though), and Ingo's patch does indeed fix it here. Freshly reproduced anomaly and patch verification, running 2.6.23-rc3 with your config, both with and without Ingo's patch reverted: 6561 root 20 0 1696 492 404 S 32.0 0.0 0:30.83 0 lt 6562 root 20 0 1696 336 248 R 32.0 0.0 0:30.79 0 lt 6563 root 20 0 1696 336 248 R 32.0 0.0 0:30.80 0 lt 6564 root 20 0 2888 1236 1028 R 4.6 0.1 0:05.26 0 sh 6507 root 20 0 2888 1236 1028 R 25.8 0.1 0:30.75 0 sh 6504 root 20 0 1696 492 404 R 24.4 0.0 0:29.26 0 lt 6505 root 20 0 1696 336 248 R 24.4 0.0 0:29.26 0 lt 6506 root 20 0 1696 336 248 R 24.4 0.0 0:29.25 0 lt -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On Tue, 2007-08-21 at 00:19 +0200, Roman Zippel wrote: Hi, On Sat, 11 Aug 2007, Ingo Molnar wrote: the only relevant thing that comes to mind at the moment is that last week Peter noticed a buggy aspect of sleeper bonuses (in that we do not rate-limit their output, hence we 'waste' them instead of redistributing them), and i've got the small patch below in my queue to fix that - could you give it a try? It doesn't make much of a difference. I thought this was history. With your config, I was finally able to reproduce the anomaly (only with your proggy though), and Ingo's patch does indeed fix it here. Freshly reproduced anomaly and patch verification, running 2.6.23-rc3 with your config, both with and without Ingo's patch reverted: 6561 root 20 0 1696 492 404 S 32.0 0.0 0:30.83 0 lt 6562 root 20 0 1696 336 248 R 32.0 0.0 0:30.79 0 lt 6563 root 20 0 1696 336 248 R 32.0 0.0 0:30.80 0 lt 6564 root 20 0 2888 1236 1028 R 4.6 0.1 0:05.26 0 sh 6507 root 20 0 2888 1236 1028 R 25.8 0.1 0:30.75 0 sh 6504 root 20 0 1696 492 404 R 24.4 0.0 0:29.26 0 lt 6505 root 20 0 1696 336 248 R 24.4 0.0 0:29.26 0 lt 6506 root 20 0 1696 336 248 R 24.4 0.0 0:29.25 0 lt -Mike - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Mike Galbraith [EMAIL PROTECTED] wrote: It doesn't make much of a difference. I thought this was history. With your config, I was finally able to reproduce the anomaly (only with your proggy though), and Ingo's patch does indeed fix it here. Freshly reproduced anomaly and patch verification, running 2.6.23-rc3 with your config, both with and without Ingo's patch reverted: 6561 root 20 0 1696 492 404 S 32.0 0.0 0:30.83 0 lt 6562 root 20 0 1696 336 248 R 32.0 0.0 0:30.79 0 lt 6563 root 20 0 1696 336 248 R 32.0 0.0 0:30.80 0 lt 6564 root 20 0 2888 1236 1028 R 4.6 0.1 0:05.26 0 sh 6507 root 20 0 2888 1236 1028 R 25.8 0.1 0:30.75 0 sh 6504 root 20 0 1696 492 404 R 24.4 0.0 0:29.26 0 lt 6505 root 20 0 1696 336 248 R 24.4 0.0 0:29.26 0 lt 6506 root 20 0 1696 336 248 R 24.4 0.0 0:29.25 0 lt oh, great! I'm glad we didnt discard this as a pure sched_clock resolution artifact. Roman, a quick easy request: please send the usual cfs-debug-info.sh output captured while your testcase is running. (Preferably try .23-rc3 or later as Mike did, which has the most recent scheduler code, it includes the patch i sent to you already.) I'll reply to your sleeper-fairness questions separately, but in any case we need to figure out what's happening on your box - if you can still reproduce it with .23-rc3. Thanks, Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Al Boldi [EMAIL PROTECTED] wrote: There is one workload that still isn't performing well; it's a web-server workload that spawns 1K+ client procs. It can be emulated by using this: for i in `seq 1 to `; do ping 10.1 -A /dev/null done on bash i did this as: for ((i=0; i; i++)); do ping 10.1 -A /dev/null done and this quickly creates a monster-runqueue with tons of ping tasks pending. (i replaced 10.1 with the IP of another box on the same LAN as the testbox) Is this what should happen? The problem is that consecutive runs don't give consistent results and sometimes stalls. You may want to try that. well, there's a natural saturation point after a few hundred tasks (depending on your CPU's speed), at which point there's no idle time left. From that point on things get slower progressively (and the ability of the shell to start new ping tasks is impacted as well), but that's expected on an overloaded system, isnt it? Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Hi, On Tue, 21 Aug 2007, Mike Galbraith wrote: I thought this was history. With your config, I was finally able to reproduce the anomaly (only with your proggy though), and Ingo's patch does indeed fix it here. Freshly reproduced anomaly and patch verification, running 2.6.23-rc3 with your config, both with and without Ingo's patch reverted: I did update to 2.6.23-rc3-git1 first, but I ended up reverting the patch, as I didn't notice it had been applied already. Sorry about that. With this patch the underflows are gone, but there are still the overflows, so the questions from the last mail still remain. bye, Roman - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Ingo Molnar wrote: * Al Boldi [EMAIL PROTECTED] wrote: There is one workload that still isn't performing well; it's a web-server workload that spawns 1K+ client procs. It can be emulated by using this: for i in `seq 1 to `; do ping 10.1 -A /dev/null done on bash i did this as: for ((i=0; i; i++)); do ping 10.1 -A /dev/null done and this quickly creates a monster-runqueue with tons of ping tasks pending. (i replaced 10.1 with the IP of another box on the same LAN as the testbox) Is this what should happen? Yes, sometimes they start pending and sometimes they run immediately. The problem is that consecutive runs don't give consistent results and sometimes stalls. You may want to try that. well, there's a natural saturation point after a few hundred tasks (depending on your CPU's speed), at which point there's no idle time left. From that point on things get slower progressively (and the ability of the shell to start new ping tasks is impacted as well), but that's expected on an overloaded system, isnt it? Of course, things should get slower with higher load, but it should be consistent without stalls. To see this problem, make sure you boot into /bin/sh with the normal VGA console (ie. not fb-console). Then try each loop a few times to show different behaviour; loops like: # for ((i=0; i; i++)); do ping 10.1 -A /dev/null done # for ((i=0; i; i++)); do nice -99 ping 10.1 -A /dev/null done # { for ((i=0; i; i++)); do ping 10.1 -A /dev/null done } /dev/null 21 Especially the last one sometimes causes a complete console lock-up, while the other two sometimes stall then surge periodically. BTW, I am also wondering how one might test threading behaviour wrt to startup and sync-on-exit with parent thread. This may not show any problems with small number of threads, but how does it scale with 1K+? Thanks! -- Al - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Hi, On Sat, 11 Aug 2007, Ingo Molnar wrote: > the only relevant thing that comes to mind at the moment is that last > week Peter noticed a buggy aspect of sleeper bonuses (in that we do not > rate-limit their output, hence we 'waste' them instead of redistributing > them), and i've got the small patch below in my queue to fix that - > could you give it a try? It doesn't make much of a difference. OTOH if I disabled the sleeper code completely in __update_curr(), I get this: PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 3139 roman 20 0 1796 344 256 R 21.7 0.3 0:02.68 lt 3138 roman 20 0 1796 344 256 R 21.7 0.3 0:02.68 lt 3137 roman 20 0 1796 520 432 R 21.7 0.4 0:02.68 lt 3136 roman 20 0 1532 268 216 R 34.5 0.2 0:06.82 l Disabling this code completely via sched_features makes only a minor difference: PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 3139 roman 20 0 1796 344 256 R 20.4 0.3 0:09.94 lt 3138 roman 20 0 1796 344 256 R 20.4 0.3 0:09.94 lt 3137 roman 20 0 1796 520 432 R 20.4 0.4 0:09.94 lt 3136 roman 20 0 1532 268 216 R 39.1 0.2 0:19.20 l > this is just a blind stab into the dark - i couldnt see any real impact > from that patch in various workloads (and it's not upstream yet), so it > might not make a big difference. Can we please skip to the point, where you try to explain the intention a little more? If I had to guess that this is supposed to keep the runtime balance, then it would be better to use wait_runtime to adjust fair_clock, from where it would be evenly distributed to all tasks (but this had to be done during enqueue and dequeue). OTOH this also had then a consequence for the wait queue, as fair_clock is used to calculate fair_key. IMHO current wait_runtime should have some influence in calculating the sleep bonus, so that wait_runtime doesn't constantly overflow for tasks which only run occasionally. bye, Roman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Hi, On Sat, 11 Aug 2007, Ingo Molnar wrote: the only relevant thing that comes to mind at the moment is that last week Peter noticed a buggy aspect of sleeper bonuses (in that we do not rate-limit their output, hence we 'waste' them instead of redistributing them), and i've got the small patch below in my queue to fix that - could you give it a try? It doesn't make much of a difference. OTOH if I disabled the sleeper code completely in __update_curr(), I get this: PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 3139 roman 20 0 1796 344 256 R 21.7 0.3 0:02.68 lt 3138 roman 20 0 1796 344 256 R 21.7 0.3 0:02.68 lt 3137 roman 20 0 1796 520 432 R 21.7 0.4 0:02.68 lt 3136 roman 20 0 1532 268 216 R 34.5 0.2 0:06.82 l Disabling this code completely via sched_features makes only a minor difference: PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 3139 roman 20 0 1796 344 256 R 20.4 0.3 0:09.94 lt 3138 roman 20 0 1796 344 256 R 20.4 0.3 0:09.94 lt 3137 roman 20 0 1796 520 432 R 20.4 0.4 0:09.94 lt 3136 roman 20 0 1532 268 216 R 39.1 0.2 0:19.20 l this is just a blind stab into the dark - i couldnt see any real impact from that patch in various workloads (and it's not upstream yet), so it might not make a big difference. Can we please skip to the point, where you try to explain the intention a little more? If I had to guess that this is supposed to keep the runtime balance, then it would be better to use wait_runtime to adjust fair_clock, from where it would be evenly distributed to all tasks (but this had to be done during enqueue and dequeue). OTOH this also had then a consequence for the wait queue, as fair_clock is used to calculate fair_key. IMHO current wait_runtime should have some influence in calculating the sleep bonus, so that wait_runtime doesn't constantly overflow for tasks which only run occasionally. bye, Roman - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/