DRM and/or X trouble (was Re: CFS review)

2007-08-31 Thread Rene Herman

On 08/31/2007 08:46 AM, Tilman Sauerbeck wrote:


On 08/29/2007 09:56 PM, Rene Herman wrote:



With X server 1.3, I'm getting consistent crashes with two glxgear
instances running. So, if you're getting any output, it's better than my
situation.

Before people focuss on software rendering too much -- also with 1.3.0
(and a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly
crummy using hardware rendering. While I can move the glxgears window
itself, the actual spinning wheels stay in the upper-left corner of the
screen and the movement leaves a non-repainting trace on the screen.


This sounds like you're running an older version of Mesa.
The bugfix went into Mesa 6.3 and 7.0.


I have Mesa 6.5.2 it seems (slackware-12.0 standard):

OpenGL renderer string: Mesa DRI G400 20061030 AGP 2x x86/MMX+/3DNow!+/SSE
OpenGL version string: 1.2 Mesa 6.5.2

The bit of the problem sketched above -- the gears just sitting there in the 
upper left corner of the screen and not moving alongside their window is 
fully reproduceable. The bit below ... :



Running a second instance of glxgears in addition seems to make both
instances unkillable -- and when I just now forcefully killed X in this
situation (the spinning wheels were covering the upper left corner of all
my desktops) I got the below.


[ two kernel BUGs ]

... isn't. This seems to (again) have been a race of sorts that I hit by 
accident since I haven't reproduced yet. Had the same type of "racyness" 
trouble with keyboard behaviour in this version of X earlier.



Running two instances of glxgears and killing them works for me, too.

I'm using xorg-server 1.3.0.0, Mesa 7.0.1 with the latest DRM bits from
http://gitweb.freedesktop.org/?p=mesa/drm.git;a=summary


For me, everything standard slackware-12.0 (X.org 1.3.0) and kernel 2.6.22 DRM.


I'm not running CFS though, but I guess the oops wasn't related to that.


I've noticed before the Matrox driver seems to get little attention/testing 
so maybe that's just it. A G550 is ofcourse in graphics-time a Model T by 
now. I'm rather decidedly not a graphics person so I don't care a lot but 
every time I try to do something fashionable (run Google Earth for example) 
I notice things are horribly, horribly broken.


X bugs I do not find very interesting (there's just too many) and the kernel 
bugs are requiring more time to reproduce than I have available. If the BUGs 
as posted aren't enough for a diagnosis, please consider the report withdrawn.


Rene.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-31 Thread Tilman Sauerbeck
Rene Herman [2007-08-30 09:05]:
> On 08/29/2007 09:56 PM, Rene Herman wrote:
> 
> Realised the BUGs may mean the kernel DRM people could want to be in CC...
> 
> > On 08/29/2007 05:57 PM, Keith Packard wrote:
> > 
> >> With X server 1.3, I'm getting consistent crashes with two glxgear
> >> instances running. So, if you're getting any output, it's better than my
> >> situation.
> > 
> > Before people focuss on software rendering too much -- also with 1.3.0
> > (and a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly
> > crummy using hardware rendering. While I can move the glxgears window
> > itself, the actual spinning wheels stay in the upper-left corner of the
> > screen and the movement leaves a non-repainting trace on the screen.

This sounds like you're running an older version of Mesa.
The bugfix went into Mesa 6.3 and 7.0.

> > Running a second instance of glxgears in addition seems to make both
> > instances unkillable -- and when I just now forcefully killed X in this
> > situation (the spinning wheels were covering the upper left corner of all
> > my desktops) I got the below.

Running two instances of glxgears and killing them works for me, too.

I'm using xorg-server 1.3.0.0, Mesa 7.0.1 with the latest DRM bits from
http://gitweb.freedesktop.org/?p=mesa/drm.git;a=summary

I'm not running CFS though, but I guess the oops wasn't related to that.

Regards,
Tilman

-- 
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?


pgpEzsAUWSOSG.pgp
Description: PGP signature


Re: CFS review

2007-08-31 Thread Tilman Sauerbeck
Rene Herman [2007-08-30 09:05]:
 On 08/29/2007 09:56 PM, Rene Herman wrote:
 
 Realised the BUGs may mean the kernel DRM people could want to be in CC...
 
  On 08/29/2007 05:57 PM, Keith Packard wrote:
  
  With X server 1.3, I'm getting consistent crashes with two glxgear
  instances running. So, if you're getting any output, it's better than my
  situation.
  
  Before people focuss on software rendering too much -- also with 1.3.0
  (and a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly
  crummy using hardware rendering. While I can move the glxgears window
  itself, the actual spinning wheels stay in the upper-left corner of the
  screen and the movement leaves a non-repainting trace on the screen.

This sounds like you're running an older version of Mesa.
The bugfix went into Mesa 6.3 and 7.0.

  Running a second instance of glxgears in addition seems to make both
  instances unkillable -- and when I just now forcefully killed X in this
  situation (the spinning wheels were covering the upper left corner of all
  my desktops) I got the below.

Running two instances of glxgears and killing them works for me, too.

I'm using xorg-server 1.3.0.0, Mesa 7.0.1 with the latest DRM bits from
http://gitweb.freedesktop.org/?p=mesa/drm.git;a=summary

I'm not running CFS though, but I guess the oops wasn't related to that.

Regards,
Tilman

-- 
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?


pgpEzsAUWSOSG.pgp
Description: PGP signature


DRM and/or X trouble (was Re: CFS review)

2007-08-31 Thread Rene Herman

On 08/31/2007 08:46 AM, Tilman Sauerbeck wrote:


On 08/29/2007 09:56 PM, Rene Herman wrote:



With X server 1.3, I'm getting consistent crashes with two glxgear
instances running. So, if you're getting any output, it's better than my
situation.

Before people focuss on software rendering too much -- also with 1.3.0
(and a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly
crummy using hardware rendering. While I can move the glxgears window
itself, the actual spinning wheels stay in the upper-left corner of the
screen and the movement leaves a non-repainting trace on the screen.


This sounds like you're running an older version of Mesa.
The bugfix went into Mesa 6.3 and 7.0.


I have Mesa 6.5.2 it seems (slackware-12.0 standard):

OpenGL renderer string: Mesa DRI G400 20061030 AGP 2x x86/MMX+/3DNow!+/SSE
OpenGL version string: 1.2 Mesa 6.5.2

The bit of the problem sketched above -- the gears just sitting there in the 
upper left corner of the screen and not moving alongside their window is 
fully reproduceable. The bit below ... :



Running a second instance of glxgears in addition seems to make both
instances unkillable -- and when I just now forcefully killed X in this
situation (the spinning wheels were covering the upper left corner of all
my desktops) I got the below.


[ two kernel BUGs ]

... isn't. This seems to (again) have been a race of sorts that I hit by 
accident since I haven't reproduced yet. Had the same type of racyness 
trouble with keyboard behaviour in this version of X earlier.



Running two instances of glxgears and killing them works for me, too.

I'm using xorg-server 1.3.0.0, Mesa 7.0.1 with the latest DRM bits from
http://gitweb.freedesktop.org/?p=mesa/drm.git;a=summary


For me, everything standard slackware-12.0 (X.org 1.3.0) and kernel 2.6.22 DRM.


I'm not running CFS though, but I guess the oops wasn't related to that.


I've noticed before the Matrox driver seems to get little attention/testing 
so maybe that's just it. A G550 is ofcourse in graphics-time a Model T by 
now. I'm rather decidedly not a graphics person so I don't care a lot but 
every time I try to do something fashionable (run Google Earth for example) 
I notice things are horribly, horribly broken.


X bugs I do not find very interesting (there's just too many) and the kernel 
bugs are requiring more time to reproduce than I have available. If the BUGs 
as posted aren't enough for a diagnosis, please consider the report withdrawn.


Rene.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-30 Thread Rene Herman

On 08/30/2007 06:06 PM, Chuck Ebbert wrote:


On 08/29/2007 03:56 PM, Rene Herman wrote:



Before people focuss on software rendering too much -- also with 1.3.0
(and a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly
crummy using hardware rendering. While I can move the glxgears window
itself, the actual spinning wheels stay in the upper-left corner of the
screen and the movement leaves a non-repainting trace on the screen.
Running a second instance of glxgears in addition seems to make both
instances unkillable -- and when I just now forcefully killed X in this
situation (the spinning wheels were covering the upper left corner of
all my desktops) I got the below.

Kernel is 2.6.22.5-cfs-v20.5, schedule() is in the traces (but that may
be expected anyway).



And this doesn't happen at all with the stock scheduler? (Just confirming,
in case you didn't compare.)


I didn't compare -- it no doubt will. I know the title of this thread is 
"CFS review" but it turned into Keith Packard noticing glxgears being broken 
on recent-ish X.org. The start of the thread was about things being broken 
using _software_ rendering though, so I thought it might be useful to 
remark/report glxgears also being quite broken using hardware rendering on 
my setup at least.



BUG: unable to handle kernel NULL pointer dereference at virtual address
0010
 printing eip:
c10ff416
*pde = 
Oops:  [#1]
PREEMPT


Try it without preempt?


If you're asking in a "I'll go debug the DRM" way I'll go dig a bit later 
(please say) but if you are only interested in the thread due to CFS, note 
that I'm aware it's not likely to have anything to do with CFS.


It's not reproducable for you? (full description of bug above).

Rene.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-30 Thread Chuck Ebbert
On 08/29/2007 03:56 PM, Rene Herman wrote:
> 
> Before people focuss on software rendering too much -- also with 1.3.0 (and
> a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly crummy
> using
> hardware rendering. While I can move the glxgears window itself, the actual
> spinning wheels stay in the upper-left corner of the screen and the
> movement
> leaves a non-repainting trace on the screen. Running a second instance of
> glxgears in addition seems to make both instances unkillable  -- and when
> I just now forcefully killed X in this situation (the spinning wheels were
> covering the upper left corner of all my desktops) I got the below.
> 
> Kernel is 2.6.22.5-cfs-v20.5, schedule() is in the traces (but that may be
> expected anyway).
> 

And this doesn't happen at all with the stock scheduler? (Just confirming,
in case you didn't compare.)

> BUG: unable to handle kernel NULL pointer dereference at virtual address
> 0010
>  printing eip:
> c10ff416
> *pde = 
> Oops:  [#1]
> PREEMPT

Try it without preempt?

> Modules linked in: nfsd exportfs lockd nfs_acl sunrpc nls_iso8859_1
> nls_cp437 vfat fat nls_base
> CPU:0
> EIP:0060:[]Not tainted VLI
> EFLAGS: 00210246   (2.6.22.5-cfs-v20.5-local #5)
> EIP is at mga_dma_buffers+0x189/0x2e3
> eax:    ebx: efd07200   ecx: 0001   edx: efc32c00
> esi:    edi: c12756cc   ebp: dfea44c0   esp: dddaaec0
> ds: 007b   es: 007b   fs:   gs: 0033  ss: 0068
> Process glxgears (pid: 1775, ti=dddaa000 task=e9daca60 task.ti=dddaa000)
> Stack: efc32c00  0004 e4c3bd20 c10fa54b e4c3bd20 efc32c00
> 
>0004     0001 0001
> bfbdb8bc
>bfbdb8b8  c10ff28d 0029 c12756cc dfea44c0 c10f87fc
> bfbdb844
> Call Trace:
>  [] drm_lock+0x255/0x2de
>  [] mga_dma_buffers+0x0/0x2e3
>  [] drm_ioctl+0x142/0x18a
>  [] do_IRQ+0x97/0xb0
>  [] drm_ioctl+0x0/0x18a
>  [] drm_ioctl+0x0/0x18a
>  [] do_ioctl+0x87/0x9f
>  [] vfs_ioctl+0x23d/0x250
>  [] schedule+0x2d0/0x2e6
>  [] sys_ioctl+0x33/0x4d
>  [] syscall_call+0x7/0xb
>  ===
> Code: 9a 08 03 00 00 8b 73 30 74 14 c7 44 24 04 28 76 1c c1 c7 04 24 49
> 51 23 c1 e8 b0 74 f1 ff 8b 83 d8 00 00 00 83 3d 1c 47 30 c1 00 <8b> 40
> 10 8b a8 58 1e 00 00 8b 43 28 8b b8 64 01 00 00 74 32 8b
> EIP: [] mga_dma_buffers+0x189/0x2e3 SS:ESP 0068:dddaaec0

dev->dev_private->mmio is NULL when trying to access mmio.handle
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-30 Thread Ingo Molnar

* Rene Herman <[EMAIL PROTECTED]> wrote:

> Realised the BUGs may mean the kernel DRM people could want to be in CC...

and note that the schedule() call in there is not part of the crash 
backtrace:

> >Call Trace:
> > [] drm_lock+0x255/0x2de
> > [] mga_dma_buffers+0x0/0x2e3
> > [] drm_ioctl+0x142/0x18a
> > [] do_IRQ+0x97/0xb0
> > [] drm_ioctl+0x0/0x18a
> > [] drm_ioctl+0x0/0x18a
> > [] do_ioctl+0x87/0x9f
> > [] vfs_ioctl+0x23d/0x250
> > [] schedule+0x2d0/0x2e6
> > [] sys_ioctl+0x33/0x4d
> > [] syscall_call+0x7/0xb

it just happened to be on the kernel stack. Nor is the do_IRQ() entry 
real. Both are frequent functions (and were executed recently) that's 
why they were still in the stackframe.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-30 Thread Rene Herman

On 08/29/2007 09:56 PM, Rene Herman wrote:

Realised the BUGs may mean the kernel DRM people could want to be in CC...


On 08/29/2007 05:57 PM, Keith Packard wrote:


With X server 1.3, I'm getting consistent crashes with two glxgear
instances running. So, if you're getting any output, it's better than my
situation.


Before people focuss on software rendering too much -- also with 1.3.0
(and a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly
crummy using hardware rendering. While I can move the glxgears window
itself, the actual spinning wheels stay in the upper-left corner of the
screen and the movement leaves a non-repainting trace on the screen.
Running a second instance of glxgears in addition seems to make both
instances unkillable -- and when I just now forcefully killed X in this
situation (the spinning wheels were covering the upper left corner of all
my desktops) I got the below.

Kernel is 2.6.22.5-cfs-v20.5, schedule() is in the traces (but that may be
expected anyway).

BUG: unable to handle kernel NULL pointer dereference at virtual address 
0010

 printing eip:
c10ff416
*pde = 
Oops:  [#1]
PREEMPT
Modules linked in: nfsd exportfs lockd nfs_acl sunrpc nls_iso8859_1 
nls_cp437 vfat fat nls_base

CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00210246   (2.6.22.5-cfs-v20.5-local #5)
EIP is at mga_dma_buffers+0x189/0x2e3
eax:    ebx: efd07200   ecx: 0001   edx: efc32c00
esi:    edi: c12756cc   ebp: dfea44c0   esp: dddaaec0
ds: 007b   es: 007b   fs:   gs: 0033  ss: 0068
Process glxgears (pid: 1775, ti=dddaa000 task=e9daca60 task.ti=dddaa000)
Stack: efc32c00  0004 e4c3bd20 c10fa54b e4c3bd20 efc32c00 

   0004     0001 0001 
bfbdb8bc
   bfbdb8b8  c10ff28d 0029 c12756cc dfea44c0 c10f87fc 
bfbdb844

Call Trace:
 [] drm_lock+0x255/0x2de
 [] mga_dma_buffers+0x0/0x2e3
 [] drm_ioctl+0x142/0x18a
 [] do_IRQ+0x97/0xb0
 [] drm_ioctl+0x0/0x18a
 [] drm_ioctl+0x0/0x18a
 [] do_ioctl+0x87/0x9f
 [] vfs_ioctl+0x23d/0x250
 [] schedule+0x2d0/0x2e6
 [] sys_ioctl+0x33/0x4d
 [] syscall_call+0x7/0xb
 ===
Code: 9a 08 03 00 00 8b 73 30 74 14 c7 44 24 04 28 76 1c c1 c7 04 24 49 
51 23 c1 e8 b0 74 f1 ff 8b 83 d8 00 00 00 83 3d 1c 47 30 c1 00 <8b> 40 
10 8b a8 58 1e 00 00 8b 43 28 8b b8 64 01 00 00 74 32 8b

EIP: [] mga_dma_buffers+0x189/0x2e3 SS:ESP 0068:dddaaec0
BUG: unable to handle kernel NULL pointer dereference at virtual address 
0010

 printing eip:
c10ff416
*pde = 
Oops:  [#2]
PREEMPT
Modules linked in: nfsd exportfs lockd nfs_acl sunrpc nls_iso8859_1 
nls_cp437 vfat fat nls_base

CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00210246   (2.6.22.5-cfs-v20.5-local #5)
EIP is at mga_dma_buffers+0x189/0x2e3
eax:    ebx: efd07200   ecx: 0001   edx: efc32c00
esi:    edi: c12756cc   ebp: dfea4780   esp: e0552ec0
ds: 007b   es: 007b   fs:   gs: 0033  ss: 0068
Process glxgears (pid: 1776, ti=e0552000 task=c19ec000 task.ti=e0552000)
Stack: efc32c00  0003 efc64b40 c10fa54b efc64b40 efc32c00 

   0003     0001 0001 
bf8dbdcc
   bf8dbdc8  c10ff28d 0029 c12756cc dfea4780 c10f87fc 
bf8dbd54

Call Trace:
 [] drm_lock+0x255/0x2de
 [] mga_dma_buffers+0x0/0x2e3
 [] drm_ioctl+0x142/0x18a
 [] preempt_schedule+0x4e/0x5a
 [] drm_ioctl+0x0/0x18a
 [] drm_ioctl+0x0/0x18a
 [] do_ioctl+0x87/0x9f
 [] vfs_ioctl+0x23d/0x250
 [] schedule+0x23b/0x2e6
 [] schedule+0x2d0/0x2e6
 [] sys_ioctl+0x33/0x4d
 [] syscall_call+0x7/0xb
 ===
Code: 9a 08 03 00 00 8b 73 30 74 14 c7 44 24 04 28 76 1c c1 c7 04 24 49 
51 23 c1 e8 b0 74 f1 ff 8b 83 d8 00 00 00 83 3d 1c 47 30 c1 00 <8b> 40 
10 8b a8 58 1e 00 00 8b 43 28 8b b8 64 01 00 00 74 32 8b

EIP: [] mga_dma_buffers+0x189/0x2e3 SS:ESP 0068:e0552ec0
[drm:drm_release] *ERROR* Device busy: 2 0


Rene.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-30 Thread Rene Herman

On 08/29/2007 09:56 PM, Rene Herman wrote:

Realised the BUGs may mean the kernel DRM people could want to be in CC...


On 08/29/2007 05:57 PM, Keith Packard wrote:


With X server 1.3, I'm getting consistent crashes with two glxgear
instances running. So, if you're getting any output, it's better than my
situation.


Before people focuss on software rendering too much -- also with 1.3.0
(and a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly
crummy using hardware rendering. While I can move the glxgears window
itself, the actual spinning wheels stay in the upper-left corner of the
screen and the movement leaves a non-repainting trace on the screen.
Running a second instance of glxgears in addition seems to make both
instances unkillable -- and when I just now forcefully killed X in this
situation (the spinning wheels were covering the upper left corner of all
my desktops) I got the below.

Kernel is 2.6.22.5-cfs-v20.5, schedule() is in the traces (but that may be
expected anyway).

BUG: unable to handle kernel NULL pointer dereference at virtual address 
0010

 printing eip:
c10ff416
*pde = 
Oops:  [#1]
PREEMPT
Modules linked in: nfsd exportfs lockd nfs_acl sunrpc nls_iso8859_1 
nls_cp437 vfat fat nls_base

CPU:0
EIP:0060:[c10ff416]Not tainted VLI
EFLAGS: 00210246   (2.6.22.5-cfs-v20.5-local #5)
EIP is at mga_dma_buffers+0x189/0x2e3
eax:    ebx: efd07200   ecx: 0001   edx: efc32c00
esi:    edi: c12756cc   ebp: dfea44c0   esp: dddaaec0
ds: 007b   es: 007b   fs:   gs: 0033  ss: 0068
Process glxgears (pid: 1775, ti=dddaa000 task=e9daca60 task.ti=dddaa000)
Stack: efc32c00  0004 e4c3bd20 c10fa54b e4c3bd20 efc32c00 

   0004     0001 0001 
bfbdb8bc
   bfbdb8b8  c10ff28d 0029 c12756cc dfea44c0 c10f87fc 
bfbdb844

Call Trace:
 [c10fa54b] drm_lock+0x255/0x2de
 [c10ff28d] mga_dma_buffers+0x0/0x2e3
 [c10f87fc] drm_ioctl+0x142/0x18a
 [c1005973] do_IRQ+0x97/0xb0
 [c10f86ba] drm_ioctl+0x0/0x18a
 [c10f86ba] drm_ioctl+0x0/0x18a
 [c105b0d7] do_ioctl+0x87/0x9f
 [c105b32c] vfs_ioctl+0x23d/0x250
 [c11b533e] schedule+0x2d0/0x2e6
 [c105b372] sys_ioctl+0x33/0x4d
 [c1003d1e] syscall_call+0x7/0xb
 ===
Code: 9a 08 03 00 00 8b 73 30 74 14 c7 44 24 04 28 76 1c c1 c7 04 24 49 
51 23 c1 e8 b0 74 f1 ff 8b 83 d8 00 00 00 83 3d 1c 47 30 c1 00 8b 40 
10 8b a8 58 1e 00 00 8b 43 28 8b b8 64 01 00 00 74 32 8b

EIP: [c10ff416] mga_dma_buffers+0x189/0x2e3 SS:ESP 0068:dddaaec0
BUG: unable to handle kernel NULL pointer dereference at virtual address 
0010

 printing eip:
c10ff416
*pde = 
Oops:  [#2]
PREEMPT
Modules linked in: nfsd exportfs lockd nfs_acl sunrpc nls_iso8859_1 
nls_cp437 vfat fat nls_base

CPU:0
EIP:0060:[c10ff416]Not tainted VLI
EFLAGS: 00210246   (2.6.22.5-cfs-v20.5-local #5)
EIP is at mga_dma_buffers+0x189/0x2e3
eax:    ebx: efd07200   ecx: 0001   edx: efc32c00
esi:    edi: c12756cc   ebp: dfea4780   esp: e0552ec0
ds: 007b   es: 007b   fs:   gs: 0033  ss: 0068
Process glxgears (pid: 1776, ti=e0552000 task=c19ec000 task.ti=e0552000)
Stack: efc32c00  0003 efc64b40 c10fa54b efc64b40 efc32c00 

   0003     0001 0001 
bf8dbdcc
   bf8dbdc8  c10ff28d 0029 c12756cc dfea4780 c10f87fc 
bf8dbd54

Call Trace:
 [c10fa54b] drm_lock+0x255/0x2de
 [c10ff28d] mga_dma_buffers+0x0/0x2e3
 [c10f87fc] drm_ioctl+0x142/0x18a
 [c11b53f6] preempt_schedule+0x4e/0x5a
 [c10f86ba] drm_ioctl+0x0/0x18a
 [c10f86ba] drm_ioctl+0x0/0x18a
 [c105b0d7] do_ioctl+0x87/0x9f
 [c105b32c] vfs_ioctl+0x23d/0x250
 [c11b52a9] schedule+0x23b/0x2e6
 [c11b533e] schedule+0x2d0/0x2e6
 [c105b372] sys_ioctl+0x33/0x4d
 [c1003d1e] syscall_call+0x7/0xb
 ===
Code: 9a 08 03 00 00 8b 73 30 74 14 c7 44 24 04 28 76 1c c1 c7 04 24 49 
51 23 c1 e8 b0 74 f1 ff 8b 83 d8 00 00 00 83 3d 1c 47 30 c1 00 8b 40 
10 8b a8 58 1e 00 00 8b 43 28 8b b8 64 01 00 00 74 32 8b

EIP: [c10ff416] mga_dma_buffers+0x189/0x2e3 SS:ESP 0068:e0552ec0
[drm:drm_release] *ERROR* Device busy: 2 0


Rene.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-30 Thread Ingo Molnar

* Rene Herman [EMAIL PROTECTED] wrote:

 Realised the BUGs may mean the kernel DRM people could want to be in CC...

and note that the schedule() call in there is not part of the crash 
backtrace:

 Call Trace:
  [c10fa54b] drm_lock+0x255/0x2de
  [c10ff28d] mga_dma_buffers+0x0/0x2e3
  [c10f87fc] drm_ioctl+0x142/0x18a
  [c1005973] do_IRQ+0x97/0xb0
  [c10f86ba] drm_ioctl+0x0/0x18a
  [c10f86ba] drm_ioctl+0x0/0x18a
  [c105b0d7] do_ioctl+0x87/0x9f
  [c105b32c] vfs_ioctl+0x23d/0x250
  [c11b533e] schedule+0x2d0/0x2e6
  [c105b372] sys_ioctl+0x33/0x4d
  [c1003d1e] syscall_call+0x7/0xb

it just happened to be on the kernel stack. Nor is the do_IRQ() entry 
real. Both are frequent functions (and were executed recently) that's 
why they were still in the stackframe.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-30 Thread Chuck Ebbert
On 08/29/2007 03:56 PM, Rene Herman wrote:
 
 Before people focuss on software rendering too much -- also with 1.3.0 (and
 a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly crummy
 using
 hardware rendering. While I can move the glxgears window itself, the actual
 spinning wheels stay in the upper-left corner of the screen and the
 movement
 leaves a non-repainting trace on the screen. Running a second instance of
 glxgears in addition seems to make both instances unkillable  -- and when
 I just now forcefully killed X in this situation (the spinning wheels were
 covering the upper left corner of all my desktops) I got the below.
 
 Kernel is 2.6.22.5-cfs-v20.5, schedule() is in the traces (but that may be
 expected anyway).
 

And this doesn't happen at all with the stock scheduler? (Just confirming,
in case you didn't compare.)

 BUG: unable to handle kernel NULL pointer dereference at virtual address
 0010
  printing eip:
 c10ff416
 *pde = 
 Oops:  [#1]
 PREEMPT

Try it without preempt?

 Modules linked in: nfsd exportfs lockd nfs_acl sunrpc nls_iso8859_1
 nls_cp437 vfat fat nls_base
 CPU:0
 EIP:0060:[c10ff416]Not tainted VLI
 EFLAGS: 00210246   (2.6.22.5-cfs-v20.5-local #5)
 EIP is at mga_dma_buffers+0x189/0x2e3
 eax:    ebx: efd07200   ecx: 0001   edx: efc32c00
 esi:    edi: c12756cc   ebp: dfea44c0   esp: dddaaec0
 ds: 007b   es: 007b   fs:   gs: 0033  ss: 0068
 Process glxgears (pid: 1775, ti=dddaa000 task=e9daca60 task.ti=dddaa000)
 Stack: efc32c00  0004 e4c3bd20 c10fa54b e4c3bd20 efc32c00
 
0004     0001 0001
 bfbdb8bc
bfbdb8b8  c10ff28d 0029 c12756cc dfea44c0 c10f87fc
 bfbdb844
 Call Trace:
  [c10fa54b] drm_lock+0x255/0x2de
  [c10ff28d] mga_dma_buffers+0x0/0x2e3
  [c10f87fc] drm_ioctl+0x142/0x18a
  [c1005973] do_IRQ+0x97/0xb0
  [c10f86ba] drm_ioctl+0x0/0x18a
  [c10f86ba] drm_ioctl+0x0/0x18a
  [c105b0d7] do_ioctl+0x87/0x9f
  [c105b32c] vfs_ioctl+0x23d/0x250
  [c11b533e] schedule+0x2d0/0x2e6
  [c105b372] sys_ioctl+0x33/0x4d
  [c1003d1e] syscall_call+0x7/0xb
  ===
 Code: 9a 08 03 00 00 8b 73 30 74 14 c7 44 24 04 28 76 1c c1 c7 04 24 49
 51 23 c1 e8 b0 74 f1 ff 8b 83 d8 00 00 00 83 3d 1c 47 30 c1 00 8b 40
 10 8b a8 58 1e 00 00 8b 43 28 8b b8 64 01 00 00 74 32 8b
 EIP: [c10ff416] mga_dma_buffers+0x189/0x2e3 SS:ESP 0068:dddaaec0

dev-dev_private-mmio is NULL when trying to access mmio.handle
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-30 Thread Rene Herman

On 08/30/2007 06:06 PM, Chuck Ebbert wrote:


On 08/29/2007 03:56 PM, Rene Herman wrote:



Before people focuss on software rendering too much -- also with 1.3.0
(and a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly
crummy using hardware rendering. While I can move the glxgears window
itself, the actual spinning wheels stay in the upper-left corner of the
screen and the movement leaves a non-repainting trace on the screen.
Running a second instance of glxgears in addition seems to make both
instances unkillable -- and when I just now forcefully killed X in this
situation (the spinning wheels were covering the upper left corner of
all my desktops) I got the below.

Kernel is 2.6.22.5-cfs-v20.5, schedule() is in the traces (but that may
be expected anyway).



And this doesn't happen at all with the stock scheduler? (Just confirming,
in case you didn't compare.)


I didn't compare -- it no doubt will. I know the title of this thread is 
CFS review but it turned into Keith Packard noticing glxgears being broken 
on recent-ish X.org. The start of the thread was about things being broken 
using _software_ rendering though, so I thought it might be useful to 
remark/report glxgears also being quite broken using hardware rendering on 
my setup at least.



BUG: unable to handle kernel NULL pointer dereference at virtual address
0010
 printing eip:
c10ff416
*pde = 
Oops:  [#1]
PREEMPT


Try it without preempt?


If you're asking in a I'll go debug the DRM way I'll go dig a bit later 
(please say) but if you are only interested in the thread due to CFS, note 
that I'm aware it's not likely to have anything to do with CFS.


It's not reproducable for you? (full description of bug above).

Rene.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-29 Thread Rene Herman

On 08/29/2007 05:57 PM, Keith Packard wrote:


With X server 1.3, I'm getting consistent crashes with two glxgear
instances running. So, if you're getting any output, it's better than my
situation.


Before people focuss on software rendering too much -- also with 1.3.0 (and
a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly crummy using
hardware rendering. While I can move the glxgears window itself, the actual
spinning wheels stay in the upper-left corner of the screen and the movement
leaves a non-repainting trace on the screen. Running a second instance of
glxgears in addition seems to make both instances unkillable  -- and when
I just now forcefully killed X in this situation (the spinning wheels were
covering the upper left corner of all my desktops) I got the below.

Kernel is 2.6.22.5-cfs-v20.5, schedule() is in the traces (but that may be
expected anyway).

BUG: unable to handle kernel NULL pointer dereference at virtual address 
0010
 printing eip:
c10ff416
*pde = 
Oops:  [#1]
PREEMPT
Modules linked in: nfsd exportfs lockd nfs_acl sunrpc nls_iso8859_1 nls_cp437 vfat fat 
nls_base

CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00210246   (2.6.22.5-cfs-v20.5-local #5)
EIP is at mga_dma_buffers+0x189/0x2e3
eax:    ebx: efd07200   ecx: 0001   edx: efc32c00
esi:    edi: c12756cc   ebp: dfea44c0   esp: dddaaec0
ds: 007b   es: 007b   fs:   gs: 0033  ss: 0068
Process glxgears (pid: 1775, ti=dddaa000 task=e9daca60 task.ti=dddaa000)
Stack: efc32c00  0004 e4c3bd20 c10fa54b e4c3bd20 efc32c00 
   0004     0001 0001 bfbdb8bc
   bfbdb8b8  c10ff28d 0029 c12756cc dfea44c0 c10f87fc bfbdb844
Call Trace:
 [] drm_lock+0x255/0x2de
 [] mga_dma_buffers+0x0/0x2e3
 [] drm_ioctl+0x142/0x18a
 [] do_IRQ+0x97/0xb0
 [] drm_ioctl+0x0/0x18a
 [] drm_ioctl+0x0/0x18a
 [] do_ioctl+0x87/0x9f
 [] vfs_ioctl+0x23d/0x250
 [] schedule+0x2d0/0x2e6
 [] sys_ioctl+0x33/0x4d
 [] syscall_call+0x7/0xb
 ===
Code: 9a 08 03 00 00 8b 73 30 74 14 c7 44 24 04 28 76 1c c1 c7 04 24 49 51 23 c1 e8 b0 74 
f1 ff 8b 83 d8 00 00 00 83 3d 1c 47 30 c1 00 <8b> 40 10 8b a8 58 1e 00 00 8b 43 28 8b b8 
64 01 00 00 74 32 8b

EIP: [] mga_dma_buffers+0x189/0x2e3 SS:ESP 0068:dddaaec0
BUG: unable to handle kernel NULL pointer dereference at virtual address 
0010
 printing eip:
c10ff416
*pde = 
Oops:  [#2]
PREEMPT
Modules linked in: nfsd exportfs lockd nfs_acl sunrpc nls_iso8859_1 nls_cp437 vfat fat 
nls_base

CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00210246   (2.6.22.5-cfs-v20.5-local #5)
EIP is at mga_dma_buffers+0x189/0x2e3
eax:    ebx: efd07200   ecx: 0001   edx: efc32c00
esi:    edi: c12756cc   ebp: dfea4780   esp: e0552ec0
ds: 007b   es: 007b   fs:   gs: 0033  ss: 0068
Process glxgears (pid: 1776, ti=e0552000 task=c19ec000 task.ti=e0552000)
Stack: efc32c00  0003 efc64b40 c10fa54b efc64b40 efc32c00 
   0003     0001 0001 bf8dbdcc
   bf8dbdc8  c10ff28d 0029 c12756cc dfea4780 c10f87fc bf8dbd54
Call Trace:
 [] drm_lock+0x255/0x2de
 [] mga_dma_buffers+0x0/0x2e3
 [] drm_ioctl+0x142/0x18a
 [] preempt_schedule+0x4e/0x5a
 [] drm_ioctl+0x0/0x18a
 [] drm_ioctl+0x0/0x18a
 [] do_ioctl+0x87/0x9f
 [] vfs_ioctl+0x23d/0x250
 [] schedule+0x23b/0x2e6
 [] schedule+0x2d0/0x2e6
 [] sys_ioctl+0x33/0x4d
 [] syscall_call+0x7/0xb
 ===
Code: 9a 08 03 00 00 8b 73 30 74 14 c7 44 24 04 28 76 1c c1 c7 04 24 49 51 23 c1 e8 b0 74 
f1 ff 8b 83 d8 00 00 00 83 3d 1c 47 30 c1 00 <8b> 40 10 8b a8 58 1e 00 00 8b 43 28 8b b8 
64 01 00 00 74 32 8b

EIP: [] mga_dma_buffers+0x189/0x2e3 SS:ESP 0068:e0552ec0
[drm:drm_release] *ERROR* Device busy: 2 0

Rene.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-29 Thread Keith Packard
On Wed, 2007-08-29 at 10:04 +0200, Ingo Molnar wrote:

> is that old enough to not have the smart X scheduler?

The smart scheduler went into the server in like 2000. I don't think
you've got any systems that old. XFree86 4.1 or 4.2, I can't remember
which.

> (probably 
> the GLX bug you mentioned) so i cannot reproduce the bug.

With X server 1.3, I'm getting consistent crashes with two glxgear
instances running. So, if you're getting any output, it's better than my
situation.

-- 
[EMAIL PROTECTED]


signature.asc
Description: This is a digitally signed message part


Re: CFS review

2007-08-29 Thread Bill Davidsen

Ingo Molnar wrote:

* Bill Davidsen <[EMAIL PROTECTED]> wrote:

  
There is another way to show the problem visually under X 
(vesa-driver), by starting 3 gears simultaneously, which after 
laying them out side-by-side need some settling time before 
smoothing out.  Without __update_curr it's absolutely smooth from 
the start.
  
I posted a LOT of stuff using the glitch1 script, and finally found a 
set of tuning values which make the test script run smooth. See back 
posts, I don't have them here.



but you have real 3D hw and DRI enabled, correct? In that case X uses up 
almost no CPU time and glxgears makes most of the processing. That is 
quite different from the above software-rendering case, where X spends 
most of the CPU time.
  


No, my test machine for that is a compile server, and uses the built-in 
motherboard graphics which are very limited. This is not in any sense a 
graphics powerhouse, it is used to build custom kernels and 
applications, and for testing of kvm and xen, and I grabbed it because 
it had the only Core2 CPU I could reboot to try new kernel versions and 
"from cold boot" testing, discovered the graphics smoothness issue by 
having several windows open on compiles, and developed the glitch1 
script as a way to reproduce it.


The settings I used, features=14, granularity=50, work to improve 
smoothness on other machines for other uses, but they do seem to impact 
performance for compiles, video processing, etc, so they are not optimal 
for general use. I regard the existence of these tuning knobs as one of 
the real strengths of CFS, when you change the tuning it has a visible 
effect.


--
bill davidsen <[EMAIL PROTECTED]>
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-29 Thread Al Boldi
Ingo Molnar wrote:
> * Keith Packard <[EMAIL PROTECTED]> wrote:
> > Make sure the X server isn't running with the smart scheduler
> > disabled; that will cause precisely the symptoms you're seeing here.
> > In the normal usptream sources, you'd have to use '-dumbSched' as an X
> > server command line option.
> >
> > The old 'scheduler' would run an entire X client's input buffer dry
> > before looking for requests from another client. Because glxgears
> > requests are small but time consuming, this can cause very long delays
> > between client switching.
>
> on the old box where i've reproduced this i've got an ancient X version:
>
>   neptune:~> X -version
>
>   X Window System Version 6.8.2
>   Release Date: 9 February 2005
>   X Protocol Version 11, Revision 0, Release 6.8.2
>   Build Operating System: Linux 2.6.9-22.ELsmp i686 [ELF]
>
> is that old enough to not have the smart X scheduler?
>
> on newer systems i dont see correctly updated glxgears output (probably
> the GLX bug you mentioned) so i cannot reproduce the bug.
>
> Al, could you send us your 'X -version' output?

This is the one I have been talking about:

XFree86 Version 4.3.0
Release Date: 27 February 2003
X Protocol Version 11, Revision 0, Release 6.6
Build Operating System: Linux 2.4.21-0.13mdksmp i686 [ELF] 


I also tried the gears test just now on this:

X Window System Version 6.8.1
Release Date: 17 September 2004
X Protocol Version 11, Revision 0, Release 6.8.1
Build Operating System: Linux 2.6.9-1.860_ELsmp i686 [ELF] 

but it completely locks up.  Disabling add_wait_runtime seems to fix it.


Thanks!

--
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-29 Thread Ingo Molnar

* Keith Packard <[EMAIL PROTECTED]> wrote:

> Make sure the X server isn't running with the smart scheduler 
> disabled; that will cause precisely the symptoms you're seeing here. 
> In the normal usptream sources, you'd have to use '-dumbSched' as an X 
> server command line option.
> 
> The old 'scheduler' would run an entire X client's input buffer dry 
> before looking for requests from another client. Because glxgears 
> requests are small but time consuming, this can cause very long delays 
> between client switching.

on the old box where i've reproduced this i've got an ancient X version:

  neptune:~> X -version

  X Window System Version 6.8.2
  Release Date: 9 February 2005
  X Protocol Version 11, Revision 0, Release 6.8.2
  Build Operating System: Linux 2.6.9-22.ELsmp i686 [ELF]

is that old enough to not have the smart X scheduler?

on newer systems i dont see correctly updated glxgears output (probably 
the GLX bug you mentioned) so i cannot reproduce the bug.

Al, could you send us your 'X -version' output?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-29 Thread Keith Packard
On Wed, 2007-08-29 at 06:46 +0200, Ingo Molnar wrote:

> ok, i finally managed to reproduce the "artifact" myself on an older 
> box. It goes like this: start up X with the vesa driver (or with NoDRI) 
> to force software rendering. Then start up a couple of glxgears 
> instances. Those glxgears instances update in a very "chunky", 
> "stuttering" way - each glxgears instance runs/stops/runs/stops at a 
> rate of a about once per second, and this was reported to me as a 
> potential CPU scheduler regression.

Hmm. I can't even run two copies of glxgears on software GL code today;
it's broken in every X server I have available. Someone broke it a while
ago, but no-one noticed. However, this shouldn't be GLX related as the
software rasterizer is no different from any other rendering code.

Testing with my smart-scheduler case (many copies of 'plaid') shows that
at least with git master, things are working as designed. When GLX is
working again, I'll try that as well.

> at a quick glance this is not a CPU scheduler thing: X uses up 99% of 
> CPU time, all the glxgears tasks (i needed 8 parallel instances to see 
> the stallings) are using up the remaining 1% of CPU time. The ordering 
> of the requests from the glxgears tasks is X's choice - and for a 
> pathological overload situation like this we cannot blame X at all for 
> not producing a completely smooth output. (although Xorg could perhaps 
> try to schedule such requests more smoothly, in a more finegrained way?)

It does. It should switch between clients ever 20ms; that's why X spends
so much time asking the kernel for the current time.

Make sure the X server isn't running with the smart scheduler disabled;
that will cause precisely the symptoms you're seeing here. In the normal
usptream sources, you'd have to use '-dumbSched' as an X server command
line option.

The old 'scheduler' would run an entire X client's input buffer dry
before looking for requests from another client. Because glxgears
requests are small but time consuming, this can cause very long delays
between client switching.

-- 
[EMAIL PROTECTED]


signature.asc
Description: This is a digitally signed message part


Re: CFS review

2007-08-29 Thread Ingo Molnar

* Al Boldi <[EMAIL PROTECTED]> wrote:

> >  se.sleep_max :  2194711437
> >  se.block_max :   0
> >  se.exec_max  :  977446
> >  se.wait_max  : 1912321
> >
> > the scheduler itself had a worst-case scheduling delay of 1.9
> > milliseconds for that glxgears instance (which is perfectly good - in
> > fact - excellent interactivity) - but the task had a maximum sleep time
> > of 2.19 seconds. So the 'glitch' was not caused by the scheduler.
> 
> 2.19sec is probably the time you need to lay them out side by side. 
> [...]

nope, i cleared the stats after i laid the glxgears out, via:

   for N in /proc/*/sched; do echo 0 > $N; done

and i did the strace (which showed a 1+ seconds latency) while the 
glxgears was not manipulated in any way.

> [...]  You see, gears sleeps when it is covered by another window, 
> [...]

none of the gear windows in my test were overlaid...

> [...] so once you lay them out it starts running, and that's when they 
> start to stutter for about 10sec.  After that they should run 
> smoothly, because they used up all the sleep bonus.

that's plain wrong - at least in the test i've reproduced. In any case, 
if that were the case then that would be visible in the stats. So please 
send me your cfs-debug-info.sh output captured while the test is running 
(with a CONFIG_SCHEDSTATS=y and CONFIG_SCHED_DEBUG=y kernel) - you can 
download it from:

   http://people.redhat.com/mingo/cfs-scheduler/tools/cfs-debug-info.sh

for best data, execute this before running it:

   for N in /proc/*/sched; do echo 0 > $N; done

> If you like, I can send you my straces, but they are kind of big 
> though, and you need to strace each gear, as stracing itself changes 
> the workload balance.

sure, send them along or upload them somewhere - but more importantly, 
please send the cfs-debug-info.sh output.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-29 Thread Al Boldi
Ingo Molnar wrote:
> * Al Boldi <[EMAIL PROTECTED]> wrote:
> > I have narrowed it down a bit to add_wait_runtime.
>
> the scheduler is a red herring here. Could you "strace -ttt -TTT" one of
> the glxgears instances (and send us the cfs-debug-info.sh output, with
> CONFIG_SCHED_DEBUG=y and CONFIG_SCHEDSTATS=y as requested before) so
> that we can have a closer look?
>
> i reproduced something similar and there the stall is caused by 1+
> second select() delays on the X client<->server socket. The scheduler
> stats agree with that:
>
>  se.sleep_max :  2194711437
>  se.block_max :   0
>  se.exec_max  :  977446
>  se.wait_max  : 1912321
>
> the scheduler itself had a worst-case scheduling delay of 1.9
> milliseconds for that glxgears instance (which is perfectly good - in
> fact - excellent interactivity) - but the task had a maximum sleep time
> of 2.19 seconds. So the 'glitch' was not caused by the scheduler.

2.19sec is probably the time you need to lay them out side by side.  You see, 
gears sleeps when it is covered by another window, so once you lay them out 
it starts running, and that's when they start to stutter for about 10sec.  
After that they should run smoothly, because they used up all the sleep 
bonus.

If you like, I can send you my straces, but they are kind of big though, and 
you need to strace each gear, as stracing itself changes the workload 
balance.

Let's first make sure what we are looking for:
1. start # gears & gears & gears &
2. lay them out side by side, don't worry about sleep times yet.
3. now they start stuttering for about 10sec
4. now they run out of sleep bonuses and smooth out

If this is the sequence you get on your machine, then try disabling 
add_wait_runtime to see the difference.


Thanks!

--
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-29 Thread Al Boldi
Ingo Molnar wrote:
 * Al Boldi [EMAIL PROTECTED] wrote:
  I have narrowed it down a bit to add_wait_runtime.

 the scheduler is a red herring here. Could you strace -ttt -TTT one of
 the glxgears instances (and send us the cfs-debug-info.sh output, with
 CONFIG_SCHED_DEBUG=y and CONFIG_SCHEDSTATS=y as requested before) so
 that we can have a closer look?

 i reproduced something similar and there the stall is caused by 1+
 second select() delays on the X client-server socket. The scheduler
 stats agree with that:

  se.sleep_max :  2194711437
  se.block_max :   0
  se.exec_max  :  977446
  se.wait_max  : 1912321

 the scheduler itself had a worst-case scheduling delay of 1.9
 milliseconds for that glxgears instance (which is perfectly good - in
 fact - excellent interactivity) - but the task had a maximum sleep time
 of 2.19 seconds. So the 'glitch' was not caused by the scheduler.

2.19sec is probably the time you need to lay them out side by side.  You see, 
gears sleeps when it is covered by another window, so once you lay them out 
it starts running, and that's when they start to stutter for about 10sec.  
After that they should run smoothly, because they used up all the sleep 
bonus.

If you like, I can send you my straces, but they are kind of big though, and 
you need to strace each gear, as stracing itself changes the workload 
balance.

Let's first make sure what we are looking for:
1. start # gears  gears  gears 
2. lay them out side by side, don't worry about sleep times yet.
3. now they start stuttering for about 10sec
4. now they run out of sleep bonuses and smooth out

If this is the sequence you get on your machine, then try disabling 
add_wait_runtime to see the difference.


Thanks!

--
Al

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-29 Thread Ingo Molnar

* Al Boldi [EMAIL PROTECTED] wrote:

   se.sleep_max :  2194711437
   se.block_max :   0
   se.exec_max  :  977446
   se.wait_max  : 1912321
 
  the scheduler itself had a worst-case scheduling delay of 1.9
  milliseconds for that glxgears instance (which is perfectly good - in
  fact - excellent interactivity) - but the task had a maximum sleep time
  of 2.19 seconds. So the 'glitch' was not caused by the scheduler.
 
 2.19sec is probably the time you need to lay them out side by side. 
 [...]

nope, i cleared the stats after i laid the glxgears out, via:

   for N in /proc/*/sched; do echo 0  $N; done

and i did the strace (which showed a 1+ seconds latency) while the 
glxgears was not manipulated in any way.

 [...]  You see, gears sleeps when it is covered by another window, 
 [...]

none of the gear windows in my test were overlaid...

 [...] so once you lay them out it starts running, and that's when they 
 start to stutter for about 10sec.  After that they should run 
 smoothly, because they used up all the sleep bonus.

that's plain wrong - at least in the test i've reproduced. In any case, 
if that were the case then that would be visible in the stats. So please 
send me your cfs-debug-info.sh output captured while the test is running 
(with a CONFIG_SCHEDSTATS=y and CONFIG_SCHED_DEBUG=y kernel) - you can 
download it from:

   http://people.redhat.com/mingo/cfs-scheduler/tools/cfs-debug-info.sh

for best data, execute this before running it:

   for N in /proc/*/sched; do echo 0  $N; done

 If you like, I can send you my straces, but they are kind of big 
 though, and you need to strace each gear, as stracing itself changes 
 the workload balance.

sure, send them along or upload them somewhere - but more importantly, 
please send the cfs-debug-info.sh output.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-29 Thread Keith Packard
On Wed, 2007-08-29 at 06:46 +0200, Ingo Molnar wrote:

 ok, i finally managed to reproduce the artifact myself on an older 
 box. It goes like this: start up X with the vesa driver (or with NoDRI) 
 to force software rendering. Then start up a couple of glxgears 
 instances. Those glxgears instances update in a very chunky, 
 stuttering way - each glxgears instance runs/stops/runs/stops at a 
 rate of a about once per second, and this was reported to me as a 
 potential CPU scheduler regression.

Hmm. I can't even run two copies of glxgears on software GL code today;
it's broken in every X server I have available. Someone broke it a while
ago, but no-one noticed. However, this shouldn't be GLX related as the
software rasterizer is no different from any other rendering code.

Testing with my smart-scheduler case (many copies of 'plaid') shows that
at least with git master, things are working as designed. When GLX is
working again, I'll try that as well.

 at a quick glance this is not a CPU scheduler thing: X uses up 99% of 
 CPU time, all the glxgears tasks (i needed 8 parallel instances to see 
 the stallings) are using up the remaining 1% of CPU time. The ordering 
 of the requests from the glxgears tasks is X's choice - and for a 
 pathological overload situation like this we cannot blame X at all for 
 not producing a completely smooth output. (although Xorg could perhaps 
 try to schedule such requests more smoothly, in a more finegrained way?)

It does. It should switch between clients ever 20ms; that's why X spends
so much time asking the kernel for the current time.

Make sure the X server isn't running with the smart scheduler disabled;
that will cause precisely the symptoms you're seeing here. In the normal
usptream sources, you'd have to use '-dumbSched' as an X server command
line option.

The old 'scheduler' would run an entire X client's input buffer dry
before looking for requests from another client. Because glxgears
requests are small but time consuming, this can cause very long delays
between client switching.

-- 
[EMAIL PROTECTED]


signature.asc
Description: This is a digitally signed message part


Re: CFS review

2007-08-29 Thread Ingo Molnar

* Keith Packard [EMAIL PROTECTED] wrote:

 Make sure the X server isn't running with the smart scheduler 
 disabled; that will cause precisely the symptoms you're seeing here. 
 In the normal usptream sources, you'd have to use '-dumbSched' as an X 
 server command line option.
 
 The old 'scheduler' would run an entire X client's input buffer dry 
 before looking for requests from another client. Because glxgears 
 requests are small but time consuming, this can cause very long delays 
 between client switching.

on the old box where i've reproduced this i've got an ancient X version:

  neptune:~ X -version

  X Window System Version 6.8.2
  Release Date: 9 February 2005
  X Protocol Version 11, Revision 0, Release 6.8.2
  Build Operating System: Linux 2.6.9-22.ELsmp i686 [ELF]

is that old enough to not have the smart X scheduler?

on newer systems i dont see correctly updated glxgears output (probably 
the GLX bug you mentioned) so i cannot reproduce the bug.

Al, could you send us your 'X -version' output?

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-29 Thread Al Boldi
Ingo Molnar wrote:
 * Keith Packard [EMAIL PROTECTED] wrote:
  Make sure the X server isn't running with the smart scheduler
  disabled; that will cause precisely the symptoms you're seeing here.
  In the normal usptream sources, you'd have to use '-dumbSched' as an X
  server command line option.
 
  The old 'scheduler' would run an entire X client's input buffer dry
  before looking for requests from another client. Because glxgears
  requests are small but time consuming, this can cause very long delays
  between client switching.

 on the old box where i've reproduced this i've got an ancient X version:

   neptune:~ X -version

   X Window System Version 6.8.2
   Release Date: 9 February 2005
   X Protocol Version 11, Revision 0, Release 6.8.2
   Build Operating System: Linux 2.6.9-22.ELsmp i686 [ELF]

 is that old enough to not have the smart X scheduler?

 on newer systems i dont see correctly updated glxgears output (probably
 the GLX bug you mentioned) so i cannot reproduce the bug.

 Al, could you send us your 'X -version' output?

This is the one I have been talking about:

XFree86 Version 4.3.0
Release Date: 27 February 2003
X Protocol Version 11, Revision 0, Release 6.6
Build Operating System: Linux 2.4.21-0.13mdksmp i686 [ELF] 


I also tried the gears test just now on this:

X Window System Version 6.8.1
Release Date: 17 September 2004
X Protocol Version 11, Revision 0, Release 6.8.1
Build Operating System: Linux 2.6.9-1.860_ELsmp i686 [ELF] 

but it completely locks up.  Disabling add_wait_runtime seems to fix it.


Thanks!

--
Al

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-29 Thread Bill Davidsen

Ingo Molnar wrote:

* Bill Davidsen [EMAIL PROTECTED] wrote:

  
There is another way to show the problem visually under X 
(vesa-driver), by starting 3 gears simultaneously, which after 
laying them out side-by-side need some settling time before 
smoothing out.  Without __update_curr it's absolutely smooth from 
the start.
  
I posted a LOT of stuff using the glitch1 script, and finally found a 
set of tuning values which make the test script run smooth. See back 
posts, I don't have them here.



but you have real 3D hw and DRI enabled, correct? In that case X uses up 
almost no CPU time and glxgears makes most of the processing. That is 
quite different from the above software-rendering case, where X spends 
most of the CPU time.
  


No, my test machine for that is a compile server, and uses the built-in 
motherboard graphics which are very limited. This is not in any sense a 
graphics powerhouse, it is used to build custom kernels and 
applications, and for testing of kvm and xen, and I grabbed it because 
it had the only Core2 CPU I could reboot to try new kernel versions and 
from cold boot testing, discovered the graphics smoothness issue by 
having several windows open on compiles, and developed the glitch1 
script as a way to reproduce it.


The settings I used, features=14, granularity=50, work to improve 
smoothness on other machines for other uses, but they do seem to impact 
performance for compiles, video processing, etc, so they are not optimal 
for general use. I regard the existence of these tuning knobs as one of 
the real strengths of CFS, when you change the tuning it has a visible 
effect.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-29 Thread Keith Packard
On Wed, 2007-08-29 at 10:04 +0200, Ingo Molnar wrote:

 is that old enough to not have the smart X scheduler?

The smart scheduler went into the server in like 2000. I don't think
you've got any systems that old. XFree86 4.1 or 4.2, I can't remember
which.

 (probably 
 the GLX bug you mentioned) so i cannot reproduce the bug.

With X server 1.3, I'm getting consistent crashes with two glxgear
instances running. So, if you're getting any output, it's better than my
situation.

-- 
[EMAIL PROTECTED]


signature.asc
Description: This is a digitally signed message part


Re: CFS review

2007-08-29 Thread Rene Herman

On 08/29/2007 05:57 PM, Keith Packard wrote:


With X server 1.3, I'm getting consistent crashes with two glxgear
instances running. So, if you're getting any output, it's better than my
situation.


Before people focuss on software rendering too much -- also with 1.3.0 (and
a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly crummy using
hardware rendering. While I can move the glxgears window itself, the actual
spinning wheels stay in the upper-left corner of the screen and the movement
leaves a non-repainting trace on the screen. Running a second instance of
glxgears in addition seems to make both instances unkillable  -- and when
I just now forcefully killed X in this situation (the spinning wheels were
covering the upper left corner of all my desktops) I got the below.

Kernel is 2.6.22.5-cfs-v20.5, schedule() is in the traces (but that may be
expected anyway).

BUG: unable to handle kernel NULL pointer dereference at virtual address 
0010
 printing eip:
c10ff416
*pde = 
Oops:  [#1]
PREEMPT
Modules linked in: nfsd exportfs lockd nfs_acl sunrpc nls_iso8859_1 nls_cp437 vfat fat 
nls_base

CPU:0
EIP:0060:[c10ff416]Not tainted VLI
EFLAGS: 00210246   (2.6.22.5-cfs-v20.5-local #5)
EIP is at mga_dma_buffers+0x189/0x2e3
eax:    ebx: efd07200   ecx: 0001   edx: efc32c00
esi:    edi: c12756cc   ebp: dfea44c0   esp: dddaaec0
ds: 007b   es: 007b   fs:   gs: 0033  ss: 0068
Process glxgears (pid: 1775, ti=dddaa000 task=e9daca60 task.ti=dddaa000)
Stack: efc32c00  0004 e4c3bd20 c10fa54b e4c3bd20 efc32c00 
   0004     0001 0001 bfbdb8bc
   bfbdb8b8  c10ff28d 0029 c12756cc dfea44c0 c10f87fc bfbdb844
Call Trace:
 [c10fa54b] drm_lock+0x255/0x2de
 [c10ff28d] mga_dma_buffers+0x0/0x2e3
 [c10f87fc] drm_ioctl+0x142/0x18a
 [c1005973] do_IRQ+0x97/0xb0
 [c10f86ba] drm_ioctl+0x0/0x18a
 [c10f86ba] drm_ioctl+0x0/0x18a
 [c105b0d7] do_ioctl+0x87/0x9f
 [c105b32c] vfs_ioctl+0x23d/0x250
 [c11b533e] schedule+0x2d0/0x2e6
 [c105b372] sys_ioctl+0x33/0x4d
 [c1003d1e] syscall_call+0x7/0xb
 ===
Code: 9a 08 03 00 00 8b 73 30 74 14 c7 44 24 04 28 76 1c c1 c7 04 24 49 51 23 c1 e8 b0 74 
f1 ff 8b 83 d8 00 00 00 83 3d 1c 47 30 c1 00 8b 40 10 8b a8 58 1e 00 00 8b 43 28 8b b8 
64 01 00 00 74 32 8b

EIP: [c10ff416] mga_dma_buffers+0x189/0x2e3 SS:ESP 0068:dddaaec0
BUG: unable to handle kernel NULL pointer dereference at virtual address 
0010
 printing eip:
c10ff416
*pde = 
Oops:  [#2]
PREEMPT
Modules linked in: nfsd exportfs lockd nfs_acl sunrpc nls_iso8859_1 nls_cp437 vfat fat 
nls_base

CPU:0
EIP:0060:[c10ff416]Not tainted VLI
EFLAGS: 00210246   (2.6.22.5-cfs-v20.5-local #5)
EIP is at mga_dma_buffers+0x189/0x2e3
eax:    ebx: efd07200   ecx: 0001   edx: efc32c00
esi:    edi: c12756cc   ebp: dfea4780   esp: e0552ec0
ds: 007b   es: 007b   fs:   gs: 0033  ss: 0068
Process glxgears (pid: 1776, ti=e0552000 task=c19ec000 task.ti=e0552000)
Stack: efc32c00  0003 efc64b40 c10fa54b efc64b40 efc32c00 
   0003     0001 0001 bf8dbdcc
   bf8dbdc8  c10ff28d 0029 c12756cc dfea4780 c10f87fc bf8dbd54
Call Trace:
 [c10fa54b] drm_lock+0x255/0x2de
 [c10ff28d] mga_dma_buffers+0x0/0x2e3
 [c10f87fc] drm_ioctl+0x142/0x18a
 [c11b53f6] preempt_schedule+0x4e/0x5a
 [c10f86ba] drm_ioctl+0x0/0x18a
 [c10f86ba] drm_ioctl+0x0/0x18a
 [c105b0d7] do_ioctl+0x87/0x9f
 [c105b32c] vfs_ioctl+0x23d/0x250
 [c11b52a9] schedule+0x23b/0x2e6
 [c11b533e] schedule+0x2d0/0x2e6
 [c105b372] sys_ioctl+0x33/0x4d
 [c1003d1e] syscall_call+0x7/0xb
 ===
Code: 9a 08 03 00 00 8b 73 30 74 14 c7 44 24 04 28 76 1c c1 c7 04 24 49 51 23 c1 e8 b0 74 
f1 ff 8b 83 d8 00 00 00 83 3d 1c 47 30 c1 00 8b 40 10 8b a8 58 1e 00 00 8b 43 28 8b b8 
64 01 00 00 74 32 8b

EIP: [c10ff416] mga_dma_buffers+0x189/0x2e3 SS:ESP 0068:e0552ec0
[drm:drm_release] *ERROR* Device busy: 2 0

Rene.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Ingo Molnar

* Al Boldi <[EMAIL PROTECTED]> wrote:

> I have narrowed it down a bit to add_wait_runtime.

the scheduler is a red herring here. Could you "strace -ttt -TTT" one of 
the glxgears instances (and send us the cfs-debug-info.sh output, with 
CONFIG_SCHED_DEBUG=y and CONFIG_SCHEDSTATS=y as requested before) so 
that we can have a closer look?

i reproduced something similar and there the stall is caused by 1+ 
second select() delays on the X client<->server socket. The scheduler 
stats agree with that:

 se.sleep_max :  2194711437
 se.block_max :   0
 se.exec_max  :  977446
 se.wait_max  : 1912321

the scheduler itself had a worst-case scheduling delay of 1.9 
milliseconds for that glxgears instance (which is perfectly good - in 
fact - excellent interactivity) - but the task had a maximum sleep time 
of 2.19 seconds. So the 'glitch' was not caused by the scheduler.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Mike Galbraith
On Wed, 2007-08-29 at 06:18 +0200, Ingo Molnar wrote:
> * Al Boldi <[EMAIL PROTECTED]> wrote:
> 
> > No need for framebuffer.  All you need is X using the X.org 
> > vesa-driver.  Then start gears like this:
> > 
> >   # gears & gears & gears &
> > 
> > Then lay them out side by side to see the periodic stallings for 
> > ~10sec.
> 
> i just tried something similar (by adding Option "NoDRI" to xorg.conf) 
> and i'm wondering how it can be smooth on vesa-driver at all. I tested 
> it on a Core2Duo box and software rendering manages to do about 3 frames 
> per second. (although glxgears itself thinks it does ~600 fps) If i 
> start 3 glxgears then they do ~1 frame per second each. This is on 
> Fedora 7 with xorg-x11-server-Xorg-1.3.0.0-9.fc7 and 
> xorg-x11-drv-i810-2.0.0-4.fc7.

At least you can run the darn test... the third instance of glxgears
here means say bye bye to GUI instantly.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Keith Packard
On Wed, 2007-08-29 at 06:18 +0200, Ingo Molnar wrote:

> > Then lay them out side by side to see the periodic stallings for 
> > ~10sec.

The X scheduling code isn't really designed to handle software GL well;
the requests can be very expensive to execute, and yet are specified as
atomic operations (sigh).

> i just tried something similar (by adding Option "NoDRI" to xorg.conf) 
> and i'm wondering how it can be smooth on vesa-driver at all. I tested 
> it on a Core2Duo box and software rendering manages to do about 3 frames 
> per second. (although glxgears itself thinks it does ~600 fps) If i 
> start 3 glxgears then they do ~1 frame per second each. This is on 
> Fedora 7 with xorg-x11-server-Xorg-1.3.0.0-9.fc7 and 
> xorg-x11-drv-i810-2.0.0-4.fc7.

Are you attempting to measure the visible updates by eye? Or are you
using some other metric?

In any case, attempting to measure anything using glxgears is a bad
idea; it's not representative of *any* real applications. And then using
software GL on top of that...

What was the question again?

-- 
[EMAIL PROTECTED]


signature.asc
Description: This is a digitally signed message part


Re: CFS review

2007-08-28 Thread Al Boldi
Ingo Molnar wrote:
> * Linus Torvalds <[EMAIL PROTECTED]> wrote:
> > On Tue, 28 Aug 2007, Al Boldi wrote:
> > > I like your analysis, but how do you explain that these stalls
> > > vanish when __update_curr is disabled?
> >
> > It's entirely possible that what happens is that the X scheduling is
> > just a slightly unstable system - which effectively would turn a small
> > scheduling difference into a *huge* visible difference.
>
> i think it's because disabling __update_curr() in essence removes the
> ability of scheduler to preempt tasks - that hack in essence results in
> a non-scheduler. Hence the gears + X pair of tasks becomes a synchronous
> pair of tasks in essence - and thus gears cannot "overload" X.

I have narrowed it down a bit to add_wait_runtime.

Patch 2.6.22.5-v20.4 like this:

346- * the two values are equal)
347- * [Note: delta_mine - delta_exec is negative]:
348- */
349://  add_wait_runtime(cfs_rq, curr, delta_mine - delta_exec);
350-}
351-
352-static void update_curr(struct cfs_rq *cfs_rq)

When disabling add_wait_runtime the stalls are gone.  With this change the 
scheduler is still usable, but it does not constitute a fix.

Now, even with this hack, uneven nice-levels between X and gears causes a 
return of the stalls, so make sure both X and gears run on the same 
nice-level when testing.

Again, the whole point of this workload is to expose scheduler glitches 
regardless of whether X is broken or not, and my hunch is that this problem 
looks suspiciously like an ia-boosting bug.  What's important to note is 
that by adjusting the scheduler we can effect a correction in behaviour, and 
as such should yield this problem as fixable.

It's probably a good idea to look further into add_wait_runtime.


Thanks!

--
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Ingo Molnar

* Al Boldi <[EMAIL PROTECTED]> wrote:

> No need for framebuffer.  All you need is X using the X.org 
> vesa-driver.  Then start gears like this:
> 
>   # gears & gears & gears &
> 
> Then lay them out side by side to see the periodic stallings for 
> ~10sec.

i just tried something similar (by adding Option "NoDRI" to xorg.conf) 
and i'm wondering how it can be smooth on vesa-driver at all. I tested 
it on a Core2Duo box and software rendering manages to do about 3 frames 
per second. (although glxgears itself thinks it does ~600 fps) If i 
start 3 glxgears then they do ~1 frame per second each. This is on 
Fedora 7 with xorg-x11-server-Xorg-1.3.0.0-9.fc7 and 
xorg-x11-drv-i810-2.0.0-4.fc7.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Ingo Molnar

* Bill Davidsen <[EMAIL PROTECTED]> wrote:

> > There is another way to show the problem visually under X 
> > (vesa-driver), by starting 3 gears simultaneously, which after 
> > laying them out side-by-side need some settling time before 
> > smoothing out.  Without __update_curr it's absolutely smooth from 
> > the start.
> 
> I posted a LOT of stuff using the glitch1 script, and finally found a 
> set of tuning values which make the test script run smooth. See back 
> posts, I don't have them here.

but you have real 3D hw and DRI enabled, correct? In that case X uses up 
almost no CPU time and glxgears makes most of the processing. That is 
quite different from the above software-rendering case, where X spends 
most of the CPU time.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Bill Davidsen

Ingo Molnar wrote:

* Al Boldi <[EMAIL PROTECTED]> wrote:

ok. I think i might finally have found the bug causing this. Could 
you try the fix below, does your webserver thread-startup test work 
any better?
It seems to help somewhat, but the problem is still visible.  Even 
v20.3 on 2.6.22.5 didn't help.


It does look related to ia-boosting, so I turned off __update_curr 
like Roman mentioned, which had an enormous smoothing effect, but then 
nice levels completely break down and lockup the system.


you can turn sleeper-fairness off via:

   echo 28 > /proc/sys/kernel/sched_features

another thing to try would be:

   echo 12 > /proc/sys/kernel/sched_features


14, and drop the granularity to 50.


(that's the new-task penalty turned off.)

Another thing to try would be to edit this:

if (sysctl_sched_features & SCHED_FEAT_START_DEBIT)
p->se.wait_runtime = -(sched_granularity(cfs_rq) / 2);

to:

if (sysctl_sched_features & SCHED_FEAT_START_DEBIT)
p->se.wait_runtime = -(sched_granularity(cfs_rq);

and could you also check 20.4 on 2.6.22.5 perhaps, or very latest -git? 
(Peter has experienced smaller spikes with that.)


Ingo



--
Bill Davidsen <[EMAIL PROTECTED]>
  "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Bill Davidsen

Al Boldi wrote:

Ingo Molnar wrote:

* Al Boldi <[EMAIL PROTECTED]> wrote:

The problem is that consecutive runs don't give consistent results
and sometimes stalls.  You may want to try that.

well, there's a natural saturation point after a few hundred tasks
(depending on your CPU's speed), at which point there's no idle time
left. From that point on things get slower progressively (and the
ability of the shell to start new ping tasks is impacted as well),
but that's expected on an overloaded system, isnt it?

Of course, things should get slower with higher load, but it should be
consistent without stalls.

To see this problem, make sure you boot into /bin/sh with the normal
VGA console (ie. not fb-console).  Then try each loop a few times to
show different behaviour; loops like:

# for ((i=0; i<; i++)); do ping 10.1 -A > /dev/null & done

# for ((i=0; i<; i++)); do nice -99 ping 10.1 -A > /dev/null & done

# { for ((i=0; i<; i++)); do
ping 10.1 -A > /dev/null &
done } > /dev/null 2>&1

Especially the last one sometimes causes a complete console lock-up,
while the other two sometimes stall then surge periodically.

ok. I think i might finally have found the bug causing this. Could you
try the fix below, does your webserver thread-startup test work any
better?


It seems to help somewhat, but the problem is still visible.  Even v20.3 on 
2.6.22.5 didn't help.


It does look related to ia-boosting, so I turned off __update_curr like Roman 
mentioned, which had an enormous smoothing effect, but then nice levels 
completely break down and lockup the system.


There is another way to show the problem visually under X (vesa-driver), by 
starting 3 gears simultaneously, which after laying them out side-by-side 
need some settling time before smoothing out.  Without __update_curr it's 
absolutely smooth from the start.


I posted a LOT of stuff using the glitch1 script, and finally found a 
set of tuning values which make the test script run smooth. See back 
posts, I don't have them here.


--
Bill Davidsen <[EMAIL PROTECTED]>
  "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Valdis . Kletnieks
On Mon, 27 Aug 2007 22:05:37 PDT, Linus Torvalds said:
> 
> 
> On Tue, 28 Aug 2007, Al Boldi wrote:
> > 
> > No need for framebuffer.  All you need is X using the X.org vesa-driver.  
> > Then start gears like this:
> > 
> >   # gears & gears & gears &
> > 
> > Then lay them out side by side to see the periodic stallings for ~10sec.
> 
> I don't think this is a good test.
> 
> Why?
> 
> If you're not using direct rendering, what you have is the X server doing 
> all the rendering, which in turn means that what you are testing is quite 
> possibly not so much about the *kernel* scheduling, but about *X-server* 
> scheduling!

I wonder - can people who are doing this as a test please specify whether
they're using an older X that has the libX11 or the newer libxcb code? That
may have a similar impact as well.

(libxcb is pretty new - it landed in Fedora Rawhide just about a month ago,
after Fedora 7 shipped.  Not sure what other distros have it now...)


pgpI8maTCY4aR.pgp
Description: PGP signature


Re: CFS review

2007-08-28 Thread Ingo Molnar

* Willy Tarreau <[EMAIL PROTECTED]> wrote:

> On Tue, Aug 28, 2007 at 10:02:18AM +0200, Ingo Molnar wrote:
> > 
> > * Xavier Bestel <[EMAIL PROTECTED]> wrote:
> > 
> > > Are you sure they are stalled ? What you may have is simple gears 
> > > running at a multiple of your screen refresh rate, so they only appear 
> > > stalled.
> > > 
> > > Plus, as said Linus, you're not really testing the kernel scheduler. 
> > > gears is really bad benchmark, it should die.
> > 
> > i like glxgears as long as it runs on _real_ 3D hardware, because there 
> > it has minimal interaction with X and so it's an excellent visual test 
> > about consistency of scheduling. You can immediately see (literally) 
> > scheduling hickups down to a millisecond range (!). In this sense, if 
> > done and interpreted carefully, glxgears gives more feedback than many 
> > audio tests. (audio latency problems are audible, but on most sound hw 
> > it takes quite a bit of latency to produce an xrun.) So basically 
> > glxgears is the "early warning system" that tells us about the potential 
> > for xruns earlier than an xrun would happen for real.
> > 
> > [ of course you can also run all the other tools to get numeric results,
> >   but glxgears is nice in that it gives immediate visual feedback. ]
> 
> Al could also test ocbench, which brings visual feedback without 
> harnessing the X server : http://linux.1wt.eu/sched/
> 
> I packaged it exactly for this problem and it has already helped. It 
> uses X after each loop, so if you run it with large run time, X is 
> nearly not sollicitated.

yeah, and ocbench is one of my favorite cross-task-fairness tests - i 
dont release a CFS patch without checking it with ocbench first :-)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Willy Tarreau
On Tue, Aug 28, 2007 at 10:02:18AM +0200, Ingo Molnar wrote:
> 
> * Xavier Bestel <[EMAIL PROTECTED]> wrote:
> 
> > Are you sure they are stalled ? What you may have is simple gears 
> > running at a multiple of your screen refresh rate, so they only appear 
> > stalled.
> > 
> > Plus, as said Linus, you're not really testing the kernel scheduler. 
> > gears is really bad benchmark, it should die.
> 
> i like glxgears as long as it runs on _real_ 3D hardware, because there 
> it has minimal interaction with X and so it's an excellent visual test 
> about consistency of scheduling. You can immediately see (literally) 
> scheduling hickups down to a millisecond range (!). In this sense, if 
> done and interpreted carefully, glxgears gives more feedback than many 
> audio tests. (audio latency problems are audible, but on most sound hw 
> it takes quite a bit of latency to produce an xrun.) So basically 
> glxgears is the "early warning system" that tells us about the potential 
> for xruns earlier than an xrun would happen for real.
> 
> [ of course you can also run all the other tools to get numeric results,
>   but glxgears is nice in that it gives immediate visual feedback. ]

Al could also test ocbench, which brings visual feedback without harnessing
the X server :  http://linux.1wt.eu/sched/

I packaged it exactly for this problem and it has already helped. It uses
X after each loop, so if you run it with large run time, X is nearly not
sollicitated. 

Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Ingo Molnar

* Linus Torvalds <[EMAIL PROTECTED]> wrote:

> On Tue, 28 Aug 2007, Al Boldi wrote:
> > 
> > I like your analysis, but how do you explain that these stalls 
> > vanish when __update_curr is disabled?
> 
> It's entirely possible that what happens is that the X scheduling is 
> just a slightly unstable system - which effectively would turn a small 
> scheduling difference into a *huge* visible difference.

i think it's because disabling __update_curr() in essence removes the 
ability of scheduler to preempt tasks - that hack in essence results in 
a non-scheduler. Hence the gears + X pair of tasks becomes a synchronous 
pair of tasks in essence - and thus gears cannot "overload" X.

Normally gears + X is an asynchronous pair of tasks, with gears (or 
xperf, or devel versions of firefox, etc.) not being throttled at all 
and thus being able to overload/spam the X server with requests. (And we 
generally want to _reward_ asynchronity and want to allow tasks to 
overlap each other and we want each task to go as fast and as parallel 
as it can.)

Eventually X's built-in "bad, abusive client" throttling code kicks in, 
which, AFAIK is pretty crude and might yield to such artifacts. But ... 
it would be nice for an X person to confirm - and in any case i'll try 
Al's workload - i thought i had a reproducer but i barked up the wrong 
tree :-) My laptop doesnt run with the vesa driver, so i have no easy 
reproducer for now.

( also, it would be nice if Al could try rc4 plus my latest scheduler
  tree as well - just on the odd chance that something got fixed
  meanwhile. In particular Mike's sleeper-bonus-limit fix could be
  related. )

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Arjan van de Ven
On Tue, 28 Aug 2007 09:34:03 -0700 (PDT)
Linus Torvalds <[EMAIL PROTECTED]> wrote:

> 
> 
> On Tue, 28 Aug 2007, Al Boldi wrote:
> > 
> > I like your analysis, but how do you explain that these stalls
> > vanish when __update_curr is disabled?
> 
> It's entirely possible that what happens is that the X scheduling is
> just a slightly unstable system - which effectively would turn a
> small scheduling difference into a *huge* visible difference.

one thing that happens if you remove __update_curr is the following
pattern (since no apps will get preempted involuntarily)

app 1 submits a full frame worth of 3D stuff to X 
app 1 then sleeps/waits for that to complete
X gets to run, has 1 full frame to render, does this
X now waits for more input
app 2 now gets to run and submits a full frame
app 2 then sleeps again
X gets to run again to process and complete
X goes to sleep
app 3 gets to run and submits a full frame
app 3 then sleeps
X runs
X sleeps
app 1 gets to submit a frame

etc etc

so without preemption happening, you can get "perfect" behavior, just
because everything is perfectly doing 1 thing at a time cooperatively.
once you start doing timeslices and enforcing limits on them, this
"perfect pattern" will break down (remember this is all software
rendering in the problem being described), and whatever you get won't
be as perfect as this.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Linus Torvalds


On Tue, 28 Aug 2007, Al Boldi wrote:
> 
> I like your analysis, but how do you explain that these stalls vanish when 
> __update_curr is disabled?

It's entirely possible that what happens is that the X scheduling is just 
a slightly unstable system - which effectively would turn a small 
scheduling difference into a *huge* visible difference.

And the "small scheduling difference" might be as simple as "if the 
process slept for a while, we give it a bit more CPU time". And then you 
get into some unbalanced setup where the X scheduler makes it sleep even 
more, because it fills its buffers.

Or something. I can easily see two schedulers that are trying to 
*individually* be "fair", fighting it out in a way where the end result is 
not very good.

I do suspect it's probably a very interesting load, so I hope Ingo looks 
more at it, but I also suspect it's more than just the kernel scheduler.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Ingo Molnar

* Xavier Bestel <[EMAIL PROTECTED]> wrote:

> Are you sure they are stalled ? What you may have is simple gears 
> running at a multiple of your screen refresh rate, so they only appear 
> stalled.
> 
> Plus, as said Linus, you're not really testing the kernel scheduler. 
> gears is really bad benchmark, it should die.

i like glxgears as long as it runs on _real_ 3D hardware, because there 
it has minimal interaction with X and so it's an excellent visual test 
about consistency of scheduling. You can immediately see (literally) 
scheduling hickups down to a millisecond range (!). In this sense, if 
done and interpreted carefully, glxgears gives more feedback than many 
audio tests. (audio latency problems are audible, but on most sound hw 
it takes quite a bit of latency to produce an xrun.) So basically 
glxgears is the "early warning system" that tells us about the potential 
for xruns earlier than an xrun would happen for real.

[ of course you can also run all the other tools to get numeric results,
  but glxgears is nice in that it gives immediate visual feedback. ]

but i agree that on a non-accelerated X setup glxgears is not really 
meaningful. It can have similar "spam the X server" effects as xperf.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Xavier Bestel
On Tue, 2007-08-28 at 07:37 +0300, Al Boldi wrote:
> start gears like this:
> 
>   # gears & gears & gears &
> 
> Then lay them out side by side to see the periodic stallings for
> ~10sec.

Are you sure they are stalled ? What you may have is simple gears
running at a multiple of your screen refresh rate, so they only appear
stalled.

Plus, as said Linus, you're not really testing the kernel scheduler.
gears is really bad benchmark, it should die.

Xav


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Ingo Molnar

* Mike Galbraith <[EMAIL PROTECTED]> wrote:

> > I like your analysis, but how do you explain that these stalls 
> > vanish when __update_curr is disabled?
> 
> When you disable __update_curr(), you're utterly destroying the
> scheduler.  There may well be a scheduler connection, but disabling
> __update_curr() doesn't tell you anything meaningful.  Basically, you're
> letting all tasks run uninterrupted for just as long as they please
> (which is why busy loops lock your box solid as a rock).  I'd suggest
> gathering some sched_debug stats or something... [...]

the output of the following would be nice:

  http://people.redhat.com/mingo/cfs-scheduler/tools/cfs-debug-info.sh

captured while the gears are running.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Mike Galbraith
On Tue, 2007-08-28 at 08:23 +0300, Al Boldi wrote:
> Linus Torvalds wrote:
> > On Tue, 28 Aug 2007, Al Boldi wrote:
> > > No need for framebuffer.  All you need is X using the X.org vesa-driver.
> > > Then start gears like this:
> > >
> > >   # gears & gears & gears &
> > >
> > > Then lay them out side by side to see the periodic stallings for ~10sec.
> >
> > I don't think this is a good test.
> >
> > Why?
> >
> > If you're not using direct rendering, what you have is the X server doing
> > all the rendering, which in turn means that what you are testing is quite
> > possibly not so much about the *kernel* scheduling, but about *X-server*
> > scheduling!
> >
> > I'm sure the kernel scheduler has an impact, but what's more likely to be
> > going on is that you're seeing effects that are indirect, and not
> > necessarily at all even "good".
> >
> > For example, if the X server is the scheduling point, it's entirely
> > possible that it ends up showing effects that are more due to the queueing
> > of the X command stream than due to the scheduler - and that those
> > stalls are simply due to *that*.
> >
> > One thing to try is to run the X connection in synchronous mode, which
> > minimizes queueing issues. I don't know if gears has a flag to turn on
> > synchronous X messaging, though. Many X programs take the "[+-]sync" flag
> > to turn on synchronous mode, iirc.
> 
> I like your analysis, but how do you explain that these stalls vanish when 
> __update_curr is disabled?

When you disable __update_curr(), you're utterly destroying the
scheduler.  There may well be a scheduler connection, but disabling
__update_curr() doesn't tell you anything meaningful.  Basically, you're
letting all tasks run uninterrupted for just as long as they please
(which is why busy loops lock your box solid as a rock).  I'd suggest
gathering some sched_debug stats or something... shoot, _anything_ but
what you did :)

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Mike Galbraith
On Tue, 2007-08-28 at 08:23 +0300, Al Boldi wrote:
 Linus Torvalds wrote:
  On Tue, 28 Aug 2007, Al Boldi wrote:
   No need for framebuffer.  All you need is X using the X.org vesa-driver.
   Then start gears like this:
  
 # gears  gears  gears 
  
   Then lay them out side by side to see the periodic stallings for ~10sec.
 
  I don't think this is a good test.
 
  Why?
 
  If you're not using direct rendering, what you have is the X server doing
  all the rendering, which in turn means that what you are testing is quite
  possibly not so much about the *kernel* scheduling, but about *X-server*
  scheduling!
 
  I'm sure the kernel scheduler has an impact, but what's more likely to be
  going on is that you're seeing effects that are indirect, and not
  necessarily at all even good.
 
  For example, if the X server is the scheduling point, it's entirely
  possible that it ends up showing effects that are more due to the queueing
  of the X command stream than due to the scheduler - and that those
  stalls are simply due to *that*.
 
  One thing to try is to run the X connection in synchronous mode, which
  minimizes queueing issues. I don't know if gears has a flag to turn on
  synchronous X messaging, though. Many X programs take the [+-]sync flag
  to turn on synchronous mode, iirc.
 
 I like your analysis, but how do you explain that these stalls vanish when 
 __update_curr is disabled?

When you disable __update_curr(), you're utterly destroying the
scheduler.  There may well be a scheduler connection, but disabling
__update_curr() doesn't tell you anything meaningful.  Basically, you're
letting all tasks run uninterrupted for just as long as they please
(which is why busy loops lock your box solid as a rock).  I'd suggest
gathering some sched_debug stats or something... shoot, _anything_ but
what you did :)

-Mike

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Ingo Molnar

* Mike Galbraith [EMAIL PROTECTED] wrote:

  I like your analysis, but how do you explain that these stalls 
  vanish when __update_curr is disabled?
 
 When you disable __update_curr(), you're utterly destroying the
 scheduler.  There may well be a scheduler connection, but disabling
 __update_curr() doesn't tell you anything meaningful.  Basically, you're
 letting all tasks run uninterrupted for just as long as they please
 (which is why busy loops lock your box solid as a rock).  I'd suggest
 gathering some sched_debug stats or something... [...]

the output of the following would be nice:

  http://people.redhat.com/mingo/cfs-scheduler/tools/cfs-debug-info.sh

captured while the gears are running.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Xavier Bestel
On Tue, 2007-08-28 at 07:37 +0300, Al Boldi wrote:
 start gears like this:
 
   # gears  gears  gears 
 
 Then lay them out side by side to see the periodic stallings for
 ~10sec.

Are you sure they are stalled ? What you may have is simple gears
running at a multiple of your screen refresh rate, so they only appear
stalled.

Plus, as said Linus, you're not really testing the kernel scheduler.
gears is really bad benchmark, it should die.

Xav


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Ingo Molnar

* Xavier Bestel [EMAIL PROTECTED] wrote:

 Are you sure they are stalled ? What you may have is simple gears 
 running at a multiple of your screen refresh rate, so they only appear 
 stalled.
 
 Plus, as said Linus, you're not really testing the kernel scheduler. 
 gears is really bad benchmark, it should die.

i like glxgears as long as it runs on _real_ 3D hardware, because there 
it has minimal interaction with X and so it's an excellent visual test 
about consistency of scheduling. You can immediately see (literally) 
scheduling hickups down to a millisecond range (!). In this sense, if 
done and interpreted carefully, glxgears gives more feedback than many 
audio tests. (audio latency problems are audible, but on most sound hw 
it takes quite a bit of latency to produce an xrun.) So basically 
glxgears is the early warning system that tells us about the potential 
for xruns earlier than an xrun would happen for real.

[ of course you can also run all the other tools to get numeric results,
  but glxgears is nice in that it gives immediate visual feedback. ]

but i agree that on a non-accelerated X setup glxgears is not really 
meaningful. It can have similar spam the X server effects as xperf.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Linus Torvalds


On Tue, 28 Aug 2007, Al Boldi wrote:
 
 I like your analysis, but how do you explain that these stalls vanish when 
 __update_curr is disabled?

It's entirely possible that what happens is that the X scheduling is just 
a slightly unstable system - which effectively would turn a small 
scheduling difference into a *huge* visible difference.

And the small scheduling difference might be as simple as if the 
process slept for a while, we give it a bit more CPU time. And then you 
get into some unbalanced setup where the X scheduler makes it sleep even 
more, because it fills its buffers.

Or something. I can easily see two schedulers that are trying to 
*individually* be fair, fighting it out in a way where the end result is 
not very good.

I do suspect it's probably a very interesting load, so I hope Ingo looks 
more at it, but I also suspect it's more than just the kernel scheduler.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Ingo Molnar

* Linus Torvalds [EMAIL PROTECTED] wrote:

 On Tue, 28 Aug 2007, Al Boldi wrote:
  
  I like your analysis, but how do you explain that these stalls 
  vanish when __update_curr is disabled?
 
 It's entirely possible that what happens is that the X scheduling is 
 just a slightly unstable system - which effectively would turn a small 
 scheduling difference into a *huge* visible difference.

i think it's because disabling __update_curr() in essence removes the 
ability of scheduler to preempt tasks - that hack in essence results in 
a non-scheduler. Hence the gears + X pair of tasks becomes a synchronous 
pair of tasks in essence - and thus gears cannot overload X.

Normally gears + X is an asynchronous pair of tasks, with gears (or 
xperf, or devel versions of firefox, etc.) not being throttled at all 
and thus being able to overload/spam the X server with requests. (And we 
generally want to _reward_ asynchronity and want to allow tasks to 
overlap each other and we want each task to go as fast and as parallel 
as it can.)

Eventually X's built-in bad, abusive client throttling code kicks in, 
which, AFAIK is pretty crude and might yield to such artifacts. But ... 
it would be nice for an X person to confirm - and in any case i'll try 
Al's workload - i thought i had a reproducer but i barked up the wrong 
tree :-) My laptop doesnt run with the vesa driver, so i have no easy 
reproducer for now.

( also, it would be nice if Al could try rc4 plus my latest scheduler
  tree as well - just on the odd chance that something got fixed
  meanwhile. In particular Mike's sleeper-bonus-limit fix could be
  related. )

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Arjan van de Ven
On Tue, 28 Aug 2007 09:34:03 -0700 (PDT)
Linus Torvalds [EMAIL PROTECTED] wrote:

 
 
 On Tue, 28 Aug 2007, Al Boldi wrote:
  
  I like your analysis, but how do you explain that these stalls
  vanish when __update_curr is disabled?
 
 It's entirely possible that what happens is that the X scheduling is
 just a slightly unstable system - which effectively would turn a
 small scheduling difference into a *huge* visible difference.

one thing that happens if you remove __update_curr is the following
pattern (since no apps will get preempted involuntarily)

app 1 submits a full frame worth of 3D stuff to X 
app 1 then sleeps/waits for that to complete
X gets to run, has 1 full frame to render, does this
X now waits for more input
app 2 now gets to run and submits a full frame
app 2 then sleeps again
X gets to run again to process and complete
X goes to sleep
app 3 gets to run and submits a full frame
app 3 then sleeps
X runs
X sleeps
app 1 gets to submit a frame

etc etc

so without preemption happening, you can get perfect behavior, just
because everything is perfectly doing 1 thing at a time cooperatively.
once you start doing timeslices and enforcing limits on them, this
perfect pattern will break down (remember this is all software
rendering in the problem being described), and whatever you get won't
be as perfect as this.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Willy Tarreau
On Tue, Aug 28, 2007 at 10:02:18AM +0200, Ingo Molnar wrote:
 
 * Xavier Bestel [EMAIL PROTECTED] wrote:
 
  Are you sure they are stalled ? What you may have is simple gears 
  running at a multiple of your screen refresh rate, so they only appear 
  stalled.
  
  Plus, as said Linus, you're not really testing the kernel scheduler. 
  gears is really bad benchmark, it should die.
 
 i like glxgears as long as it runs on _real_ 3D hardware, because there 
 it has minimal interaction with X and so it's an excellent visual test 
 about consistency of scheduling. You can immediately see (literally) 
 scheduling hickups down to a millisecond range (!). In this sense, if 
 done and interpreted carefully, glxgears gives more feedback than many 
 audio tests. (audio latency problems are audible, but on most sound hw 
 it takes quite a bit of latency to produce an xrun.) So basically 
 glxgears is the early warning system that tells us about the potential 
 for xruns earlier than an xrun would happen for real.
 
 [ of course you can also run all the other tools to get numeric results,
   but glxgears is nice in that it gives immediate visual feedback. ]

Al could also test ocbench, which brings visual feedback without harnessing
the X server :  http://linux.1wt.eu/sched/

I packaged it exactly for this problem and it has already helped. It uses
X after each loop, so if you run it with large run time, X is nearly not
sollicitated. 

Willy

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Ingo Molnar

* Willy Tarreau [EMAIL PROTECTED] wrote:

 On Tue, Aug 28, 2007 at 10:02:18AM +0200, Ingo Molnar wrote:
  
  * Xavier Bestel [EMAIL PROTECTED] wrote:
  
   Are you sure they are stalled ? What you may have is simple gears 
   running at a multiple of your screen refresh rate, so they only appear 
   stalled.
   
   Plus, as said Linus, you're not really testing the kernel scheduler. 
   gears is really bad benchmark, it should die.
  
  i like glxgears as long as it runs on _real_ 3D hardware, because there 
  it has minimal interaction with X and so it's an excellent visual test 
  about consistency of scheduling. You can immediately see (literally) 
  scheduling hickups down to a millisecond range (!). In this sense, if 
  done and interpreted carefully, glxgears gives more feedback than many 
  audio tests. (audio latency problems are audible, but on most sound hw 
  it takes quite a bit of latency to produce an xrun.) So basically 
  glxgears is the early warning system that tells us about the potential 
  for xruns earlier than an xrun would happen for real.
  
  [ of course you can also run all the other tools to get numeric results,
but glxgears is nice in that it gives immediate visual feedback. ]
 
 Al could also test ocbench, which brings visual feedback without 
 harnessing the X server : http://linux.1wt.eu/sched/
 
 I packaged it exactly for this problem and it has already helped. It 
 uses X after each loop, so if you run it with large run time, X is 
 nearly not sollicitated.

yeah, and ocbench is one of my favorite cross-task-fairness tests - i 
dont release a CFS patch without checking it with ocbench first :-)

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Valdis . Kletnieks
On Mon, 27 Aug 2007 22:05:37 PDT, Linus Torvalds said:
 
 
 On Tue, 28 Aug 2007, Al Boldi wrote:
  
  No need for framebuffer.  All you need is X using the X.org vesa-driver.  
  Then start gears like this:
  
# gears  gears  gears 
  
  Then lay them out side by side to see the periodic stallings for ~10sec.
 
 I don't think this is a good test.
 
 Why?
 
 If you're not using direct rendering, what you have is the X server doing 
 all the rendering, which in turn means that what you are testing is quite 
 possibly not so much about the *kernel* scheduling, but about *X-server* 
 scheduling!

I wonder - can people who are doing this as a test please specify whether
they're using an older X that has the libX11 or the newer libxcb code? That
may have a similar impact as well.

(libxcb is pretty new - it landed in Fedora Rawhide just about a month ago,
after Fedora 7 shipped.  Not sure what other distros have it now...)


pgpI8maTCY4aR.pgp
Description: PGP signature


Re: CFS review

2007-08-28 Thread Bill Davidsen

Al Boldi wrote:

Ingo Molnar wrote:

* Al Boldi [EMAIL PROTECTED] wrote:

The problem is that consecutive runs don't give consistent results
and sometimes stalls.  You may want to try that.

well, there's a natural saturation point after a few hundred tasks
(depending on your CPU's speed), at which point there's no idle time
left. From that point on things get slower progressively (and the
ability of the shell to start new ping tasks is impacted as well),
but that's expected on an overloaded system, isnt it?

Of course, things should get slower with higher load, but it should be
consistent without stalls.

To see this problem, make sure you boot into /bin/sh with the normal
VGA console (ie. not fb-console).  Then try each loop a few times to
show different behaviour; loops like:

# for ((i=0; i; i++)); do ping 10.1 -A  /dev/null  done

# for ((i=0; i; i++)); do nice -99 ping 10.1 -A  /dev/null  done

# { for ((i=0; i; i++)); do
ping 10.1 -A  /dev/null 
done }  /dev/null 21

Especially the last one sometimes causes a complete console lock-up,
while the other two sometimes stall then surge periodically.

ok. I think i might finally have found the bug causing this. Could you
try the fix below, does your webserver thread-startup test work any
better?


It seems to help somewhat, but the problem is still visible.  Even v20.3 on 
2.6.22.5 didn't help.


It does look related to ia-boosting, so I turned off __update_curr like Roman 
mentioned, which had an enormous smoothing effect, but then nice levels 
completely break down and lockup the system.


There is another way to show the problem visually under X (vesa-driver), by 
starting 3 gears simultaneously, which after laying them out side-by-side 
need some settling time before smoothing out.  Without __update_curr it's 
absolutely smooth from the start.


I posted a LOT of stuff using the glitch1 script, and finally found a 
set of tuning values which make the test script run smooth. See back 
posts, I don't have them here.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Bill Davidsen

Ingo Molnar wrote:

* Al Boldi [EMAIL PROTECTED] wrote:

ok. I think i might finally have found the bug causing this. Could 
you try the fix below, does your webserver thread-startup test work 
any better?
It seems to help somewhat, but the problem is still visible.  Even 
v20.3 on 2.6.22.5 didn't help.


It does look related to ia-boosting, so I turned off __update_curr 
like Roman mentioned, which had an enormous smoothing effect, but then 
nice levels completely break down and lockup the system.


you can turn sleeper-fairness off via:

   echo 28  /proc/sys/kernel/sched_features

another thing to try would be:

   echo 12  /proc/sys/kernel/sched_features


14, and drop the granularity to 50.


(that's the new-task penalty turned off.)

Another thing to try would be to edit this:

if (sysctl_sched_features  SCHED_FEAT_START_DEBIT)
p-se.wait_runtime = -(sched_granularity(cfs_rq) / 2);

to:

if (sysctl_sched_features  SCHED_FEAT_START_DEBIT)
p-se.wait_runtime = -(sched_granularity(cfs_rq);

and could you also check 20.4 on 2.6.22.5 perhaps, or very latest -git? 
(Peter has experienced smaller spikes with that.)


Ingo



--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Ingo Molnar

* Bill Davidsen [EMAIL PROTECTED] wrote:

  There is another way to show the problem visually under X 
  (vesa-driver), by starting 3 gears simultaneously, which after 
  laying them out side-by-side need some settling time before 
  smoothing out.  Without __update_curr it's absolutely smooth from 
  the start.
 
 I posted a LOT of stuff using the glitch1 script, and finally found a 
 set of tuning values which make the test script run smooth. See back 
 posts, I don't have them here.

but you have real 3D hw and DRI enabled, correct? In that case X uses up 
almost no CPU time and glxgears makes most of the processing. That is 
quite different from the above software-rendering case, where X spends 
most of the CPU time.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Ingo Molnar

* Al Boldi [EMAIL PROTECTED] wrote:

 No need for framebuffer.  All you need is X using the X.org 
 vesa-driver.  Then start gears like this:
 
   # gears  gears  gears 
 
 Then lay them out side by side to see the periodic stallings for 
 ~10sec.

i just tried something similar (by adding Option NoDRI to xorg.conf) 
and i'm wondering how it can be smooth on vesa-driver at all. I tested 
it on a Core2Duo box and software rendering manages to do about 3 frames 
per second. (although glxgears itself thinks it does ~600 fps) If i 
start 3 glxgears then they do ~1 frame per second each. This is on 
Fedora 7 with xorg-x11-server-Xorg-1.3.0.0-9.fc7 and 
xorg-x11-drv-i810-2.0.0-4.fc7.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Al Boldi
Ingo Molnar wrote:
 * Linus Torvalds [EMAIL PROTECTED] wrote:
  On Tue, 28 Aug 2007, Al Boldi wrote:
   I like your analysis, but how do you explain that these stalls
   vanish when __update_curr is disabled?
 
  It's entirely possible that what happens is that the X scheduling is
  just a slightly unstable system - which effectively would turn a small
  scheduling difference into a *huge* visible difference.

 i think it's because disabling __update_curr() in essence removes the
 ability of scheduler to preempt tasks - that hack in essence results in
 a non-scheduler. Hence the gears + X pair of tasks becomes a synchronous
 pair of tasks in essence - and thus gears cannot overload X.

I have narrowed it down a bit to add_wait_runtime.

Patch 2.6.22.5-v20.4 like this:

346- * the two values are equal)
347- * [Note: delta_mine - delta_exec is negative]:
348- */
349://  add_wait_runtime(cfs_rq, curr, delta_mine - delta_exec);
350-}
351-
352-static void update_curr(struct cfs_rq *cfs_rq)

When disabling add_wait_runtime the stalls are gone.  With this change the 
scheduler is still usable, but it does not constitute a fix.

Now, even with this hack, uneven nice-levels between X and gears causes a 
return of the stalls, so make sure both X and gears run on the same 
nice-level when testing.

Again, the whole point of this workload is to expose scheduler glitches 
regardless of whether X is broken or not, and my hunch is that this problem 
looks suspiciously like an ia-boosting bug.  What's important to note is 
that by adjusting the scheduler we can effect a correction in behaviour, and 
as such should yield this problem as fixable.

It's probably a good idea to look further into add_wait_runtime.


Thanks!

--
Al

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Keith Packard
On Wed, 2007-08-29 at 06:18 +0200, Ingo Molnar wrote:

  Then lay them out side by side to see the periodic stallings for 
  ~10sec.

The X scheduling code isn't really designed to handle software GL well;
the requests can be very expensive to execute, and yet are specified as
atomic operations (sigh).

 i just tried something similar (by adding Option NoDRI to xorg.conf) 
 and i'm wondering how it can be smooth on vesa-driver at all. I tested 
 it on a Core2Duo box and software rendering manages to do about 3 frames 
 per second. (although glxgears itself thinks it does ~600 fps) If i 
 start 3 glxgears then they do ~1 frame per second each. This is on 
 Fedora 7 with xorg-x11-server-Xorg-1.3.0.0-9.fc7 and 
 xorg-x11-drv-i810-2.0.0-4.fc7.

Are you attempting to measure the visible updates by eye? Or are you
using some other metric?

In any case, attempting to measure anything using glxgears is a bad
idea; it's not representative of *any* real applications. And then using
software GL on top of that...

What was the question again?

-- 
[EMAIL PROTECTED]


signature.asc
Description: This is a digitally signed message part


Re: CFS review

2007-08-28 Thread Mike Galbraith
On Wed, 2007-08-29 at 06:18 +0200, Ingo Molnar wrote:
 * Al Boldi [EMAIL PROTECTED] wrote:
 
  No need for framebuffer.  All you need is X using the X.org 
  vesa-driver.  Then start gears like this:
  
# gears  gears  gears 
  
  Then lay them out side by side to see the periodic stallings for 
  ~10sec.
 
 i just tried something similar (by adding Option NoDRI to xorg.conf) 
 and i'm wondering how it can be smooth on vesa-driver at all. I tested 
 it on a Core2Duo box and software rendering manages to do about 3 frames 
 per second. (although glxgears itself thinks it does ~600 fps) If i 
 start 3 glxgears then they do ~1 frame per second each. This is on 
 Fedora 7 with xorg-x11-server-Xorg-1.3.0.0-9.fc7 and 
 xorg-x11-drv-i810-2.0.0-4.fc7.

At least you can run the darn test... the third instance of glxgears
here means say bye bye to GUI instantly.

-Mike

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-28 Thread Ingo Molnar

* Al Boldi [EMAIL PROTECTED] wrote:

 I have narrowed it down a bit to add_wait_runtime.

the scheduler is a red herring here. Could you strace -ttt -TTT one of 
the glxgears instances (and send us the cfs-debug-info.sh output, with 
CONFIG_SCHED_DEBUG=y and CONFIG_SCHEDSTATS=y as requested before) so 
that we can have a closer look?

i reproduced something similar and there the stall is caused by 1+ 
second select() delays on the X client-server socket. The scheduler 
stats agree with that:

 se.sleep_max :  2194711437
 se.block_max :   0
 se.exec_max  :  977446
 se.wait_max  : 1912321

the scheduler itself had a worst-case scheduling delay of 1.9 
milliseconds for that glxgears instance (which is perfectly good - in 
fact - excellent interactivity) - but the task had a maximum sleep time 
of 2.19 seconds. So the 'glitch' was not caused by the scheduler.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-27 Thread Al Boldi
Linus Torvalds wrote:
> On Tue, 28 Aug 2007, Al Boldi wrote:
> > No need for framebuffer.  All you need is X using the X.org vesa-driver.
> > Then start gears like this:
> >
> >   # gears & gears & gears &
> >
> > Then lay them out side by side to see the periodic stallings for ~10sec.
>
> I don't think this is a good test.
>
> Why?
>
> If you're not using direct rendering, what you have is the X server doing
> all the rendering, which in turn means that what you are testing is quite
> possibly not so much about the *kernel* scheduling, but about *X-server*
> scheduling!
>
> I'm sure the kernel scheduler has an impact, but what's more likely to be
> going on is that you're seeing effects that are indirect, and not
> necessarily at all even "good".
>
> For example, if the X server is the scheduling point, it's entirely
> possible that it ends up showing effects that are more due to the queueing
> of the X command stream than due to the scheduler - and that those
> stalls are simply due to *that*.
>
> One thing to try is to run the X connection in synchronous mode, which
> minimizes queueing issues. I don't know if gears has a flag to turn on
> synchronous X messaging, though. Many X programs take the "[+-]sync" flag
> to turn on synchronous mode, iirc.

I like your analysis, but how do you explain that these stalls vanish when 
__update_curr is disabled?


Thanks!

--
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-27 Thread Linus Torvalds


On Tue, 28 Aug 2007, Al Boldi wrote:
> 
> No need for framebuffer.  All you need is X using the X.org vesa-driver.  
> Then start gears like this:
> 
>   # gears & gears & gears &
> 
> Then lay them out side by side to see the periodic stallings for ~10sec.

I don't think this is a good test.

Why?

If you're not using direct rendering, what you have is the X server doing 
all the rendering, which in turn means that what you are testing is quite 
possibly not so much about the *kernel* scheduling, but about *X-server* 
scheduling!

I'm sure the kernel scheduler has an impact, but what's more likely to be 
going on is that you're seeing effects that are indirect, and not 
necessarily at all even "good".

For example, if the X server is the scheduling point, it's entirely 
possible that it ends up showing effects that are more due to the queueing 
of the X command stream than due to the scheduler - and that those 
stalls are simply due to *that*.

One thing to try is to run the X connection in synchronous mode, which 
minimizes queueing issues. I don't know if gears has a flag to turn on 
synchronous X messaging, though. Many X programs take the "[+-]sync" flag 
to turn on synchronous mode, iirc.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-27 Thread Al Boldi
Ingo Molnar wrote:
> * Al Boldi <[EMAIL PROTECTED]> wrote:
> > > Could you try the patch below instead, does this make 3x glxgears
> > > smooth again? (if yes, could you send me your Signed-off-by line as
> > > well.)
> >
> > The task-startup stalling is still there for ~10sec.
> >
> > Can you see the problem on your machine?
>
> nope (i have no framebuffer setup)

No need for framebuffer.  All you need is X using the X.org vesa-driver.  
Then start gears like this:

  # gears & gears & gears &

Then lay them out side by side to see the periodic stallings for ~10sec.

> - but i can see some chew-max
> latencies that occur when new tasks are started up. I _think_ it's
> probably the same problem as yours.

chew-max is great, but it's too accurate in that it exposes any scheduling 
glitches and as such hides the startup glitch within the many glitches it 
exposes.  For example, it fluctuates all over the place using this:

  # for ((i=0;i<9;i++)); do chew-max 60 > /dev/shm/chew$i.log & done

Also, chew-max locks-up when disabling __update_curr, which means that the 
workload of chew-max is different from either the ping-startup loop or the 
gears.  You really should try the gears test by any means, as the problem is 
really pronounced there.

> could you try the patch below (which is the combo patch of my current
> queue), ontop of head 50c46637aa? This makes chew-max behave better
> during task mass-startup here.

Still no improvement.


Thanks!

--
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-27 Thread Ingo Molnar

* Al Boldi <[EMAIL PROTECTED]> wrote:

> > Could you try the patch below instead, does this make 3x glxgears 
> > smooth again? (if yes, could you send me your Signed-off-by line as 
> > well.)
> 
> The task-startup stalling is still there for ~10sec.
> 
> Can you see the problem on your machine?

nope (i have no framebuffer setup) - but i can see some chew-max 
latencies that occur when new tasks are started up. I _think_ it's 
probably the same problem as yours.

could you try the patch below (which is the combo patch of my current 
queue), ontop of head 50c46637aa? This makes chew-max behave better 
during task mass-startup here.

Ingo

->
Index: linux/include/linux/sched.h
===
--- linux.orig/include/linux/sched.h
+++ linux/include/linux/sched.h
@@ -904,6 +904,7 @@ struct sched_entity {
 
u64 exec_start;
u64 sum_exec_runtime;
+   u64 prev_sum_exec_runtime;
u64 wait_start_fair;
u64 sleep_start_fair;
 
Index: linux/kernel/sched.c
===
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -1587,6 +1587,7 @@ static void __sched_fork(struct task_str
p->se.wait_start_fair   = 0;
p->se.exec_start= 0;
p->se.sum_exec_runtime  = 0;
+   p->se.prev_sum_exec_runtime = 0;
p->se.delta_exec= 0;
p->se.delta_fair_run= 0;
p->se.delta_fair_sleep  = 0;
Index: linux/kernel/sched_fair.c
===
--- linux.orig/kernel/sched_fair.c
+++ linux/kernel/sched_fair.c
@@ -82,12 +82,12 @@ enum {
 };
 
 unsigned int sysctl_sched_features __read_mostly =
-   SCHED_FEAT_FAIR_SLEEPERS*1 |
+   SCHED_FEAT_FAIR_SLEEPERS*0 |
SCHED_FEAT_SLEEPER_AVG  *0 |
SCHED_FEAT_SLEEPER_LOAD_AVG *1 |
SCHED_FEAT_PRECISE_CPU_LOAD *1 |
-   SCHED_FEAT_START_DEBIT  *1 |
-   SCHED_FEAT_SKIP_INITIAL *0;
+   SCHED_FEAT_START_DEBIT  *0 |
+   SCHED_FEAT_SKIP_INITIAL *1;
 
 extern struct sched_class fair_sched_class;
 
@@ -225,39 +225,15 @@ static struct sched_entity *__pick_next_
  * Calculate the preemption granularity needed to schedule every
  * runnable task once per sysctl_sched_latency amount of time.
  * (down to a sensible low limit on granularity)
- *
- * For example, if there are 2 tasks running and latency is 10 msecs,
- * we switch tasks every 5 msecs. If we have 3 tasks running, we have
- * to switch tasks every 3.33 msecs to get a 10 msecs observed latency
- * for each task. We do finer and finer scheduling up to until we
- * reach the minimum granularity value.
- *
- * To achieve this we use the following dynamic-granularity rule:
- *
- *gran = lat/nr - lat/nr/nr
- *
- * This comes out of the following equations:
- *
- *kA1 + gran = kB1
- *kB2 + gran = kA2
- *kA2 = kA1
- *kB2 = kB1 - d + d/nr
- *lat = d * nr
- *
- * Where 'k' is key, 'A' is task A (waiting), 'B' is task B (running),
- * '1' is start of time, '2' is end of time, 'd' is delay between
- * 1 and 2 (during which task B was running), 'nr' is number of tasks
- * running, 'lat' is the the period of each task. ('lat' is the
- * sched_latency that we aim for.)
  */
-static long
+static unsigned long
 sched_granularity(struct cfs_rq *cfs_rq)
 {
unsigned int gran = sysctl_sched_latency;
unsigned int nr = cfs_rq->nr_running;
 
if (nr > 1) {
-   gran = gran/nr - gran/nr/nr;
+   gran = gran/nr;
gran = max(gran, sysctl_sched_min_granularity);
}
 
@@ -489,6 +465,9 @@ update_stats_wait_end(struct cfs_rq *cfs
 {
unsigned long delta_fair;
 
+   if (unlikely(!se->wait_start_fair))
+   return;
+
delta_fair = (unsigned long)min((u64)(2*sysctl_sched_runtime_limit),
(u64)(cfs_rq->fair_clock - se->wait_start_fair));
 
@@ -668,7 +647,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, st
 /*
  * Preempt the current task with a newly woken task if needed:
  */
-static void
+static int
 __check_preempt_curr_fair(struct cfs_rq *cfs_rq, struct sched_entity *se,
  struct sched_entity *curr, unsigned long granularity)
 {
@@ -679,8 +658,11 @@ __check_preempt_curr_fair(struct cfs_rq 
 * preempt the current task unless the best task has
 * a larger than sched_granularity fairness advantage:
 */
-   if (__delta > niced_granularity(curr, granularity))
+   if (__delta > niced_granularity(curr, granularity)) {
resched_task(rq_of(cfs_rq)->curr);
+   return 

Re: CFS review

2007-08-27 Thread Al Boldi
Ingo Molnar wrote:
> * Al Boldi <[EMAIL PROTECTED]> wrote:
> > > could you send the exact patch that shows what you did?
> >
> > On 2.6.22.5-v20.3 (not v20.4):
> >
> > 340-curr->delta_exec += delta_exec;
> > 341-
> > 342-if (unlikely(curr->delta_exec > sysctl_sched_stat_granularity))
> > { 343://  __update_curr(cfs_rq, curr);
> > 344-curr->delta_exec = 0;
> > 345-}
> > 346-curr->exec_start = rq_of(cfs_rq)->clock;
>
> ouch - this produces a really broken scheduler - with this we dont do
> any run-time accounting (!).

Of course it's broken, and it's not meant as a fix, but this change allows 
you to see the amount of overhead as well as any miscalculations 
__update_curr incurs.

In terms of overhead, __update_curr incurs ~3x slowdown, and in terms of 
run-time accounting it exhibits a ~10sec task-startup miscalculation.

> Could you try the patch below instead, does this make 3x glxgears smooth
> again? (if yes, could you send me your Signed-off-by line as well.)

The task-startup stalling is still there for ~10sec.

Can you see the problem on your machine?


Thanks!

--
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-27 Thread Ingo Molnar

* Al Boldi <[EMAIL PROTECTED]> wrote:

> > could you send the exact patch that shows what you did?
> 
> On 2.6.22.5-v20.3 (not v20.4):
> 
> 340-curr->delta_exec += delta_exec;
> 341-
> 342-if (unlikely(curr->delta_exec > sysctl_sched_stat_granularity)) {
> 343://  __update_curr(cfs_rq, curr);
> 344-curr->delta_exec = 0;
> 345-}
> 346-curr->exec_start = rq_of(cfs_rq)->clock;

ouch - this produces a really broken scheduler - with this we dont do 
any run-time accounting (!).

Could you try the patch below instead, does this make 3x glxgears smooth 
again? (if yes, could you send me your Signed-off-by line as well.)

Ingo

>
Subject: sched: make the scheduler converge to the ideal latency
From: Ingo Molnar <[EMAIL PROTECTED]>

de-HZ-ification of the granularity defaults unearthed a pre-existing
property of CFS: while it correctly converges to the granularity goal,
it does not prevent run-time fluctuations in the range of [-gran ...
+gran].

With the increase of the granularity due to the removal of HZ
dependencies, this becomes visible in chew-max output (with 5 tasks
running):

 out:  28 . 27. 32 | flu:  0 .  0 | ran:9 .   13 | per:   37 .   40
 out:  27 . 27. 32 | flu:  0 .  0 | ran:   17 .   13 | per:   44 .   40
 out:  27 . 27. 32 | flu:  0 .  0 | ran:9 .   13 | per:   36 .   40
 out:  29 . 27. 32 | flu:  2 .  0 | ran:   17 .   13 | per:   46 .   40
 out:  28 . 27. 32 | flu:  0 .  0 | ran:9 .   13 | per:   37 .   40
 out:  29 . 27. 32 | flu:  0 .  0 | ran:   18 .   13 | per:   47 .   40
 out:  28 . 27. 32 | flu:  0 .  0 | ran:9 .   13 | per:   37 .   40

average slice is the ideal 13 msecs and the period is picture-perfect 40
msecs. But the 'ran' field fluctuates around 13.33 msecs and there's no
mechanism in CFS to keep that from happening: it's a perfectly valid
solution that CFS finds.

the solution is to add a granularity/preemption rule that knows about
the "target latency", which makes tasks that run longer than the ideal
latency run a bit less. The simplest approach is to simply decrease the
preemption granularity when a task overruns its ideal latency. For this
we have to track how much the task executed since its last preemption.

( this adds a new field to task_struct, but we can eliminate that
  overhead in 2.6.24 by putting all the scheduler timestamps into an
  anonymous union. )

with this change in place, chew-max output is fluctuation-less all
around:

 out:  28 . 27. 39 | flu:  0 .  2 | ran:   13 .   13 | per:   41 .   40
 out:  28 . 27. 39 | flu:  0 .  2 | ran:   13 .   13 | per:   41 .   40
 out:  28 . 27. 39 | flu:  0 .  2 | ran:   13 .   13 | per:   41 .   40
 out:  28 . 27. 39 | flu:  0 .  2 | ran:   13 .   13 | per:   41 .   40
 out:  28 . 27. 39 | flu:  0 .  1 | ran:   13 .   13 | per:   41 .   40
 out:  28 . 27. 39 | flu:  0 .  1 | ran:   13 .   13 | per:   41 .   40

this patch has no impact on any fastpath or on any globally observable
scheduling property. (unless you have sharp enough eyes to see
millisecond-level ruckles in glxgears smoothness :-)

Also, with this mechanism in place the formula for adaptive granularity
can be simplified down to the obvious "granularity = latency/nr_running"
calculation.

Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
Signed-off-by: Peter Zijlstra <[EMAIL PROTECTED]>
---
 include/linux/sched.h |1 +
 kernel/sched_fair.c   |   43 ++-
 2 files changed, 15 insertions(+), 29 deletions(-)

Index: linux/include/linux/sched.h
===
--- linux.orig/include/linux/sched.h
+++ linux/include/linux/sched.h
@@ -904,6 +904,7 @@ struct sched_entity {
 
u64 exec_start;
u64 sum_exec_runtime;
+   u64 prev_sum_exec_runtime;
u64 wait_start_fair;
u64 sleep_start_fair;
 
Index: linux/kernel/sched_fair.c
===
--- linux.orig/kernel/sched_fair.c
+++ linux/kernel/sched_fair.c
@@ -225,30 +225,6 @@ static struct sched_entity *__pick_next_
  * Calculate the preemption granularity needed to schedule every
  * runnable task once per sysctl_sched_latency amount of time.
  * (down to a sensible low limit on granularity)
- *
- * For example, if there are 2 tasks running and latency is 10 msecs,
- * we switch tasks every 5 msecs. If we have 3 tasks running, we have
- * to switch tasks every 3.33 msecs to get a 10 msecs observed latency
- * for each task. We do finer and finer scheduling up to until we
- * reach the minimum granularity value.
- *
- * To achieve this we use the following dynamic-granularity rule:
- *
- *gran = lat/nr - lat/nr/nr
- *
- * This comes out of the following equations:
- *
- *kA1 + gran = kB1
- *kB2 + gran = kA2
- *kA2 = kA1
- *kB2 = kB1 - d + d/nr
- *   

Re: CFS review

2007-08-27 Thread Ingo Molnar

* Al Boldi [EMAIL PROTECTED] wrote:

  could you send the exact patch that shows what you did?
 
 On 2.6.22.5-v20.3 (not v20.4):
 
 340-curr-delta_exec += delta_exec;
 341-
 342-if (unlikely(curr-delta_exec  sysctl_sched_stat_granularity)) {
 343://  __update_curr(cfs_rq, curr);
 344-curr-delta_exec = 0;
 345-}
 346-curr-exec_start = rq_of(cfs_rq)-clock;

ouch - this produces a really broken scheduler - with this we dont do 
any run-time accounting (!).

Could you try the patch below instead, does this make 3x glxgears smooth 
again? (if yes, could you send me your Signed-off-by line as well.)

Ingo


Subject: sched: make the scheduler converge to the ideal latency
From: Ingo Molnar [EMAIL PROTECTED]

de-HZ-ification of the granularity defaults unearthed a pre-existing
property of CFS: while it correctly converges to the granularity goal,
it does not prevent run-time fluctuations in the range of [-gran ...
+gran].

With the increase of the granularity due to the removal of HZ
dependencies, this becomes visible in chew-max output (with 5 tasks
running):

 out:  28 . 27. 32 | flu:  0 .  0 | ran:9 .   13 | per:   37 .   40
 out:  27 . 27. 32 | flu:  0 .  0 | ran:   17 .   13 | per:   44 .   40
 out:  27 . 27. 32 | flu:  0 .  0 | ran:9 .   13 | per:   36 .   40
 out:  29 . 27. 32 | flu:  2 .  0 | ran:   17 .   13 | per:   46 .   40
 out:  28 . 27. 32 | flu:  0 .  0 | ran:9 .   13 | per:   37 .   40
 out:  29 . 27. 32 | flu:  0 .  0 | ran:   18 .   13 | per:   47 .   40
 out:  28 . 27. 32 | flu:  0 .  0 | ran:9 .   13 | per:   37 .   40

average slice is the ideal 13 msecs and the period is picture-perfect 40
msecs. But the 'ran' field fluctuates around 13.33 msecs and there's no
mechanism in CFS to keep that from happening: it's a perfectly valid
solution that CFS finds.

the solution is to add a granularity/preemption rule that knows about
the target latency, which makes tasks that run longer than the ideal
latency run a bit less. The simplest approach is to simply decrease the
preemption granularity when a task overruns its ideal latency. For this
we have to track how much the task executed since its last preemption.

( this adds a new field to task_struct, but we can eliminate that
  overhead in 2.6.24 by putting all the scheduler timestamps into an
  anonymous union. )

with this change in place, chew-max output is fluctuation-less all
around:

 out:  28 . 27. 39 | flu:  0 .  2 | ran:   13 .   13 | per:   41 .   40
 out:  28 . 27. 39 | flu:  0 .  2 | ran:   13 .   13 | per:   41 .   40
 out:  28 . 27. 39 | flu:  0 .  2 | ran:   13 .   13 | per:   41 .   40
 out:  28 . 27. 39 | flu:  0 .  2 | ran:   13 .   13 | per:   41 .   40
 out:  28 . 27. 39 | flu:  0 .  1 | ran:   13 .   13 | per:   41 .   40
 out:  28 . 27. 39 | flu:  0 .  1 | ran:   13 .   13 | per:   41 .   40

this patch has no impact on any fastpath or on any globally observable
scheduling property. (unless you have sharp enough eyes to see
millisecond-level ruckles in glxgears smoothness :-)

Also, with this mechanism in place the formula for adaptive granularity
can be simplified down to the obvious granularity = latency/nr_running
calculation.

Signed-off-by: Ingo Molnar [EMAIL PROTECTED]
Signed-off-by: Peter Zijlstra [EMAIL PROTECTED]
---
 include/linux/sched.h |1 +
 kernel/sched_fair.c   |   43 ++-
 2 files changed, 15 insertions(+), 29 deletions(-)

Index: linux/include/linux/sched.h
===
--- linux.orig/include/linux/sched.h
+++ linux/include/linux/sched.h
@@ -904,6 +904,7 @@ struct sched_entity {
 
u64 exec_start;
u64 sum_exec_runtime;
+   u64 prev_sum_exec_runtime;
u64 wait_start_fair;
u64 sleep_start_fair;
 
Index: linux/kernel/sched_fair.c
===
--- linux.orig/kernel/sched_fair.c
+++ linux/kernel/sched_fair.c
@@ -225,30 +225,6 @@ static struct sched_entity *__pick_next_
  * Calculate the preemption granularity needed to schedule every
  * runnable task once per sysctl_sched_latency amount of time.
  * (down to a sensible low limit on granularity)
- *
- * For example, if there are 2 tasks running and latency is 10 msecs,
- * we switch tasks every 5 msecs. If we have 3 tasks running, we have
- * to switch tasks every 3.33 msecs to get a 10 msecs observed latency
- * for each task. We do finer and finer scheduling up to until we
- * reach the minimum granularity value.
- *
- * To achieve this we use the following dynamic-granularity rule:
- *
- *gran = lat/nr - lat/nr/nr
- *
- * This comes out of the following equations:
- *
- *kA1 + gran = kB1
- *kB2 + gran = kA2
- *kA2 = kA1
- *kB2 = kB1 - d + d/nr
- *lat = d * nr
- *
- * Where 

Re: CFS review

2007-08-27 Thread Al Boldi
Ingo Molnar wrote:
 * Al Boldi [EMAIL PROTECTED] wrote:
   could you send the exact patch that shows what you did?
 
  On 2.6.22.5-v20.3 (not v20.4):
 
  340-curr-delta_exec += delta_exec;
  341-
  342-if (unlikely(curr-delta_exec  sysctl_sched_stat_granularity))
  { 343://  __update_curr(cfs_rq, curr);
  344-curr-delta_exec = 0;
  345-}
  346-curr-exec_start = rq_of(cfs_rq)-clock;

 ouch - this produces a really broken scheduler - with this we dont do
 any run-time accounting (!).

Of course it's broken, and it's not meant as a fix, but this change allows 
you to see the amount of overhead as well as any miscalculations 
__update_curr incurs.

In terms of overhead, __update_curr incurs ~3x slowdown, and in terms of 
run-time accounting it exhibits a ~10sec task-startup miscalculation.

 Could you try the patch below instead, does this make 3x glxgears smooth
 again? (if yes, could you send me your Signed-off-by line as well.)

The task-startup stalling is still there for ~10sec.

Can you see the problem on your machine?


Thanks!

--
Al

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-27 Thread Ingo Molnar

* Al Boldi [EMAIL PROTECTED] wrote:

  Could you try the patch below instead, does this make 3x glxgears 
  smooth again? (if yes, could you send me your Signed-off-by line as 
  well.)
 
 The task-startup stalling is still there for ~10sec.
 
 Can you see the problem on your machine?

nope (i have no framebuffer setup) - but i can see some chew-max 
latencies that occur when new tasks are started up. I _think_ it's 
probably the same problem as yours.

could you try the patch below (which is the combo patch of my current 
queue), ontop of head 50c46637aa? This makes chew-max behave better 
during task mass-startup here.

Ingo

-
Index: linux/include/linux/sched.h
===
--- linux.orig/include/linux/sched.h
+++ linux/include/linux/sched.h
@@ -904,6 +904,7 @@ struct sched_entity {
 
u64 exec_start;
u64 sum_exec_runtime;
+   u64 prev_sum_exec_runtime;
u64 wait_start_fair;
u64 sleep_start_fair;
 
Index: linux/kernel/sched.c
===
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -1587,6 +1587,7 @@ static void __sched_fork(struct task_str
p-se.wait_start_fair   = 0;
p-se.exec_start= 0;
p-se.sum_exec_runtime  = 0;
+   p-se.prev_sum_exec_runtime = 0;
p-se.delta_exec= 0;
p-se.delta_fair_run= 0;
p-se.delta_fair_sleep  = 0;
Index: linux/kernel/sched_fair.c
===
--- linux.orig/kernel/sched_fair.c
+++ linux/kernel/sched_fair.c
@@ -82,12 +82,12 @@ enum {
 };
 
 unsigned int sysctl_sched_features __read_mostly =
-   SCHED_FEAT_FAIR_SLEEPERS*1 |
+   SCHED_FEAT_FAIR_SLEEPERS*0 |
SCHED_FEAT_SLEEPER_AVG  *0 |
SCHED_FEAT_SLEEPER_LOAD_AVG *1 |
SCHED_FEAT_PRECISE_CPU_LOAD *1 |
-   SCHED_FEAT_START_DEBIT  *1 |
-   SCHED_FEAT_SKIP_INITIAL *0;
+   SCHED_FEAT_START_DEBIT  *0 |
+   SCHED_FEAT_SKIP_INITIAL *1;
 
 extern struct sched_class fair_sched_class;
 
@@ -225,39 +225,15 @@ static struct sched_entity *__pick_next_
  * Calculate the preemption granularity needed to schedule every
  * runnable task once per sysctl_sched_latency amount of time.
  * (down to a sensible low limit on granularity)
- *
- * For example, if there are 2 tasks running and latency is 10 msecs,
- * we switch tasks every 5 msecs. If we have 3 tasks running, we have
- * to switch tasks every 3.33 msecs to get a 10 msecs observed latency
- * for each task. We do finer and finer scheduling up to until we
- * reach the minimum granularity value.
- *
- * To achieve this we use the following dynamic-granularity rule:
- *
- *gran = lat/nr - lat/nr/nr
- *
- * This comes out of the following equations:
- *
- *kA1 + gran = kB1
- *kB2 + gran = kA2
- *kA2 = kA1
- *kB2 = kB1 - d + d/nr
- *lat = d * nr
- *
- * Where 'k' is key, 'A' is task A (waiting), 'B' is task B (running),
- * '1' is start of time, '2' is end of time, 'd' is delay between
- * 1 and 2 (during which task B was running), 'nr' is number of tasks
- * running, 'lat' is the the period of each task. ('lat' is the
- * sched_latency that we aim for.)
  */
-static long
+static unsigned long
 sched_granularity(struct cfs_rq *cfs_rq)
 {
unsigned int gran = sysctl_sched_latency;
unsigned int nr = cfs_rq-nr_running;
 
if (nr  1) {
-   gran = gran/nr - gran/nr/nr;
+   gran = gran/nr;
gran = max(gran, sysctl_sched_min_granularity);
}
 
@@ -489,6 +465,9 @@ update_stats_wait_end(struct cfs_rq *cfs
 {
unsigned long delta_fair;
 
+   if (unlikely(!se-wait_start_fair))
+   return;
+
delta_fair = (unsigned long)min((u64)(2*sysctl_sched_runtime_limit),
(u64)(cfs_rq-fair_clock - se-wait_start_fair));
 
@@ -668,7 +647,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, st
 /*
  * Preempt the current task with a newly woken task if needed:
  */
-static void
+static int
 __check_preempt_curr_fair(struct cfs_rq *cfs_rq, struct sched_entity *se,
  struct sched_entity *curr, unsigned long granularity)
 {
@@ -679,8 +658,11 @@ __check_preempt_curr_fair(struct cfs_rq 
 * preempt the current task unless the best task has
 * a larger than sched_granularity fairness advantage:
 */
-   if (__delta  niced_granularity(curr, granularity))
+   if (__delta  niced_granularity(curr, granularity)) {
resched_task(rq_of(cfs_rq)-curr);
+   return 1;
+   }
+   return 

Re: CFS review

2007-08-27 Thread Al Boldi
Ingo Molnar wrote:
 * Al Boldi [EMAIL PROTECTED] wrote:
   Could you try the patch below instead, does this make 3x glxgears
   smooth again? (if yes, could you send me your Signed-off-by line as
   well.)
 
  The task-startup stalling is still there for ~10sec.
 
  Can you see the problem on your machine?

 nope (i have no framebuffer setup)

No need for framebuffer.  All you need is X using the X.org vesa-driver.  
Then start gears like this:

  # gears  gears  gears 

Then lay them out side by side to see the periodic stallings for ~10sec.

 - but i can see some chew-max
 latencies that occur when new tasks are started up. I _think_ it's
 probably the same problem as yours.

chew-max is great, but it's too accurate in that it exposes any scheduling 
glitches and as such hides the startup glitch within the many glitches it 
exposes.  For example, it fluctuates all over the place using this:

  # for ((i=0;i9;i++)); do chew-max 60  /dev/shm/chew$i.log  done

Also, chew-max locks-up when disabling __update_curr, which means that the 
workload of chew-max is different from either the ping-startup loop or the 
gears.  You really should try the gears test by any means, as the problem is 
really pronounced there.

 could you try the patch below (which is the combo patch of my current
 queue), ontop of head 50c46637aa? This makes chew-max behave better
 during task mass-startup here.

Still no improvement.


Thanks!

--
Al

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-27 Thread Al Boldi
Linus Torvalds wrote:
 On Tue, 28 Aug 2007, Al Boldi wrote:
  No need for framebuffer.  All you need is X using the X.org vesa-driver.
  Then start gears like this:
 
# gears  gears  gears 
 
  Then lay them out side by side to see the periodic stallings for ~10sec.

 I don't think this is a good test.

 Why?

 If you're not using direct rendering, what you have is the X server doing
 all the rendering, which in turn means that what you are testing is quite
 possibly not so much about the *kernel* scheduling, but about *X-server*
 scheduling!

 I'm sure the kernel scheduler has an impact, but what's more likely to be
 going on is that you're seeing effects that are indirect, and not
 necessarily at all even good.

 For example, if the X server is the scheduling point, it's entirely
 possible that it ends up showing effects that are more due to the queueing
 of the X command stream than due to the scheduler - and that those
 stalls are simply due to *that*.

 One thing to try is to run the X connection in synchronous mode, which
 minimizes queueing issues. I don't know if gears has a flag to turn on
 synchronous X messaging, though. Many X programs take the [+-]sync flag
 to turn on synchronous mode, iirc.

I like your analysis, but how do you explain that these stalls vanish when 
__update_curr is disabled?


Thanks!

--
Al

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-27 Thread Linus Torvalds


On Tue, 28 Aug 2007, Al Boldi wrote:
 
 No need for framebuffer.  All you need is X using the X.org vesa-driver.  
 Then start gears like this:
 
   # gears  gears  gears 
 
 Then lay them out side by side to see the periodic stallings for ~10sec.

I don't think this is a good test.

Why?

If you're not using direct rendering, what you have is the X server doing 
all the rendering, which in turn means that what you are testing is quite 
possibly not so much about the *kernel* scheduling, but about *X-server* 
scheduling!

I'm sure the kernel scheduler has an impact, but what's more likely to be 
going on is that you're seeing effects that are indirect, and not 
necessarily at all even good.

For example, if the X server is the scheduling point, it's entirely 
possible that it ends up showing effects that are more due to the queueing 
of the X command stream than due to the scheduler - and that those 
stalls are simply due to *that*.

One thing to try is to run the X connection in synchronous mode, which 
minimizes queueing issues. I don't know if gears has a flag to turn on 
synchronous X messaging, though. Many X programs take the [+-]sync flag 
to turn on synchronous mode, iirc.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-26 Thread Al Boldi
Ingo Molnar wrote:
> * Al Boldi <[EMAIL PROTECTED]> wrote:
> > > and could you also check 20.4 on 2.6.22.5 perhaps, or very latest
> > > -git? (Peter has experienced smaller spikes with that.)
> >
> > Ok, I tried all your suggestions, but nothing works as smooth as
> > removing __update_curr.
>
> could you send the exact patch that shows what you did?

On 2.6.22.5-v20.3 (not v20.4):

340-curr->delta_exec += delta_exec;
341-
342-if (unlikely(curr->delta_exec > sysctl_sched_stat_granularity)) {
343://  __update_curr(cfs_rq, curr);
344-curr->delta_exec = 0;
345-}
346-curr->exec_start = rq_of(cfs_rq)->clock;

> And could you
> also please describe it exactly which aspect of the workload you call
> 'smooth'. Could it be made quantitative somehow?

The 3x gears test shows the startup problem in a really noticeable way.  With 
v20.4 they startup surging and stalling periodically for about 10sec, then 
they are smooth.  With v20.3 + above patch they startup completely smooth.


Thanks!

--
Al
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-26 Thread Ingo Molnar

* Al Boldi <[EMAIL PROTECTED]> wrote:

> > and could you also check 20.4 on 2.6.22.5 perhaps, or very latest 
> > -git? (Peter has experienced smaller spikes with that.)
> 
> Ok, I tried all your suggestions, but nothing works as smooth as 
> removing __update_curr.

could you send the exact patch that shows what you did? And could you 
also please describe it exactly which aspect of the workload you call 
'smooth'. Could it be made quantitative somehow?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-26 Thread Al Boldi
Ingo Molnar wrote:
> * Al Boldi <[EMAIL PROTECTED]> wrote:
> > > ok. I think i might finally have found the bug causing this. Could
> > > you try the fix below, does your webserver thread-startup test work
> > > any better?
> >
> > It seems to help somewhat, but the problem is still visible.  Even
> > v20.3 on 2.6.22.5 didn't help.
> >
> > It does look related to ia-boosting, so I turned off __update_curr
> > like Roman mentioned, which had an enormous smoothing effect, but then
> > nice levels completely break down and lockup the system.
>
> you can turn sleeper-fairness off via:
>
>echo 28 > /proc/sys/kernel/sched_features
>
> another thing to try would be:
>
>echo 12 > /proc/sys/kernel/sched_features
>
> (that's the new-task penalty turned off.)
>
> Another thing to try would be to edit this:
>
> if (sysctl_sched_features & SCHED_FEAT_START_DEBIT)
> p->se.wait_runtime = -(sched_granularity(cfs_rq) / 2);
>
> to:
>
> if (sysctl_sched_features & SCHED_FEAT_START_DEBIT)
> p->se.wait_runtime = -(sched_granularity(cfs_rq);
>
> and could you also check 20.4 on 2.6.22.5 perhaps, or very latest -git?
> (Peter has experienced smaller spikes with that.)

Ok, I tried all your suggestions, but nothing works as smooth as removing 
__update_curr.

Does the problem show on your machine with the 3x gears under X-vesa test?


Thanks!

--
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-26 Thread Al Boldi
Ingo Molnar wrote:
 * Al Boldi [EMAIL PROTECTED] wrote:
   ok. I think i might finally have found the bug causing this. Could
   you try the fix below, does your webserver thread-startup test work
   any better?
 
  It seems to help somewhat, but the problem is still visible.  Even
  v20.3 on 2.6.22.5 didn't help.
 
  It does look related to ia-boosting, so I turned off __update_curr
  like Roman mentioned, which had an enormous smoothing effect, but then
  nice levels completely break down and lockup the system.

 you can turn sleeper-fairness off via:

echo 28  /proc/sys/kernel/sched_features

 another thing to try would be:

echo 12  /proc/sys/kernel/sched_features

 (that's the new-task penalty turned off.)

 Another thing to try would be to edit this:

 if (sysctl_sched_features  SCHED_FEAT_START_DEBIT)
 p-se.wait_runtime = -(sched_granularity(cfs_rq) / 2);

 to:

 if (sysctl_sched_features  SCHED_FEAT_START_DEBIT)
 p-se.wait_runtime = -(sched_granularity(cfs_rq);

 and could you also check 20.4 on 2.6.22.5 perhaps, or very latest -git?
 (Peter has experienced smaller spikes with that.)

Ok, I tried all your suggestions, but nothing works as smooth as removing 
__update_curr.

Does the problem show on your machine with the 3x gears under X-vesa test?


Thanks!

--
Al

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-26 Thread Ingo Molnar

* Al Boldi [EMAIL PROTECTED] wrote:

  and could you also check 20.4 on 2.6.22.5 perhaps, or very latest 
  -git? (Peter has experienced smaller spikes with that.)
 
 Ok, I tried all your suggestions, but nothing works as smooth as 
 removing __update_curr.

could you send the exact patch that shows what you did? And could you 
also please describe it exactly which aspect of the workload you call 
'smooth'. Could it be made quantitative somehow?

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-26 Thread Al Boldi
Ingo Molnar wrote:
 * Al Boldi [EMAIL PROTECTED] wrote:
   and could you also check 20.4 on 2.6.22.5 perhaps, or very latest
   -git? (Peter has experienced smaller spikes with that.)
 
  Ok, I tried all your suggestions, but nothing works as smooth as
  removing __update_curr.

 could you send the exact patch that shows what you did?

On 2.6.22.5-v20.3 (not v20.4):

340-curr-delta_exec += delta_exec;
341-
342-if (unlikely(curr-delta_exec  sysctl_sched_stat_granularity)) {
343://  __update_curr(cfs_rq, curr);
344-curr-delta_exec = 0;
345-}
346-curr-exec_start = rq_of(cfs_rq)-clock;

 And could you
 also please describe it exactly which aspect of the workload you call
 'smooth'. Could it be made quantitative somehow?

The 3x gears test shows the startup problem in a really noticeable way.  With 
v20.4 they startup surging and stalling periodically for about 10sec, then 
they are smooth.  With v20.3 + above patch they startup completely smooth.


Thanks!

--
Al
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-25 Thread Ingo Molnar

* Al Boldi <[EMAIL PROTECTED]> wrote:

> > ok. I think i might finally have found the bug causing this. Could 
> > you try the fix below, does your webserver thread-startup test work 
> > any better?
> 
> It seems to help somewhat, but the problem is still visible.  Even 
> v20.3 on 2.6.22.5 didn't help.
> 
> It does look related to ia-boosting, so I turned off __update_curr 
> like Roman mentioned, which had an enormous smoothing effect, but then 
> nice levels completely break down and lockup the system.

you can turn sleeper-fairness off via:

   echo 28 > /proc/sys/kernel/sched_features

another thing to try would be:

   echo 12 > /proc/sys/kernel/sched_features

(that's the new-task penalty turned off.)

Another thing to try would be to edit this:

if (sysctl_sched_features & SCHED_FEAT_START_DEBIT)
p->se.wait_runtime = -(sched_granularity(cfs_rq) / 2);

to:

if (sysctl_sched_features & SCHED_FEAT_START_DEBIT)
p->se.wait_runtime = -(sched_granularity(cfs_rq);

and could you also check 20.4 on 2.6.22.5 perhaps, or very latest -git? 
(Peter has experienced smaller spikes with that.)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-25 Thread Al Boldi
Ingo Molnar wrote:
> * Al Boldi <[EMAIL PROTECTED]> wrote:
> > > > The problem is that consecutive runs don't give consistent results
> > > > and sometimes stalls.  You may want to try that.
> > >
> > > well, there's a natural saturation point after a few hundred tasks
> > > (depending on your CPU's speed), at which point there's no idle time
> > > left. From that point on things get slower progressively (and the
> > > ability of the shell to start new ping tasks is impacted as well),
> > > but that's expected on an overloaded system, isnt it?
> >
> > Of course, things should get slower with higher load, but it should be
> > consistent without stalls.
> >
> > To see this problem, make sure you boot into /bin/sh with the normal
> > VGA console (ie. not fb-console).  Then try each loop a few times to
> > show different behaviour; loops like:
> >
> > # for ((i=0; i<; i++)); do ping 10.1 -A > /dev/null & done
> >
> > # for ((i=0; i<; i++)); do nice -99 ping 10.1 -A > /dev/null & done
> >
> > # { for ((i=0; i<; i++)); do
> > ping 10.1 -A > /dev/null &
> > done } > /dev/null 2>&1
> >
> > Especially the last one sometimes causes a complete console lock-up,
> > while the other two sometimes stall then surge periodically.
>
> ok. I think i might finally have found the bug causing this. Could you
> try the fix below, does your webserver thread-startup test work any
> better?

It seems to help somewhat, but the problem is still visible.  Even v20.3 on 
2.6.22.5 didn't help.

It does look related to ia-boosting, so I turned off __update_curr like Roman 
mentioned, which had an enormous smoothing effect, but then nice levels 
completely break down and lockup the system.

There is another way to show the problem visually under X (vesa-driver), by 
starting 3 gears simultaneously, which after laying them out side-by-side 
need some settling time before smoothing out.  Without __update_curr it's 
absolutely smooth from the start.


Thanks!

--
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-25 Thread Al Boldi
Ingo Molnar wrote:
 * Al Boldi [EMAIL PROTECTED] wrote:
The problem is that consecutive runs don't give consistent results
and sometimes stalls.  You may want to try that.
  
   well, there's a natural saturation point after a few hundred tasks
   (depending on your CPU's speed), at which point there's no idle time
   left. From that point on things get slower progressively (and the
   ability of the shell to start new ping tasks is impacted as well),
   but that's expected on an overloaded system, isnt it?
 
  Of course, things should get slower with higher load, but it should be
  consistent without stalls.
 
  To see this problem, make sure you boot into /bin/sh with the normal
  VGA console (ie. not fb-console).  Then try each loop a few times to
  show different behaviour; loops like:
 
  # for ((i=0; i; i++)); do ping 10.1 -A  /dev/null  done
 
  # for ((i=0; i; i++)); do nice -99 ping 10.1 -A  /dev/null  done
 
  # { for ((i=0; i; i++)); do
  ping 10.1 -A  /dev/null 
  done }  /dev/null 21
 
  Especially the last one sometimes causes a complete console lock-up,
  while the other two sometimes stall then surge periodically.

 ok. I think i might finally have found the bug causing this. Could you
 try the fix below, does your webserver thread-startup test work any
 better?

It seems to help somewhat, but the problem is still visible.  Even v20.3 on 
2.6.22.5 didn't help.

It does look related to ia-boosting, so I turned off __update_curr like Roman 
mentioned, which had an enormous smoothing effect, but then nice levels 
completely break down and lockup the system.

There is another way to show the problem visually under X (vesa-driver), by 
starting 3 gears simultaneously, which after laying them out side-by-side 
need some settling time before smoothing out.  Without __update_curr it's 
absolutely smooth from the start.


Thanks!

--
Al

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-25 Thread Ingo Molnar

* Al Boldi [EMAIL PROTECTED] wrote:

  ok. I think i might finally have found the bug causing this. Could 
  you try the fix below, does your webserver thread-startup test work 
  any better?
 
 It seems to help somewhat, but the problem is still visible.  Even 
 v20.3 on 2.6.22.5 didn't help.
 
 It does look related to ia-boosting, so I turned off __update_curr 
 like Roman mentioned, which had an enormous smoothing effect, but then 
 nice levels completely break down and lockup the system.

you can turn sleeper-fairness off via:

   echo 28  /proc/sys/kernel/sched_features

another thing to try would be:

   echo 12  /proc/sys/kernel/sched_features

(that's the new-task penalty turned off.)

Another thing to try would be to edit this:

if (sysctl_sched_features  SCHED_FEAT_START_DEBIT)
p-se.wait_runtime = -(sched_granularity(cfs_rq) / 2);

to:

if (sysctl_sched_features  SCHED_FEAT_START_DEBIT)
p-se.wait_runtime = -(sched_granularity(cfs_rq);

and could you also check 20.4 on 2.6.22.5 perhaps, or very latest -git? 
(Peter has experienced smaller spikes with that.)

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-24 Thread Ingo Molnar

* Al Boldi <[EMAIL PROTECTED]> wrote:

> > > The problem is that consecutive runs don't give consistent results 
> > > and sometimes stalls.  You may want to try that.
> >
> > well, there's a natural saturation point after a few hundred tasks 
> > (depending on your CPU's speed), at which point there's no idle time 
> > left. From that point on things get slower progressively (and the 
> > ability of the shell to start new ping tasks is impacted as well), 
> > but that's expected on an overloaded system, isnt it?
> 
> Of course, things should get slower with higher load, but it should be 
> consistent without stalls.
> 
> To see this problem, make sure you boot into /bin/sh with the normal 
> VGA console (ie. not fb-console).  Then try each loop a few times to 
> show different behaviour; loops like:
> 
> # for ((i=0; i<; i++)); do ping 10.1 -A > /dev/null & done
> 
> # for ((i=0; i<; i++)); do nice -99 ping 10.1 -A > /dev/null & done
> 
> # { for ((i=0; i<; i++)); do
> ping 10.1 -A > /dev/null &
> done } > /dev/null 2>&1
> 
> Especially the last one sometimes causes a complete console lock-up, 
> while the other two sometimes stall then surge periodically.

ok. I think i might finally have found the bug causing this. Could you 
try the fix below, does your webserver thread-startup test work any 
better?

Ingo

--->
Subject: sched: fix startup penalty calculation
From: Ingo Molnar <[EMAIL PROTECTED]>

fix task startup penalty miscalculation: sysctl_sched_granularity is
unsigned int and wait_runtime is long so we first have to convert it
to long before turning it negative ...

Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
---
 kernel/sched_fair.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/kernel/sched_fair.c
===
--- linux.orig/kernel/sched_fair.c
+++ linux/kernel/sched_fair.c
@@ -1048,7 +1048,7 @@ static void task_new_fair(struct rq *rq,
 * -granularity/2, so initialize the task with that:
 */
if (sysctl_sched_features & SCHED_FEAT_START_DEBIT)
-   p->se.wait_runtime = -(sysctl_sched_granularity / 2);
+   p->se.wait_runtime = -((long)sysctl_sched_granularity / 2);
 
__enqueue_entity(cfs_rq, se);
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-24 Thread Ingo Molnar

* Al Boldi [EMAIL PROTECTED] wrote:

   The problem is that consecutive runs don't give consistent results 
   and sometimes stalls.  You may want to try that.
 
  well, there's a natural saturation point after a few hundred tasks 
  (depending on your CPU's speed), at which point there's no idle time 
  left. From that point on things get slower progressively (and the 
  ability of the shell to start new ping tasks is impacted as well), 
  but that's expected on an overloaded system, isnt it?
 
 Of course, things should get slower with higher load, but it should be 
 consistent without stalls.
 
 To see this problem, make sure you boot into /bin/sh with the normal 
 VGA console (ie. not fb-console).  Then try each loop a few times to 
 show different behaviour; loops like:
 
 # for ((i=0; i; i++)); do ping 10.1 -A  /dev/null  done
 
 # for ((i=0; i; i++)); do nice -99 ping 10.1 -A  /dev/null  done
 
 # { for ((i=0; i; i++)); do
 ping 10.1 -A  /dev/null 
 done }  /dev/null 21
 
 Especially the last one sometimes causes a complete console lock-up, 
 while the other two sometimes stall then surge periodically.

ok. I think i might finally have found the bug causing this. Could you 
try the fix below, does your webserver thread-startup test work any 
better?

Ingo

---
Subject: sched: fix startup penalty calculation
From: Ingo Molnar [EMAIL PROTECTED]

fix task startup penalty miscalculation: sysctl_sched_granularity is
unsigned int and wait_runtime is long so we first have to convert it
to long before turning it negative ...

Signed-off-by: Ingo Molnar [EMAIL PROTECTED]
---
 kernel/sched_fair.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/kernel/sched_fair.c
===
--- linux.orig/kernel/sched_fair.c
+++ linux/kernel/sched_fair.c
@@ -1048,7 +1048,7 @@ static void task_new_fair(struct rq *rq,
 * -granularity/2, so initialize the task with that:
 */
if (sysctl_sched_features  SCHED_FEAT_START_DEBIT)
-   p-se.wait_runtime = -(sysctl_sched_granularity / 2);
+   p-se.wait_runtime = -((long)sysctl_sched_granularity / 2);
 
__enqueue_entity(cfs_rq, se);
 }
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-21 Thread Al Boldi
Ingo Molnar wrote:
> * Al Boldi <[EMAIL PROTECTED]> wrote:
> > There is one workload that still isn't performing well; it's a
> > web-server workload that spawns 1K+ client procs.  It can be emulated
> > by using this:
> >
> >   for i in `seq 1 to `; do ping 10.1 -A > /dev/null & done
>
> on bash i did this as:
>
>   for ((i=0; i<; i++)); do ping 10.1 -A > /dev/null & done
>
> and this quickly creates a monster-runqueue with tons of ping tasks
> pending. (i replaced 10.1 with the IP of another box on the same LAN as
> the testbox) Is this what should happen?

Yes, sometimes they start pending and sometimes they run immediately.

> > The problem is that consecutive runs don't give consistent results and
> > sometimes stalls.  You may want to try that.
>
> well, there's a natural saturation point after a few hundred tasks
> (depending on your CPU's speed), at which point there's no idle time
> left. From that point on things get slower progressively (and the
> ability of the shell to start new ping tasks is impacted as well), but
> that's expected on an overloaded system, isnt it?

Of course, things should get slower with higher load, but it should be 
consistent without stalls.

To see this problem, make sure you boot into /bin/sh with the normal VGA 
console (ie. not fb-console).  Then try each loop a few times to show 
different behaviour; loops like:

# for ((i=0; i<; i++)); do ping 10.1 -A > /dev/null & done

# for ((i=0; i<; i++)); do nice -99 ping 10.1 -A > /dev/null & done

# { for ((i=0; i<; i++)); do
ping 10.1 -A > /dev/null &
done } > /dev/null 2>&1

Especially the last one sometimes causes a complete console lock-up, while 
the other two sometimes stall then surge periodically.

BTW, I am also wondering how one might test threading behaviour wrt to 
startup and sync-on-exit with parent thread.  This may not show any problems 
with small number of threads, but how does it scale with 1K+?


Thanks!

--
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-21 Thread Roman Zippel
Hi,

On Tue, 21 Aug 2007, Mike Galbraith wrote:

> I thought this was history.  With your config, I was finally able to
> reproduce the anomaly (only with your proggy though), and Ingo's patch
> does indeed fix it here.
> 
> Freshly reproduced anomaly and patch verification, running 2.6.23-rc3
> with your config, both with and without Ingo's patch reverted:

I did update to 2.6.23-rc3-git1 first, but I ended up reverting the patch, 
as I didn't notice it had been applied already. Sorry about that.
With this patch the underflows are gone, but there are still the 
overflows, so the questions from the last mail still remain.

bye, Roman
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-21 Thread Ingo Molnar

* Al Boldi <[EMAIL PROTECTED]> wrote:

> There is one workload that still isn't performing well; it's a 
> web-server workload that spawns 1K+ client procs.  It can be emulated 
> by using this:
> 
>   for i in `seq 1 to `; do ping 10.1 -A > /dev/null & done

on bash i did this as:

  for ((i=0; i<; i++)); do ping 10.1 -A > /dev/null & done

and this quickly creates a monster-runqueue with tons of ping tasks 
pending. (i replaced 10.1 with the IP of another box on the same LAN as 
the testbox) Is this what should happen?

> The problem is that consecutive runs don't give consistent results and 
> sometimes stalls.  You may want to try that.

well, there's a natural saturation point after a few hundred tasks 
(depending on your CPU's speed), at which point there's no idle time 
left. From that point on things get slower progressively (and the 
ability of the shell to start new ping tasks is impacted as well), but 
that's expected on an overloaded system, isnt it?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-21 Thread Ingo Molnar

* Mike Galbraith <[EMAIL PROTECTED]> wrote:

> > It doesn't make much of a difference.
> 
> I thought this was history.  With your config, I was finally able to 
> reproduce the anomaly (only with your proggy though), and Ingo's patch 
> does indeed fix it here.
> 
> Freshly reproduced anomaly and patch verification, running 2.6.23-rc3 
> with your config, both with and without Ingo's patch reverted:
> 
> 6561 root  20   0  1696  492  404 S 32.0  0.0   0:30.83 0 lt
> 6562 root  20   0  1696  336  248 R 32.0  0.0   0:30.79 0 lt
> 6563 root  20   0  1696  336  248 R 32.0  0.0   0:30.80 0 lt
> 6564 root  20   0  2888 1236 1028 R  4.6  0.1   0:05.26 0 sh
> 
> 6507 root  20   0  2888 1236 1028 R 25.8  0.1   0:30.75 0 sh
> 6504 root  20   0  1696  492  404 R 24.4  0.0   0:29.26 0 lt
> 6505 root  20   0  1696  336  248 R 24.4  0.0   0:29.26 0 lt
> 6506 root  20   0  1696  336  248 R 24.4  0.0   0:29.25 0 lt

oh, great! I'm glad we didnt discard this as a pure sched_clock 
resolution artifact.

Roman, a quick & easy request: please send the usual cfs-debug-info.sh 
output captured while your testcase is running. (Preferably try .23-rc3 
or later as Mike did, which has the most recent scheduler code, it 
includes the patch i sent to you already.) I'll reply to your 
sleeper-fairness questions separately, but in any case we need to figure 
out what's happening on your box - if you can still reproduce it with 
.23-rc3. Thanks,

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-21 Thread Mike Galbraith
On Tue, 2007-08-21 at 00:19 +0200, Roman Zippel wrote: 
> Hi,
> 
> On Sat, 11 Aug 2007, Ingo Molnar wrote:
> 
> > the only relevant thing that comes to mind at the moment is that last 
> > week Peter noticed a buggy aspect of sleeper bonuses (in that we do not 
> > rate-limit their output, hence we 'waste' them instead of redistributing 
> > them), and i've got the small patch below in my queue to fix that - 
> > could you give it a try?
> 
> It doesn't make much of a difference.

I thought this was history.  With your config, I was finally able to
reproduce the anomaly (only with your proggy though), and Ingo's patch
does indeed fix it here.

Freshly reproduced anomaly and patch verification, running 2.6.23-rc3
with your config, both with and without Ingo's patch reverted:

6561 root  20   0  1696  492  404 S 32.0  0.0   0:30.83 0 lt
6562 root  20   0  1696  336  248 R 32.0  0.0   0:30.79 0 lt
6563 root  20   0  1696  336  248 R 32.0  0.0   0:30.80 0 lt
6564 root  20   0  2888 1236 1028 R  4.6  0.1   0:05.26 0 sh

6507 root  20   0  2888 1236 1028 R 25.8  0.1   0:30.75 0 sh
6504 root  20   0  1696  492  404 R 24.4  0.0   0:29.26 0 lt
6505 root  20   0  1696  336  248 R 24.4  0.0   0:29.26 0 lt
6506 root  20   0  1696  336  248 R 24.4  0.0   0:29.25 0 lt

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-21 Thread Mike Galbraith
On Tue, 2007-08-21 at 00:19 +0200, Roman Zippel wrote: 
 Hi,
 
 On Sat, 11 Aug 2007, Ingo Molnar wrote:
 
  the only relevant thing that comes to mind at the moment is that last 
  week Peter noticed a buggy aspect of sleeper bonuses (in that we do not 
  rate-limit their output, hence we 'waste' them instead of redistributing 
  them), and i've got the small patch below in my queue to fix that - 
  could you give it a try?
 
 It doesn't make much of a difference.

I thought this was history.  With your config, I was finally able to
reproduce the anomaly (only with your proggy though), and Ingo's patch
does indeed fix it here.

Freshly reproduced anomaly and patch verification, running 2.6.23-rc3
with your config, both with and without Ingo's patch reverted:

6561 root  20   0  1696  492  404 S 32.0  0.0   0:30.83 0 lt
6562 root  20   0  1696  336  248 R 32.0  0.0   0:30.79 0 lt
6563 root  20   0  1696  336  248 R 32.0  0.0   0:30.80 0 lt
6564 root  20   0  2888 1236 1028 R  4.6  0.1   0:05.26 0 sh

6507 root  20   0  2888 1236 1028 R 25.8  0.1   0:30.75 0 sh
6504 root  20   0  1696  492  404 R 24.4  0.0   0:29.26 0 lt
6505 root  20   0  1696  336  248 R 24.4  0.0   0:29.26 0 lt
6506 root  20   0  1696  336  248 R 24.4  0.0   0:29.25 0 lt

-Mike

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-21 Thread Ingo Molnar

* Mike Galbraith [EMAIL PROTECTED] wrote:

  It doesn't make much of a difference.
 
 I thought this was history.  With your config, I was finally able to 
 reproduce the anomaly (only with your proggy though), and Ingo's patch 
 does indeed fix it here.
 
 Freshly reproduced anomaly and patch verification, running 2.6.23-rc3 
 with your config, both with and without Ingo's patch reverted:
 
 6561 root  20   0  1696  492  404 S 32.0  0.0   0:30.83 0 lt
 6562 root  20   0  1696  336  248 R 32.0  0.0   0:30.79 0 lt
 6563 root  20   0  1696  336  248 R 32.0  0.0   0:30.80 0 lt
 6564 root  20   0  2888 1236 1028 R  4.6  0.1   0:05.26 0 sh
 
 6507 root  20   0  2888 1236 1028 R 25.8  0.1   0:30.75 0 sh
 6504 root  20   0  1696  492  404 R 24.4  0.0   0:29.26 0 lt
 6505 root  20   0  1696  336  248 R 24.4  0.0   0:29.26 0 lt
 6506 root  20   0  1696  336  248 R 24.4  0.0   0:29.25 0 lt

oh, great! I'm glad we didnt discard this as a pure sched_clock 
resolution artifact.

Roman, a quick  easy request: please send the usual cfs-debug-info.sh 
output captured while your testcase is running. (Preferably try .23-rc3 
or later as Mike did, which has the most recent scheduler code, it 
includes the patch i sent to you already.) I'll reply to your 
sleeper-fairness questions separately, but in any case we need to figure 
out what's happening on your box - if you can still reproduce it with 
.23-rc3. Thanks,

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-21 Thread Ingo Molnar

* Al Boldi [EMAIL PROTECTED] wrote:

 There is one workload that still isn't performing well; it's a 
 web-server workload that spawns 1K+ client procs.  It can be emulated 
 by using this:
 
   for i in `seq 1 to `; do ping 10.1 -A  /dev/null  done

on bash i did this as:

  for ((i=0; i; i++)); do ping 10.1 -A  /dev/null  done

and this quickly creates a monster-runqueue with tons of ping tasks 
pending. (i replaced 10.1 with the IP of another box on the same LAN as 
the testbox) Is this what should happen?

 The problem is that consecutive runs don't give consistent results and 
 sometimes stalls.  You may want to try that.

well, there's a natural saturation point after a few hundred tasks 
(depending on your CPU's speed), at which point there's no idle time 
left. From that point on things get slower progressively (and the 
ability of the shell to start new ping tasks is impacted as well), but 
that's expected on an overloaded system, isnt it?

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-21 Thread Roman Zippel
Hi,

On Tue, 21 Aug 2007, Mike Galbraith wrote:

 I thought this was history.  With your config, I was finally able to
 reproduce the anomaly (only with your proggy though), and Ingo's patch
 does indeed fix it here.
 
 Freshly reproduced anomaly and patch verification, running 2.6.23-rc3
 with your config, both with and without Ingo's patch reverted:

I did update to 2.6.23-rc3-git1 first, but I ended up reverting the patch, 
as I didn't notice it had been applied already. Sorry about that.
With this patch the underflows are gone, but there are still the 
overflows, so the questions from the last mail still remain.

bye, Roman
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-21 Thread Al Boldi
Ingo Molnar wrote:
 * Al Boldi [EMAIL PROTECTED] wrote:
  There is one workload that still isn't performing well; it's a
  web-server workload that spawns 1K+ client procs.  It can be emulated
  by using this:
 
for i in `seq 1 to `; do ping 10.1 -A  /dev/null  done

 on bash i did this as:

   for ((i=0; i; i++)); do ping 10.1 -A  /dev/null  done

 and this quickly creates a monster-runqueue with tons of ping tasks
 pending. (i replaced 10.1 with the IP of another box on the same LAN as
 the testbox) Is this what should happen?

Yes, sometimes they start pending and sometimes they run immediately.

  The problem is that consecutive runs don't give consistent results and
  sometimes stalls.  You may want to try that.

 well, there's a natural saturation point after a few hundred tasks
 (depending on your CPU's speed), at which point there's no idle time
 left. From that point on things get slower progressively (and the
 ability of the shell to start new ping tasks is impacted as well), but
 that's expected on an overloaded system, isnt it?

Of course, things should get slower with higher load, but it should be 
consistent without stalls.

To see this problem, make sure you boot into /bin/sh with the normal VGA 
console (ie. not fb-console).  Then try each loop a few times to show 
different behaviour; loops like:

# for ((i=0; i; i++)); do ping 10.1 -A  /dev/null  done

# for ((i=0; i; i++)); do nice -99 ping 10.1 -A  /dev/null  done

# { for ((i=0; i; i++)); do
ping 10.1 -A  /dev/null 
done }  /dev/null 21

Especially the last one sometimes causes a complete console lock-up, while 
the other two sometimes stall then surge periodically.

BTW, I am also wondering how one might test threading behaviour wrt to 
startup and sync-on-exit with parent thread.  This may not show any problems 
with small number of threads, but how does it scale with 1K+?


Thanks!

--
Al

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-20 Thread Roman Zippel
Hi,

On Sat, 11 Aug 2007, Ingo Molnar wrote:

> the only relevant thing that comes to mind at the moment is that last 
> week Peter noticed a buggy aspect of sleeper bonuses (in that we do not 
> rate-limit their output, hence we 'waste' them instead of redistributing 
> them), and i've got the small patch below in my queue to fix that - 
> could you give it a try?

It doesn't make much of a difference. OTOH if I disabled the sleeper code 
completely in __update_curr(), I get this:

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 3139 roman 20   0  1796  344  256 R 21.7  0.3   0:02.68 lt
 3138 roman 20   0  1796  344  256 R 21.7  0.3   0:02.68 lt
 3137 roman 20   0  1796  520  432 R 21.7  0.4   0:02.68 lt
 3136 roman 20   0  1532  268  216 R 34.5  0.2   0:06.82 l

Disabling this code completely via sched_features makes only a minor 
difference:

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 3139 roman 20   0  1796  344  256 R 20.4  0.3   0:09.94 lt
 3138 roman 20   0  1796  344  256 R 20.4  0.3   0:09.94 lt
 3137 roman 20   0  1796  520  432 R 20.4  0.4   0:09.94 lt
 3136 roman 20   0  1532  268  216 R 39.1  0.2   0:19.20 l

> this is just a blind stab into the dark - i couldnt see any real impact 
> from that patch in various workloads (and it's not upstream yet), so it 
> might not make a big difference.

Can we please skip to the point, where you try to explain the intention a 
little more?
If I had to guess that this is supposed to keep the runtime balance, then 
it would be better to use wait_runtime to adjust fair_clock, from where it 
would be evenly distributed to all tasks (but this had to be done during 
enqueue and dequeue). OTOH this also had then a consequence for the wait 
queue, as fair_clock is used to calculate fair_key.
IMHO current wait_runtime should have some influence in calculating the 
sleep bonus, so that wait_runtime doesn't constantly overflow for tasks 
which only run occasionally.

bye, Roman
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-20 Thread Roman Zippel
Hi,

On Sat, 11 Aug 2007, Ingo Molnar wrote:

 the only relevant thing that comes to mind at the moment is that last 
 week Peter noticed a buggy aspect of sleeper bonuses (in that we do not 
 rate-limit their output, hence we 'waste' them instead of redistributing 
 them), and i've got the small patch below in my queue to fix that - 
 could you give it a try?

It doesn't make much of a difference. OTOH if I disabled the sleeper code 
completely in __update_curr(), I get this:

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 3139 roman 20   0  1796  344  256 R 21.7  0.3   0:02.68 lt
 3138 roman 20   0  1796  344  256 R 21.7  0.3   0:02.68 lt
 3137 roman 20   0  1796  520  432 R 21.7  0.4   0:02.68 lt
 3136 roman 20   0  1532  268  216 R 34.5  0.2   0:06.82 l

Disabling this code completely via sched_features makes only a minor 
difference:

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 3139 roman 20   0  1796  344  256 R 20.4  0.3   0:09.94 lt
 3138 roman 20   0  1796  344  256 R 20.4  0.3   0:09.94 lt
 3137 roman 20   0  1796  520  432 R 20.4  0.4   0:09.94 lt
 3136 roman 20   0  1532  268  216 R 39.1  0.2   0:19.20 l

 this is just a blind stab into the dark - i couldnt see any real impact 
 from that patch in various workloads (and it's not upstream yet), so it 
 might not make a big difference.

Can we please skip to the point, where you try to explain the intention a 
little more?
If I had to guess that this is supposed to keep the runtime balance, then 
it would be better to use wait_runtime to adjust fair_clock, from where it 
would be evenly distributed to all tasks (but this had to be done during 
enqueue and dequeue). OTOH this also had then a consequence for the wait 
queue, as fair_clock is used to calculate fair_key.
IMHO current wait_runtime should have some influence in calculating the 
sleep bonus, so that wait_runtime doesn't constantly overflow for tasks 
which only run occasionally.

bye, Roman
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   >