Re: Bug in calcru in he 6.2 and 6.3 kernels

2008-07-20 Thread Garrett Cooper
On Sun, Jul 20, 2008 at 7:13 AM, Kris Kennaway <[EMAIL PROTECTED]> wrote:
> Murty, Ravi wrote:
>>
>> Jeremy, thanks. I look forward to switching to ULE in 7.0 and realize
>> that it is a completely new scheduler (I spent some time yesterday
>> looking at it) -- which is my porting effort is much harder than a
>> simple cut and paste. I just wanted to find out if there was something
>> simple I could look at before I spent weeks porting my changes to the
>> scheduler (also I can justify the move to 7.x). I can't figure out why
>> my 8 app threads run so slow -- I am booting the kernel is single user
>> mode with not much else running and my threads do a lot of work and
>> don't really sleep.
>
> Once again, ULE in 6.x is too broken to use, which is why major changes were
> required to get it to a suitable state in 7.0.  It's a shame that you didn't
> read about this before putting in so much work.
>
> Kris

Perhaps an $(error ) (don't know the pmake analog) should be put into
the kernel Makefile to note that ULE is busted?
-Garrett
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Bug in calcru in he 6.2 and 6.3 kernels

2008-07-20 Thread Kris Kennaway

Murty, Ravi wrote:

Jeremy, thanks. I look forward to switching to ULE in 7.0 and realize
that it is a completely new scheduler (I spent some time yesterday
looking at it) -- which is my porting effort is much harder than a
simple cut and paste. I just wanted to find out if there was something
simple I could look at before I spent weeks porting my changes to the
scheduler (also I can justify the move to 7.x). I can't figure out why
my 8 app threads run so slow -- I am booting the kernel is single user
mode with not much else running and my threads do a lot of work and
don't really sleep.


Once again, ULE in 6.x is too broken to use, which is why major changes 
were required to get it to a suitable state in 7.0.  It's a shame that 
you didn't read about this before putting in so much work.


Kris

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: Bug in calcru in he 6.2 and 6.3 kernels

2008-07-20 Thread Murty, Ravi
Jeremy, thanks. I look forward to switching to ULE in 7.0 and realize
that it is a completely new scheduler (I spent some time yesterday
looking at it) -- which is my porting effort is much harder than a
simple cut and paste. I just wanted to find out if there was something
simple I could look at before I spent weeks porting my changes to the
scheduler (also I can justify the move to 7.x). I can't figure out why
my 8 app threads run so slow -- I am booting the kernel is single user
mode with not much else running and my threads do a lot of work and
don't really sleep.

Thanks
Ravi


-Original Message-
From: Jeremy Chadwick [mailto:[EMAIL PROTECTED] 
Sent: Sunday, July 20, 2008 6:59 AM
To: Murty, Ravi
Cc: Kris Kennaway; [EMAIL PROTECTED]; freebsd-hackers@freebsd.org
Subject: Re: Bug in calcru in he 6.2 and 6.3 kernels

On Sun, Jul 20, 2008 at 06:51:22AM -0700, Murty, Ravi wrote:
> Has anyone identified the issue(s) that might be broken in the ULE
> scheduler in 6.2? I am running a rather simple test - creates 8
threads
> and runs it on an 8 CPU system (not a whole lot running on the
system).
> When I run it with ULE, it runs slow, very slow sometimes - it's
almost
> like the threads aren't picked to run. When I switch to 4BSD, things
run
> fine. I was wondering if there is something I could look at? I realize
> it is broken, but I've added lots of stuff to the scheduler (for our
> project) which I'd have to migrate to ULE in 7.0. I'd like to figure
out
> what might be going on in 6.2 before I spend the time to migrate to
7.0.

ULE in 7.0 is not the same as in 6.2 -- it was entirely re-written
before 7.0 was released.  The ULE scheduler in 7.0 is often called "ULE
2.0", to signify that it's not the same ULE scheduler in previous
FreeBSD releases.

See "New Scheduler: ULE 2.0 / 3.0" here:

http://ivoras.sharanet.org/freebsd/freebsd7.html

Technical details from the author:

http://jeffr-tech.livejournal.com/3729.html

The reason the ULE scheduler in 7.0 is not the default scheduler is
because the community felt more testing needed to be done.  I believe
the plan is to have ULE as the default scheduler in 7.1.

You should really be running 4BSD on 6.x, and ULE on 7.x (unless you
have reason to run 4BSD on 7.x -- and some people do.  And no, I don't
know the reasons why).

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Bug in calcru in he 6.2 and 6.3 kernels

2008-07-20 Thread Jeremy Chadwick
On Sun, Jul 20, 2008 at 06:51:22AM -0700, Murty, Ravi wrote:
> Has anyone identified the issue(s) that might be broken in the ULE
> scheduler in 6.2? I am running a rather simple test - creates 8 threads
> and runs it on an 8 CPU system (not a whole lot running on the system).
> When I run it with ULE, it runs slow, very slow sometimes - it's almost
> like the threads aren't picked to run. When I switch to 4BSD, things run
> fine. I was wondering if there is something I could look at? I realize
> it is broken, but I've added lots of stuff to the scheduler (for our
> project) which I'd have to migrate to ULE in 7.0. I'd like to figure out
> what might be going on in 6.2 before I spend the time to migrate to 7.0.

ULE in 7.0 is not the same as in 6.2 -- it was entirely re-written
before 7.0 was released.  The ULE scheduler in 7.0 is often called "ULE
2.0", to signify that it's not the same ULE scheduler in previous
FreeBSD releases.

See "New Scheduler: ULE 2.0 / 3.0" here:

http://ivoras.sharanet.org/freebsd/freebsd7.html

Technical details from the author:

http://jeffr-tech.livejournal.com/3729.html

The reason the ULE scheduler in 7.0 is not the default scheduler is
because the community felt more testing needed to be done.  I believe
the plan is to have ULE as the default scheduler in 7.1.

You should really be running 4BSD on 6.x, and ULE on 7.x (unless you
have reason to run 4BSD on 7.x -- and some people do.  And no, I don't
know the reasons why).

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: Bug in calcru in he 6.2 and 6.3 kernels

2008-07-20 Thread Murty, Ravi
Has anyone identified the issue(s) that might be broken in the ULE
scheduler in 6.2? I am running a rather simple test - creates 8 threads
and runs it on an 8 CPU system (not a whole lot running on the system).
When I run it with ULE, it runs slow, very slow sometimes - it's almost
like the threads aren't picked to run. When I switch to 4BSD, things run
fine. I was wondering if there is something I could look at? I realize
it is broken, but I've added lots of stuff to the scheduler (for our
project) which I'd have to migrate to ULE in 7.0. I'd like to figure out
what might be going on in 6.2 before I spend the time to migrate to 7.0.

Thanks
Ravi


-Original Message-
From: Kris Kennaway [mailto:[EMAIL PROTECTED] 
Sent: Monday, July 07, 2008 2:04 PM
To: [EMAIL PROTECTED]
Cc: Murty, Ravi; freebsd-hackers@freebsd.org
Subject: Re: Bug in calcru in he 6.2 and 6.3 kernels

Xin LI wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Kris Kennaway wrote:
> | Murty, Ravi wrote:
> |> Hello everyone,
> |>
> |>
> |>
> |> Finally found what my last problem was. We were running top in a
loop
> |> and running some workloads that called sched_bind() to bind threads
to
> |> specific CPUs. The problem was that (and I am using ULE) sched_bind
> |> calls a function to notify another CPU of a thread and then
mi_switches
> |> out of it. Since mi_switch sets the "oncpu" field of the thread to
NOCPU
> |> and given the thread is still running, calcru would come in and
assert
> |> the fact that "If I am running I better no be on NOCPU".. It
appears
> |> that in other parts of the kernel (e.g. forward_signal) this is
> |> acceptable (i.e. it is okay to be running and oncpu is NOCPU).
> |>
> |>
> |> Thanks
> |> Ravi
> |
> | Don't use ULE in 6.x, it's broken and will not be fixed.
> 
> Perhaps we should mark it as broken using #error?  After all the ULE
> changes in 7.x is amazing and we do not want to have users to obtain
bad
> impressions from the 6.x versions...
> 
> I am not sure but some explicit warning message saying "ULE has been
> revamped in FreeBSD 7.x+ and will not be MFC'ed back to 6.x, please
use
> SCHED_4BSD or upgrade to 7.x." seems to be better than having them to
> pursue the mailing list archive...

I would agree with this; if you're happy running unstable and broken 
scheduler code, you're surely able to update to 7.0 and run stable and 
working scheduler code :)

We should run it past re@ first since it's a change to a stable branch, 
but it's experimental code so I don't see an issue.

Kris
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Bug in calcru in he 6.2 and 6.3 kernels

2008-07-07 Thread Kris Kennaway

Xin LI wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Kris Kennaway wrote:
| Murty, Ravi wrote:
|> Hello everyone,
|>
|>
|>
|> Finally found what my last problem was. We were running top in a loop
|> and running some workloads that called sched_bind() to bind threads to
|> specific CPUs. The problem was that (and I am using ULE) sched_bind
|> calls a function to notify another CPU of a thread and then mi_switches
|> out of it. Since mi_switch sets the "oncpu" field of the thread to NOCPU
|> and given the thread is still running, calcru would come in and assert
|> the fact that "If I am running I better no be on NOCPU".. It appears
|> that in other parts of the kernel (e.g. forward_signal) this is
|> acceptable (i.e. it is okay to be running and oncpu is NOCPU).
|>
|>
|> Thanks
|> Ravi
|
| Don't use ULE in 6.x, it's broken and will not be fixed.

Perhaps we should mark it as broken using #error?  After all the ULE
changes in 7.x is amazing and we do not want to have users to obtain bad
impressions from the 6.x versions...

I am not sure but some explicit warning message saying "ULE has been
revamped in FreeBSD 7.x+ and will not be MFC'ed back to 6.x, please use
SCHED_4BSD or upgrade to 7.x." seems to be better than having them to
pursue the mailing list archive...


I would agree with this; if you're happy running unstable and broken 
scheduler code, you're surely able to update to 7.0 and run stable and 
working scheduler code :)


We should run it past re@ first since it's a change to a stable branch, 
but it's experimental code so I don't see an issue.


Kris
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Bug in calcru in he 6.2 and 6.3 kernels

2008-07-07 Thread Xin LI

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Kris Kennaway wrote:
| Murty, Ravi wrote:
|> Hello everyone,
|>
|>
|>
|> Finally found what my last problem was. We were running top in a loop
|> and running some workloads that called sched_bind() to bind threads to
|> specific CPUs. The problem was that (and I am using ULE) sched_bind
|> calls a function to notify another CPU of a thread and then mi_switches
|> out of it. Since mi_switch sets the "oncpu" field of the thread to NOCPU
|> and given the thread is still running, calcru would come in and assert
|> the fact that "If I am running I better no be on NOCPU".. It appears
|> that in other parts of the kernel (e.g. forward_signal) this is
|> acceptable (i.e. it is okay to be running and oncpu is NOCPU).
|>
|>
|> Thanks
|> Ravi
|
| Don't use ULE in 6.x, it's broken and will not be fixed.

Perhaps we should mark it as broken using #error?  After all the ULE
changes in 7.x is amazing and we do not want to have users to obtain bad
impressions from the 6.x versions...

I am not sure but some explicit warning message saying "ULE has been
revamped in FreeBSD 7.x+ and will not be MFC'ed back to 6.x, please use
SCHED_4BSD or upgrade to 7.x." seems to be better than having them to
pursue the mailing list archive...

Cheers,
- --
Xin LI <[EMAIL PROTECTED]>http://www.delphij.net/
FreeBSD - The Power to Serve!
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.9 (FreeBSD)

iEYEARECAAYFAkhyfjYACgkQi+vbBBjt66CdLQCfet8ls7tfg5jV5I7gSOw8QwhC
maoAn2sBwjfoOBhFt6u5fELK9X6XMp0A
=Bxr3
-END PGP SIGNATURE-
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Bug in calcru in he 6.2 and 6.3 kernels

2008-07-07 Thread Kris Kennaway

Murty, Ravi wrote:

Hello everyone,

 


Finally found what my last problem was. We were running top in a loop
and running some workloads that called sched_bind() to bind threads to
specific CPUs. The problem was that (and I am using ULE) sched_bind
calls a function to notify another CPU of a thread and then mi_switches
out of it. Since mi_switch sets the "oncpu" field of the thread to NOCPU
and given the thread is still running, calcru would come in and assert
the fact that "If I am running I better no be on NOCPU".. It appears
that in other parts of the kernel (e.g. forward_signal) this is
acceptable (i.e. it is okay to be running and oncpu is NOCPU). 

 


Thanks
Ravi


Don't use ULE in 6.x, it's broken and will not be fixed.

Kris
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"