Re: Bug in calcru in he 6.2 and 6.3 kernels
On Sun, Jul 20, 2008 at 7:13 AM, Kris Kennaway <[EMAIL PROTECTED]> wrote: > Murty, Ravi wrote: >> >> Jeremy, thanks. I look forward to switching to ULE in 7.0 and realize >> that it is a completely new scheduler (I spent some time yesterday >> looking at it) -- which is my porting effort is much harder than a >> simple cut and paste. I just wanted to find out if there was something >> simple I could look at before I spent weeks porting my changes to the >> scheduler (also I can justify the move to 7.x). I can't figure out why >> my 8 app threads run so slow -- I am booting the kernel is single user >> mode with not much else running and my threads do a lot of work and >> don't really sleep. > > Once again, ULE in 6.x is too broken to use, which is why major changes were > required to get it to a suitable state in 7.0. It's a shame that you didn't > read about this before putting in so much work. > > Kris Perhaps an $(error ) (don't know the pmake analog) should be put into the kernel Makefile to note that ULE is busted? -Garrett ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Bug in calcru in he 6.2 and 6.3 kernels
Murty, Ravi wrote: Jeremy, thanks. I look forward to switching to ULE in 7.0 and realize that it is a completely new scheduler (I spent some time yesterday looking at it) -- which is my porting effort is much harder than a simple cut and paste. I just wanted to find out if there was something simple I could look at before I spent weeks porting my changes to the scheduler (also I can justify the move to 7.x). I can't figure out why my 8 app threads run so slow -- I am booting the kernel is single user mode with not much else running and my threads do a lot of work and don't really sleep. Once again, ULE in 6.x is too broken to use, which is why major changes were required to get it to a suitable state in 7.0. It's a shame that you didn't read about this before putting in so much work. Kris ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
RE: Bug in calcru in he 6.2 and 6.3 kernels
Jeremy, thanks. I look forward to switching to ULE in 7.0 and realize that it is a completely new scheduler (I spent some time yesterday looking at it) -- which is my porting effort is much harder than a simple cut and paste. I just wanted to find out if there was something simple I could look at before I spent weeks porting my changes to the scheduler (also I can justify the move to 7.x). I can't figure out why my 8 app threads run so slow -- I am booting the kernel is single user mode with not much else running and my threads do a lot of work and don't really sleep. Thanks Ravi -Original Message- From: Jeremy Chadwick [mailto:[EMAIL PROTECTED] Sent: Sunday, July 20, 2008 6:59 AM To: Murty, Ravi Cc: Kris Kennaway; [EMAIL PROTECTED]; freebsd-hackers@freebsd.org Subject: Re: Bug in calcru in he 6.2 and 6.3 kernels On Sun, Jul 20, 2008 at 06:51:22AM -0700, Murty, Ravi wrote: > Has anyone identified the issue(s) that might be broken in the ULE > scheduler in 6.2? I am running a rather simple test - creates 8 threads > and runs it on an 8 CPU system (not a whole lot running on the system). > When I run it with ULE, it runs slow, very slow sometimes - it's almost > like the threads aren't picked to run. When I switch to 4BSD, things run > fine. I was wondering if there is something I could look at? I realize > it is broken, but I've added lots of stuff to the scheduler (for our > project) which I'd have to migrate to ULE in 7.0. I'd like to figure out > what might be going on in 6.2 before I spend the time to migrate to 7.0. ULE in 7.0 is not the same as in 6.2 -- it was entirely re-written before 7.0 was released. The ULE scheduler in 7.0 is often called "ULE 2.0", to signify that it's not the same ULE scheduler in previous FreeBSD releases. See "New Scheduler: ULE 2.0 / 3.0" here: http://ivoras.sharanet.org/freebsd/freebsd7.html Technical details from the author: http://jeffr-tech.livejournal.com/3729.html The reason the ULE scheduler in 7.0 is not the default scheduler is because the community felt more testing needed to be done. I believe the plan is to have ULE as the default scheduler in 7.1. You should really be running 4BSD on 6.x, and ULE on 7.x (unless you have reason to run 4BSD on 7.x -- and some people do. And no, I don't know the reasons why). -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Bug in calcru in he 6.2 and 6.3 kernels
On Sun, Jul 20, 2008 at 06:51:22AM -0700, Murty, Ravi wrote: > Has anyone identified the issue(s) that might be broken in the ULE > scheduler in 6.2? I am running a rather simple test - creates 8 threads > and runs it on an 8 CPU system (not a whole lot running on the system). > When I run it with ULE, it runs slow, very slow sometimes - it's almost > like the threads aren't picked to run. When I switch to 4BSD, things run > fine. I was wondering if there is something I could look at? I realize > it is broken, but I've added lots of stuff to the scheduler (for our > project) which I'd have to migrate to ULE in 7.0. I'd like to figure out > what might be going on in 6.2 before I spend the time to migrate to 7.0. ULE in 7.0 is not the same as in 6.2 -- it was entirely re-written before 7.0 was released. The ULE scheduler in 7.0 is often called "ULE 2.0", to signify that it's not the same ULE scheduler in previous FreeBSD releases. See "New Scheduler: ULE 2.0 / 3.0" here: http://ivoras.sharanet.org/freebsd/freebsd7.html Technical details from the author: http://jeffr-tech.livejournal.com/3729.html The reason the ULE scheduler in 7.0 is not the default scheduler is because the community felt more testing needed to be done. I believe the plan is to have ULE as the default scheduler in 7.1. You should really be running 4BSD on 6.x, and ULE on 7.x (unless you have reason to run 4BSD on 7.x -- and some people do. And no, I don't know the reasons why). -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
RE: Bug in calcru in he 6.2 and 6.3 kernels
Has anyone identified the issue(s) that might be broken in the ULE scheduler in 6.2? I am running a rather simple test - creates 8 threads and runs it on an 8 CPU system (not a whole lot running on the system). When I run it with ULE, it runs slow, very slow sometimes - it's almost like the threads aren't picked to run. When I switch to 4BSD, things run fine. I was wondering if there is something I could look at? I realize it is broken, but I've added lots of stuff to the scheduler (for our project) which I'd have to migrate to ULE in 7.0. I'd like to figure out what might be going on in 6.2 before I spend the time to migrate to 7.0. Thanks Ravi -Original Message- From: Kris Kennaway [mailto:[EMAIL PROTECTED] Sent: Monday, July 07, 2008 2:04 PM To: [EMAIL PROTECTED] Cc: Murty, Ravi; freebsd-hackers@freebsd.org Subject: Re: Bug in calcru in he 6.2 and 6.3 kernels Xin LI wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Kris Kennaway wrote: > | Murty, Ravi wrote: > |> Hello everyone, > |> > |> > |> > |> Finally found what my last problem was. We were running top in a loop > |> and running some workloads that called sched_bind() to bind threads to > |> specific CPUs. The problem was that (and I am using ULE) sched_bind > |> calls a function to notify another CPU of a thread and then mi_switches > |> out of it. Since mi_switch sets the "oncpu" field of the thread to NOCPU > |> and given the thread is still running, calcru would come in and assert > |> the fact that "If I am running I better no be on NOCPU".. It appears > |> that in other parts of the kernel (e.g. forward_signal) this is > |> acceptable (i.e. it is okay to be running and oncpu is NOCPU). > |> > |> > |> Thanks > |> Ravi > | > | Don't use ULE in 6.x, it's broken and will not be fixed. > > Perhaps we should mark it as broken using #error? After all the ULE > changes in 7.x is amazing and we do not want to have users to obtain bad > impressions from the 6.x versions... > > I am not sure but some explicit warning message saying "ULE has been > revamped in FreeBSD 7.x+ and will not be MFC'ed back to 6.x, please use > SCHED_4BSD or upgrade to 7.x." seems to be better than having them to > pursue the mailing list archive... I would agree with this; if you're happy running unstable and broken scheduler code, you're surely able to update to 7.0 and run stable and working scheduler code :) We should run it past re@ first since it's a change to a stable branch, but it's experimental code so I don't see an issue. Kris ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Bug in calcru in he 6.2 and 6.3 kernels
Xin LI wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Kris Kennaway wrote: | Murty, Ravi wrote: |> Hello everyone, |> |> |> |> Finally found what my last problem was. We were running top in a loop |> and running some workloads that called sched_bind() to bind threads to |> specific CPUs. The problem was that (and I am using ULE) sched_bind |> calls a function to notify another CPU of a thread and then mi_switches |> out of it. Since mi_switch sets the "oncpu" field of the thread to NOCPU |> and given the thread is still running, calcru would come in and assert |> the fact that "If I am running I better no be on NOCPU".. It appears |> that in other parts of the kernel (e.g. forward_signal) this is |> acceptable (i.e. it is okay to be running and oncpu is NOCPU). |> |> |> Thanks |> Ravi | | Don't use ULE in 6.x, it's broken and will not be fixed. Perhaps we should mark it as broken using #error? After all the ULE changes in 7.x is amazing and we do not want to have users to obtain bad impressions from the 6.x versions... I am not sure but some explicit warning message saying "ULE has been revamped in FreeBSD 7.x+ and will not be MFC'ed back to 6.x, please use SCHED_4BSD or upgrade to 7.x." seems to be better than having them to pursue the mailing list archive... I would agree with this; if you're happy running unstable and broken scheduler code, you're surely able to update to 7.0 and run stable and working scheduler code :) We should run it past re@ first since it's a change to a stable branch, but it's experimental code so I don't see an issue. Kris ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Bug in calcru in he 6.2 and 6.3 kernels
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Kris Kennaway wrote: | Murty, Ravi wrote: |> Hello everyone, |> |> |> |> Finally found what my last problem was. We were running top in a loop |> and running some workloads that called sched_bind() to bind threads to |> specific CPUs. The problem was that (and I am using ULE) sched_bind |> calls a function to notify another CPU of a thread and then mi_switches |> out of it. Since mi_switch sets the "oncpu" field of the thread to NOCPU |> and given the thread is still running, calcru would come in and assert |> the fact that "If I am running I better no be on NOCPU".. It appears |> that in other parts of the kernel (e.g. forward_signal) this is |> acceptable (i.e. it is okay to be running and oncpu is NOCPU). |> |> |> Thanks |> Ravi | | Don't use ULE in 6.x, it's broken and will not be fixed. Perhaps we should mark it as broken using #error? After all the ULE changes in 7.x is amazing and we do not want to have users to obtain bad impressions from the 6.x versions... I am not sure but some explicit warning message saying "ULE has been revamped in FreeBSD 7.x+ and will not be MFC'ed back to 6.x, please use SCHED_4BSD or upgrade to 7.x." seems to be better than having them to pursue the mailing list archive... Cheers, - -- Xin LI <[EMAIL PROTECTED]>http://www.delphij.net/ FreeBSD - The Power to Serve! -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.9 (FreeBSD) iEYEARECAAYFAkhyfjYACgkQi+vbBBjt66CdLQCfet8ls7tfg5jV5I7gSOw8QwhC maoAn2sBwjfoOBhFt6u5fELK9X6XMp0A =Bxr3 -END PGP SIGNATURE- ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Bug in calcru in he 6.2 and 6.3 kernels
Murty, Ravi wrote: Hello everyone, Finally found what my last problem was. We were running top in a loop and running some workloads that called sched_bind() to bind threads to specific CPUs. The problem was that (and I am using ULE) sched_bind calls a function to notify another CPU of a thread and then mi_switches out of it. Since mi_switch sets the "oncpu" field of the thread to NOCPU and given the thread is still running, calcru would come in and assert the fact that "If I am running I better no be on NOCPU".. It appears that in other parts of the kernel (e.g. forward_signal) this is acceptable (i.e. it is okay to be running and oncpu is NOCPU). Thanks Ravi Don't use ULE in 6.x, it's broken and will not be fixed. Kris ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"