I think there *was* a decision and it effectively changed how sched_yield() effectively operates, and that it may not do what we expect any more.
See this thread (the discussion of Linux/sched_yield() comes in the later messages): http://www.open-mpi.org/community/lists/users/2010/07/13729.php I believe there's similar threads in the MPICH mailing list archives; that's why Dave posted on the OMPI list about it. We briefly discussed replacing OMPI's sched_yield() with a usleep(1), but it was shot down. On Dec 13, 2010, at 10:47 AM, Ralph Castain wrote: > Thanks for the link! > > Just to clarify for the list, my original statement is essentially correct. > When calling sched_yield, we give up the remaining portion of our time slice. > > The issue in the kernel world centers around where to put you in the > scheduling cycle once you have called sched_yield. Do you go to the end of > the schedule for your priority? Do you go to the end of the schedule for all > priorities? Or...where? > > Looks like they decided to not decide, and left several options available. > Not entirely clear of the default, and they recommend we not use sched_yield > and release the time some other method. We'll take this up on the developer > list to see what (if anything) we want to do about it. > > Bottom line for users: the results remain the same. If no other process wants > time, you'll continue to see near 100% utilization even if we yield because > we will always poll for some time before deciding to yield. > > > On Dec 13, 2010, at 7:52 AM, Jeff Squyres wrote: > >> See the discussion on kerneltrap: >> >> http://kerneltrap.org/Linux/CFS_and_sched_yield >> >> Looks like the change came in somewhere around 2.6.23 or so...? >> >> >> >> On Dec 13, 2010, at 9:38 AM, Ralph Castain wrote: >> >>> Could you at least provide a one-line explanation of that statement? >>> >>> >>> On Dec 13, 2010, at 7:31 AM, Jeff Squyres wrote: >>> >>>> Also note that recent versions of the Linux kernel have changed what >>>> sched_yield() does -- it no longer does essentially what Ralph describes >>>> below. Google around to find those discussions. >>>> >>>> >>>> On Dec 9, 2010, at 4:07 PM, Ralph Castain wrote: >>>> >>>>> Sorry for delay - am occupied with my day job. >>>>> >>>>> Yes, that is correct to an extent. When you yield the processor, all that >>>>> happens is that you surrender the rest of your scheduled time slice back >>>>> to the OS. The OS then cycles thru its scheduler and sequentially assigns >>>>> the processor to the line of waiting processes. Eventually, the OS will >>>>> cycle back to your process, and you'll begin cranking again. >>>>> >>>>> So if no other process wants or needs attention, then yes - it will cycle >>>>> back around to you pretty quickly. In cases where only system processes >>>>> are running (besides my MPI ones, of course), then I'll typically see cpu >>>>> usage drop a few percentage points - down to like 95% - because most >>>>> system tools are very courteous and call yield is they don't need to do >>>>> something. If there is something out there that wants time, or is less >>>>> courteous, then my cpu usage can change a great deal. >>>>> >>>>> Note, though, that top and ps are -very- coarse measuring tools. You'll >>>>> probably see them reading more like 100% simply because, averaged out >>>>> over their sampling periods, nobody else is using enough to measure the >>>>> difference. >>>>> >>>>> >>>>> On Dec 9, 2010, at 1:37 PM, Hicham Mouline wrote: >>>>> >>>>>>> -----Original Message----- >>>>>>> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On >>>>>>> Behalf Of Eugene Loh >>>>>>> Sent: 08 December 2010 16:19 >>>>>>> To: Open MPI Users >>>>>>> Subject: Re: [OMPI users] curious behavior during wait for broadcast: >>>>>>> 100% cpu >>>>>>> >>>>>>> I wouldn't mind some clarification here. Would CPU usage really >>>>>>> decrease, or would other processes simply have an easier time getting >>>>>>> cycles? My impression of yield was that if there were no one to yield >>>>>>> to, the "yielding" process would still go hard. Conversely, turning on >>>>>>> "yield" would still show 100% cpu, but it would be easier for other >>>>>>> processes to get time. >>>>>>> >>>>>> Any clarifications? >>>>>> >>>>>> _______________________________________________ >>>>>> users mailing list >>>>>> us...@open-mpi.org >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> >>>> >>>> -- >>>> Jeff Squyres >>>> jsquy...@cisco.com >>>> For corporate legal information go to: >>>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> -- >> Jeff Squyres >> jsquy...@cisco.com >> For corporate legal information go to: >> http://www.cisco.com/web/about/doing_business/legal/cri/ >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/