Re: [OMPI users] CPU burning in Wait state

Jeff Squyres Wed, 3 Sep 2008 18:18:09 -0400

As usual, Dick is much more eloquent than me.  :-)

He also correctly pointed out to me in an off-list mail that in myfirst reply, I casually used the internal term "blocking progress" andprobably sowed some of the initial seeds of confusion in this thread(because "blocking" has specific meaning in MPI parlance). Sorryabout that.

What I should have said is that we have on our to-do list to effect anon-polling model of making message passing progress. As has beenstated several times on this thread, OMPI currently polls for messagepassing progress. While you're in MPI_BCAST, it's quite possible/likely that OMPI will poll hard until the BCAST is done. It ispossible that a future version of OMPI will use a hybrid polling+non-polling approach for progress, such that if you call MPI_BCAST, we'llpoll for a while. And if nothing "interesting" happens after a while(i.e., the BCAST hasn't finished and nothing else seems to behappening), we'll allow OMPI's internal progression engine to block/goto sleep until something interesting happens. We casually refer tothis as "blocking progress" in OMPI developer circles, but we mean itin a very different way than the traditional "blocking" meaning forMPI communication.

Again, sorry about the confusion -- hopefully all the followups inthis thread cleared up the issue.




On Sep 3, 2008, at 7:17 PM, Richard Treumann wrote:

Vincent
1) Assume you are running an MPI program which has 16 tasks inMPI_COMM_WORLD, you have 16 dedicated CPUs and each task is singlethreaded. (a task is a distinct process, a process can contain oneor more threads) The is the most common traditional model. In thismodel, when a task makes a blocking call, the CPU is used to pollthe communication layer. With only one thread per task, there is noway the CPU can be given other useful work because the only threadis in the MPI_Bast and not available to compute. With nothing elsefor the CPU to do anyway, it may as well poll because that is likelyto complete the blocking operation in shortest time. Polling is theright choice. You should not worry that the CPU is being "burned".It will not wear out.
2) Now assume you have the same number of tasks and CPUs but youhave provided a compute thread and a communication thread in eachtask. At the moment you make an MPI_Bcast call on each task'scommunication thread you have unfinished computation that the CPUscould process on the compute threads. In this case you want the CPUto be released by the blocked MPI_Bcast so it can be used by thecompute thread. The MPI_Bcast may take longer to complete because itis not burning the CPU but if useful computation is going forwardyou come out ahead. A non-polling mode for the blocking MPI_Bcast isthe better option.
3) Take a third case - the CPUs are not dedicated to your MPI job.You have only one thread per task but when that thread is blocked inan MPI_Bcast you want other processes to be able to run. This is nota common situation in production environments but may be common inlearning or development situations. Perhaps your MPI homeworkproblem is running at the same time someone else is trying tocompile theirs on the same nodes. In this case you really do notneed the MPI_Bcast to finish in the shortest possible time and youdo want the people who share the node with you to quit complaining.Again. a non-polling mode than gives up the CPU and lets yourneighbors compilation run is best.
Which of these is closest to your situation? If it is situation 1,why would you care that CPU is burning? If it is situation 2 or 3then you do have reason to care.
Dick

Dick Treumann - MPI Team
IBM Systems & Technology Group
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846 Fax (845) 433-8363


users-boun...@open-mpi.org wrote on 09/03/2008 01:11:00 PM:

> [image removed]
>
> Re: [OMPI users] CPU burning in Wait state
>
> Vincent Rotival
>
> to:
>
> Open MPI Users
>
> 09/03/2008 01:15 PM
>
> Sent by:
>
> users-boun...@open-mpi.org
>
> Please respond to Open MPI Users
>
> Eugene,
>
> No what I'd like is that when doing something like
>
> call mpi_bcast(data, 1, MPI_INTEGER, 0, .....)
>
> the program continues AFTER the Bcast is completed (so no control
> returned to user), but while threads with rank > 0 are waiting inBcast
> they are not taking CPU resources
>
> I hope it is more clear, I apologize for not being clear in thefirst place
>
> Vincent
>
>
>
> Eugene Loh wrote:
> >
> > Vincent Rotival wrote:
> >
> >> The solution I retained was for the main thread to isend data
> >> separately to each other threads that are using Irecv + loop on
> >> mpi_test to test the  finish of the Irecv. It mught be dirty but
> >> works much better than using Bcast
> >
> > Thanks for the clarification.
> >
> > But this strikes me more as a question about the MPI standard than
> > about the Open MPI implementation. That is, what you reallywant is> > for the MPI API to support a non-blocking form of collectives.You
> > want control to return to the user program before the
> > barrier/bcast/etc. operation has completed.  That's an API change.
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems

Re: [OMPI users] CPU burning in Wait state

Reply via email to