Hi Ralph, I wonder if I disable the btl-sm component ,the performance of message passing will degrade.I need to test how much performance are lost. The code quality now is far from to be integrated into trunk, I will try to clean up my code before sending these changes to you .
在 2010年8月13日 下午12:00,Ralph Castain <r...@open-mpi.org>写道: > This sounds like excellent progress! Jeff and others know much more about > MTT than I do, so I'll leave that question to them. > > You have two approaches to the mmap issue. Easiest for now would be to > simply disable the shared memory component - you can either turn it off at > run-time with -mca btl ^sm, or you can direct that it not even be built with > -enable-mca-no-build=btl-sm when configuring OMPI. > > I would think your TCP comm would then allow the two procs sharing a host > to communicate. Can you give that a try? > > I'd be happy to begin reviewing the changes, and can help integrate them > back into the OMPI trunk, when you feel ready. > > > On Aug 12, 2010, at 9:35 PM, 张晶 wrote: > > Hi Ralph,Jeff and all > > It is a good news that I can almost run the openmpi on the vxworks > ,but there are also still some bugs.The final test which has passed is: > Rank 0 process calls mpi_send running on the host 0,rank 1 process > calls mpi_recv running on the host 1. It works well .For the absence of the > mmap in the vxworks ,which is used in the btl sm component , it still fails > running two processes in the same host. > The difference between the vxworks and unix is the real trouble .For > example pipe(),fork(),exec(),socketpair(),fcntl() ,sshd and so no are not > implemented in the vxworks .Replacing these lost with the correspond > functions is the key work of the migration.After having a clear > understanding of the function of rsh component ,I write a simple daemon and > client to launch the orted for the calling of the rlogin() in the user space > of the vxworks complain. > I think there are still many test needed to be launching .Maybe I'd > better to look into MTT. > > 在 2010年7月8日 上午9:54,张晶 <iam.chi...@gmail.com>写道: > >> Thank you ,Squyres , it is really useful ! >> >> 在 2010年7月7日 下午7:22,Jeff Squyres <jsquy...@cisco.com>写道: >> >>> On Jul 6, 2010, at 10:48 PM, 张晶 wrote: >>> >>> > 1.If I write a rlogin component , >>> >>> Is the command line of rlogin that much different than that of rsh/ssh? >>> For example, can you just s/rsh/rlogin/ on the overall command line and >>> have it just work? >>> >>> If so, I suspect that tweaking the rsh plm might be far simpler than >>> having your own component. >>> >>> > can I just login in the node in the cluster and launch the process . >>> If it is ,what the role the odls plays ?? >>> >>> ODLS = ORTE Daemon Local launch Subsystem. >>> >>> PLM = Process Lifecycle Management. >>> >>> Meaning: the PLM is used to launch orteds (more on this below) across >>> multiple nodes. The ODLS is used to launch processes locally from the orted >>> (e.g., via POSIX fork/exec). >>> >>> > 2.what is orted? Should the orted exists in every node and functions as >>> a node process launch proxy ? >>> >>> Yes. The orted = ORTE daemon. It is almost always the first thing >>> launched on each node and acts as a proxy for launching, killing, and >>> monitoring the user's applications on each node. It also does other control >>> kinds of things, like relay stdout/stderr back up to the HNP (more below), >>> etc. >>> >>> > 3,what is hnp ? Is every job has only one hnp ,and when I use mpirun , >>> the mpirun process is hnp ?? >>> >>> HNP = head node process, meaning mpirun (or actually, orterun -- mpirun >>> is a symlink to orterun). The HNP functions as an orted as well, so it can >>> use the ODLS to launch processes locally, etc. >>> >>> Ralph can provide more detail on all of the above, but these are the >>> basics. >>> >>> -- >>> Jeff Squyres >>> jsquy...@cisco.com >>> For corporate legal information go to: >>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>> >>> >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >> >> >> >> -- >> 张晶 >> > > > > -- > 张晶 > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel >