Hi guys: I will add to this thread by pointing you all to this article available at ClusterMonkey:
http://www.clustermonkey.net//content/view/142/32/ Cheers, Bernard ________________________________ From: [EMAIL PROTECTED] on behalf of Michael Edwards Sent: Thu 10/08/2006 20:57 To: oscar-users@lists.sourceforge.net Subject: Re: [Oscar-users] MPICH2 Follow-Up Many people, in my experience, either run off their NFS mounted home directories or they copy any needed files to some local directory (/tmp is popular). The first way is easy, but I have had occasional problems when the clocks on the nodes get out of sync, because then the files on a given node will not neccesarily get updated if the copy on one of the other nodes changes. This shouldn't be an issue since OSCAR uses ntp to keep the clocks in sync (the system that had this problem had ntp turned off for some reason), but I guess it depends a bit on how often you are hitting your files. NFS also isn't really designed for high bandwith stuff I don't think. Copying your files to the local drive is a nice solution if the files are not extremely large (what that means exactly depends a lot on your network). Then you get the file transfer overhead out of the way at the begining and you are always sure of what files you have, because you put them there yourself. This also avoids file locking problems, and write timing issues that can creep into code made by lazy mpi programers like me :) If you have very big data files, or very high I/O bandwith for some reason, it becomes a very difficult problem. Very large clusters are tricky too. On 8/10/06, Steven Blackburn <[EMAIL PROTECTED]> wrote: > I am a novice with clusters.... but I was planning to > solve the same problem as you in one of two ways: > > a) The home directories are automatically shared by > Oscar, so a user could log on to the head node, ssh to > a client node and see her directories (including any > executables she has built there). I get the impression > this is the model used if your cluster is used by lots > of people (e.g. in a commercial setting). After all, a > normal user can probably only write to their home > directory and /tmp. > > b) Parallel file systems exist, such as PVFS, which > could be used to distribute a volume across several > nodes. I was considering installing PVFS on all four > nodes of my 'toy' cluster. The way I was hoping to > install this was going to end up with a file system > each node could access locally but which would be > spread across (and shared by) all nodes in the system. > > Because the Oscar PVFS package is not currently > maintained, I went for using the shared home dirs. If > I get a bit more comfortable with the cluster, I might > give the package a go and see if I can fix whatever > might be broken in it. > > Remember that with either option, the I/O would be > across the network, so file access might be > inefficient (i.e. reading the same file over and > over). I was thinking of copying any such files to > /tmp but, as you say, cpush might be useful here. Is > there a programatic interface to cpush, or just > exec()? > > But I am only a novice at this and could have got > entirely the wrong idea... > > Steve. > > > --- Tyler Cruickshank <[EMAIL PROTECTED]> wrote: > > > Hello. > > > > I have 2 items: > > > > 1) I believe that I have successfully built, > > installed, and pushed > > MPICH2 using PGI compilers. Once I am sure that it > > is working Ill write > > it up and send it on. > > > > 2) I have a question that illustrates my depth of > > understanding of > > clusters (lack of depth). I am trying to run a > > model where the compute > > nodes need access to the same input/output dirs and > > executables (perhaps > > this is always the case?). Right now, when the > > nodes try to do a job, > > they cant access the executable that lives on the > > server. How do I set > > the nodes up so that they are able to access the > > server node > > directories? I can imagine using cpush in some way > > or fully mounting > > the file systems? > > > > Thanks for listening/reading. > > > > -Tyler > > > > > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Oscar-users mailing list > Oscar-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/oscar-users > ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Oscar-users mailing list Oscar-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/oscar-users
<<winmail.dat>>
------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________ Oscar-users mailing list Oscar-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/oscar-users