Jörg Saßmannshausen wrote: > Dear all, > > I was wondering if somebody could help me here a bit. > For some of the calculations we are running on our cluster we need a > significant amount of disc space. The last calculation crashed as the ~700 GB > which I made available were not enough. So, I want to set up a RAID0 on one 8 > core node with 2 1.5 TB discs. So far, so good. > > However, I was wondering whether it does make any sense to somehow 'export' > that scratch space to other nodes (4 cores only). So, the idea behind that > is, if I need a vast amount of scratch space, I could use the one in the 8 > core node (the one I mentioned above). I could do that with nfs but I got the > feeling it will be too slow. Also, I only got GB ethernet at hand, so I > cannot use some other networks here. Is there a good way of doing that? Some > words like i-scsi and cluster-FS come to mind but to be honest, up to now I > never really worked with them. >
You could do something crazy like dynamically create distributed filesystems using GlusterFS (or other Open Source FS) using the local storage of each node that the job is using. This way it is dedicated to your job, share it in your job, and not impact other jobs. Each node needs a disk, but that isn't too expensive. Also, you can skip the RAID part (unless it is for performance) because if the disk dies, it only affects that one node. We tried this for awhile. It worked ok (with GlusterFS), but then we got a good Lustre setup and the performance of the dynamic version didn't justify the effort and maintenance. However, on a smaller system where I don't have that many resources, I might try this again. Craig > Any ideas? > > All the best > > Jörg > -- Craig Tierney ([email protected]) _______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
