Re: [zfs-discuss] ZFS pegging the system
Have each node record results locally, and then merge pair-wise until a single node is left with the final results? If you can do merges that way while reducing the size of the result set, then that's probably going to be the most scalable way to generate overall results. On Thu, Jul 16, 2009 at 10:51 AM, Jeff Hafermanj...@haferman.com wrote: We have a SGE array task that we wish to run with elements 1-7. Each task generates output and takes roughly 20 seconds to 4 minutes of CPU time. We're doing them on a machine with about 144 8-core nodes, and we've divvied the job up to do about 500 at a time. So, we have 500 jobs at a time writing to the same ZFS partition. What is the best way to collect the results of the task? Currently we are having each task write to STDOUT and then are combining the results. This nails our ZFS partition to the wall and kills performance for other users of the system. We tried setting up a MySQL server to receive the results, but it couldn't take 1000 simultaneous inbound connections. Jeff ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS pegging the system
On Thu, 2009-07-16 at 10:51 -0700, Jeff Haferman wrote: We have a SGE array task that we wish to run with elements 1-7. Each task generates output and takes roughly 20 seconds to 4 minutes of CPU time. We're doing them on a machine with about 144 8-core nodes, and we've divvied the job up to do about 500 at a time. So, we have 500 jobs at a time writing to the same ZFS partition. Sorry no answers, just some question that first came to mind. Where is your bottleneck? Is it drive I/O or Network? Are all nodes accessing/writing via NFS? Is this a NFS sync issue? Might a SSD ZIL help? -- Louis-Frédéric Feuillette jeb...@gmail.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS pegging the system
We have a SGE array task that we wish to run with elements 1-7. Each task generates output and takes roughly 20 seconds to 4 minutes of CPU time. We're doing them on a machine with about 144 8-core nodes, and we've divvied the job up to do about 500 at a time. So, we have 500 jobs at a time writing to the same ZFS partition. What is the best way to collect the results of the task? Currently we are having each task write to STDOUT and then are combining the results. This nails our ZFS partition to the wall and kills performance for other users of the system. We tried setting up a MySQL server to receive the results, but it couldn't take 1000 simultaneous inbound connections. Jeff ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss