Re: Using RAM instead of disk for build servers (was: Re: Build servers offline due to failed SSD)
On Mar 15, 2021, at 09:50, Steven Smith wrote: > SSD RAID offers speed and fault tolerance. Sure. That either wasn't available or was not within what I was willing to spend in 2016 when I set up this system. > Simple options that are tolerant to a single disk failure are: > > • Free/one extra SSD: Use macOS Disk Utility to RAID 1 together two > smaller, inexpensive SSD drives for 100% redundancy. > • OWC ThunderBay 4 Mini, $279: Use macOS Disk Utility to RAID 1 > together four smaller, inexpensive SSD drives for 100% redundancy and larger > capacity. > • OWC ThunderBay 4 Mini with SoftRAID, $379: Use SoftRAID to RAID 4 > together four smaller, inexpensive SSD drives for 100% redundancy and even > larger capacity. (Caveats: no encryption, no boot volumes.) Software RAID is not possible with VMware ESXi, which is what we are using and which is where the storage needs to be addressable from*. *with the exception of the hard disk RAID that holds the master copy of the rsync server, including packages and distfiles, which is the built-in hardware RAID of one of the Xserves which VMware ESXi itself cannot use but which is mapped directly into the single VM that needs to use it.
Re: Using RAM instead of disk for build servers (was: Re: Build servers offline due to failed SSD)
On Mon, 15 Mar 2021, Daniel J. Luke wrote: Thanks for including this information - it's similar to experience I've had with SSDs for $work. I'd be really surprised if we care about builds on the xserves in 8-10 years (given our previous experience with the ppc to x86 transition). Somewhat related, I record TV shows to USB sticks (mini-SSDs) and I find that they tend to fail after a number of cycles (well, they are cheap). I can usually recover by a complete reformat, but of course that burns up more spare blocks; time to break out the spinning rust that I used to use... -- Dave
Re: Using RAM instead of disk for build servers (was: Re: Build servers offline due to failed SSD)
On Mar 14, 2021, at 6:38 PM, Ryan Schmidt wrote: > As far as longevity, the previous set of 3 500 GB SSDs I bought for these > servers in 2016 lasted 4-5 years. They were rated for 150 TBW (terabytes > written) and actually endured around 450 TBW by the time they failed, or 3 > times as long as they were expected to last. The new SSDs are rated for 300 > TBW, and if they also last 3 times longer than that, then they might last > 8-10 years, by which time we might have completely abandoned Intel-based Macs > and be totally switched over to Apple Silicon hardware and will have no use > for the Xserves anymore. Thanks for including this information - it's similar to experience I've had with SSDs for $work. I'd be really surprised if we care about builds on the xserves in 8-10 years (given our previous experience with the ppc to x86 transition). I haven't looked recently, but I recall xserves being somewhat picky about their internal drives - have you found that specific SSDs work well (vs others that don't)? I'm assuming you've installed them on the internal trays - but maybe that's a bad assumption. -- Daniel J. Luke
Re: Using RAM instead of disk for build servers (was: Re: Build servers offline due to failed SSD)
On Mar 14, 2021, at 6:38 PM, Ryan Schmidt wrote: > As far as longevity, the previous set of 3 500 GB SSDs I bought for these > servers in 2016 lasted 4-5 years. They were rated for 150 TBW (terabytes > written) and actually endured around 450 TBW by the time they failed, or 3 > times as long as they were expected to last. The new SSDs are rated for 300 > TBW, and if they also last 3 times longer than that, then they might last > 8-10 years, by which time we might have completely abandoned Intel-based Macs > and be totally switched over to Apple Silicon hardware and will have no use > for the Xserves anymore. Thanks for including this information - it's similar to experience I've had with SSDs for $work. I'd be really surprised if we care about builds on the xserves in 8-10 years (given our previous experience with the ppc to x86 transition). I haven't looked recently, but I recall xserves being somewhat picky about their internal drives - have you found that specific SSDs work well (vs others that don't)? I'm assuming you've installed them on the internal trays - but maybe that's a bad assumption. -- Daniel J. Luke
Re: Using RAM instead of disk for build servers (was: Re: Build servers offline due to failed SSD)
SSD RAID offers speed and fault tolerance. Simple options that are tolerant to a single disk failure are: Free/one extra SSD: Use macOS Disk Utility to RAID 1 together two smaller, inexpensive SSD drives for 100% redundancy. OWC ThunderBay 4 Mini, $279: Use macOS Disk Utility to RAID 1 together four smaller, inexpensive SSD drives for 100% redundancy and larger capacity. OWC ThunderBay 4 Mini with SoftRAID, $379: Use SoftRAID to RAID 4 together four smaller, inexpensive SSD drives for 100% redundancy and even larger capacity. (Caveats: no encryption, no boot volumes.) > On Mar 14, 2021, at 6:38 PM, Ryan Schmidt wrote: > > On Mar 14, 2021, at 06:11, Balthasar Indermuehle wrote: > >> I used to run mac servers in what now can only be described as the days of >> yore... when a 32GB RAM bank cost a lot more than a (spinning) disk - and >> those were expensive then too. SSDs were not here yet. I haven't checked >> pricing lately, but I'd think you could put 256GB of RAM into a server for >> probably about the same as a 1TB SSD, and that would offer plenty of build >> space when used as a RAM drive. And that space will not degrade over time >> (unlike the SSD). In terms of longevity, for a machine with such a >> singularly targeted use case, I'd seriously consider taking the expense now, >> and have a server that lives for another decade. > > Some pricing details: > > OWC sells 96 GB Xserve RAM for $270 and 48 GB for $160. 96 + 96 + 48 would be > 240 GB for $700. > > Meanwhile, the 500 GB SSDs I've been putting in cost about $65 each. I've > already put in three and still need one more to get rid of the hard drives in > the fourth server, though I may get a 1 TB SSD for $120 to have some extra > room. > > Note that the way our build system (using our mpbb build script) works is > that all (or most) ports that exist are kept installed but deactivated. When > a build request comes in, we first activate all of that port's dependencies, > then we build and install and activate that port, then we deactivate all > ports. "Activate" means extract the tbz2 file to disk and note it in the > registry. "Deactivate" means delete the extracted files from disk and note it > in the registry. So even if we move all port building to a RAM disk, which > will undeniably be faster and reduce wear on the disk, it will not eliminate > it completely, not by a long shot. Some ports have hundreds of dependencies. > Activating and deactivating ports is a considerable portion of what our build > machines spend their day doing. > > If we wanted to move that from SSD to RAM disk as well, that would mean > putting MacPorts itself onto the RAM disk. We wouldn't have room on the RAM > disk to keep all ports installed, so it would mean not keeping any ports > installed, and instead installing them on demand and then uninstalling them > (and maybe we would need to budget even more RAM for the RAM disk to > accommodate both space needed for MacPorts and for the dependencies and for > the build). That means downloading and verifying each port's tbz2 file from > the packages web server for every port build. Even though we do have a local > packages server, so that traffic would not have to go over the Internet, the > server uses a hard disk based RAID which is not the fastest in the world, so > this would cause additional delays, not to mention additional wear and tear > on the RAID's disks. > > As far as longevity, the previous set of 3 500 GB SSDs I bought for these > servers in 2016 lasted 4-5 years. They were rated for 150 TBW (terabytes > written) and actually endured around 450 TBW by the time they failed, or 3 > times as long as they were expected to last. The new SSDs are rated for 300 > TBW, and if they also last 3 times longer than that, then they might last > 8-10 years, by which time we might have completely abandoned Intel-based Macs > and be totally switched over to Apple Silicon hardware and will have no use > for the Xserves anymore. > > smime.p7s Description: S/MIME cryptographic signature
Re: Using RAM instead of disk for build servers (was: Re: Build servers offline due to failed SSD)
Hi Ryan, thanks for your detailed response. I hadn't thought of some of the build intricacies you mention. Let alone the upcoming silicon change and phasing out of x86. Sounds like your approach is a good balance for longevity, performance, and cost. Cheers Balthasar On Mon, 15 Mar 2021 at 09:38, Ryan Schmidt wrote: > On Mar 14, 2021, at 06:11, Balthasar Indermuehle wrote: > > > I used to run mac servers in what now can only be described as the days > of yore... when a 32GB RAM bank cost a lot more than a (spinning) disk - > and those were expensive then too. SSDs were not here yet. I haven't > checked pricing lately, but I'd think you could put 256GB of RAM into a > server for probably about the same as a 1TB SSD, and that would offer > plenty of build space when used as a RAM drive. And that space will not > degrade over time (unlike the SSD). In terms of longevity, for a machine > with such a singularly targeted use case, I'd seriously consider taking the > expense now, and have a server that lives for another decade. > > Some pricing details: > > OWC sells 96 GB Xserve RAM for $270 and 48 GB for $160. 96 + 96 + 48 would > be 240 GB for $700. > > Meanwhile, the 500 GB SSDs I've been putting in cost about $65 each. I've > already put in three and still need one more to get rid of the hard drives > in the fourth server, though I may get a 1 TB SSD for $120 to have some > extra room. > > Note that the way our build system (using our mpbb build script) works is > that all (or most) ports that exist are kept installed but deactivated. > When a build request comes in, we first activate all of that port's > dependencies, then we build and install and activate that port, then we > deactivate all ports. "Activate" means extract the tbz2 file to disk and > note it in the registry. "Deactivate" means delete the extracted files from > disk and note it in the registry. So even if we move all port building to a > RAM disk, which will undeniably be faster and reduce wear on the disk, it > will not eliminate it completely, not by a long shot. Some ports have > hundreds of dependencies. Activating and deactivating ports is a > considerable portion of what our build machines spend their day doing. > > If we wanted to move that from SSD to RAM disk as well, that would mean > putting MacPorts itself onto the RAM disk. We wouldn't have room on the RAM > disk to keep all ports installed, so it would mean not keeping any ports > installed, and instead installing them on demand and then uninstalling them > (and maybe we would need to budget even more RAM for the RAM disk to > accommodate both space needed for MacPorts and for the dependencies and for > the build). That means downloading and verifying each port's tbz2 file from > the packages web server for every port build. Even though we do have a > local packages server, so that traffic would not have to go over the > Internet, the server uses a hard disk based RAID which is not the fastest > in the world, so this would cause additional delays, not to mention > additional wear and tear on the RAID's disks. > > As far as longevity, the previous set of 3 500 GB SSDs I bought for these > servers in 2016 lasted 4-5 years. They were rated for 150 TBW (terabytes > written) and actually endured around 450 TBW by the time they failed, or 3 > times as long as they were expected to last. The new SSDs are rated for 300 > TBW, and if they also last 3 times longer than that, then they might last > 8-10 years, by which time we might have completely abandoned Intel-based > Macs and be totally switched over to Apple Silicon hardware and will have > no use for the Xserves anymore. > > >
Re: Using RAM instead of disk for build servers (was: Re: Build servers offline due to failed SSD)
On Mar 14, 2021, at 06:11, Balthasar Indermuehle wrote: > I used to run mac servers in what now can only be described as the days of > yore... when a 32GB RAM bank cost a lot more than a (spinning) disk - and > those were expensive then too. SSDs were not here yet. I haven't checked > pricing lately, but I'd think you could put 256GB of RAM into a server for > probably about the same as a 1TB SSD, and that would offer plenty of build > space when used as a RAM drive. And that space will not degrade over time > (unlike the SSD). In terms of longevity, for a machine with such a singularly > targeted use case, I'd seriously consider taking the expense now, and have a > server that lives for another decade. Some pricing details: OWC sells 96 GB Xserve RAM for $270 and 48 GB for $160. 96 + 96 + 48 would be 240 GB for $700. Meanwhile, the 500 GB SSDs I've been putting in cost about $65 each. I've already put in three and still need one more to get rid of the hard drives in the fourth server, though I may get a 1 TB SSD for $120 to have some extra room. Note that the way our build system (using our mpbb build script) works is that all (or most) ports that exist are kept installed but deactivated. When a build request comes in, we first activate all of that port's dependencies, then we build and install and activate that port, then we deactivate all ports. "Activate" means extract the tbz2 file to disk and note it in the registry. "Deactivate" means delete the extracted files from disk and note it in the registry. So even if we move all port building to a RAM disk, which will undeniably be faster and reduce wear on the disk, it will not eliminate it completely, not by a long shot. Some ports have hundreds of dependencies. Activating and deactivating ports is a considerable portion of what our build machines spend their day doing. If we wanted to move that from SSD to RAM disk as well, that would mean putting MacPorts itself onto the RAM disk. We wouldn't have room on the RAM disk to keep all ports installed, so it would mean not keeping any ports installed, and instead installing them on demand and then uninstalling them (and maybe we would need to budget even more RAM for the RAM disk to accommodate both space needed for MacPorts and for the dependencies and for the build). That means downloading and verifying each port's tbz2 file from the packages web server for every port build. Even though we do have a local packages server, so that traffic would not have to go over the Internet, the server uses a hard disk based RAID which is not the fastest in the world, so this would cause additional delays, not to mention additional wear and tear on the RAID's disks. As far as longevity, the previous set of 3 500 GB SSDs I bought for these servers in 2016 lasted 4-5 years. They were rated for 150 TBW (terabytes written) and actually endured around 450 TBW by the time they failed, or 3 times as long as they were expected to last. The new SSDs are rated for 300 TBW, and if they also last 3 times longer than that, then they might last 8-10 years, by which time we might have completely abandoned Intel-based Macs and be totally switched over to Apple Silicon hardware and will have no use for the Xserves anymore.