On 05:09PM Wed 01/28/15 -0800, Egan Ford wrote: > On Wed, Jan 28, 2015 at 11:00 AM, Gavin W. Burris <[email protected]> > wrote: > > I guess I would have to ask a few questions of the developer considering > > docker... WHY do you need to be outside of a self-contained directory? > > Given that this is mostly an HPC crowd, this answer may not be 100% > relevant, but I'll try to answer anyway and throw in a few opinions > along the way. > > I'm finding that more and more developers and open source projects > have a heavy dependency on other services, libraries, and code. Ask > any top Python programmer how to start a new project and it will start > with "virtualenv"--basically a chroot type of solution for Python, > including different Pythons and their libs/packages.
Yes, good stuff there. We have largely "standardized" on Python to minimize the ground we have to cover, and virtualenv was the stuff. Developers can install any-and-every module at any version per git repo / project. This is still a headache, though, when an application makes the transition from dev to prod. There is no hard answer about how to best keep things patched and updated in a production environment if devs can go crazy with a dozen previously unseen modules. It is all in a contained directory, but it still has the patching / updating, without breaking, problem. Instead of one central module to maintain, there are virtualenvs all over the place. I would say the goal is to centralize those env dependencies on production. BUT, if it is forever research/dev code, go crazy, in your own contained world. This seems to be the promise of Docker. > > I'm sure Ruby and node.js has something similar. > > If the app uses an Flask or Unicorn and needs to be frontended with a > web server, then you have the complexity of supporting many other > components and trying to get them to play together nicely. Then > there's the databases and trying to maintain all the different table > spaces and security, etc... > > It's not an impossible problem, sys admins have been dealing with this > for a very long time. The challenge is that the number of > environments to deploy applications has exploded. So your admins have > to know everything or limit what the users can develop. In my DevOps > env. I have to deal with Ruby, Python, node.js, and Java. Each may > require different versions. The researcher/dev vs admin/ops dynamic is definitely at play here. My stance is still that devs should try to target known modules, and admins should be flexible to support additional ones with a reasonable and generous time commitment. This should be true of all apps, not just Python modules. > > VMs solve a lot of that problem, however at a greater cost. VM's > usually have a static memory foot print (esp. in the cloud). It's > possible to have 90% of your memory assigned to VMs, but not used by > the applications in the VMs. Containers are just processes that use > what they need (and can be limited). In my own experimentation I've > been able to reduce 20 1GB VMs running 20 services into 20 containers > on a single 8GB VM. 4GB of my RAM is still unused. Sure I could also > spend 100s of hours getting all 20 services to play nice on a single > OS, but one problem with one service can take down the others. I also > have different admins assigned to different services. With containers > I have them fenced off. Thin provisioning goes a long way here for CPU, memory AND storage. We've been pretty happy thin provisioning our VMs and our NFS shares. > > Because I have to pay for my VMs in the cloud, using containers has > measurably reduced my cost. > > Other benefits has been time. I do not have to figure out how to get > Flask and Unicorn to play nice with Nginx. When I need a new gitlab > instance I just create a new set of three containers linked together > and it does not impact my other instances. > > Containers are not perfect, but for me they reduce my costs and > complexity while saving me a lot of time and hassle. Every container > or cluster of containers is a single app. Makes life really easy. > Yes you can do with VMs, but for me, it's too costly. > > Lastly for developer productivity I use Docker on my Mac. It's really > a VirtualBox VM with Linux with Docker installed. I've got about 3-5 > containers running various services and tools that I will use later in > upgrading my production environment. I've tried that in VMs before. > It was slow, painful, and not as easy to automate. Docker is stupid > simple to use with it's Python APIs. You can learn it in 5 min. > > Anyway to directly answer your question. Containers is how I put > complexity into a self-contained directory with no limitations. Docker has been our go-to solution for reproducibility of dev environments, with virtualenvs inside. Will have to give containers a hard look in this area, too. Thanks. > > Oh, let me close with, developers like to bring their own stack. It's > not uncommon. In 2003-2004 I worked on the TeraGrid. Every week all > four of the original sites got on the phone and debated the SW stacks. > Only if they were the same could applications run across the grid. > That inspired me to explore stateless provisioning. In 2005 I worked > with Adaptive computing and we got Moab talking to xCAT so that we > could provision any stateless OS/stack on demand on bare-metal. > Bring-your-own-stack. We call that cloud now. There was demand for > it then as there is now. Containers makes this really easy for both > the admin and the developer. The admin can provide some constraints > (it's not the free for all with VMs and BM where you the developer > have to provide an entire OS image), and the developers get a bit a > structure, but the freedom to be as lazy and dumb as they want to so > that they can get results faster. And the admin does not have to be > bothered with setting up libs, chroot, modules, etc... And if the > admin has to provide a base, well Docker supports that to and you just > put in a registry. I think this is where I start getting anxious, opening the doors to support any OS with any stack. I would much rather push it the other way. The language environment should be cross-platform and well supported, so that production can support one OS well. My inclination is to Keep It Simple Stupid, and not add additional layers of complexity. > > If you are a goal oriented admin/developer, then containers are your > friends. :-) Noted. Stop making SENSE, Egan. > > Cheers, > > Egan Cheers. -- Gavin W. Burris Senior Project Leader for Research Computing The Wharton School University of Pennsylvania _______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
