Re: [E-devel] e5: Call for VM

Bertrand Jacquin Tue, 12 Feb 2013 12:04:21 -0800

D'ar yaou 07 a viz C'hwevrer 2013 e 04 eur 06, « Carsten Haitzler » he deus 
skrivet :
> On Wed, 6 Feb 2013 20:19:56 +0100 Bertrand Jacquin <be...@meleeweb.net> said:
> 
> > Hi,
> > 
> > D'ar lun 04 a viz C'hwevrer 2013 e 07 eur 29, « Carsten Haitzler » he deus
> > skrivet :
> > > On Wed, 30 Jan 2013 12:32:23 +0100 Bertrand Jacquin <be...@meleeweb.net>
> > > said:
> > > 
> > > while this would be nice... we don't even have our "home turf" vm's all 
> > > set
> > > up yet.
> > > 
> > > 1. we need to drop lvm. qcow+file images. it just makes migrating vms
> > > elsehwere a lot harder if you put them on lvm requiring extra levels of
> > > special setup to use them.
> > 
> > I do agree that qcow can be more easy, but here is the trick. We have
> > two cases : VM migration, VM duplication for tests purpose.
> > 
> > ** VM duplication :
> > 
> > Using libvirt to migrate need one of Fibre Channel, NFS, GFS or
> > iSCSI between the two hosts. We don't have it. I asked to OSUOSL what is
> > their internal process so maybe we can use it in the future.
> 
> eh? i literally have just copied a qemu disk image from machine a to machine
> b...


Oh no, I'm talking about live migration here.

> and brought up qemu on it and it works out-of-the-box, zero changes,
> migrating etc. it was "scp" and presto. thats why "nfs" for homedirs i think 
> is
> bad - it's not self-contained in the image file - you have to transprot your
> entire setup across which is complex and not self-contained. that's why i 
> think
> lvm just complicates things too. with simple disk files we can easily do delta
> snapshot files to transport "current state relative to a baseline" and have
> the baseline backed up and stored too. u can just bring it up with qemu 
> cmdline
> by hand. dont HAVE to use libvirt. it's trivial to do. you can use libvirt 
> too.
> 
> qemu-img create -f qcow2 -b base.img delta.img
> 
> personally i'm perfectly happy with shutting down the kvm/qemu image once per
> day and doing the above to gen a new daily delta img and backing that up (eg
> scp it home - i can do this trivially - i have the bandwidth and space. we can
> also do it via scping over to e3/e4 too etc.). osuosl plan on bringing up a 
> big
> nfs storage system sometime for backups and data and we can use that too when
> it appears. :)

On not closed on qcow images, this was just an advise for performances
reason and making things the more possible close to hardware.

I can move things to a fresh new FS. I'll start on this tonight.

> and yes - i know qemu can do LIVE snapshots without shutdown.
> 
> http://wiki.qemu.org/Features/Snapshots
> 
> also possible. :)
> 
> > For the moment there is nothing, so here is the process :
> > 
> >  = Using LVM :
> > 
> >   pv /dev/vg-data/e5-template1-sys | ssh osuosl dd 
> > of=/somewhere/e5-t1-sys.img
> >     <repeat for each disk>
> > 
> >  = Using qcow :
> > 
> >     scp /somewhere/e5-template1-sys.img osuosl:/somewhere
> >     <repeat for each disk>
> > 
> > So here it's the same and not so much complicated for one way or one
> > other.
> 
> the files are more obvious. if we need bigger base images it's a simple
> re-generate of img file. lvm is more involved to expand image sizes. :)
> 
> > ** Dump to test purpose
> > 
> > That's already what we did with Tom and Dan needed a fresh VM :
> > 
> >   pv /dev/vg-data/e5-template1-sys | xz -f | dd
> > of=/root/e5-template1-sys.img.xz
> > 
> > In all case, the source file is just different, so it's not higly more
> > complicated, it's the same.
> > 
> > Using FS over FS in not really efficace, plus, you need to care of the
> > multiple sync that append on the VM FS, on the host FS, on the host
> > RAID, on the host disks. Using LVM, there is one less layer.
> 
> this isnt a "mission critical" system. we will have backups. if we loose half 
> a
> day of stuff because of a disk failure.. so be it.. we lose it. go back to 
> last
> snapshot/backup. with git we have a massively distributed backup/copy of all
> the src which is the most important thing we have, so we are good imho.
> 
> > Also, all resize feature etc are still available.
> > 
> > I do think performance is a criteria as we can observe on e2 for now. We
> > should avoid to reproduce a such situation at the very begenning to not
> > be surprise in some times.
> 
> i am sitting here with a totally unoptimized ubuntu 12.04 image in qemu. i
> actually don't NOTICE a slowdown compared the to the 12.04 host when it comes
> to I/O. in fact... its FASTER than the host

I can agree with one VM with no I/O or high CPU usage.

This is not the case when (for now) e5-buildbotslave1 is using a lot a
CPU, you can feel the effect on this on other VM.

> - by a wide margin, when it comes
> to boot for example. once cached it boots in about 7 seconds all the way to a
> gui (e17+xorg+etc. with all the ubuntu overhead behind it too). the host takes
> something like 20-30sec to boot... you currently say that it takes about 5 sec
> to boot your gentoo vm's.. thats without e17+xorg+the rest ubuntu slaps on 
> top.
> this is a simple core i7 desktop with only 3.5gb of available ram (32bit and
> your regular spinning hdd... i see no issues here.
> 
> what we need to avoid is things like mysql that hammer the disk with syncs. 
> svn
> +apache are a lethal combination easily consuming 2gb of ram when someone
> checks out our svn tree via http. and trac in general just loves to consume
> cycles.

Yep, this is why the two 'applications' are splitted on 2 differents
VMs.

> remember e2 is a 6+ year old machine, with a raid controller on the
> blink, 2g of ram and a fairly old cpu... e5 is beefy on a bad day. 48g of ram
> is one really nice disk cache.

That also make more differences in case of crash, but yes we can re
import a fresh backup. Also, we can use qemu live snapshot.

> > Also, yes we have a lot of RAM and CPU on e5, that's really fine. But
> > I/O are absolutely not better with a lot of RAM or CPU.
> 
> that's why i'd rather not FORCE it to be I/O - let it go to ram. this is not
> "mission critical". if we "lose a transaction" - too bad. someone's bug report
> disappears. for git - this is not a problem as its a distributed system with
> every client having backups. the person who did the push has the data - we
> didn't lose it. we can push it again in this rare event. we can actually
> minimize this simply by running fsync on the host every now and again to 
> ensure
> syncs to real media - but dont force this per transaction.
> 
> i understand your point of view. it's one of "this is a mission critical 
> system
> and we must not lose anything ever".  but we can afford that - thus we can 
> make
> different trade-offs. ease of administration. obviousness. ease of migration 
> of
> vm's, ease of setting up new things ad-hoc when needed and simplicity in
> general are worth more imho. the experience of 6+ years of e.org tell me to
> "keep it simple".

Things are 'simple', this don't need any magic stuff, they are basic
(well maybe many basic stuff) accumulated to make a complete solution,
but it's not complex. Making things different from what you are use to
does not make this 'complex'.

> what happens when you get run over by a bus? we don't have
> full-time admins following your every setup and reading manuals. they look at
> it once and then use it.

I absolutely agree, I don't want to have all this stuff on my shoulders.
I try to comment all the config files and script to let people the
knownledge and the philosophy.

At the begenning I wanted to write some docs but it seems people doesn't
really want to read things ;)

And you can be sure I'm not ready to handle all this for the next 10
years.

Spending time to make a simple solution (this is the case for me) take
really much more time than build a complex one that doesn't have a
global philosophy and/or architecture (differents distro, different web
servers, non common config files ...).

I think some other can agree on what i'm saying.

> in 3 years something happens to you (the bus, or you
> get bored of e, or are just on a long holiday for 2 months?) and someone else
> needs to "fix things"... if what they find is incredibly complex, tied 
> together
> deeply and they haven't spent day in and out adminning the box, they can't
> figure out what to do and eventually just ad-hoc make a mess trying to get
> things back up at all. you may be back after your holiday only to find things
> are now messed up, or we spend longer "down" than needed etc. SIMPLICITY 
> trumps
> performance here. GIT solves the most important side of data integrity by its
> very nature. the rest is "nice to have, but not critical". if we have to roll
> back 1/2 a day or a day of data... we'll live just fine, unless this is
> happening every few days.. and then we have a major serious problem.
> 
> > > 2. nfs homedirs are a no-go. homedirs must/should be local to the vm fs
> > > image. making them nfs makes them non-portable. the whole POINT of using
> > > qemu/kvm in the first place is, that if we get overloaded, or if e5 goes
> > > down, or whatever we can TRIVALLY just move any vm over to osuosl's vm
> > > cluster system and re-point dns and presto... it works. this is WHY we 
> > > want
> > > a SIMPLE setup. not complex. SIMPLE.
> > 
> > Yep ! NFS is here to speed up thing for now, it's temporary and will not
> > stay. Never wanted to.
> 
> cool! :) again as above - speed.. don't care. :) the qemu -fsdev/virtfs option
> is more "sane" imho if we want to have arbitrary expansion space outside the 
> fs
> image - this means we can have base fs img + delta + dir of "extra
> files" (tar/rsync that dir as needed)

I mean setup time, not the final usage. NFS is temporary while the
new users script is not complete.

> > > 3. firewall needs to go - NAT is enough.
> > 
> > NAT is not firewalling, hope we agree on this.
> 
> i know.. but it provides protection in that "you can't get in through the nat
> unless something explicitly forwards a port". getting in is a reality we will
> have to live with unless we want to ask for an IP for every VM... and that's
> asking a lot of osu to go vie us so many IP's.

So you want me to ask to OSUOSL one IP per VM ? Is it that ? This is a
rare ressource nowadays. Well, we can have a complete IPv6 solution, not
sure it's the solution here ;)

> > > firewall just wastes peoples time
> > > debugging services.
> > 
> > We should not have to debug anything when thing are up, when we deploy
> > new since I agree, it's normal. And this does not happend so often.
> 
> ask daniel. he already spent hours figuring out a problem that was the fw
> blocking OUTGOING traffic. we JUST need a NAT. we all know the NAT will 
> provide
> incoming connections unless we explicitly route them. we "expect" outgoing
> connections to "just work" (tm) with nothing in the way.

This is now done. But i'm not really a fan of this. All is logged in all
cases.

> > > in all the years e1/e2 have been up we have nver needed or
> > > uses a fw.
> > 
> > And for 4 years, MySQL has been opened to whole internet. This just
> > prevent mistakes.
> 
> we're behind a nat... that already does all the work we need. the host system
> (e5) has nothing running anyway, so doesn't matter.

and NAT also doesn't append on e5, it's on e5v1. Access to all e5v* is
bridged to e5v1 with no firewall/NAT, all is done on e5v1.

-- 
Beber

pgpRJRxMXRgpa.pgp
Description: PGP signature

------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013 
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb

_______________________________________________
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Re: [E-devel] e5: Call for VM

Reply via email to