Hi again,
I did another install run, using the findings I posted yesterday. The
ones I quote are the one's I used:-
1)
A successful install will have prerequisites. These run from low to
medium complexity, probably have no generally accepted defaults, and
as such I propose we declare them as out of scope for the installation
of Tashi itself. Where appropriate, we should give helpful hints
though. My list:-
a) RPyC-3.1.0 is a prerequisite. The software is easily obtained and
installed.
b) A hypervisor is a prerequisite. Qemu is more tested, Xen less so.
Tutorials I write will use Qemu for now.
c) OS images are prerequisites. They need to be prepared before
deployment.
d) A host's hostname should be set (but need not match DNS).
e) Networking must be engineered beforehand. Tashi will call a script
based on network ID; the user should provide that script to connect
the hosts virtual interface to the appropriate host network. The host
OS must be set up to route if appropriate.
prereqs were applied before the install
2)
The default qemuBin configuration points to what appears to be a
locally compiled version. This should be set to /usr/bin/kvm. Can
someone confirm that /usr/bin/kvm is also the proper name on Ubuntu?
qemuBin was changed to /usr/bin/kvm
3)
The NM no longer has use for the infoFile parameter, so it should be
removed.
removed
6)
Users must change Vfs/prefix to point to somewhere (large) where Tashi
can read and write stuff. OS images must be located under ./images.
Suspend and resume images must be located under ./suspend.
changed to be /tmp
8)
When installing in a place like /usr/local/tashi, running the Tashi
programs requires being in that directory and calling the programs
like bin/tashi-client.py. This permits the default config to be read,
since ./etc is a search path. I would argue that no relative paths
should be read by default and ./etc should be changed to
/usr/local/tashi/etc. A better thing would be to have the install
location configurable.
changed to be /usr/local/tashi/etc
9)
The programs all end with ".py". Yes, they are python scripts, but
users probably don't care and it may trigger negative associations of
scripts. I propose that executables in the bin/ directory be stripped
of their suffix.
removed .py suffix
Install was done using a grml live-cd Linux install running under a VM
on a Macbook Pro.
Special notes:-
(XXXstroucki add prereqs here)
(XXXstroucki: the output of createVm is verbose and not generally useful
to the end user)
(XXXstroucki: the error output here is harmless, but should not happen
(and does not happen in my cluster). I will address next.)
(XXXstroucki: the leaked stuff comes from lvm2 and can be suppressed by
setting the environment value LVM_SUPPRESS_FD_WARNINGS. Current attitude
is for lvm2 to no longer display the messages by default.)
(XXXstroucki: The lvremove output should be quieted.)
Here's the script:-
Prerequisites from the previous message to [email protected] must be met.
(XXXstroucki add prereqs here)
Script started on Wed 25 Jan 2012 07:47:56 AM CET
root@grml:~# mkdir /tmp/tashi
root@grml:~# cd /tmp/tashi
root@grml:/tmp/tashi# svn co
http://svn.apache.org/repos/asf/incubator/tashi/branches/stroucki-accounting
./tashi
A tashi/NOTICE
A tashi/LICENSE
A tashi/doc
A tashi/doc/DEVELOPMENT
A tashi/src
...
A tashi/DISCLAIMER
A tashi/Makefile
A tashi/README
U tashi
Checked out revision 1235865.
Move the distribution files to their destination (here: /usr/local/tashi)
root@grml:/tmp/tashi# mv tashi /usr/local/tashi
root@grml:/tmp/tashi# cd /usr/local/tashi
root@grml:/usr/local/tashi# ls
DISCLAIMER doc etc LICENSE Makefile NOTICE README src
Create daemons and clients
root@grml:/usr/local/tashi# make
Symlinking in clustermanager...
Symlinking in nodemanager...
Symlinking in tashi-client...
Symlinking in primitive...
Symlinking in zoni-cli...
Symlinking in Accounting server...
Done
root@grml:/usr/local/tashi# ls bin
accounting clustermanager nmd nodemanager primitive tashi-client
zoni-cli
If the Tashi source directory is not part of the system python search
path, you must add it.
root@grml:/usr/local/tashi# export PYTHONPATH=$PYTHONPATH:`pwd`/src
Start the cluster manager in debug mode to add a host (in this case, itself)
root@grml:/usr/local/tashi# cd bin
root@grml:/usr/local/tashi/bin# DEBUG=1 ./clustermanager
2012-01-25 07:50:50,285 [./clustermanager:INFO] Using configuration
file(s) ['/usr/local/tashi/etc/TashiDefaults.cfg']
2012-01-25 07:50:50,286 [./clustermanager:INFO] Starting cluster manager
**********************************************************************
Welcome to IPython. I will try to create a personal configuration directory
where you can customize many aspects of IPython's functionality in:
/root/.ipython
Initializing from configuration:
/usr/lib/python2.6/dist-packages/IPython/UserConfig
Successful installation!
Please read the sections 'Initial Configuration' and 'Quick Tips' in the
IPython manual (there are both HTML and PDF versions supplied with the
distribution) to make sure that your system environment is properly
configured
to take advantage of IPython's features.
Important note: the configuration system has changed! The old system is
still in place, but its setting may be partly overridden by the settings in
"~/.ipython/ipy_user_conf.py" config file. Please take a look at the file
if some of the new settings bother you.
Please press <RETURN> to start IPython.
**********************************************************************
from tashi.rpycservices.rpyctypes import Host, HostState, Network
In [1]: from tashi.rpycservices.rpyctypes import Host, HostState, Network
In [2]: data.baseDataObject.hosts[1] =
Host(d={'id':1,'name':'grml','state': HostState.Normal,'up':False})
In [3]:
data.baseDataObject.networks[1]=Network(d={'id':272,'name':'default'})
In [4]: data.baseDataObject.save()
In [4]: %Exit
Run the cluster manager in the background:
root@grml:/usr/local/tashi/bin# ./clustermanager &
[1] 4289
root@grml:/usr/local/tashi/bin# 2012-01-25 07:53:43,177
[./clustermanager:INFO] Using configuration file(s)
['/usr/local/tashi/etc/TashiDefaults.cfg']
2012-01-25 07:53:43,177 [./clustermanager:INFO] Starting cluster manager
Run the node manager in the background. Note that the hostname must be
registered with the cluster manager, as shown above.
root@grml:/usr/local/tashi/bin# ./nodemanager &
[2] 4293
root@grml:/usr/local/tashi/bin# 2012-01-25 07:53:59,348 [__main__:INFO]
Using configuration file(s) ['/usr/local/tashi/etc/TashiDefaults.cfg',
'/usr/local/tashi/etc/NodeManager.cfg']
2012-01-25 07:53:59,392
[/usr/local/tashi/src/tashi/nodemanager/vmcontrol/qemu.py:INFO] No VM
information found in /var/tmp/VmControlQemu/
2012-01-25 07:53:59,404
[/usr/local/tashi/src/tashi/nodemanager/vmcontrol/qemu.py:INFO] Waiting
for NM initialization
Verify that the node is shown as being "Up".
root@grml:/usr/local/tashi/bin# ./tashi-client gethosts
id reserved name decayed up state version memory cores notes
----------------------------------------------------------------
1 [] grml True True Normal HEAD 233 1 None
Start the primitive scheduling agent:
root@grml:/usr/local/tashi/bin# ./primitive &
[3] 4312
Verify that the cluster manager has full communication with the host.
When this has happened, decayed is False.
root@grml:/usr/local/tashi/bin# tashi-client gethosts
id reserved name decayed up state version memory cores notes
----------------------------------------------------------------
1 [] grml False True Normal HEAD 233 1 None
Check the presence of a disk image:
root@grml:/usr/local/tashi/bin# ls /tmp/images/
debian-wheezy-amd64.qcow2
root@grml:/usr/local/tashi/bin# ./tashi-client getimages
id imageName imageSize
---------------------------------------
0 debian-wheezy-amd64.qcow2 1.74G
Create a VM with 1 core and 128 MB of memory using our disk image in
non-persistent mode:
(XXXstroucki: the output of createVm is verbose and not generally useful
to the end user)
(XXXstroucki: the error output here is harmless, but should not happen
(and does not happen in my cluster). I will address next.)
root@grml:/usr/local/tashi/bin# ./tashi-client createVm --cores 1
--memory 128 --name wheezy --disks debian-wheezy-amd64.qcow2
{
hostId: None
name: wheezy
vmId: None
decayed: False
disks: [
{'uri': 'debian-wheezy-amd64.qcow2', 'persistent': False}
]
userId: 0
groupName: None
state: Pending
nics: [
{'ip': None, 'mac': '52:54:00:8d:f0:36', 'network': 0}
]
memory: 128
cores: 1
id: 1
hints: {}
}
root@grml:/usr/local/tashi/bin# 2012-01-25 07:56:58,670
[./primitive:INFO] Scheduling instance wheezy (128 mem, 1 cores, 0 uid)
on host grml
2012-01-25 07:56:58,679
[/usr/local/tashi/src/tashi/nodemanager/vmcontrol/qemu.py:INFO]
Executing command: /usr/bin/kvm -clock dynticks -drive
file=/tmp/images/debian-wheezy-amd64.qcow2,if=ide,index=0,cache=off,snapshot=on
-net nic,macaddr=52:54:00:8d:f0:36,model=virtio,vlan=0 -net
tap,ifname=tashi1.0,vlan=0,script=/etc/qemu-ifup.0 -m 128 -smp 1
-serial null -vnc none -monitor pty
2012-01-25 07:56:58,688
[/usr/local/tashi/src/tashi/nodemanager/vmcontrol/qemu.py:INFO] Adding
vmId 4370
2012-01-25 07:56:59,104
[/usr/local/tashi/src/tashi/nodemanager/vmcontrol/qemu.py:ERROR]
vmStateChange failed for VM wheezy
Traceback (most recent call last):
File "/usr/local/tashi/src/tashi/nodemanager/vmcontrol/qemu.py", line
227, in __matchHostPids
self.nm.vmStateChange(vmId, None, InstanceState.Running)
File "/usr/local/tashi/src/tashi/nodemanager/nodemanagerservice.py",
line 237, in vmStateChange
instance = self.__getInstance(vmId)
File "/usr/local/tashi/src/tashi/nodemanager/nodemanagerservice.py",
line 228, in __getInstance
raise TashiException(d={'errno':Errors.NoSuchVmId,'msg':"There is no
vmId %d on this host" % (vmId)})
TashiException: {'msg': 'There is no vmId 4370 on this host', 'errno': 3}
2012-01-25 07:57:00,110
[/usr/local/tashi/src/tashi/nodemanager/vmcontrol/qemu.py:ERROR]
vmStateChange failed for VM wheezy
Traceback (most recent call last):
File "/usr/local/tashi/src/tashi/nodemanager/vmcontrol/qemu.py", line
227, in __matchHostPids
self.nm.vmStateChange(vmId, None, InstanceState.Running)
File "/usr/local/tashi/src/tashi/nodemanager/nodemanagerservice.py",
line 237, in vmStateChange
instance = self.__getInstance(vmId)
File "/usr/local/tashi/src/tashi/nodemanager/nodemanagerservice.py",
line 228, in __getInstance
raise TashiException(d={'errno':Errors.NoSuchVmId,'msg':"There is no
vmId %d on this host" % (vmId)})
TashiException: {'msg': 'There is no vmId 4370 on this host', 'errno': 3}
Verify the machine is running:
root@grml:/usr/local/tashi/bin# ./tashi-client getinstances
id hostId name user state disk memory cores
---------------------------------------------------------------------
1 1 wheezy root Running debian-wheezy-amd64.qcow2 128 1
After the machine had a chance to boot, find out what address it got. If
you have a DHCP server on your network, search the pool of addresses:
root@grml:/usr/local/tashi/bin# ifconfig br0
br0 Link encap:Ethernet HWaddr 00:0c:29:62:b3:76
inet addr:192.168.244.131 Bcast:192.168.244.255
Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe62:b376/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:2622 errors:0 dropped:0 overruns:0 frame:0
TX packets:1598 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:730925 (713.7 KiB) TX bytes:226530 (221.2 KiB)
root@grml:/usr/local/tashi/bin# arp-scan -I br0 192.168.244.0/24
Interface: br0, datalink type: EN10MB (Ethernet)
Starting arp-scan 1.6 with 256 hosts
(http://www.nta-monitor.com/tools/arp-scan/)
192.168.244.1 00:50:56:c0:00:08 VMWare, Inc.
192.168.244.2 00:50:56:e6:2e:0e VMWare, Inc.
192.168.244.136 52:54:00:8d:f0:36 QEMU
192.168.244.254 00:50:56:fc:50:42 VMWare, Inc.
4 packets received by filter, 0 packets dropped by kernel
Ending arp-scan 1.6: 256 hosts scanned in 1.493 seconds (171.47
hosts/sec). 4 responded
Log into the VM:
root@grml:/usr/local/tashi/bin# ssh [email protected]
The authenticity of host '192.168.244.136 (192.168.244.136)' can't be
established.
RSA key fingerprint is af:f2:1a:3a:2b:7c:c3:3b:6a:04:4f:37:bb:75:16:58.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.244.136' (RSA) to the list of known
hosts.
[email protected]'s password:
Linux debian 3.1.0-1-amd64 #1 SMP Tue Jan 10 05:01:58 UTC 2012 x86_64
The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.
Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Thu Jan 19 15:06:22 2012 from login.cirrus.pdl.cmu.local
debian:~#
debian:~# uname -a
Linux debian 3.1.0-1-amd64 #1 SMP Tue Jan 10 05:01:58 UTC 2012 x86_64
GNU/Linux
debian:~# cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 6
model : 2
model name : QEMU Virtual CPU version 0.14.0
stepping : 3
cpu MHz : 2193.593
cache size : 512 KB
fpu : yes
fpu_exception : yes
cpuid level : 4
wp : yes
flags : fpu pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush mmx fxsr sse sse2 syscall nx lm up nopl pni cx16
popcnt hypervisor lahf_lm svm abm sse4a
bogomips : 4387.18
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:
debian:~# echo "my new vm!"
my new vm
debian:~# halt
Broadcast message from root@debian (pts/0) (Wed Jan 25 02:01:43 2012):
The system is going down for system halt NOW!
debian:~# Connection to 192.168.244.136 closed by remote host.
Connection to 192.168.244.136 closed.
(XXXstroucki: the leaked stuff comes from lvm2 and can be suppressed by
setting the environment value LVM_SUPPRESS_FD_WARNINGS. Current attitude
is for lvm2 to no longer display the messages by default.)
(XXXstroucki: The lvremove output should be quieted.)
root@grml:/usr/local/tashi/bin# 2012-01-25 08:01:56,544
[/usr/local/tashi/src/tashi/nodemanager/vmcontrol/qemu.py:INFO] Removing
vmId 4370 because it is no longer running
2012-01-25 08:01:56,545
[/usr/local/tashi/src/tashi/nodemanager/vmcontrol/qemu.py:INFO] Removing
any scratch for wheezy
File descriptor 3 (socket:[18519]) leaked on lvremove invocation. Parent
PID 4293: /usr/bin/python
File descriptor 5 (socket:[18526]) leaked on lvremove invocation. Parent
PID 4293: /usr/bin/python
File descriptor 7 (pipe:[18747]) leaked on lvremove invocation. Parent
PID 4293: /usr/bin/python
File descriptor 8 (/dev/pts/5 (deleted)) leaked on lvremove invocation.
Parent PID 4293: /usr/bin/python
Volume group "vgscratch" not found
Skipping volume group vgscratch
2012-01-25 08:01:57,245 [./primitive:INFO] VM exited: wheezy
Verify the VM is no longer running:
root@grml:/usr/local/tashi/bin# ./tashi-client getinstances
id hostId name user state disk memory cores
--------------------------------------------
Script done on Wed 25 Jan 2012 08:02:32 AM CET