Re: [one-users] The virtual machine failure migration!
Hi, Sounds like libvirt/qemu have a problem restarting the VM. Are the config files for those daemons under /etc/libvirt set up identically on all your hosts? Did you make sure that the user that starts the kvm process is able to read en write to the files? (In my install I added the user "qemu" to the "oneadmin" group to make that work.) You should have a log file for this particular VM in /var/log/libvirt/qemu/one-[vm-id].log. What does it say? You say it happens occasionally, so not always? Does it always fail when you migrate tot this particular host, or also occasionally? If occasionally, you have to look for things that change from time to time. Do you have config files maintained by chef/puppet/cfengine? Do you have user accounts maintained by ldap/nis? If this host always fails, it must be bad config of this host; compare it to the other hosts that do work. Where does the "unable to read from monitor" come from. Is it opennebula? That is kind of normal: it tries to read the status for the VM but since it is not up, it fails. But actually, that should not give you a "connection reset"... What you can do to test your setup when it fails: Go to the directory with the files and before you change anything, so a "virsh create deployment.X" where X is the highest number you can find in that dir. )In your example, it would be 2). (You will need to be root for this or you will not be able to use the "system" libvirt space.) If that "just works" then you have a real strange problem. If it gives you errors, try to solve them. :) (If you do not know "virsh", you should read up a bit on it.) Hope this is a bit helpful. Jhon On 07/04/2012 12:59 PM, David wrote: Hi, All I used OpenNebula3.2.1 version. When I execute VM migrate operation,Occasionally the VM appears a migration failure, following log: Thu Jun 28 15:16:24 2012 [LCM][I]: New VM state is RUNNING Thu Jun 28 15:17:09 2012 [LCM][I]: New VM state is SAVE_MIGRATE Thu Jun 28 15:17:42 2012 [VMM][I]: save: Executed "virsh --connect qemu:///system [^] save one-383 /one_images/383/images/checkpoint". Thu Jun 28 15:17:42 2012 [VMM][I]: ExitCode: 0 Thu Jun 28 15:17:42 2012 [VMM][I]: Successfully execute virtualization driver operation: save. Thu Jun 28 15:17:43 2012 [VMM][I]: ExitCode: 0 Thu Jun 28 15:17:43 2012 [VMM][I]: Successfully execute network driver operation: clean. Thu Jun 28 15:17:43 2012 [LCM][I]: New VM state is PROLOG_MIGRATE Thu Jun 28 15:56:03 2012 [TM][I]: tm_mv.sh: Moving /one_images/383/images Thu Jun 28 15:56:03 2012 [TM][I]: tm_mv.sh: Executed "ssh compute-56-5.local mkdir -p /one_images/383". Thu Jun 28 15:56:03 2012 [TM][I]: tm_mv.sh: Executed "scp -r compute-56-4.local:/one_images/383/images compute-56-5.local:/one_images/383/images". Thu Jun 28 15:56:03 2012 [TM][I]: tm_mv.sh: Executed "ssh compute-56-4.local rm -rf /one_images/383/images". Thu Jun 28 15:56:03 2012 [TM][I]: ExitCode: 0 Thu Jun 28 15:56:03 2012 [LCM][I]: New VM state is BOOT Thu Jun 28 15:56:05 2012 [VMM][I]: ExitCode: 0 Thu Jun 28 15:56:05 2012 [VMM][I]: Successfully execute network driver operation: pre. Thu Jun 28 15:56:06 2012 [VMM][I]: Command execution fail: /var/tmp/one/vmm/kvm/restore /one_images/383/images/checkpoint compute-56-5.local 383 compute-56-5.local Thu Jun 28 15:56:06 2012 [VMM][E]: restore: Command "virsh --connect qemu:///system [^] restore /one_images/383/images/checkpoint" failed. Thu Jun 28 15:56:06 2012 [VMM][E]: restore: error: Failed to restore domain from /one_images/383/images/checkpoint Thu Jun 28 15:56:06 2012 [VMM][I]: error: internal error process exited while connecting to monitor: qemu-kvm: -drive file=/one_images/383/images/disk.0,if=none,id=drive-virtio-disk0,format=raw: could not open disk image /one_images/383/images/disk.0: Permission denied Thu Jun 28 15:56:06 2012 [VMM][E]: Could not restore from /one_images/383/images/checkpoint Thu Jun 28 15:56:06 2012 [VMM][I]: ExitCode: 1 Thu Jun 28 15:56:06 2012 [VMM][I]: Failed to execute virtualization driver operation: restore. Thu Jun 28 15:56:06 2012 [VMM][E]: Error restoring VM: Could not restore from /one_images/383/images/checkpoint Thu Jun 28 15:56:06 2012 [DiM][I]: New VM state is FAILED execute command : chmod +x *
Re: [one-users] Sunstone does not load any stats. (Users Digest, Vol 53, Issue 14)
Hi, If that started to appear in Monday after Saturday leap second then it could be related to http://it.slashdot.org/story/12/07/01/1920217/leap-second-bug-causes-crashes. Our opennebula server had increased load (to ~20) after that also. Reboot helped. Regards, Rolandas Naujikas On 2012-07-05 05:48, Tao Craig wrote: Hi Hector, At first, I see the orange spinning balls... then after some time, this is replaced with "undefined". I also noticed that ruby seems to be fairly stable until I try to load these graphs. Then one ruby script will jumpt to 100+ CPU usage and pretty much stay there. Sometimes, I will get a "Could not connect..." alert during this time and ruby will return to normal. Seeing stuff like this when I trace the PID of the ruby script: rt_sigreturn(0x1a) = 121 --- SIGVTALRM (Virtual timer expired) @ 0 (0) --- sunstone.log: Wed Jul 04 19:32:04 2012 [I]: xx.xxx.xxx.xxx - - [04/Jul/2012 19:32:04] "GET /vmtemplate?timeout=true HTTP/1.1" 200 3907 53.3096 Wed Jul 04 19:32:08 2012 [I]: xx.xxx.xxx.xxx - - [04/Jul/2012 19:32:07] "GET /acl?timeout=true HTTP/1.1" 200 377 33.8135 Wed Jul 04 19:32:12 2012 [I]: xx.xxx.xxx.xxx - - [04/Jul/2012 19:32:11] "GET /vnet?timeout=true HTTP/1.1" 200 649 39.9651 Wed Jul 04 19:32:22 2012 [I]: xx.xxx.xxx.xxx - - [04/Jul/2012 19:32:22] "GET /datastore?timeout=true HTTP/1.1" 200 2335 55.5165 Wed Jul 04 19:32:28 2012 [I]: xx.xxx.xxx.xxx - - [04/Jul/2012 19:32:28] "GET /user?timeout=true HTTP/1.1" 200 1505 44.8972 Wed Jul 04 19:32:41 2012 [I]: xx.xxx.xxx.xxx - - [04/Jul/2012 19:32:41] "GET /cluster?timeout=true HTTP/1.1" 200 27 26.3875 Wed Jul 04 19:32:44 2012 [I]: xx.xxx.xxx.xxx - - [04/Jul/2012 19:32:44] "GET /image?timeout=true HTTP/1.1" 200 2957 65.6978 Wed Jul 04 19:32:48 2012 [I]: xx.xxx.xxx.xxx - - [04/Jul/2012 19:32:48] "GET /host?timeout=true HTTP/1.1" 200 0 139.9275 Wed Jul 04 19:32:51 2012 [I]: xx.xxx.xxx.xxx - - [04/Jul/2012 19:32:51] "GET /group?timeout=true HTTP/1.1" 200 796 36.0109 Wed Jul 04 19:32:58 2012 [I]: xx.xxx.xxx.xxx - - [04/Jul/2012 19:32:58] "GET /datastore?timeout=true HTTP/1.1" 200 2335 56.8328 sunstone.error is empty, except for this: == Sinatra/1.3.2 has taken the stage on 9869 for development with backup from Thin "gem list" addressable (2.2.8) amazon-ec2 (0.9.17) bcrypt-ruby (3.0.1) curb (0.8.0) daemons (1.1.8) data_mapper (1.2.0) data_objects (0.10.8) dm-aggregates (1.2.0) dm-constraints (1.2.0) dm-core (1.2.0) dm-do-adapter (1.2.0) dm-migrations (1.2.0) dm-mysql-adapter (1.2.0) dm-serializer (1.2.1) dm-sqlite-adapter (1.2.0) dm-timestamps (1.2.0) dm-transactions (1.2.0) dm-types (1.2.1) dm-validations (1.2.0) do_mysql (0.10.8) do_sqlite3 (0.10.8) eventmachine (0.12.10) fastercsv (1.5.4) json (1.7.0, 1.6.7) json_pure (1.6.7) multi_json (1.0.4) mysql (2.8.1) net-ldap (0.3.1) nokogiri (1.5.2) rack (1.4.1) rack-protection (1.2.0) rake (0.8.7) sequel (3.35.0) sinatra (1.3.2) sqlite3 (1.3.6) stringex (1.3.3) thin (1.3.1) tilt (1.3.3) uuidtools (2.1.2) xml-simple (1.1.1) I am running sunstone and opennebula on the same box... it does seem to be ruby related, but I never had a problem until Monday. Prior to that, nothing had changed on my end. I just came into the office on Monday and discvoered I could not log in to the older, private cloud and the public cloud was very slow. Upgrading the public cloud seemed to help (aside from the issues mentioned above), but I can't upgrade the private cloud just yet and I would rather identify the source of this problem first. The CLI is very fast and responsive and other tools such as, VNC console work fine. I haven't really been using the self-service portal (although I would like to in the future), but when I try to start it -I get the following error: Wed Jul 04 19:43:17 2012 [E]: Error initializing authentication system Wed Jul 04 19:43:17 2012 [E]: [UserPoolInfo] User couldn't be authenticated, aborting call. Thanks again for your help. - Original Message - From: "Hector Sanjuan" To: ; "Tao Craig" Sent: Wednesday, July 04, 2012 4:06 PM Subject: Re: [one-users] Sunstone does not load any stats. Hi, monitoring graphs on my hosts and virtual machines no longer appear. Is there an empty graph in place or is there an error message? If you can attach sunstone.log and sunstone.error (if not empty) after trying to see those graphs etc. perhaps I see something... It's not normal that the dashboard takes 30secs to load. I guess the CLI is not so slow when issuing a listing command (onehost list, onevm list etc..) right? And what is the ruby script consuming 100% exactly? (grep pid from 'ps aux' or press 'c' during the execution of 'top' to find the full command). If you have this long-wait problem in two different clouds and ruby is consuming so much cpu I would think there is an issue with your boxes configuration related to ruby perhaps. What's the output of 'gem list'? Are you running sunstone and opennebula on the same box? Have you tried
[one-users] opennebula3.4.1&&vSphere4.0 CPU&&MEM problem
Hi,all I encountered a strange problem . I use the following template to create a virtual machine. NAME = BCI00a0 MEMORY = 512 CPU = 1 VCPU = 1.0 RANK = FREECPU OS = [ BOOT = hd, ARCH = x86_64 ] DISK = [ IMAGE_ID = 58 ] INPUT = [ TYPE = tablet,BUS = usb ] GRAPHICS = [ TYPE = vnc,LISTEN = 0.0.0.0,PORT = -1 ] NIC = [ MODEL = e1000,NETWORK = 10.150.0.0 ] CONTEXT = [ VM_UUID = $NAME,HOSTNAME = $NAME,ETH0_IP = "$NIC[IP, NETWORK=\"10.150.0.0\"]",ETH0_NETMASK = 255.255.255.0,ETH0_GATEWAY = 10.150.0.254,ETH0_DNS1 = 8.8.8.8,ETH0_DNS2 = 9.9.9.9,ROOT_PASSWD = "zhanggp@123",FILES = "/opt/nebula/images/init.sh",TARGET = hdc ] ] Then , i use onevm show command to list the info of this vm : ... CPU="1" MEMORY="512" ... VIRTUAL MACHINE MONITORING NET_TX : 0 NET_RX : 0 USED MEMORY : 0 USED CPU: 0 But after a few minutes, a strange thing happened . The info of the vm was changed: CPU="2" , MEM="1024" MEMORY="512" VIRTUAL MACHINE MONITORING NET_TX : 0 NET_RX : 0 USED MEMORY : 0 USED CPU : 0 Please help to explain . 张光鹏 tel: 13718913184 mail: zhan...@neusoft.com 移动互联网事业部 东软集团股份有限公司 http://www.neusoft.com --- Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s) is intended only for the use of the intended recipient and may be confidential and/or privileged of Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is not the intended recipient, unauthorized use, forwarding, printing, storing, disclosure or copying is strictly prohibited, and may be unlawful.If you have received this communication in error,please immediately notify the sender by return e-mail, and delete the original message and all copies from your system. Thank you. --- <>___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
Re: [one-users] Sunstone does not load any stats.
Hi Hector, At first, I see the orange spinning balls... then after some time, this is replaced with "undefined". I also noticed that ruby seems to be fairly stable until I try to load these graphs. Then one ruby script will jumpt to 100+ CPU usage and pretty much stay there. Sometimes, I will get a "Could not connect..." alert during this time and ruby will return to normal. Seeing stuff like this when I trace the PID of the ruby script: rt_sigreturn(0x1a) = 121 --- SIGVTALRM (Virtual timer expired) @ 0 (0) --- sunstone.log: Wed Jul 04 19:32:04 2012 [I]: xx.xxx.xxx.xxx - - [04/Jul/2012 19:32:04] "GET /vmtemplate?timeout=true HTTP/1.1" 200 3907 53.3096 Wed Jul 04 19:32:08 2012 [I]: xx.xxx.xxx.xxx - - [04/Jul/2012 19:32:07] "GET /acl?timeout=true HTTP/1.1" 200 377 33.8135 Wed Jul 04 19:32:12 2012 [I]: xx.xxx.xxx.xxx - - [04/Jul/2012 19:32:11] "GET /vnet?timeout=true HTTP/1.1" 200 649 39.9651 Wed Jul 04 19:32:22 2012 [I]: xx.xxx.xxx.xxx - - [04/Jul/2012 19:32:22] "GET /datastore?timeout=true HTTP/1.1" 200 2335 55.5165 Wed Jul 04 19:32:28 2012 [I]: xx.xxx.xxx.xxx - - [04/Jul/2012 19:32:28] "GET /user?timeout=true HTTP/1.1" 200 1505 44.8972 Wed Jul 04 19:32:41 2012 [I]: xx.xxx.xxx.xxx - - [04/Jul/2012 19:32:41] "GET /cluster?timeout=true HTTP/1.1" 200 27 26.3875 Wed Jul 04 19:32:44 2012 [I]: xx.xxx.xxx.xxx - - [04/Jul/2012 19:32:44] "GET /image?timeout=true HTTP/1.1" 200 2957 65.6978 Wed Jul 04 19:32:48 2012 [I]: xx.xxx.xxx.xxx - - [04/Jul/2012 19:32:48] "GET /host?timeout=true HTTP/1.1" 200 0 139.9275 Wed Jul 04 19:32:51 2012 [I]: xx.xxx.xxx.xxx - - [04/Jul/2012 19:32:51] "GET /group?timeout=true HTTP/1.1" 200 796 36.0109 Wed Jul 04 19:32:58 2012 [I]: xx.xxx.xxx.xxx - - [04/Jul/2012 19:32:58] "GET /datastore?timeout=true HTTP/1.1" 200 2335 56.8328 sunstone.error is empty, except for this: == Sinatra/1.3.2 has taken the stage on 9869 for development with backup from Thin "gem list" addressable (2.2.8) amazon-ec2 (0.9.17) bcrypt-ruby (3.0.1) curb (0.8.0) daemons (1.1.8) data_mapper (1.2.0) data_objects (0.10.8) dm-aggregates (1.2.0) dm-constraints (1.2.0) dm-core (1.2.0) dm-do-adapter (1.2.0) dm-migrations (1.2.0) dm-mysql-adapter (1.2.0) dm-serializer (1.2.1) dm-sqlite-adapter (1.2.0) dm-timestamps (1.2.0) dm-transactions (1.2.0) dm-types (1.2.1) dm-validations (1.2.0) do_mysql (0.10.8) do_sqlite3 (0.10.8) eventmachine (0.12.10) fastercsv (1.5.4) json (1.7.0, 1.6.7) json_pure (1.6.7) multi_json (1.0.4) mysql (2.8.1) net-ldap (0.3.1) nokogiri (1.5.2) rack (1.4.1) rack-protection (1.2.0) rake (0.8.7) sequel (3.35.0) sinatra (1.3.2) sqlite3 (1.3.6) stringex (1.3.3) thin (1.3.1) tilt (1.3.3) uuidtools (2.1.2) xml-simple (1.1.1) I am running sunstone and opennebula on the same box... it does seem to be ruby related, but I never had a problem until Monday. Prior to that, nothing had changed on my end. I just came into the office on Monday and discvoered I could not log in to the older, private cloud and the public cloud was very slow. Upgrading the public cloud seemed to help (aside from the issues mentioned above), but I can't upgrade the private cloud just yet and I would rather identify the source of this problem first. The CLI is very fast and responsive and other tools such as, VNC console work fine. I haven't really been using the self-service portal (although I would like to in the future), but when I try to start it -I get the following error: Wed Jul 04 19:43:17 2012 [E]: Error initializing authentication system Wed Jul 04 19:43:17 2012 [E]: [UserPoolInfo] User couldn't be authenticated, aborting call. Thanks again for your help. - Original Message - From: "Hector Sanjuan" To: ; "Tao Craig" Sent: Wednesday, July 04, 2012 4:06 PM Subject: Re: [one-users] Sunstone does not load any stats. Hi, monitoring graphs on my hosts and virtual machines no longer appear. Is there an empty graph in place or is there an error message? If you can attach sunstone.log and sunstone.error (if not empty) after trying to see those graphs etc. perhaps I see something... It's not normal that the dashboard takes 30secs to load. I guess the CLI is not so slow when issuing a listing command (onehost list, onevm list etc..) right? And what is the ruby script consuming 100% exactly? (grep pid from 'ps aux' or press 'c' during the execution of 'top' to find the full command). If you have this long-wait problem in two different clouds and ruby is consuming so much cpu I would think there is an issue with your boxes configuration related to ruby perhaps. What's the output of 'gem list'? Are you running sunstone and opennebula on the same box? Have you tried Self-Service interface? Is it so slow as well? Hector En Thu, 05 Jul 2012 00:40:19 +0200, Tao Craig escribió: Hector, Thanks for the prompt reply. I am ashamed to admit that browser cache was the problem in this case. The dashboard still takes about 30 seconds to load, but at least it
Re: [one-users] Sunstone does not load any stats.
Hi, monitoring graphs on my hosts and virtual machines no longer appear. Is there an empty graph in place or is there an error message? If you can attach sunstone.log and sunstone.error (if not empty) after trying to see those graphs etc. perhaps I see something... It's not normal that the dashboard takes 30secs to load. I guess the CLI is not so slow when issuing a listing command (onehost list, onevm list etc..) right? And what is the ruby script consuming 100% exactly? (grep pid from 'ps aux' or press 'c' during the execution of 'top' to find the full command). If you have this long-wait problem in two different clouds and ruby is consuming so much cpu I would think there is an issue with your boxes configuration related to ruby perhaps. What's the output of 'gem list'? Are you running sunstone and opennebula on the same box? Have you tried Self-Service interface? Is it so slow as well? Hector En Thu, 05 Jul 2012 00:40:19 +0200, Tao Craig escribió: Hector, Thanks for the prompt reply. I am ashamed to admit that browser cache was the problem in this case. The dashboard still takes about 30 seconds to load, but at least it is loading now. I noticed a few other minor issues though that I can not track down in my logs. For example, the monitoring graphs on my hosts and virtual machines no longer appear. ... any advice? Part of the reason I didn't catch the browser cache issue earlier is because I have a second CentOS/KVM cloud running version 3.2.0 and the dashboard recently stopped loading on it as well. This was not fixed by clearing my browser cache. Eventually, I get a "Could not connect..." alert and the page never finishes loading. During this time, there is a ruby script consuming 100+ percent of CPU resources. When I kill this script, the cloud is still functional but Sunstone is no longer running. The logs all appear normal as far as I can tell and all CLI commnands work without error. Any suggestions here would be greatly appreciated as well. Thanks. - Original Message - From: "Hector Sanjuan" To: Sent: Wednesday, July 04, 2012 4:17 AM Subject: Re: [one-users] Sunstone does not load any stats. Hello, can you try to remove browsers cache and see if that fixes it? Hector En Wed, 04 Jul 2012 02:10:34 +0200, Tao Craig escribió: Hi everybody, I recently upgraded my CentOS Open Nebula installation from 3.4 to 3.6 (Lagoon). Prior to the upgrade, I noticed my Sunstone dashboard was loading slowly on login (the page would load fine, but it took awhile to load the graphs, number of hosts, etc). I saw there were some improvements with the Sunstone dashboard with this upgrade, so I applied it hoping it would help. Now, my Sunstone dashboard doesn't load any stats or graphs... I just see those spinning orange dots and the rest of the Sunstone interface does not work either (I'm assuming because this information is never gathered). There are no errors in my logs anywhere that I can find. The only thing I am noticing is that ruby scripts are consuming a large amount of CPU resources. If it helps, I am currently running 13 virtual machines on 9 hosts and all "one" CLI commands work fine. Any help would be appreciated. Thanks. -- Hector Sanjuan OpenNebula Developer ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org - No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.2193 / Virus Database: 2437/5109 - Release Date: 07/03/12 -- Hector Sanjuan OpenNebula Developer ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
Re: [one-users] Sunstone does not load any stats.
Hector, Thanks for the prompt reply. I am ashamed to admit that browser cache was the problem in this case. The dashboard still takes about 30 seconds to load, but at least it is loading now. I noticed a few other minor issues though that I can not track down in my logs. For example, the monitoring graphs on my hosts and virtual machines no longer appear. ... any advice? Part of the reason I didn't catch the browser cache issue earlier is because I have a second CentOS/KVM cloud running version 3.2.0 and the dashboard recently stopped loading on it as well. This was not fixed by clearing my browser cache. Eventually, I get a "Could not connect..." alert and the page never finishes loading. During this time, there is a ruby script consuming 100+ percent of CPU resources. When I kill this script, the cloud is still functional but Sunstone is no longer running. The logs all appear normal as far as I can tell and all CLI commnands work without error. Any suggestions here would be greatly appreciated as well. Thanks. - Original Message - From: "Hector Sanjuan" To: Sent: Wednesday, July 04, 2012 4:17 AM Subject: Re: [one-users] Sunstone does not load any stats. Hello, can you try to remove browsers cache and see if that fixes it? Hector En Wed, 04 Jul 2012 02:10:34 +0200, Tao Craig escribió: Hi everybody, I recently upgraded my CentOS Open Nebula installation from 3.4 to 3.6 (Lagoon). Prior to the upgrade, I noticed my Sunstone dashboard was loading slowly on login (the page would load fine, but it took awhile to load the graphs, number of hosts, etc). I saw there were some improvements with the Sunstone dashboard with this upgrade, so I applied it hoping it would help. Now, my Sunstone dashboard doesn't load any stats or graphs... I just see those spinning orange dots and the rest of the Sunstone interface does not work either (I'm assuming because this information is never gathered). There are no errors in my logs anywhere that I can find. The only thing I am noticing is that ruby scripts are consuming a large amount of CPU resources. If it helps, I am currently running 13 virtual machines on 9 hosts and all "one" CLI commands work fine. Any help would be appreciated. Thanks. -- Hector Sanjuan OpenNebula Developer ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org - No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.2193 / Virus Database: 2437/5109 - Release Date: 07/03/12 ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
Re: [one-users] Public IPs on an internal private network of an OpenNebula Cluster
Hello Patrizio, If I correctly understood, the master server is the OpenNebula server and the slaves are the KVM hosts. The master is connected to Internet, has a public ip address and is also connected to a private lan; slaves (kvm servers) are connected only to private lan. You use also the master server to masquerade traffic form VM to Internet via iptables. If it's correct, you have 2 possible solution: 1) configure the public IPs as secondary in master servers and set iptalbles to DNAT traffic from public IP to the private one 2) install a second ethernet card on slave server, connect it to public internet and create a second bridge (without a real public IP address). The second solution for me is better. Bye, Alberto On 04/07/2012 16:36, Patrizio Dazzi wrote: Dear all OpenNebulers, As a researcher of the HPCLab at ISTI-CNR I am working for the CONTRAIL EU project (http://www.contrail-project.eu/), which main aim is to conceive and develop an holistic system for building cloud federation that can be managed in an integrated and seamless way. For the reference implementation of CONTRAIL system, we decided to exploit OpenNebula as Provider-level IaaS. As a consequence, CNR and a few other project partners are setting up an OpenNebula Cloud each. Unfortunately, at CNR we are experiencing some issues related to OpenNebula configuration. These are mainly due to the lack of public IPs availability on our side and the consequent decision to reserve them for the VMs hence avoiding to assign to each Physical machine a public IP. Let me describe what's going on on our side. We have installed the tarball distribution of open nebula 3.4.1 for running virtual machines on a (kvm based) cluster made of 5 computers: a front-end machine and 4 slaves machines. Currently, the master has 2 network interfaces configured whereas the slaves have only a single network interface configured each. All the nodes of the cluster are running Ubuntu server 12.04 64 bit. The slaves of the cluster are connected to the front-end via a gigabit switch. The front-end uses the second network interface to connect to Internet. Such front-end is the only machine having a public IP. Indeed, the internal network exploits a class of private IPs (192.168.100.X). The front-end iptables has been already properly configured to forward and masquerade the connections from the slaves to the internet. Indeed, we are able to connect to ubuntu update sites directly from the slaves. I also have a few public IPs that I would like to assign to certain Virtual Machines that will be run on the cluster. Unfortunately, the slaves are connected to a private network, hence their virtual bridges, as far as I know, can receive only packets sent to IPs having the same network address/mask. As a consequence assigning them a public IP would result in a useless operation because the packets won't be properly routed to the physical machine hosting such a public IP. Can you help me ? Do you have any suggestion ? Best Regards, -- Patrizio Dr Patrizio Dazzi, Ph.D. HPC Lab @ ISTI-CNR, Via Moruzzi, 1 - 56126, Pisa, Italy Phone: +39 050 315 30 74 -- Fax: +39 050 315 20 40 "Genius is one percent inspiration, ninety-nine percent perspiration" - Thomas Alva Edison ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org -- Alberto Zuin via Mare, 36/A 36030 Lugo di Vicenza (VI) Italy P.I. 04310790284 Tel. +39.0499271575 Fax. +39.0492106654 Cell. +39.3286268626 www.azns.it - albe...@azns.it ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
Re: [one-users] Public IPs on an internal private network of an OpenNebula Cluster
Le 7/4/12 4:36 PM, Patrizio Dazzi a écrit : > Dear all OpenNebulers, > > As a researcher of the HPCLab at ISTI-CNR I am working for the CONTRAIL > EU project (http://www.contrail-project.eu/), which main aim is to > conceive and develop > an holistic system for building cloud federation that can be managed > in an integrated and > seamless way. > > For the reference implementation of CONTRAIL system, we decided to exploit > OpenNebula as Provider-level IaaS. As a consequence, CNR and a few other > project partners are setting up an OpenNebula Cloud each. > > Unfortunately, at CNR we are experiencing some issues related to OpenNebula > configuration. These are mainly due to the lack of public IPs > availability on our side and > the consequent decision to reserve them for the VMs hence avoiding to > assign to each > Physical machine a public IP. > > Let me describe what's going on on our side. > > We have installed the tarball distribution of open nebula 3.4.1 for > running virtual machines on a (kvm based) cluster made of 5 computers: > a front-end machine and 4 slaves machines. Currently, the master has 2 > network interfaces configured whereas the slaves have only a single > network interface configured each. All the nodes of the cluster are > running Ubuntu server 12.04 64 bit. > > The slaves of the cluster are connected to the front-end via a gigabit > switch. The front-end uses the second network interface to connect to > Internet. Such front-end is the only machine having a public IP. > Indeed, the internal network exploits a class of private IPs > (192.168.100.X). The front-end iptables has been already properly > configured to forward and masquerade the connections from the slaves > to the internet. Indeed, we are able to connect to ubuntu update sites > directly from the slaves. > > I also have a few public IPs that I would like to assign to certain > Virtual Machines that will be run on the cluster. > > Unfortunately, the slaves are connected to a private network, hence > their virtual bridges, as far as I know, can receive only packets sent > to IPs having the same network address/mask. As a consequence > assigning them a public IP would result in a useless operation because > the packets won't be properly routed to the physical machine hosting > such a public IP. > > Can you help me ? Do you have any suggestion ? Can't you connect slaves to "public" network with, by default, the interface down. Then, in your boot script, if the template assigns a public IP, configure the public interface to use it and start the interface (of course routing tables should also be modified to get direct access). Olivier > > Best Regards, > -- Patrizio > > > Dr Patrizio Dazzi, Ph.D. > HPC Lab @ ISTI-CNR, Via Moruzzi, 1 - 56126, Pisa, Italy > Phone: +39 050 315 30 74 -- Fax: +39 050 315 20 40 > > "Genius is one percent inspiration, ninety-nine percent perspiration" > - Thomas Alva Edison > ___ > Users mailing list > Users@lists.opennebula.org > http://lists.opennebula.org/listinfo.cgi/users-opennebula.org > > -- Olivier Sallou IRISA / University of Rennes 1 Campus de Beaulieu, 35000 RENNES - FRANCE Tel: 02.99.84.71.95 gpg key id: 4096R/326D8438 (keyring.debian.org) Key fingerprint = 5FB4 6F83 D3B9 5204 6335 D26D 78DC 68DB 326D 8438 ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
[one-users] Public IPs on an internal private network of an OpenNebula Cluster
Dear all OpenNebulers, As a researcher of the HPCLab at ISTI-CNR I am working for the CONTRAIL EU project (http://www.contrail-project.eu/), which main aim is to conceive and develop an holistic system for building cloud federation that can be managed in an integrated and seamless way. For the reference implementation of CONTRAIL system, we decided to exploit OpenNebula as Provider-level IaaS. As a consequence, CNR and a few other project partners are setting up an OpenNebula Cloud each. Unfortunately, at CNR we are experiencing some issues related to OpenNebula configuration. These are mainly due to the lack of public IPs availability on our side and the consequent decision to reserve them for the VMs hence avoiding to assign to each Physical machine a public IP. Let me describe what's going on on our side. We have installed the tarball distribution of open nebula 3.4.1 for running virtual machines on a (kvm based) cluster made of 5 computers: a front-end machine and 4 slaves machines. Currently, the master has 2 network interfaces configured whereas the slaves have only a single network interface configured each. All the nodes of the cluster are running Ubuntu server 12.04 64 bit. The slaves of the cluster are connected to the front-end via a gigabit switch. The front-end uses the second network interface to connect to Internet. Such front-end is the only machine having a public IP. Indeed, the internal network exploits a class of private IPs (192.168.100.X). The front-end iptables has been already properly configured to forward and masquerade the connections from the slaves to the internet. Indeed, we are able to connect to ubuntu update sites directly from the slaves. I also have a few public IPs that I would like to assign to certain Virtual Machines that will be run on the cluster. Unfortunately, the slaves are connected to a private network, hence their virtual bridges, as far as I know, can receive only packets sent to IPs having the same network address/mask. As a consequence assigning them a public IP would result in a useless operation because the packets won't be properly routed to the physical machine hosting such a public IP. Can you help me ? Do you have any suggestion ? Best Regards, -- Patrizio Dr Patrizio Dazzi, Ph.D. HPC Lab @ ISTI-CNR, Via Moruzzi, 1 - 56126, Pisa, Italy Phone: +39 050 315 30 74 -- Fax: +39 050 315 20 40 "Genius is one percent inspiration, ninety-nine percent perspiration" - Thomas Alva Edison ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
[one-users] Reminder: call for translations (deadline 06th July 9am CEST)
Dear community, This is a reminder of the call for translations for our web interfaces: Sunstone and Self-Service. The deadline has been moved some hours ealier, to Friday 6th of July, 9am CEST, in order to be able to comply better with the testing and packaging of the new OpenNebula 3.6 version, to be released on 9th of July. For instructions on how to translate check [1]. As corrections and improvements have been carried out during the beta phase, there have been minor changes to some translations strings. These changes are now reflected in the Transifex project page[2]. Therefore we kindly ask all translators to complete the translations as much as possible before July 6th, 9am CEST. We would like to thank all the contributors for the great response and the efforts carried out so far. It is great to count with so many languages :) -- Hector Sanjuan OpenNebula Developer [1]http://blog.opennebula.org/?p=3124 [2]https://www.transifex.com/projects/p/one ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
Re: [one-users] Failed to connect to server (code: 1006)
Solved. I forgot to set LISTEN parameter ... Thanks, Hector! Jan Dňa 04.07.2012 15:34, Hector Sanjuan wrote / napísal(a): En Wed, 04 Jul 2012 15:28:18 +0200, Jan Benadik escribió: - no, I cannot connect to running VM via another VNC viewer Well, then it's a problem with the hypervisor or the connectivity to the VM physical host? You should try to fix that and then check if the problem still persists in Sunstone. Im assuming that you defined the GRAPHICS section correctly in your VM template and that the LISTEN param is set to 0.0.0.0? -- Ján Beňadik Managed Services - Solution Design Architect +421 46 5151 332 +421 903 691 634 jan.bena...@atos.net Vinohradnícka 6, 971 01 Prievidza www.sk.atos.net __ ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
Re: [one-users] Failed to connect to server (code: 1006)
En Wed, 04 Jul 2012 15:28:18 +0200, Jan Benadik escribió: - no, I cannot connect to running VM via another VNC viewer Well, then it's a problem with the hypervisor or the connectivity to the VM physical host? You should try to fix that and then check if the problem still persists in Sunstone. Im assuming that you defined the GRAPHICS section correctly in your VM template and that the LISTEN param is set to 0.0.0.0? -- Hector Sanjuan OpenNebula Developer ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
Re: [one-users] Failed to connect to server (code: 1006)
- /var/log/one/sunstone.error is empty - websockify is not running - I ran /usr/share/one/install_novnc.sh already at the end of installation, config files in /etc/one were modified and I restarted one, occi-server and sunstone-server after that - no, I cannot connect to running VM via another VNC viewer - I installed OpenNebula 3.6 from scratch (virgin installation, no upgrade) - yes, line "starting VNC proxy" appears in sunstone.log (see below) -- Server configuration -- {:marketplace_url=>"http://marketplace.c12g.com/", :vnc_proxy_cert=>nil, :core_auth=>"cipher", :auth=>"sunstone", :vnc_proxy_support_wss=>false, :vnc_proxy_base_port=>29876, :debug_level=>3, :tmpdir=>"/var/tmp/one", :vnc_proxy_key=>nil, :host=>"0.0.0.0", :port=>9869, :_one_xmlrpc_=>"http://localhost:2633/RPC2", :lang=>"en_US", :vnc_proxy_path=>"/usr/share/one/noVNC/utils/websockify"} . . . Wed Jul 04 17:47:57 2012 [I]: Starting vnc proxy: /usr/share/one/noVNC/utils/websockify 35777 myto:5901 Wed Jul 04 17:47:57 2012 [I]: 10.0.3.51 - - [04/Jul/2012 17:47:57] "POST /vm/1/startvnc HTTP/1.1" 200 30 0.0518 Wed Jul 04 17:48:04 2012 [I]: 10.0.3.51 - - [04/Jul/2012 17:48:04] "GET /vnet?timeout=true HTTP/1.1" 200 790 0.0139 Next line in log is after closing window with error message: Wed Jul 04 17:48:06 2012 [I]: 10.0.3.51 - - [04/Jul/2012 17:48:06] "POST /vm/1/stopvnc HTTP/1.1" 200 - 0.0028 Wed Jul 04 17:48:06 2012 [I]: 10.0.3.51 - - [04/Jul/2012 17:48:06] "GET /vmtemplate?timeout=true HTTP/1.1" 200 948 0.0159 Wed Jul 04 17:48:07 2012 [I]: 10.0.3.51 - - [04/Jul/2012 17:48:07] "GET /image?timeout=true HTTP/1.1" 200 901 0.0142 Wed Jul 04 17:48:30 2012 [I]: 10.0.3.51 - - [04/Jul/2012 17:48:30] "GET /acl?timeout=true HTTP/1.1" 200 377 0.1223 Wed Jul 04 17:48:34 2012 [I]: 10.0.3.51 - - [04/Jul/2012 17:48:34] "GET /datastore?timeout=true HTTP/1.1" 200 2312 0.0284 Wed Jul 04 17:48:34 2012 [I]: 10.0.3.51 - - [04/Jul/2012 17:48:34] "GET /group?timeout=true HTTP/1.1" 200 605 0.0105 Wed Jul 04 17:48:35 2012 [I]: 10.0.3.51 - - [04/Jul/2012 17:48:35] "GET /cluster?timeout=true HTTP/1.1" 200 27 0.0072 Wed Jul 04 17:48:39 2012 [I]: 10.0.3.51 - - [04/Jul/2012 17:48:39] "GET /vm?timeout=true HTTP/1.1" 200 2254 0.0278 Jan - Dňa 04.07.2012 15:11, Hector Sanjuan wrote / napísal(a): Hi, let see, Check sunstone.error too. Check that no websockify from a previous session is running (ps aux | grep websockify). Check that you have run ./install_novnc.sh after your upgrade. Check that you can still connect to the machine VNC with a different vnc client directly. Is the "starting vnc proxy" line showing up in sunstone.log? From which version were you upgrading? You can attach sunstone.log so I can check if I see something that you missed. Hector En Wed, 04 Jul 2012 14:56:14 +0200, Jan Benadik escribió: Hi all, I've installed the newest version of OpenNebula (3.6 beta) on Ubuntu Server 12.04 (management node and host too). When I try to click on VNC icon of running VM in Sunstone UI, open window with this error message appears: Failed to connect to server (code: 1006) The same behaviour in the newest Chromium and Firefox 13 browser ... Nothing in /var/log/syslog (host nor management node), /var/log/one/oned.log, /var/log/one/sunstone.log (management node). Where can be an issue? Jan -- Ján Beňadik Managed Services - Solution Design Architect +421 46 5151 332 +421 903 691 634 jan.bena...@atos.net Vinohradnícka 6, 971 01 Prievidza www.sk.atos.net __ ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
Re: [one-users] Failed to connect to server (code: 1006)
Hi, let see, Check sunstone.error too. Check that no websockify from a previous session is running (ps aux | grep websockify). Check that you have run ./install_novnc.sh after your upgrade. Check that you can still connect to the machine VNC with a different vnc client directly. Is the "starting vnc proxy" line showing up in sunstone.log? From which version were you upgrading? You can attach sunstone.log so I can check if I see something that you missed. Hector En Wed, 04 Jul 2012 14:56:14 +0200, Jan Benadik escribió: Hi all, I've installed the newest version of OpenNebula (3.6 beta) on Ubuntu Server 12.04 (management node and host too). When I try to click on VNC icon of running VM in Sunstone UI, open window with this error message appears: Failed to connect to server (code: 1006) The same behaviour in the newest Chromium and Firefox 13 browser ... Nothing in /var/log/syslog (host nor management node), /var/log/one/oned.log, /var/log/one/sunstone.log (management node). Where can be an issue? Jan -- Hector Sanjuan OpenNebula Developer ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
[one-users] Failed to connect to server (code: 1006)
Hi all, I've installed the newest version of OpenNebula (3.6 beta) on Ubuntu Server 12.04 (management node and host too). When I try to click on VNC icon of running VM in Sunstone UI, open window with this error message appears: Failed to connect to server (code: 1006) The same behaviour in the newest Chromium and Firefox 13 browser ... Nothing in /var/log/syslog (host nor management node), /var/log/one/oned.log, /var/log/one/sunstone.log (management node). Where can be an issue? Jan -- Ján Beňadik Managed Services - Solution Design Architect +421 46 5151 332 +421 903 691 634 jan.bena...@atos.net Vinohradnícka 6, 971 01 Prievidza www.sk.atos.net __ ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
Re: [one-users] Sunstone does not load any stats.
Hello, can you try to remove browsers cache and see if that fixes it? Hector En Wed, 04 Jul 2012 02:10:34 +0200, Tao Craig escribió: Hi everybody, I recently upgraded my CentOS Open Nebula installation from 3.4 to 3.6 (Lagoon). Prior to the upgrade, I noticed my Sunstone dashboard was loading slowly on login (the page would load fine, but it took awhile to load the graphs, number of hosts, etc). I saw there were some improvements with the Sunstone dashboard with this upgrade, so I applied it hoping it would help. Now, my Sunstone dashboard doesn't load any stats or graphs... I just see those spinning orange dots and the rest of the Sunstone interface does not work either (I'm assuming because this information is never gathered). There are no errors in my logs anywhere that I can find. The only thing I am noticing is that ruby scripts are consuming a large amount of CPU resources. If it helps, I am currently running 13 virtual machines on 9 hosts and all "one" CLI commands work fine. Any help would be appreciated. Thanks. -- Hector Sanjuan OpenNebula Developer ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
[one-users] The virtual machine failure migration??
Hi?? All I used OpenNebula3.2.1 version. When I execute VM migrate operation??Occasionally the VM appears a migration failure, following log?? Thu Jun 28 15:16:24 2012 [LCM][I]: New VM state is RUNNING Thu Jun 28 15:17:09 2012 [LCM][I]: New VM state is SAVE_MIGRATE Thu Jun 28 15:17:42 2012 [VMM][I]: save: Executed "virsh --connect qemu:///system [^] save one-383 /one_images/383/images/checkpoint". Thu Jun 28 15:17:42 2012 [VMM][I]: ExitCode: 0 Thu Jun 28 15:17:42 2012 [VMM][I]: Successfully execute virtualization driver operation: save. Thu Jun 28 15:17:43 2012 [VMM][I]: ExitCode: 0 Thu Jun 28 15:17:43 2012 [VMM][I]: Successfully execute network driver operation: clean. Thu Jun 28 15:17:43 2012 [LCM][I]: New VM state is PROLOG_MIGRATE Thu Jun 28 15:56:03 2012 [TM][I]: tm_mv.sh: Moving /one_images/383/images Thu Jun 28 15:56:03 2012 [TM][I]: tm_mv.sh: Executed "ssh compute-56-5.local mkdir -p /one_images/383". Thu Jun 28 15:56:03 2012 [TM][I]: tm_mv.sh: Executed "scp -r compute-56-4.local:/one_images/383/images compute-56-5.local:/one_images/383/images". Thu Jun 28 15:56:03 2012 [TM][I]: tm_mv.sh: Executed "ssh compute-56-4.local rm -rf /one_images/383/images". Thu Jun 28 15:56:03 2012 [TM][I]: ExitCode: 0 Thu Jun 28 15:56:03 2012 [LCM][I]: New VM state is BOOT Thu Jun 28 15:56:05 2012 [VMM][I]: ExitCode: 0 Thu Jun 28 15:56:05 2012 [VMM][I]: Successfully execute network driver operation: pre. Thu Jun 28 15:56:06 2012 [VMM][I]: Command execution fail: /var/tmp/one/vmm/kvm/restore /one_images/383/images/checkpoint compute-56-5.local 383 compute-56-5.local Thu Jun 28 15:56:06 2012 [VMM][E]: restore: Command "virsh --connect qemu:///system [^] restore /one_images/383/images/checkpoint" failed. Thu Jun 28 15:56:06 2012 [VMM][E]: restore: error: Failed to restore domain from /one_images/383/images/checkpoint Thu Jun 28 15:56:06 2012 [VMM][I]: error: internal error process exited while connecting to monitor: qemu-kvm: -drive file=/one_images/383/images/disk.0,if=none,id=drive-virtio-disk0,format=raw: could not open disk image /one_images/383/images/disk.0: Permission denied Thu Jun 28 15:56:06 2012 [VMM][E]: Could not restore from /one_images/383/images/checkpoint Thu Jun 28 15:56:06 2012 [VMM][I]: ExitCode: 1 Thu Jun 28 15:56:06 2012 [VMM][I]: Failed to execute virtualization driver operation: restore. Thu Jun 28 15:56:06 2012 [VMM][E]: Error restoring VM: Could not restore from /one_images/383/images/checkpoint Thu Jun 28 15:56:06 2012 [DiM][I]: New VM state is FAILED execute command : chmod +x * but ??receive the following error message?? error: Unable to read from monitor: Connection reset by peer This is what causes problems ? Thanks! Hope after Regards?? 240AF710@32040E77.F621F44F Description: Binary data ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org