Hi, Something did not work in the migration process... The HOST/VMS element you mention should have been added by this file [1], and your VM xml is missing the USER_TEMPLATE element, which is added here [2].
Can you compare the contents of your migrator files in /usr/lib/one/ruby/onedb/ to the repo [3]? Regards [1] http://dev.opennebula.org/projects/opennebula/repository/revisions/master/entry/src/onedb/3.8.0_to_3.8.1.rb#L92 [2] http://dev.opennebula.org/projects/opennebula/repository/revisions/master/entry/src/onedb/3.8.4_to_3.9.80.rb#L410 [3] http://dev.opennebula.org/projects/opennebula/repository/revisions/master/show/src/onedb -- Join us at OpenNebulaConf2013 <http://opennebulaconf.com> in Berlin, 24-26 September, 2013 -- Carlos Martín, MSc Project Engineer OpenNebula - The Open-source Solution for Data Center Virtualization www.OpenNebula.org | cmar...@opennebula.org | @OpenNebula<http://twitter.com/opennebula><cmar...@opennebula.org> On Mon, Aug 26, 2013 at 2:53 PM, Federico Zani <federico.z...@roma2.infn.it>wrote: > Hi Carlos, > the problem is that I can't even get the xml of the vms. > It seems it's something related to how the xml in the "body" column (for > both hosts and vms) of the database is structured. > > Looking deeply in the migrations scripts, I solved the hosts problem by > adding the <vms> node (even without child) under the <host> tag of the body > column in "host_pool" table, but for the vms I still have to find a > solution. > > Now with hosts access I'm able to submit and control new vm instances, but > I have dozens of running vms that I'm not even able to destroy (not even > with the force switch turned on). > > This is the xml of one my hosts, as returned by onehost show -x (relevant > names are remmed out via the "[...]" string) : > > <HOST> > <ID>15</ID> > <NAME>[...]</NAME> > <STATE>2</STATE> > <IM_MAD>im_kvm</IM_MAD> > <VM_MAD>vmm_kvm</VM_MAD> > <VN_MAD>dummy</VN_MAD> > <LAST_MON_TIME>1377520947</LAST_MON_TIME> > <CLUSTER_ID>101</CLUSTER_ID> > <CLUSTER>[...]</CLUSTER> > <HOST_SHARE> > <DISK_USAGE>0</DISK_USAGE> > <MEM_USAGE>20971520</MEM_USAGE> > <CPU_USAGE>1800</CPU_USAGE> > <MAX_DISK>0</MAX_DISK> > <MAX_MEM>24596936</MAX_MEM> > <MAX_CPU>2400</MAX_CPU> > <FREE_DISK>0</FREE_DISK> > <FREE_MEM>5558100</FREE_MEM> > <FREE_CPU>2323</FREE_CPU> > <USED_DISK>0</USED_DISK> > <USED_MEM>19038836</USED_MEM> > <USED_CPU>76</USED_CPU> > <RUNNING_VMS>6</RUNNING_VMS> > </HOST_SHARE> > <VMS> > <ID>326</ID> > </VMS> > <TEMPLATE> > <ARCH><![CDATA[x86_64]]></ARCH> > <CPUSPEED><![CDATA[1600]]></CPUSPEED> > <FREECPU><![CDATA[2323.2]]></FREECPU> > <FREEMEMORY><![CDATA[5558100]]></FREEMEMORY> > <HOSTNAME><![CDATA[[...]]]></HOSTNAME> > <HYPERVISOR><![CDATA[kvm]]></HYPERVISOR> > <MODELNAME><![CDATA[Intel(R) Xeon(R) CPU E5645 @ > 2.40GHz]]></MODELNAME> > <NETRX><![CDATA[16007208117863]]></NETRX> > <NETTX><![CDATA[1185926401588]]></NETTX> > <TOTALCPU><![CDATA[2400]]></TOTALCPU> > <TOTALMEMORY><![CDATA[24596936]]></TOTALMEMORY> > <TOTAL_ZOMBIES><![CDATA[5]]></TOTAL_ZOMBIES> > <USEDCPU><![CDATA[76.8000000000002]]></USEDCPU> > <USEDMEMORY><![CDATA[19038836]]></USEDMEMORY> > <ZOMBIES><![CDATA[one-324, one-283, one-314, one-317, > one-304]]></ZOMBIES> > </TEMPLATE> > </HOST> > > As you can see, every hosts now recognize the connected vms as "zombies", > probably because he can't query the vms. > > I'm also sending you the xml contained in the "body" column of the vm_pool > table of a vm I can't query with onevm show : > > <VM> > <ID>324</ID> > <UID>0</UID> > <GID>0</GID> > <UNAME>oneadmin</UNAME> > <GNAME>oneadmin</GNAME> > <NAME>[...]</NAME> > <PERMISSIONS> > <OWNER_U>1</OWNER_U> > <OWNER_M>1</OWNER_M> > <OWNER_A>0</OWNER_A> > <GROUP_U>0</GROUP_U> > <GROUP_M>0</GROUP_M> > <GROUP_A>0</GROUP_A> > <OTHER_U>0</OTHER_U> > <OTHER_M>0</OTHER_M> > <OTHER_A>0</OTHER_A> > </PERMISSIONS> > <LAST_POLL>1375778872</LAST_POLL> > <STATE>3</STATE> > <LCM_STATE>3</LCM_STATE> > <RESCHED>0</RESCHED> > <STIME>1375457045</STIME> > <ETIME>0</ETIME> > <DEPLOY_ID>one-324</DEPLOY_ID> > <MEMORY>4194304</MEMORY> > <CPU>9</CPU> > <NET_TX>432290511</NET_TX> > <NET_RX>2072231827</NET_RX> > <TEMPLATE> > <CONTEXT> > <ETH0_DNS><![CDATA[[...]]]></ETH0_DNS> > <ETH0_GATEWAY><![CDATA[[...]]]></ETH0_GATEWAY> > <ETH0_IP><![CDATA[[...]]]></ETH0_IP> > <ETH0_MASK><![CDATA[[...]]]></ETH0_MASK> > <FILES><![CDATA[[...]]]></FILES> > <HOSTNAME><![CDATA[[...]]]></HOSTNAME> > <TARGET><![CDATA[hdb]]></TARGET> > </CONTEXT> > <CPU><![CDATA[4]]></CPU> > <DISK> > <CLONE><![CDATA[YES]]></CLONE> > <CLUSTER_ID><![CDATA[101]]></CLUSTER_ID> > <DATASTORE><![CDATA[nonshared_ds]]></DATASTORE> > <DATASTORE_ID><![CDATA[101]]></DATASTORE_ID> > <DEV_PREFIX><![CDATA[hd]]></DEV_PREFIX> > <DISK_ID><![CDATA[0]]></DISK_ID> > <IMAGE><![CDATA[[...]]]></IMAGE> > <IMAGE_ID><![CDATA[119]]></IMAGE_ID> > <IMAGE_UNAME><![CDATA[oneadmin]]></IMAGE_UNAME> > <READONLY><![CDATA[NO]]></READONLY> > <SAVE><![CDATA[NO]]></SAVE> > > <SOURCE><![CDATA[/var/lib/one/datastores/101/3860dfcd1bec39ce672ba855564b44ca]]></SOURCE> > <TARGET><![CDATA[hda]]></TARGET> > <TM_MAD><![CDATA[ssh]]></TM_MAD> > <TYPE><![CDATA[FILE]]></TYPE> > </DISK> > <DISK> > <DEV_PREFIX><![CDATA[hd]]></DEV_PREFIX> > <DISK_ID><![CDATA[1]]></DISK_ID> > <FORMAT><![CDATA[ext3]]></FORMAT> > <SIZE><![CDATA[26000]]></SIZE> > <TARGET><![CDATA[hdc]]></TARGET> > <TYPE><![CDATA[fs]]></TYPE> > </DISK> > <DISK> > <DEV_PREFIX><![CDATA[hd]]></DEV_PREFIX> > <DISK_ID><![CDATA[2]]></DISK_ID> > <SIZE><![CDATA[8192]]></SIZE> > <TARGET><![CDATA[hdd]]></TARGET> > <TYPE><![CDATA[swap]]></TYPE> > </DISK> > <FEATURES> > <ACPI><![CDATA[yes]]></ACPI> > </FEATURES> > <GRAPHICS> > <KEYMAP><![CDATA[it]]></KEYMAP> > <LISTEN><![CDATA[0.0.0.0]]></LISTEN> > <PORT><![CDATA[6224]]></PORT> > <TYPE><![CDATA[vnc]]></TYPE> > </GRAPHICS> > <MEMORY><![CDATA[4096]]></MEMORY> > <NAME><![CDATA[[...]]]></NAME> > <NIC> > <BRIDGE><![CDATA[br1]]></BRIDGE> > <CLUSTER_ID><![CDATA[101]]></CLUSTER_ID> > <IP><![CDATA[[...]]]></IP> > <MAC><![CDATA[02:00:c0:a8:1e:02]]></MAC> > <MODEL><![CDATA[virtio]]></MODEL> > <NETWORK><![CDATA[[...]]]></NETWORK> > <NETWORK_ID><![CDATA[9]]></NETWORK_ID> > <NETWORK_UNAME><![CDATA[oneadmin]]></NETWORK_UNAME> > <VLAN><![CDATA[NO]]></VLAN> > </NIC> > <OS> > <ARCH><![CDATA[x86_64]]></ARCH> > <BOOT><![CDATA[hd]]></BOOT> > </OS> > <RAW> > <TYPE><![CDATA[kvm]]></TYPE> > </RAW> > <REQUIREMENTS><![CDATA[CLUSTER_ID = 101]]></REQUIREMENTS> > <TEMPLATE_ID><![CDATA[38]]></TEMPLATE_ID> > <VCPU><![CDATA[4]]></VCPU> > <VMID><![CDATA[324]]></VMID> > </TEMPLATE> > <HISTORY_RECORDS> > <HISTORY> > <OID>324</OID> > <SEQ>0</SEQ> > <HOSTNAME>[...]</HOSTNAME> > <HID>15</HID> > <STIME>1375457063</STIME> > <ETIME>0</ETIME> > <VMMMAD>vmm_kvm</VMMMAD> > <VNMMAD>dummy</VNMMAD> > <TMMAD>ssh</TMMAD> > <DS_LOCATION>/var/datastore</DS_LOCATION> > <DS_ID>102</DS_ID> > <PSTIME>1375457063</PSTIME> > <PETIME>1375457263</PETIME> > <RSTIME>1375457263</RSTIME> > <RETIME>0</RETIME> > <ESTIME>0</ESTIME> > <EETIME>0</EETIME> > <REASON>0</REASON> > </HISTORY> > </HISTORY_RECORDS> > </VM> > > I think it'd be of a great help for me to have the update XSD files for > all the body columns in the databases: I'd able to validate the xml > structure of all the tables to highlight migration problems. > > Thanks! :) > > F. > > > Il 21/08/2013 12:13, Carlos Martín Sánchez ha scritto: > > Hi, > > Could you send us the xml of some of the failing vms and hosts? You can > get it with the -x flag in onevm/host list. > > Send them off-list if you prefer. > > Regards > > -- > Join us at OpenNebulaConf2013 <http://opennebulaconf.com> in Berlin, > 24-26 September, 2013 > -- > Carlos Martín, MSc > Project Engineer > OpenNebula - The Open-source Solution for Data Center Virtualization > www.OpenNebula.org | cmar...@opennebula.org | > @OpenNebula<http://twitter.com/opennebula> > > > On Thu, Aug 8, 2013 at 11:29 AM, Federico Zani < > federico.z...@roma2.infn.it> wrote: > >> Hi, >> I am experiencing some issues after the update from 3.7 to 4.2 >> (frontend on a CentOS 6.4 and hosts with KVM virt manager), this is what I >> did : >> >> - Stopped one and sunstone and backed up /etc/one >> - yum localinstall opennebula-4.2.0-1.x86_64.rpm >> opennebula-java-4.2.0-1.x86_64.rpm opennebula-ruby-4.2.0-1.x86_64.rpm >> opennebula-server-4.2.0-1.x86_64.rpm opennebula-sunstone-4.2.0-1.x86_64.rpm >> - duplicated im and vmm for kvm mads as specified here >> http://opennebula.org/documentation:archives:rel4.0:upgrade#driver_names >> - checked for other mismatch in one.conf but actually I found nothing to >> be fixed >> - onedb upgrade -v --sqlite /var/lib/one/one.db (no errors, just a few >> warning about manual fixes needed - that I did) >> - moved vm description files from */var/lib/one/*[0-9]* to */ >> var/lib/one/vms/* >> >> Then I tried to fsck the sqlite db but got the following error : >> -------------- >> onedb fsck -f -v -s /var/lib/one/one.db >> Version read: >> 4.2.0 : Database migrated from 3.7.80 to 4.2.0 (OpenNebula 4.2.0) by >> onedb command. >> >> Sqlite database backup stored in /var/lib/one/one.db.bck >> Use 'onedb restore' or copy the file back to restore the DB. >> > Running fsck >> >> Datastore 0 is missing fom Cluster 101 datastore id list >> Image 127 is missing fom Datastore 101 image id list >> undefined method `elements' for nil:NilClass >> Error running fsck version 4.2.0 >> The database will be restored >> Sqlite database backup restored in /var/lib/one/one.db >> ----------- >> >> I also tried to reinstall ruby gems with /usr/share/one/install_gems but >> still got the same issue. >> >> After a few searching, I tried to start one and sunstone-server anyway, >> and this is the result : >> - I can do "onevm list" and "onehost list" correctly >> - When I do a "onevm show" on a terminated vm it shows me the correct >> information >> - When I do a "onevm show" (on a running vm) or "onehost show", it >> returns a "[VirtualMachineInfo] Error getting virtual machine [312]." or >> either "[HostInfo] Error getting host [30]." >> >> In the log file (/var/log/oned.log) I can see the following errors, when >> issuing those commands : >> ---------- >> Tue Aug 6 12:49:40 2013 [ONE][E]: SQL command was: SELECT body FROM >> host_pool WHERE oid = 30, error: callback requested query abort >> Tue Aug 6 12:49:40 2013 [ONE][E]: SQL command was: SELECT body FROM >> vm_pool WHERE oid = 312, error: callback requested query abort >> ------------ >> >> I am still able to see datastores informations and the overall situation >> of my private cloud through the sunstone dashboard, but it seems I cannot >> access informations related to running vms and hosts: it leads to an >> unusable private cloud (can't stop vms, can't run a new one, etc...) >> >> Any clues ? >> >> Federico. >> >> _______________________________________________ >> Users mailing list >> Users@lists.opennebula.org >> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org >> >> > >
_______________________________________________ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org