Re: cannot start system VMs: disaster after maintenance followup

2019-04-02 Thread Riepl, Gregor (SWISS TXT)
> I have a problem with my cloud-management web UI. It just stopped > accepting connections and gives apache error. > > Is theer anyway I could get some help from you? Is this related to Jevgeni's issue or a completely new one? You shouldn't simply hit reply on a mail thread, as that will make t

RE: cannot start system VMs: disaster after maintenance followup

2019-03-29 Thread Sam Ceylani
start system VMs: disaster after maintenance followup Stick to 4.11.2 - 4.12 should be released withing few days officially. As for qemu-kvm-ev - yes, it's supposed to work - make sure to test new versions obviously. Did you got your new installation running fine ? On Thu, 21 Mar 2019 at

Re: cannot start system VMs: disaster after maintenance followup

2019-03-26 Thread Riepl, Gregor (SWISS TXT)
Hi Jevgeni, > (1) client -> VM1:80/app -> VM2:8080/app > (2) client -> VM1:80/data -> VM3:8080/data > > This was working fine before the reinstallation. > We found that it works, if we stop iptables. > > But with iptables ON, (1) works, but (2) does not work - it gives > connection refused. > Ho

Re: cannot start system VMs: disaster after maintenance followup

2019-03-25 Thread Jevgeni Zolotarjov
This saga is to be continued. "Security groups" was the correct keyword to resolve my problem. Now all is in order and all VMs run. One observation: This guide here suggests to configure /etc/libvirt/libvirtd.conf and /etc/sysconfig/libvirtd under Libvirt Configuration But these files get overw

Re: cannot start system VMs: disaster after maintenance followup

2019-03-22 Thread Dag Sonstebo
Jevgeni - you've not provided any network troubleshooting findings - but this is all down to security groups so check these are in place and working. Regards, Dag Sonstebo Cloud Architect ShapeBlue On 21/03/2019, 19:47, "Jevgeni Zolotarjov" wrote: << wrote: > Stick to 4.11.2 -

RE: cannot start system VMs: disaster after maintenance followup

2019-03-22 Thread Piotr Pisz
Andrija, Qemu-ev repo add 2.x to CentOS 7 yum install centos-release-qemu-ev Regards, Piotr -Original Message- From: Andrija Panic Sent: Thursday, March 21, 2019 6:19 PM To: users Subject: Re: cannot start system VMs: disaster after maintenance followup Jevgeni, qemu-kvm 1.5.3 is

Re: cannot start system VMs: disaster after maintenance followup

2019-03-21 Thread Jevgeni Zolotarjov
<< wrote: > Stick to 4.11.2 - 4.12 should be released withing few days officially. > > As for qemu-kvm-ev - yes, it's supposed to work - make sure to test new > versions obviously. > > Did you got your new installation running fine ? > > On Thu, 21 Mar 2019 at 19:26, Jevgeni Zolotarjov > wrote: >

Re: cannot start system VMs: disaster after maintenance followup

2019-03-21 Thread Andrija Panic
Stick to 4.11.2 - 4.12 should be released withing few days officially. As for qemu-kvm-ev - yes, it's supposed to work - make sure to test new versions obviously. Did you got your new installation running fine ? On Thu, 21 Mar 2019 at 19:26, Jevgeni Zolotarjov wrote: > Andrija, > > I asked her

Re: cannot start system VMs: disaster after maintenance followup

2019-03-21 Thread Jevgeni Zolotarjov
Andrija, I asked here in the group if its safe to try new version of KVM and got reply, that it works. It was back in September. So we installed it with yum install centos-release-qemu-ev yum install qemu-kvm-ev It worked fine ever since. But with new maintenance (yum update) apparently some brea

Re: cannot start system VMs: disaster after maintenance followup

2019-03-21 Thread Andrija Panic
Jevgeni, qemu-kvm 1.5.3 is the lastest official one for CentoS 7.6.XXX (latest) which I'm running atm in my lab (just checked for update) - how did you manage to go to 2.0 (custom repo ?) On Thu, 21 Mar 2019 at 18:13, Ivan Kudryavtsev wrote: > Jevgeniy, simplest and the most obvious way is to fl

Re: cannot start system VMs: disaster after maintenance followup

2019-03-21 Thread Ivan Kudryavtsev
Jevgeniy, simplest and the most obvious way is to flatten their images with "qemu-img convert", next import them as templates and recreate VMs from those templates. чт, 21 мар. 2019 г. в 13:05, Jevgeni Zolotarjov : > What happened in the end was: qemu-kvm got updated to version 2.0 during > the m

Re: cannot start system VMs: disaster after maintenance followup

2019-03-21 Thread Jevgeni Zolotarjov
What happened in the end was: qemu-kvm got updated to version 2.0 during the maintenance. We could not manage to make this KVM to work with Cloudstack. So we rolled back to version 1.5.3. And now we have clean cloudstack fully operational. We can create new VMs and it works. I am almost happy. N

Re: cannot start system VMs: disaster after maintenance followup

2019-03-21 Thread Andrija Panic
Just replace the URL for systemVM template from 4.11.1 with 4.11.2 (there is a PR for this now). On Thu, 21 Mar 2019 at 16:53, Andrija Panic wrote: > Please use the one, updated specifically for CentOS 7 - > https://github.com/apache/cloudstack-documentation/blob/master/source/quickinstallationg

Re: cannot start system VMs: disaster after maintenance followup

2019-03-21 Thread Andrija Panic
Please use the one, updated specifically for CentOS 7 - https://github.com/apache/cloudstack-documentation/blob/master/source/quickinstallationguide/qig.rst And please avoid collocating KVM and MGMT on same server (especially in any production-like system) Please let me know if the guide above gi

Re: cannot start system VMs: disaster after maintenance followup

2019-03-21 Thread Jevgeni Zolotarjov
OS management - centos 7 (1810) OS hypervisor - centos 7 (1810) Basic zone - yes I am following this quide http://docs.cloudstack.apache.org/en/4.11.2.0/quickinstallationguide/qig.html Right now from scratch - management ans hypervisor on the same machine qemu - version 1.5.3 libvirt - libvirt ve

Re: cannot start system VMs: disaster after maintenance followup

2019-03-21 Thread Andrija Panic
Hey Jevgeni, what OS mgmt, what OS hypervisor, what qemu/libvirt versions - still in Basic Zone, SG ? Andrija On Thu, 21 Mar 2019 at 13:06, Jevgeni Zolotarjov wrote: > I reinstalled cloudstack from scratch - everything > > But looks like I hit the same wall now > > In the last step of installa

cannot start system VMs: disaster after maintenance followup

2019-03-21 Thread Jevgeni Zolotarjov
I reinstalled cloudstack from scratch - everything But looks like I hit the same wall now In the last step of installation it cannot create system VMs. service libvirtd status -l gives me ● libvirtd.service - Virtualization daemon Loaded: loaded (/usr/lib/

Re: Disaster after maintenance

2019-03-20 Thread Sergey Levitskiy
+1 on the advice to start from scratch. Provisioning is failing because it can’t spin up either SSVM or proxy due to not enough capacity. The reason might be: * Not enough capacity either CPU or RAM. increasing overprovisioning factors or reducing disable thresholds might help. * Host

Re: Disaster after maintenance

2019-03-20 Thread Andrija Panic
Hi Jevgeni, I would perhaps consider you continue with plan B from your separate email thread (root volumes --> create snapshots, convert snaps to template, download template somewhere safe - for DATA volumes, also create snapshots, then convert to volume and download it (or simply directly downlo

Re: Disaster after maintenance

2019-03-20 Thread Jevgeni Zolotarjov
It started with 4.10 and then gradually upgraded with all stops, when new releases were available. >>> Why do you have 3 zones in this installation - what is the setup ? >>> SSVM and CPVM (for whatever zone) are failing to be created... Its a result of attempts to create new zone and somehow move

Re: Disaster after maintenance

2019-03-20 Thread Andrija Panic
Hi, 2019-03-20 06:41:50,446 INFO [c.c.u.DatabaseUpgradeChecker] (main:null) (logid:) DB version = 4.10.0.0 Code Version = 4.11.2.0 2019-03-20 06:41:50,447 DEBUG [c.c.u.DatabaseUpgradeChecker] (main:null) (logid:) Running upgrade Upgrade41000to41100 to upgrade from 4.10.0.0-4.11.0.0 to 4.11.0.0 fa

Re: Disaster after maintenance

2019-03-20 Thread Jevgeni Zolotarjov
Basic Zone - Yes router has been actually started/created on KVM side - not created, not started. Thats the main problem, I guess agent.log https://drive.google.com/open?id=1rATxHKqgNKo2kD23BtlrZy_9gFXC-Bq- management log https://drive.google.com/open?id=1H2jI0roeiWxtzReB8qV6QxDkNpaki99A >> Can

Re: Disaster after maintenance

2019-03-20 Thread Dag Sonstebo
Jevgeni, Can you also explain your infrastructure - you said you have two hosts only, where does CloudStack management run? Reason I'm asking is when checking your logs from yesterday the IP address 192.168.1.14 seems to be used for management, NFS and a KVM host? Is this the case, do you co-h

Re: Disaster after maintenance

2019-03-20 Thread Andrija Panic
Just to confirm, you are using Basic Zone in CloudStack, right ? Can you confirm that router has been actually started/created on KVM side, again, as requested please post logs (mgmt and agent - and note the time around which you tried to start VR last time it partially succeeded) - we can't guess

Re: Disaster after maintenance

2019-03-20 Thread Jevgeni Zolotarjov
After dozen of attempts, the Virtual Router could finally be recreated. But its in eternal Starting status, and console prompts it required upgrade and Version is UNKNOWN It does not resolve the problem, I cannot move further form this point. Any hints? Or I am condemned to do reinstall cloudstac

Re: Disaster after maintenance

2019-03-20 Thread Jevgeni Zolotarjov
at > > > > > > > > > > > > > > > >> > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > &

Re: Disaster after maintenance

2019-03-20 Thread Andrija Panic
> > > > > > > > > >> > >>at > > > > > > > > > > > > > sun.reflect.NativeMethodAccessorImpl.invoke(Unknown > > > > > > > > &g

Re: Disaster after maintenance

2019-03-20 Thread Jevgeni Zolotarjov
t; > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > &g

Re: Disaster after maintenance

2019-03-20 Thread Andrija Panic
> > > > > > > > > > > > > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:197) > > > > > > > > > > > > >> > >>at > > >

Re: Disaster after maintenance

2019-03-19 Thread Sergey Levitskiy
> > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
gt; > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >

Re: Disaster after maintenance

2019-03-19 Thread Sergey Levitskiy
gt; > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown > > > > > > > > > > >> > >> Source) > > > > > > > > > > >> > >>at java.lang.Thread.run(Unknown

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
t; >>at > > > > > > > > > > >> > >> > > > > > > > > > > >> > > > > > > > > > > > >> > > &

Re: Disaster after maintenance

2019-03-19 Thread Sergey Levitskiy
; > > > > > >> > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > > > > &g

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
t; > > > > > > > > > > > > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:197) > > > > > > > > > >> > >>at > > > > > > > > > >> > >> > > > > > > > > > >> > > > &g

Re: Disaster after maintenance

2019-03-19 Thread Andrija Panic
> > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
gt; > >> > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.

Re: Disaster after maintenance

2019-03-19 Thread Andrija Panic
gt; > > > > > > > > > > > com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:51) > > > > > > >> > >>at > > > > > > >> > >&g

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
t;>at > > > > > >> > >> > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > org.springfram

Re: Disaster after maintenance

2019-03-19 Thread Andrija Panic
> > > > > > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185) > > > > >> > >>at > > > > >> > >> > > > > >&g

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
> > > >> > >> > > > >> > > > > >> > > > > > > org.apache.cloudstack.api.command.user.network.RestartNetworkCmd.execute(RestartNe

Re: Disaster after maintenance

2019-03-19 Thread Andrija Panic
> com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:108) > > >> > >>at > > >> > >> > > >> > > > >> > > > org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
at > >> > >> > >> > > >> > org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56) > >> > >>at >

Re: Disaster after maintenance

2019-03-19 Thread Andrija Panic
callWithContext(DefaultManagedContext.java:103) >> > >>at >> > >> >> > >> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53) >> > >> at >> > >> >> >

Re: Disaster after maintenance

2019-03-19 Thread Andrija Panic
t java.util.concurrent.Executors$RunnableAdapter.call(Unknown > > >> Source) > > >>at java.util.concurrent.FutureTask.run(Unknown Source) > > >>at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown > > >> Source) > > >>

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
>> No > >>>>>> > >>>>>> libvirtd.service - Virtualization daemon > >>>>>> Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; > >>>>>> enabled; vendor preset: enabled) > >>>>>> Activ

Re: Disaster after maintenance

2019-03-19 Thread Andrija Panic
65a6099) Complete > async > >> job-5093, jobStatus: FAILED, resultCode: 530, result: > >> > org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":530,"errortext":"Resource > >> [DataCenter:1] is un >

Re: Disaster after maintenance

2019-03-19 Thread Boris Stoyanov
;> On Tue, Mar 19, 2019 at 4:19 PM Andrija Panic >> wrote: >> >>> ​​ >>> Your network can't be deleted due to "Can't delete the network, not all >>> user vms are expunged. Vm >>> VM[User|i-2-11-VM] i

RE: Disaster after maintenance

2019-03-19 Thread Paul Angus
: Jevgeni Zolotarjov Sent: 19 March 2019 17:29 To: users@cloudstack.apache.org Subject: Re: Disaster after maintenance Guys, please help with it. What can be done here? There is too much valuable data. On Tue, Mar 19, 2019 at 4:21 PM Jevgeni Zolotarjov wrote: > Tried that just now and got er

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
vms are expunged. Vm >> VM[User|i-2-11-VM] is in Stopped state" - which is fine. >> >> You should be able to just start the user VM - but if you have actually >> delete the VR itself, then just do Network restart with "cleanup" and it >> will recreate a new VR,

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
art the VM. > > Andrija > > andrija.pa...@shapeblue.com > www.shapeblue.com > Amadeus House, Floral Street, London WC2E 9DPUK > @shapeblue > > > > > -Original Message----- > From: Jevgeni Zolotarjov > Sent: 19 March 2019 15:10 >

RE: Disaster after maintenance

2019-03-19 Thread Andrija Panic
19 March 2019 15:10 To: users@cloudstack.apache.org Subject: Re: Disaster after maintenance I mean I cannot delete network: In the management server log I see == 019-03-19 14:06:36,316 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (API-Job-Ex

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
t systemd[1]: Failed to start >> > Virtualization daemon. >> > Mar 19 13:03:07 mtl1-apphst04.mt.pbt.com.mt systemd[1]: Unit >> > libvirtd.service entered failed state. >> > Mar 19 13:03:07 mtl1-apphst04.mt.pbt.com.mt systemd[1]: >> libvirtd.service >&

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
unning on the host and the > > > agent logs (usual logs directory) > > > Also worth checking that libvirt has started ok. Do you have some NUMA > > > constraints or anything which requires particular RAM configuration? > > > > > > paul.an...@shapeblue.com > > > www.shapeblu

RE: Disaster after maintenance

2019-03-19 Thread Paul Angus
apache.org Subject: Re: Disaster after maintenance That's it - libvirtd failed to start on second host. Tried restarting, but it does not start. >> Do you have some NUMA constraints or anything which requires >> particular RAM configuration? No libvirtd.service - Virtualizati

Re: Disaster after maintenance

2019-03-19 Thread Ivan Kudryavtsev
com > > www.shapeblue.com > > Amadeus House, Floral Street, London WC2E 9DPUK > > @shapeblue > > > > > > > > > > -Original Message- > > From: Jevgeni Zolotarjov > > Sent: 19 March 2019 14:49 > > To: users@cloudstack.apache.org

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
> From: Jevgeni Zolotarjov > Sent: 19 March 2019 14:49 > To: users@cloudstack.apache.org > Subject: Re: Disaster after maintenance > > Can you try migrating a VM to the server that you changed the RAM amount? > > Also: > What is the hypervisor version? > KVM > Q

RE: Disaster after maintenance

2019-03-19 Thread Paul Angus
Amadeus House, Floral Street, London WC2E 9DPUK @shapeblue -Original Message- From: Jevgeni Zolotarjov Sent: 19 March 2019 14:49 To: users@cloudstack.apache.org Subject: Re: Disaster after maintenance Can you try migrating a VM to the server that you changed the RAM amount? Also: What

Re: Disaster after maintenance

2019-03-19 Thread Rafael Weingärtner
that is why nothing deploys there. You need to connect this host to ACS. otherwise, it will just be ignored. Did you check the log files in the agent (in the host)? And, of course, in ACS? On Tue, Mar 19, 2019 at 9:49 AM Jevgeni Zolotarjov wrote: > Can you try migrating a VM to the server that y

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
Can you try migrating a VM to the server that you changed the RAM amount? Also: What is the hypervisor version? KVM QEMU Version : 2.0.0 Release : 1.el7.6 Host status in ACS? 1st server: Unsecure 2nd server: Disconnected Did you try to force a VM to start/deploy in this server where you

Re: Disaster after maintenance

2019-03-19 Thread Rafael Weingärtner
Can you try migrating a VM to the server that you changed the RAM amount? Also: What is the hypervisor version? Host status in ACS? Did you try to force a VM to start/deploy in this server where you changed the RAM? On Tue, Mar 19, 2019 at 9:39 AM Jevgeni Zolotarjov wrote: > We have Cloudstack

Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
We have Cloudstack 4.11.2 setup running fine for few months (>4) The setup is very simple: 2 hosts We decided to do a maintenance to increase RAM on both servers For this we put first server to maintenance. All VMS moved to second host after a while. Then first server was shutdown, RAM increased,