[jira] [Commented] (CLOUDSTACK-9428) Fix for CLOUDSTACK-9211 - Improve performance

2016-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15426928#comment-15426928
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9428:


Github user nvazquez commented on the issue:

https://github.com/apache/cloudstack/pull/1605
  
Thanks @sateesh-chodapuneedi! 
You've mentioned before that you will test this in your env, can you please 
share your tests results?


> Fix for CLOUDSTACK-9211 - Improve performance
> -
>
> Key: CLOUDSTACK-9428
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9428
> Project: CloudStack
>  Issue Type: Improvement
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>  Components: VMware
>Reporter: Nicolas Vazquez
>Assignee: Nicolas Vazquez
>
> h3. Introduction
> On [CLOUDSTACK-9211|https://issues.apache.org/jira/browse/CLOUDSTACK-9211] 
> passing vRAM size to support 3D GPU problem was addressed on VMware. It was 
> found out that it could be improved to increase performance by reducing extra 
> API calls, as we'll describe later
> h3. Improvement
> On WMware, {{VmwareResource}} manages execution of {{StartCommand}}. Before 
> sending power on command to ESXi hypervisor, vm is configured by calling 
> {{reconfigVMTask}} web method on vSphere's client {{VimPortType}} web service.
> It was found out that we were using this method 2 times when passing vRAM 
> size, as it implied creating a new vm config spec only editing video card 
> specs and making an extra call to {{reconfigVMTask}}.
> We propose reducing the extra web service call by adjusting vm's config spec. 
> This way video card gets properly configured (when passing vRAM size) in the 
> same configure call, increasing performance.
> h3. Use case (passing vRAM size)
> # Deploy a new VM, let its id be X
> # Stop VM
> # Execute SQL, where X is vm's id and Z is vRAM size (in kB): {code:sql}
> INSERT INTO cloud.user_vm_details (vm_id, name, value) VALUES (X, 
> 'mks.enable3d', 'true');
> INSERT INTO cloud.user_vm_details (vm_id, name, value) VALUES (X, 
> 'mks.use3dRenderer', 'automatic');
> INSERT INTO cloud.user_vm_details (vm_id, name, value) VALUES (X, 
> 'svga.autodetect', 'false');
> INSERT INTO cloud.user_vm_details (vm_id, name, value) VALUES (X, 
> 'svga.vramSize', Z);
> {code}
> # Start VM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9428) Fix for CLOUDSTACK-9211 - Improve performance

2016-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15426871#comment-15426871
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9428:


Github user sateesh-chodapuneedi commented on the issue:

https://github.com/apache/cloudstack/pull/1605
  
LGTM  


> Fix for CLOUDSTACK-9211 - Improve performance
> -
>
> Key: CLOUDSTACK-9428
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9428
> Project: CloudStack
>  Issue Type: Improvement
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>  Components: VMware
>Reporter: Nicolas Vazquez
>Assignee: Nicolas Vazquez
>
> h3. Introduction
> On [CLOUDSTACK-9211|https://issues.apache.org/jira/browse/CLOUDSTACK-9211] 
> passing vRAM size to support 3D GPU problem was addressed on VMware. It was 
> found out that it could be improved to increase performance by reducing extra 
> API calls, as we'll describe later
> h3. Improvement
> On WMware, {{VmwareResource}} manages execution of {{StartCommand}}. Before 
> sending power on command to ESXi hypervisor, vm is configured by calling 
> {{reconfigVMTask}} web method on vSphere's client {{VimPortType}} web service.
> It was found out that we were using this method 2 times when passing vRAM 
> size, as it implied creating a new vm config spec only editing video card 
> specs and making an extra call to {{reconfigVMTask}}.
> We propose reducing the extra web service call by adjusting vm's config spec. 
> This way video card gets properly configured (when passing vRAM size) in the 
> same configure call, increasing performance.
> h3. Use case (passing vRAM size)
> # Deploy a new VM, let its id be X
> # Stop VM
> # Execute SQL, where X is vm's id and Z is vRAM size (in kB): {code:sql}
> INSERT INTO cloud.user_vm_details (vm_id, name, value) VALUES (X, 
> 'mks.enable3d', 'true');
> INSERT INTO cloud.user_vm_details (vm_id, name, value) VALUES (X, 
> 'mks.use3dRenderer', 'automatic');
> INSERT INTO cloud.user_vm_details (vm_id, name, value) VALUES (X, 
> 'svga.autodetect', 'false');
> INSERT INTO cloud.user_vm_details (vm_id, name, value) VALUES (X, 
> 'svga.vramSize', Z);
> {code}
> # Start VM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CLOUDSTACK-9461) When RBD backup snapshot to secondary occurs, secondary storage image format is RAW

2016-08-18 Thread Simon Weller (JIRA)

 [ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Weller reassigned CLOUDSTACK-9461:


Assignee: Simon Weller

> When RBD backup snapshot to secondary occurs, secondary storage image format 
> is RAW
> ---
>
> Key: CLOUDSTACK-9461
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9461
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>  Components: Storage Controller
>Affects Versions: 4.7.0, 4.8.0, 4.9.0
>Reporter: Simon Weller
>Assignee: Simon Weller
>Priority: Minor
> Fix For: Future
>
>
> RBD snapshot backup to secondary storage should use QCOW2, not RAW.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CLOUDSTACK-9461) When RBD backup snapshot to secondary occurs, secondary storage image format is RAW

2016-08-18 Thread Simon Weller (JIRA)
Simon Weller created CLOUDSTACK-9461:


 Summary: When RBD backup snapshot to secondary occurs, secondary 
storage image format is RAW
 Key: CLOUDSTACK-9461
 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9461
 Project: CloudStack
  Issue Type: Bug
  Security Level: Public (Anyone can view this level - this is the default.)
  Components: Storage Controller
Affects Versions: 4.9.0, 4.8.0, 4.7.0
Reporter: Simon Weller
Priority: Minor
 Fix For: Future


RBD snapshot backup to secondary storage should use QCOW2, not RAW.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9461) When RBD backup snapshot to secondary occurs, secondary storage image format is RAW

2016-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15426779#comment-15426779
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9461:


Github user kiwiflyer commented on the issue:

https://github.com/apache/cloudstack/pull/1645
  
CLOUDSTACK-9461


> When RBD backup snapshot to secondary occurs, secondary storage image format 
> is RAW
> ---
>
> Key: CLOUDSTACK-9461
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9461
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>  Components: Storage Controller
>Affects Versions: 4.7.0, 4.8.0, 4.9.0
>Reporter: Simon Weller
>Priority: Minor
> Fix For: Future
>
>
> RBD snapshot backup to secondary storage should use QCOW2, not RAW.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9447) Fix systemvm template build failure

2016-08-18 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15426202#comment-15426202
 ] 

ASF subversion and git services commented on CLOUDSTACK-9447:
-

Commit bdc409c7a2cc0e8119fd37261458c62ba3fe4539 in cloudstack's branch 
refs/heads/4.9-bountycastle-daan from [~rajanik]
[ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=bdc409c ]

Merge pull request #1626 from shapeblue/systemvmtemplate-4dot9

[blocker] Fix systemvm template buildPrevious PR: 
https://github.com/apache/cloudstack/pull/1531

Fixes failing systemvmtemplate build.

* pr/1626:
  CLOUDSTACK-9447: fix build and upgrade to debian 7.11 iso

Signed-off-by: Rajani Karuturi 


> Fix systemvm template build failure
> ---
>
> Key: CLOUDSTACK-9447
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9447
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
>
> SystemVM template build code has not been maintained in past months, the ISO 
> url does not work. Further certain openswan pkg version no longer is 
> available and is replaced by an package with the same version but newer pkg 
> name. The fix should do:
> Bumps base debian iso to version 7.11
> Upgrade ruby version to 2.3.0 (latest/stable)
> Fix Gemfile
> Update README



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9447) Fix systemvm template build failure

2016-08-18 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15426203#comment-15426203
 ] 

ASF subversion and git services commented on CLOUDSTACK-9447:
-

Commit 56a352650271f09b6bf2d50e6e9cb377997539e8 in cloudstack's branch 
refs/heads/4.9-bountycastle-daan from [~rajanik]
[ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=56a3526 ]

Merge release branch 4.9 to master

* 4.9:
  CLOUDSTACK-9447: fix build and upgrade to debian 7.11 iso


> Fix systemvm template build failure
> ---
>
> Key: CLOUDSTACK-9447
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9447
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
>
> SystemVM template build code has not been maintained in past months, the ISO 
> url does not work. Further certain openswan pkg version no longer is 
> available and is replaced by an package with the same version but newer pkg 
> name. The fix should do:
> Bumps base debian iso to version 7.11
> Upgrade ruby version to 2.3.0 (latest/stable)
> Fix Gemfile
> Update README



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9449) Dynamic roles default user description typo and column issue

2016-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15426186#comment-15426186
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9449:


GitHub user rhtyd opened a pull request:

https://github.com/apache/cloudstack/pull/1646

[4.9/LTS] Add upgrade path from 4.9.0 to 4.9.1

This adds db upgrade path from 4.9.0 to 4.9.1 and fixes a typo in default 
user role description (CLOUDSTACK-9449)

/cc @karuturi @jburwell  -- this will cause issues when fwd-merged to 
master, I can do the fwd-merging if you would like to avoid fixing the 
conflicts yourself

@blueorangutan package

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shapeblue/cloudstack 4.9-491upgradepath

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/cloudstack/pull/1646.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1646


commit c2512b675463cb978995912978a3e687e1bb8acb
Author: Rohit Yadav 
Date:   2016-08-18T09:40:09Z

cloudstack: upgrade path from 4.9.0 to 4.9.1

- Adds db upgrade path from 4.9.0 to 4.9.1
- CLOUDSTACK-9449: Fix typo in default user role description

Signed-off-by: Rohit Yadav 

commit 72aaf07fc6fc66fbb067afa569806e0a17649d9f
Author: Rohit Yadav 
Date:   2016-08-18T09:45:24Z

Updating pom.xml version numbers for release 4.9.1-SNAPSHOT

Signed-off-by: Rohit Yadav 




> Dynamic roles default user description typo and column issue
> 
>
> Key: CLOUDSTACK-9449
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9449
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
>
> Based on the issue reported here:
> https://github.com/apache/cloudstack/commit/4347776ac6ef9ae86fb016862f4a6b2376f8319a#commitcomment-18546909
> Fix typo in description and possible missing field issue (if reproducible)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9449) Dynamic roles default user description typo and column issue

2016-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15426165#comment-15426165
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9449:


Github user rhtyd closed the pull request at:

https://github.com/apache/cloudstack/pull/1629


> Dynamic roles default user description typo and column issue
> 
>
> Key: CLOUDSTACK-9449
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9449
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
>
> Based on the issue reported here:
> https://github.com/apache/cloudstack/commit/4347776ac6ef9ae86fb016862f4a6b2376f8319a#commitcomment-18546909
> Fix typo in description and possible missing field issue (if reproducible)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9452) CentOS6 kvm hosts stop working after upgrade

2016-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15426155#comment-15426155
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9452:


Github user rhtyd commented on the issue:

https://github.com/apache/cloudstack/pull/1634
  
@karuturi @jburwell can we merge this, this is only packaging related fix. 
Using above packages, installation confirmed that argparse was installed on 
el6/el7 kvm hosts.


> CentOS6 kvm hosts stop working after upgrade
> 
>
> Key: CLOUDSTACK-9452
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9452
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
>Priority: Blocker
>
> Recently patchviasocket script was rewritten in python from perl, but it uses 
> argparse causing failures on centos6 hosts with python 2.6 which may not have 
> it pre-installed. The fix would be to use optparse etc. instaed of argparse 
> as it's introduced only in python 2.7.
> Error log;
> DEBUG [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-3 
> (logid:005ad4fd) Executing: 
> /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.py -n 
> r-148-VM -p %t
>  
> emplate=domP%name=r-148-VM%eth2ip=10.1.34.18%eth2mask=255.255.224.0%gateway=10.1.63.254%eth0ip=10.1.1.1%eth0mask=255.255.255.0%domain=cs2cloud.internal%cidrsize=24%dhcprange=10.1.1.1%eth1ip=
>   
>    
> 169.254.1.214%eth1mask=255.255.0.0%type=router%disable_rp_filter=true%dns1=8.8.8.8%dns2=8.8.4.4%baremetalnotificationsecuritykey=4XFrHwfZgVr6DrJhgoBuNgbc5Vk7ACm90TW3GgYk9-O7TgNY9LXn_FNcm9Sdc
>   
>    
> IEnwSTktEx3K_a7ng2K4fpyUg%baremetalnotificationapikey=k3ja_d3xCTT78-ow30eah6TCvqYB3IIXYtKeaDJ4_TMdD7BbZbHhp07dVKXPiM5ee3xFn2wqSIxuX5LsObQYDg%host=10.2.3.61%port=8080
> DEBUG [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-3 
> (logid:005ad4fd) Exit value is 1
> DEBUG [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-3 
> (logid:005ad4fd) Traceback (most recent call last):  File 
> "/usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patc  
>   
>  hviasocket.py", line 25, in import 
> argparseImportError: No module named argparse
> DEBUG [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-3 
> (logid:005ad4fd) passcmd failed:Traceback (most recent call last):  File 
> "/usr/share/cloudstack-common/scripts/vm/hype 
>   
>   rvisor/kvm/patchviasocket.py", line 25, in import 
> argparseImportError: No module named argparse



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9458) Some VMs are being stopped when agent is reconnecting

2016-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15426116#comment-15426116
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9458:


Github user marcaurele commented on the issue:

https://github.com/apache/cloudstack/pull/1640
  
I understand your point of the release, but we're not in an ideal world 
where everyone runs the latest version. I try to do my best to look at the 
current code in CS to find possible fixes of any bug/problem we encounter or 
changes we want to do in our version. I want us to get back to the master 
version but that's not the topic here, neither going to happen in the next 
weeks.

The point 2 does not make sense to me. If the management server cannot 
determine the state of the VM, it could mark them as stopped (*even though I 
don't think it should*). But it should not create a StopVM job, because that 
might trigger a proper stop of the VM if the agent is reconnecting while the 
job is picked by async job workers.
If the VM is really down because the host has crashed, then the command is 
pointless, and in a customer point of view it would not make a difference. If 
the host is still up and fine, but we have a network glitch, then requesting a 
stop of the VM is really bad in a customer point of view. By not doing 
anything, not requesting a stop, we would end up in a better situation.

Going back to which state should be set on the VM when the management 
server cannot determine it, taking the assumption that the VM is stopped 
because the management server cannot reach the agent is as much incorrect as 
leaving it as it is (running, migrating, creating...). I'd rather create a new 
state `UNKNOWN` for such special case, when the management server does really 
not know. In a management point of view it will be also easier to know there 
are *ghost* VMs somewhere for which the management server cannot determine the 
exact state and proper investigation (*manual*) should be done if the state 
stays like this, regarding the billing part too.


> Some VMs are being stopped when agent is reconnecting
> -
>
> Key: CLOUDSTACK-9458
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9458
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Marc-Aurèle Brothier
>Assignee: Marc-Aurèle Brothier
>
> If you loose the communication between the management server and one of the 
> agent for a few minutes, even though HA mode is not active the 
> HighAvailibilityManager kicks in and start to schedule vm restart. Those 
> tasks are being inserted as async job in the DB and if the agent comes back 
> online during the time the jobs are still in the async table, they are pushed 
> to the agent and shuts down the VMs. Then since HA is not active, the VM are 
> not restarted.
> The expected behavior in my opinion is that the VM should not be restarted at 
> all if HA mode is not active on them, and let the agent update the VM state 
> with the power report.
> The bug lies in 
> {{HighAvailibilityManagerImpl.scheduleRestartForVmsOnHost(final HostVO host, 
> boolean investigate)}}, PR will follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9458) Some VMs are being stopped when agent is reconnecting

2016-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15425993#comment-15425993
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9458:


Github user koushik-das commented on the issue:

https://github.com/apache/cloudstack/pull/1640
  
Please use a proper ACS release for reporting bugs. In your case you may 
have to do some additional cherry-picks.

"Schedule restart" does multiple tasks. There is a method by this name in 
code, the name may not be the most appropriate but it does the following. So 
don't get confused with the name
1. Tries to find out if the VM is alive or not
2. If it is not able to determine conclusively if VM is alive, then it 
tries to fence off VM
3. After successful fencing, HA enabled VMs are restarted on another host, 
non-HA VMs are marked as Stopped

So as you see non-HA VMs are simply stopped when the host is determined as 
down and not restarted. It makes sense to mark them as stopped so that 
subsequent operations can be performed on the VMs, for e.g. selective VMs may 
be explicitly started on another host. If a host is down then power sync won't 
happen for that host and VM states on that host won't get updated.


> Some VMs are being stopped when agent is reconnecting
> -
>
> Key: CLOUDSTACK-9458
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9458
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Marc-Aurèle Brothier
>Assignee: Marc-Aurèle Brothier
>
> If you loose the communication between the management server and one of the 
> agent for a few minutes, even though HA mode is not active the 
> HighAvailibilityManager kicks in and start to schedule vm restart. Those 
> tasks are being inserted as async job in the DB and if the agent comes back 
> online during the time the jobs are still in the async table, they are pushed 
> to the agent and shuts down the VMs. Then since HA is not active, the VM are 
> not restarted.
> The expected behavior in my opinion is that the VM should not be restarted at 
> all if HA mode is not active on them, and let the agent update the VM state 
> with the power report.
> The bug lies in 
> {{HighAvailibilityManagerImpl.scheduleRestartForVmsOnHost(final HostVO host, 
> boolean investigate)}}, PR will follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9458) Some VMs are being stopped when agent is reconnecting

2016-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15425976#comment-15425976
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9458:


Github user marcaurele commented on the issue:

https://github.com/apache/cloudstack/pull/1640
  
@koushik-das We are running a fork based on 4.4.2 with lots of 
cherry-pickings.

But even if the host is down, why would you want to schedule a restart if 
the VM are not HA enabled?


> Some VMs are being stopped when agent is reconnecting
> -
>
> Key: CLOUDSTACK-9458
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9458
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Marc-Aurèle Brothier
>Assignee: Marc-Aurèle Brothier
>
> If you loose the communication between the management server and one of the 
> agent for a few minutes, even though HA mode is not active the 
> HighAvailibilityManager kicks in and start to schedule vm restart. Those 
> tasks are being inserted as async job in the DB and if the agent comes back 
> online during the time the jobs are still in the async table, they are pushed 
> to the agent and shuts down the VMs. Then since HA is not active, the VM are 
> not restarted.
> The expected behavior in my opinion is that the VM should not be restarted at 
> all if HA mode is not active on them, and let the agent update the VM state 
> with the power report.
> The bug lies in 
> {{HighAvailibilityManagerImpl.scheduleRestartForVmsOnHost(final HostVO host, 
> boolean investigate)}}, PR will follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9458) Some VMs are being stopped when agent is reconnecting

2016-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15425966#comment-15425966
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9458:


Github user koushik-das commented on the issue:

https://github.com/apache/cloudstack/pull/1640
  
@marcaurele Based on the initial few lines of the logs the agent went to 
Alert state.

srv02 2016-08-08 11:56:03,895 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-16:ctx-8b5b6956) The next status of agent 44692is Alert, current 
status is Up
srv02 2016-08-08 11:56:03,896 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-16:ctx-8b5b6956) Deregistering link for 44692 with state Alert

As per the latest ACS code (4.9/master) restart of VMs on a host are 
scheduled only if the state of host is determined as Down. In case of Alert 
nothing is done.

On what version of CS are you seeing this issue?


> Some VMs are being stopped when agent is reconnecting
> -
>
> Key: CLOUDSTACK-9458
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9458
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Marc-Aurèle Brothier
>Assignee: Marc-Aurèle Brothier
>
> If you loose the communication between the management server and one of the 
> agent for a few minutes, even though HA mode is not active the 
> HighAvailibilityManager kicks in and start to schedule vm restart. Those 
> tasks are being inserted as async job in the DB and if the agent comes back 
> online during the time the jobs are still in the async table, they are pushed 
> to the agent and shuts down the VMs. Then since HA is not active, the VM are 
> not restarted.
> The expected behavior in my opinion is that the VM should not be restarted at 
> all if HA mode is not active on them, and let the agent update the VM state 
> with the power report.
> The bug lies in 
> {{HighAvailibilityManagerImpl.scheduleRestartForVmsOnHost(final HostVO host, 
> boolean investigate)}}, PR will follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9458) Some VMs are being stopped when agent is reconnecting

2016-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15425956#comment-15425956
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9458:


Github user marcaurele commented on the issue:

https://github.com/apache/cloudstack/pull/1640
  
> Would you mind adding these notes to the bug ticket?

@jburwell All PR comments are going automatically into the jira ticket 
comment thanks to the ID matching (I think).


> Some VMs are being stopped when agent is reconnecting
> -
>
> Key: CLOUDSTACK-9458
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9458
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Marc-Aurèle Brothier
>Assignee: Marc-Aurèle Brothier
>
> If you loose the communication between the management server and one of the 
> agent for a few minutes, even though HA mode is not active the 
> HighAvailibilityManager kicks in and start to schedule vm restart. Those 
> tasks are being inserted as async job in the DB and if the agent comes back 
> online during the time the jobs are still in the async table, they are pushed 
> to the agent and shuts down the VMs. Then since HA is not active, the VM are 
> not restarted.
> The expected behavior in my opinion is that the VM should not be restarted at 
> all if HA mode is not active on them, and let the agent update the VM state 
> with the power report.
> The bug lies in 
> {{HighAvailibilityManagerImpl.scheduleRestartForVmsOnHost(final HostVO host, 
> boolean investigate)}}, PR will follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9458) Some VMs are being stopped when agent is reconnecting

2016-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15425955#comment-15425955
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9458:


Github user marcaurele commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1640#discussion_r75252604
  
--- Diff: server/src/com/cloud/ha/HighAvailabilityManagerImpl.java ---
@@ -497,7 +498,7 @@ protected Long restart(final HaWorkVO work) {
 
 boolean fenced = false;
 if (alive == null) {
-s_logger.debug("Fencing off VM that we don't know the 
state of");
+s_logger.debug("Fencing off VM " + vm + " that we 
don't know the state of");
--- End diff --

Turning all of those log debug lines without a `if` statement to lambda 
expressions when we require Java8 would be easier if there're not wrapped 
within a `if`:
```
s_logger.debug(() -> "Fencing off VM " + vm + " that we don't know the 
state of");
```
with
```
default void debug(Supplier stringSupplier) {
if (isDebugEnabled()) {
debug(stringSupplier.get());
}
}
```
The concatenation will be only done when debug is enabled.

You're right, I turn it to `warn`.


> Some VMs are being stopped when agent is reconnecting
> -
>
> Key: CLOUDSTACK-9458
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9458
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Marc-Aurèle Brothier
>Assignee: Marc-Aurèle Brothier
>
> If you loose the communication between the management server and one of the 
> agent for a few minutes, even though HA mode is not active the 
> HighAvailibilityManager kicks in and start to schedule vm restart. Those 
> tasks are being inserted as async job in the DB and if the agent comes back 
> online during the time the jobs are still in the async table, they are pushed 
> to the agent and shuts down the VMs. Then since HA is not active, the VM are 
> not restarted.
> The expected behavior in my opinion is that the VM should not be restarted at 
> all if HA mode is not active on them, and let the agent update the VM state 
> with the power report.
> The bug lies in 
> {{HighAvailibilityManagerImpl.scheduleRestartForVmsOnHost(final HostVO host, 
> boolean investigate)}}, PR will follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)