Re: Dissimilar host OS within the same cluster - Not allowed?

2022-10-13 Thread Eric Green
I just stood up a Ubuntu cluster and started adding hosts to it while 
decommissioning hosts from the Centos cluster and restarting virtual 
machines onto it. The whole point of a cluster is that it is homogeneous 
so that things like virtual machine migration can be assured to work. 
There's no way that Centos and Ubuntu are going to have identical KVM 
versions guaranteed to make things like virtual machine migration work, 
nevermind all the other environmental stuff that needs to be identical 
in order to make virtual machine migration actually work.


In short, you can do it using the workarounds others have provided, but 
the more reliable way is to stand up a new cluster with the new OS and 
manually shut down VM's and then start them up on the new cluster. 
Painful if you don't have the compute resources to do the move all in 
one service outage, but (shrug). It's the reliable way to do things.


On 10/13/2022 7:28 AM, S.Fuller wrote:

I am working to transition the host OS for my Cloudstack 4.11.3 hosts from
CentOS to Ubuntu. I was able successfully bring up a new Ubuntu host with
Cloudstack and wanted to have it be part of an existing cluster, but after
attempting to add the server I'm noting the following warning in the agent
log file

"Can't add host: XX.XX.XX.XX with hostOS: Ubuntu into a cluster,in which
there are CentOS hosts added"

Is this really the case? I did not see anything obvious in the
documentation about this.  I was able to successfully add the new Ubuntu
server into a new cluster within the same POD, and have it see storage,
networking, etc, so the host itself appears to be configured correctly.



Re: Which linux system is recommended

2022-09-20 Thread Eric Green
We have used both CentOS and Ubuntu. Currently we have standardized on 
Ubuntu 20.04 LTS due to Red Hat shenanigans with CentOS and for 
reliability reasons (Red Hat often broke Cloudstack with bug fixes for 
CentOS, Ubuntu has never done so). Once a version of Cloudstack is 
released that has been validated with Ubuntu 22.04 LTS and we have 
validated it ourselves we will upgrade to that.


We currently have one old cluster running CentOS 7 (soon to be retired) 
and a new cluster running Ubuntu 20.04LTS. Both are operating in the 
same zone and pod without a problem, Cloudstack allocates virtual 
machines across both clusters without a problem.


On 9/20/2022 8:05 AM, Mariusz Wojtarek wrote:


Hi,

which linux system is recommended for cs and kvm? Debian ? ubuntu ? 
fedora ? centos ?


Support Online





*Mariusz Wojtarek*

Administrator iT

*P: *22 335 28 00

*E: *mariusz.wojta...@support-online.pl 



www.support-online.pl 

Poleczki 23 | 02-822 Warszawa

cidimage005.png@01D77E41.D4CA7110 
cidimage007.png@01D77E41.D4CA7110 




Support OnLine Sp. z o.o., ul. Poleczki 23, 02-822 Warszawa, NIP: 
951-20-32-692, Regon: 017431965, KRS: 078497,
XIII Wydział Gospodarczy Krajowego Rejestru Sądowego w Warszawie, 
Kapitał zakładowy: 50 000 PLN - opłacony w pełnej wysokości.
W przypadku podania w ramach niniejszej korespondencji Państwa danych 
osobowych, prosimy zapoznać się z następującą informacją 
https://www.support-online.pl/dane-osobowe
In case any of your private data was included within this 
conversation, please consult this website 
https://www.support-online.pl/en/personal-data 

Re: keycloak saml

2022-03-08 Thread Eric Green
I tried to configure SAML on my Cloudstack a while back and never got it 
to work too, though I must admit I wasn't trying too hard. So if you get 
it to work, please enlighten us!


On 3/8/2022 10:59 AM, Piotr Pisz wrote:

Hi,

  


I am trying to configure Cloudstack with Keycloak SSO, with no positive
result. Can anyone have this configured?

I'm not sure what to put in the fields: saml2.default.idpid (cs) and clien
ID (keycloak).

  


Regards,

Piotr




Re: CloudStack vs. vCloud Director

2021-10-26 Thread Eric Green

Not currently using the VMware integration, but it's on our road map.

The main reason for using Cloudstack for us was that it worked well with 
Linux KVM hosts without needing to purchase expensive add-ons or 
licenses. This allowed us to spin up our cluster rapidly using existing 
machines in our racks without having to get budget approvals and such. 
However, now that we have budget for an expansion of our internal cloud, 
managing the existing VMware hosts and their virtual machines using 
Cloudstack is a natural way to extend our on-premise cloud 
infrastructure to manage more of our infrastructure.


In short, the ability to manage deploying to both KVM and ESXi hosts is 
one reason to use Cloudstack rather than vCloud Director. They'll be 
separate zones, but with a common way of deploying virtual machines and 
managing virtual machines. That said, if we were a vSphere-only site, 
vCloud Director would be something I'd look at. Pricing might be an 
issue however -- Cloudstack is of course "free" (lol, not when you 
consider the time implementing it), while I have no idea what vCloud 
Director's pricing is. Our business is price sensitive, yours may be 
less so.


On 10/26/2021 3:35 AM, Ivet Petrova wrote:

Hi all,

I am working on an idea and wanted to get some community feedback. I know that 
CloudStack has a VMware integration and you can orchestrate your VMware 
environment with it.
But on the other hand, the natural choice for all VMware users is vCloud 
Director.

Can somebody who is using such setup share why you have chosen CloudStack vs. 
vCloud Director? What is the difference and what is the CloudStack advantage?

Kind regards,


  





Re: kvm ovs vm with trunk

2021-10-07 Thread Eric Green
On KVM, Cloudstack relies on the underlying Linux OS to do the base 
network configuration. Linux "port groups" are called "bonds" and 
virtual switches are called "bridges". In the Linux OS you set up the 
bond0 for all of the ports that will be part of the port group, with 
whatever parameters you wish to have for how the port balancing will 
work, and then the bond0.X / brX VLAN and bridge for the management 
network (where X is the vlan number of the management network). Then 
Cloudstack will handle the rest, creating additional VLANs and bridges 
on the base bond as needed. For directions on how to set up the bond0.X 
/ brX pairs, consult your Linux distribution's documentation. (It is 
*radically* different between CentOS and Ubuntu, with CentOS using 
multiple configuration files under /etc/sysconfig/network-scripts and 
Ubuntu using /etc/netplan/*  YAML files).


Basically, a "bond" in Linux takes the place of an actual physical 
network port in all of the configuration for the networking. VLANs work 
the same with a "bond" in Linux as they work with actual physical 
network ports in Linux. As with an actual physical network port in 
Linux, the base bond is basically VLAN 0. All other VLANs must be 
configured on top of it as a sub-interface e.g. bond0.100 is VLAN 100 on 
the bond0 network interface.




Hi,

In vSphere it is possible to create port group with vlan range 1-4094, can
we done the same on kvm with L2 network (on openvswitch or bridge)?
Can we use in cloudstack vm with vlan trunk?

Regards,
Piotr




Re: Will Cloudstack work with existing KVM server?

2021-08-12 Thread Eric Green
Cloudstack will not, however, manage existing KVM virtual machines, 
which is what Chris wants to do. While that is theoretically possible, 
there's currently no practical way to populate the Cloudstack MySQL 
database with the information needed to make that happen. It appears his 
desire is to manage his current virtual machines with CloudStack, and 
currently the only way to do that is to upload their disk image to 
Cloudstack as a template, then deploy them to Cloudstack as a new 
virtual machine.


On 8/12/2021 10:43 PM, Suresh Anaparti wrote:

Hi Chris,

Yes, CloudStack works with existing KVM, you can check the server requirements 
and installation details here: 
http://docs.cloudstack.apache.org/en/latest/installguide/hypervisor/kvm.html#host-kvm-installation
  
Regards,

Suresh

On 13/08/21, 11:04 AM, "Chris Jefferies"  wrote:

 I have an existing server running ubuntu 16.04 and it's been running KVM 
for quite a while.  Mostly I use virt-manager GUI over xwindows to manage my 
VMs.  Looking now for a web UI that can work with that.

 Can Cloudstack be used as a remote manager, perhaps even running in a 
separate VM instance, which is configured to manage the parent libvirt/qemu.

 Can it discover the existing setup and be retrofitted with the existing 
configs, storage pools (mostly LVM), etc.

 If not, does anyone have suggestions.  Thanks.



  



Re: CPU Core Count Incorrect

2021-07-29 Thread Eric Green

On 7/29/2021 3:48 AM, Andrija Panic wrote:

AND, the "insufficient capacity" has, wait one 99% of the case NOTHING
to do with not having enough capacity here or there, it's the stupid,
generic message on failure.


Talking about which, a bit off-topic here I know, I dug through the 
source code a bit trying to figure out if there's a way we can get 
better error messages because 95% of the time, what finally makes it out 
to the GUI after going through all the various layers from agent to task 
runner to api to GUI just isn't very informative. I shouldn't have to be 
digging through logs to know why my new instance didn't run, that error 
message should be turned into a standard English (or other language) 
error message that gives me actual information and propagate up through 
the layers until it reaches me. I came to the conclusion that it wasn't 
going to be an easy task because whoever architected this thing just 
didn't make provisions for propagating errors in a consistent way, and 
it was going to require a bit of re-factoring here and there to make it 
happen. Has there been any talk of doing that work, or has it been lost 
behind the constant struggle to keep Cloudstack up to date and 
compatible with recent hypervisor and OS changes?


BTW, I am already a maintainer of another massive pile of Java code with 
a similar architecture and similar issues (we *mostly* do a good job of 
telling you why a task failed, but not 100% of the time, the agent is 
supposed to give us an event giving us a reason why it failed for us to 
put in the task state but sometimes it just splats flat on its face and 
all we can tell you is that a task failed, though at least we don't give 
a misleading excuse for why it failed) so alas lack the cycles to 
contribute to Cloudstack.




RE: number of cores

2018-11-19 Thread Eric Green
I am with KVM.

I am sure it’s the core count preventing me from starting VM’s because when I 
hack the database to tell it I have 48 cores rather than 24 cores on my hosts, 
I can then start the VM.

The only thing the logs say is that I can’t create a new VM due to lack of 
resources. Then it quits saying that when I hack the database. Note that under 
4.9.2 (what I reverted back to), Memory is at 49%, CPU is at 41%, Primary 
Storage is at 5%, and Secondary Storage is at 5%. All other resources aren’t 
even 1% used (I set up # of vlans, shared network IP’s, etc. fairly large 
because I expect to grow the cluster in the future).  4.9 doesn’t list CPU 
cores. Under 4.11.1 those measures were the same.  

I am running KVM under Centos 7. It may be that the KVM allocator works 
different from the VMware allocator? 

From: Dag Sonstebo
Sent: Monday, November 19, 2018 9:47 AM
To: users@cloudstack.apache.org
Subject: Re: number of cores

Andrija - not sure about your 3.4GHz cores - must a be a simplified lookup 
somewhere making assumptions.

Eric - have just tried your scenario in my 4.11.2RC5 lab (admittedly with 
VMware, not KVM) - and I can see my core allocation keeps going up, e.g. at the 
moment it sits at 166% - 10 out of 6 cores used. However it doesn't stop me 
starting new VMs (only using 30-40% CPU and memory). 
Are you sure it's the core count preventing you from starting VMs? What do the 
logs say? (Also keep in mind your system VMs are now using more resources that 
before).

Regards,
Dag Sonstebo
Cloud Architect
ShapeBlue
 

On 19/11/2018, 17:15, "Eric Lee Green"  wrote:


On 11/19/18 03:56, Andrija Panic wrote:
> Hi Ugo,
>
> Why would you want to do this, just curious ?
>
> I believe it's not possible, but anyway (at least with KVM, probably same
> for other hypervisors) it doesn't even makes sense/use, since when
> deploying a VM, ACS query host free/unused number of MHz (GHz), so it's 
not
> even relevant for ACS - number of cores in not relevant in ACS 
calculations
> during VM deployment.


I think you are misunderstanding the question. I have 72 cores in my 
cluster. Each of my hosts has 24 cores. With 4.9.2, I can provision 10 
virtual machines, each of which is programmed with 8 cores, meaning 80 
cores total. They on average are using only 25% of the CPU available to 
them (they need to be able to burst) and my compute servers on average 
are only 40% CPU used so that is not a problem.

When I tried upgrading to 4.11.1,  the dashboard showed a new value "# 
of CPU Cores" in red and showed that I had more cores provisioned for 
virtual machines than available in the cluster (80 versus 72 available). 
Cloudstack would not launch new virtual machines. I shut down two 
virtual machines, and now I can launch one, but not the second because I 
would need 80 cores total in my cluster. I cannot launch all 10 virtual 
machines because I would need 80 cores total. I know this because I 
tried it. I then used MySQL to tell Cloudstack that each of my hosts has 
48 cores (144 total), and suddenly I can launch all of my virtual machines.

Is this a bug in 4.11.1? Or is this expected behavior? If expected 
behavior, is there a way to over-provision "total # of cores used" other 
than to go into MySQL and tell it that my hosts have more cores than 
they in fact have? (Note that my service offerings are limited to 8 
cores max, so there's no way to launch a single VM with more cores than 
exists on a physical host, since all my hosts have 24 cores).


> On Mon, Nov 19, 2018, 11:31 Ugo Vasi 
>> Hi all,
>> in the dashboard of an ACS installation vesion 4.11.1.0 (Ubuntu 16.04
>> with KVM hypervisor), the new entry "# of CPU Cores" appears.
>> Is it possible to over-provision like for MHz or storage?
>>
>> Thanks
>>
>>
>> --
>>
>> *Ugo Vasi* / System Administrator
>> ugo.v...@procne.it 
>>
>>
>>
>>
>> *Procne S.r.l.*
>> +39 0432 486 523
>> via Cotonificio, 45
>> 33010 Tavagnacco (UD)
>> www.procne.it 
>>
>>
>> Le informazioni contenute nella presente comunicazione ed i relativi
>> allegati possono essere riservate e sono, comunque, destinate
>> esclusivamente alle persone od alla Società sopraindicati. La
>> diffusione, distribuzione e/o copiatura del documento trasmesso da parte
>> di qualsiasi soggetto diverso dal destinatario è proibita sia ai sensi
>> dell'art. 616 c.p., che ai sensi del Decreto Legislativo n. 196/2003
>> "Codice in materia di protezione dei dati personali". Se avete ricevuto
>> questo messaggio per errore, vi preghiamo di distruggerlo e di informare
>> immediatamente Procne S.r.l. scrivendo all' indirizzo e-mail
>> i...@procne.it 

Re: kvm live volume migration

2018-01-19 Thread Eric Green
KVM is able to live migrate entire virtual machines complete with local volumes 
(see 'man virsh') but does require nbd (Network Block Device) to be installed 
on the destination host to do so. It may need installation of later libvirt / 
qemu packages from OpenStack repositories on Centos 6, I'm not sure, but just 
works on Centos 7. In any event, I have used this functionality to move virtual 
machines between virtualization hosts on my home network. It works.

What is missing is the ability to live-migrate a disk from one shared storage 
to another. The functionality built into virsh live-migrates the volume ***to 
the exact same location on the new host***, so obviously is useless for 
migrating the disk to a new location on shared storage. I looked everywhere for 
the ability of KVM to live migrate a disk from point A to point B all by 
itself, and found no such thing. libvirt/qemu has the raw capabilities needed 
to do this, but it is not currently exposed as a single API via the qemu 
console or virsh. It can be emulated via scripting however:

1. Pause virtual machine
2. Do qcow2 snapshot.
3. Detach base disk, attach qcow2 snapshot
4. unpause virtual machine
5. copy qcow2 base file to new location
6. pause virtual machine
7. detach snapshot
8. unsnapshot qcow2 snapshot at its new location.
9. attach new base at new location.
10. unpause virtual machine.

Thing is, if that entire process is not built into the underlying 
kvm/qemu/libvirt infrastructure as tested functionality with a defined API, 
there's no guarantee that it will work seamlessly and will continue working 
with the next release of the underlying infrastructure. This is using multiple 
different tools to manipulate the qcow2 file and attach/detach base disks to 
the running (but paused) kvm domain, and would have to be tested against all 
variations of those tools on all supported Cloudstack KVM host platforms. The 
test matrix looks pretty grim. 

By contrast, the migrate-with-local-storage process is built into virsh and is 
tested by the distribution vendor and the set of tools provided with the 
distribution is guaranteed to work with the virsh / libvirt/ qemu distributed 
by the distribution vendor. That makes the test matrix for 
move-with-local-storage look a lot simpler -- "is this functionality supported 
by that version of virsh on that distribution? Yes? Enable it. No? Don't enable 
it." 

I'd love to have live migration of disks on shared storage with Cloudstack KVM, 
but not at the expense of reliability. Shutting down a virtual machine in order 
to migrate one of its disks from one shared datastore to another is not ideal, 
but at least it's guaranteed reliable.


> On Jan 19, 2018, at 04:54, Rafael Weingärtner <rafaelweingart...@gmail.com> 
> wrote:
> 
> Hey Marc,
> It is very interesting that you are going to pick this up for KVM. I am
> working in a related issue for XenServer [1].
> If you can confirm that KVM is able to live migrate local volumes to other
> local storage or shared storage I could make the feature I am working on
> available to KVM as well.
> 
> 
> [1] https://issues.apache.org/jira/browse/CLOUDSTACK-10240
> 
> On Thu, Jan 18, 2018 at 11:35 AM, Marc-Aurèle Brothier <ma...@exoscale.ch>
> wrote:
> 
>> There's a PR waiting to be fixed about live migration with local volume for
>> KVM. So it will come at some point. I'm the one who made this PR but I'm
>> not using the upstream release so it's hard for me to debug the problem.
>> You can add yourself to the PR to get notify when things are moving on it.
>> 
>> https://github.com/apache/cloudstack/pull/1709
>> 
>> On Wed, Jan 17, 2018 at 10:56 AM, Eric Green <eric.lee.gr...@gmail.com>
>> wrote:
>> 
>>> Theoretically on Centos 7 as the host KVM OS it could be done with a
>>> couple of pauses and the snapshotting mechanism built into qcow2, but
>> there
>>> is no simple way to do it directly via virsh, the libvirtd/qemu control
>>> program that is used to manage virtualization. It's not as with issuing a
>>> simple vmotion 'migrate volume' call in Vmware.
>>> 
>>> I scripted out how it would work without that direct support in
>>> libvirt/virsh and after looking at all the points where things could go
>>> wrong, honestly, I think we need to wait until there is support in
>>> libvirt/virsh to do this. virsh clearly has the capability internally to
>> do
>>> live migration of storage, since it does this for live domain migration
>> of
>>> local storage between machines when migrating KVM domains from one host
>> to
>>> another, but that capability is not currently exposed in a way Cloudstack
>>> could use, at least not on Centos 7.
>>> 
>>> 
>>&g

Re: kvm live volume migration

2018-01-17 Thread Eric Green
Theoretically on Centos 7 as the host KVM OS it could be done with a couple of 
pauses and the snapshotting mechanism built into qcow2, but there is no simple 
way to do it directly via virsh, the libvirtd/qemu control program that is used 
to manage virtualization. It's not as with issuing a simple vmotion 'migrate 
volume' call in Vmware. 

I scripted out how it would work without that direct support in libvirt/virsh 
and after looking at all the points where things could go wrong, honestly, I 
think we need to wait until there is support in libvirt/virsh to do this. virsh 
clearly has the capability internally to do live migration of storage, since it 
does this for live domain migration of local storage between machines when 
migrating KVM domains from one host to another, but that capability is not 
currently exposed in a way Cloudstack could use, at least not on Centos 7.


> On Jan 17, 2018, at 01:05, Piotr Pisz  wrote:
> 
> Hello,
> 
> Is there a chance that one day it will be possible to migrate volume (root 
> disk) of a live VM in KVM between storage pools (in CloudStack)?
> Like a storage vMotion in Vmware.
> 
> Best regards,
> Piotr
> 



Re: [PROPOSE] EOL for supported OSes & Hypervisors

2018-01-12 Thread Eric Green
Official EOL for Centos 6 / RHEL 6 as declared by Red Hat Software is 
11/30/2020. Jumping the gun a bit there, padme. 

People on Centos 6 should certainly be working on a migration strategy right 
now, but the end is not here *yet*. Furthermore, the install documentation is 
still written for Centos 6 rather than Centos 7. That needs to be fixed before 
discontinuing support for Centos 6, eh?

> On Jan 12, 2018, at 04:35, Rohit Yadav  wrote:
> 
> +1 I've updated the page with upcoming Ubuntu 18.04 LTS.
> 
> 
> After 4.11, I think 4.12 (assuming releases by mid of 2018) should remove 
> "declared" (they might still work with 4.12+ but in docs and by project we 
> should officially support them) support for following:
> 
> 
> a. Hypervisor:
> 
> XenServer - 6.2, 6.5,
> 
> KVM - CentOS6, RHEL6, Ubuntu12.04 (I think this is already removed, packages 
> don't work I think?)
> 
> vSphere/Vmware - 4.x, 5.0, 5.1, 5.5
> 
> 
> b. Remove packaging for CentOS6.x, RHEL 6.x (the el6 packages), and Ubuntu 
> 12.04 (any non-systemd debian distro).
> 
> 
> Thoughts, comments?
> 



Re: Recover VM after KVM host down (and HA not working) ?

2017-12-23 Thread Eric Green
If all else fails, change its state to the correct  state in the MySQL
database and restart the management  service. Sadly that is the only way I
could do it when my Cloudstack got confused and stuck an instance in an
intermediate state where I couldn't do anything with it.

On Dec 22, 2017 at 9:09 AM, >
wrote:

Good morning,

New to ACS and doing a POC with 4.10 on Centos 7 and KVM.

Im trying to recover VMs after an host failure (powered off from OOB).

Primary storage is NFS and IPMI is configured for the KVM hosts.  Zone is
advanced mode with vlan separation and created a shared network with no
services since I wish to use an external DHCP.

First,  say I don't have a compute offering with HA enabled and a KVM host
goes down...  I can't put it in maintenance mode while down and disabling
it have no effect on the state of the lost VMs.  VM stays in running state
according to manager.   What should I do to force restart on remaining
healthy hosts ?

Then I enabled  IPMI on all KVM hosts and attempted the same experience
with a compute offering with HA enabled.   Same result.  Manager do see the
host as disconnected and powered off but take no action.   I certainly miss
something here.  Please help !

Regards,

Jean-Francois


Re: Bandwith limit on guests

2017-10-30 Thread Eric Green
Did you try the same test from the exact same physical host that one of the 
guests are running on? There may be congestion between the Cloudstack network 
and the NFS network.

I just tested this by creating a compute offering that had the 200Mbit limit 
and assigning it to an instance. I mounted a NFS directory, and dding' a large 
file in that directory. I got the 23 mbyte/sec throughput I expected. I then 
shut the instance down, reassigned it to another compute offering without the 
limit, started it back up, and dd'ed that large file. I got the 200mbyte/sec 
throughput that I expected from this specific NFS server. 

How exactly are you setting network and VM throttling? Are you talking about in 
the Global Settings? If so, note that any limit set here (even infinite -- 
i.e., zero) is overridden by the values in the service offering if the service 
offering's values are smaller. Please check your service offerings to make sure 
they don't have throttling values in them. Also make sure that you put both 
network.throttling.rate and vm.network.throttling.rate back down to zero. 

Note -- I am running Cloudstack 4.9.2. But it should work same as 4.8 here.


> On Oct 30, 2017, at 10:56, Imran Ahmed  wrote:
> 
> Hi All,
> 
> WE are facing a bandwidth problem on our cloudstack guests. (cloudstack 4.8
> with KVM  on CentOS)
> 
> The network and vm throttling was set at 200mbs, and we're seeing a max on
> the guests of 25MB/sec (just slightly over the throttle).  I set the values
> to 0, restarted the management server and stopped/started the virtual
> router.  However, the guests are still only seeing 25MB/sec to an NFS share.
> I performed the same test to the same NFS share on a physical machine and it
> reached the full gigabit network speed (just over 100MB/sec).
> 
> 
> Any ideas please?
> 
> Kind regards,
> 
> Imran
> 



Re: Rsize / Wsize configuration for NFS share

2017-10-20 Thread Eric Green
Okay. So:

1) Don't use EXT4 with LVM/RAID, it performs terribly with QCOW2. Use XFS. 
2) I didn't do anything to my NFS mount options and they came out fine:

10.100.255.3:/storage/primary3 on /mnt/0ab13de9-2310-334c-b438-94dfb0b8ec84 
type nfs4 
(rw,relatime,sync,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,hard,noac,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=10.100.255.2,local_lock=none,addr=10.100.255.3)

3) ZFS works relatively well, but if you are using it with SSD's you *must* set 
the alignment to match the SSD alignment, and *must* set the ZFS block size to 
match the SSD internal block size, else performance is terrible. For 
performance, use LVM/XFS/RAID. *EXCEPT* it is really hard to make that perform 
on SSD's too, if you have a block size and alignment mismatch performance will 
be *terrible* with LVM/XFS/RAID.

4) Hardware RAID with a battery backup tends to result in much faster writes 
*if writing to rotating storage* and only if using Linux LVM/XFS. ZFS does its 
own redundancy better, so don't use BBU with ZFS. If writing to SSD's, you will 
get better performance with the Linux MD RAID or ZFS, but note you must be 
*very* careful about block sizes and alignment. Don't use a BBU hardware raid 
with SSD's, performance will be terrible compared to Linux MD RAID or ZFS for a 
number of reasons.

5) Needless to say the ZFS NFS shares work fine, *if* you've done your homework 
and set them up well with proper alignment and block size for your hardware. 
However, for rotational storage the LVM/XFS/RAID will be faster, especially 
with the hardware raid and BBU. 

My own CloudStack implementation has two storage servers that use LVM/XFS/RAID 
for storage and one storage server that uses ZFS for storage. The ZFS server 
has two 12-disk RAID groups, one made up of SSD's for a database, the other 
made up of large rotational storage drives. It also has a NVMe card that is 
used for log and cache for the rotational storage. I spent a *lot* of time 
trying to get the SSD's to perform under RAID/LVM/XFS and just couldn't get 
everything to agree on alignment. That was when I said foo on that and put ZFS 
on there, and since I was using ZFS for one RAID group, it made sense to use it 
on the other too. I'm using RAID10 on the SSD RAID group, and RAIDZ2 on the 
rotational storage (which is there for bulk storage where performance isn't a 
priority). 

Storage is not a limitation on my cloud, especially since I have four other 
storage servers that I can throw at it if necessary. RAM and CPU are, so that's 
my next task -- get more compute servers into the Cloudstack cluster.


> On Oct 19, 2017, at 22:16, Ivan Kudryavtsev  wrote:
> 
> Hello, friends.
> 
> I'm planning to deploy new cluster with KVM and shared NFS storage soon.
> Right now I already have such deploys that operate fine. Currently, I'm
> trying to compare different storage options for new cluster between
> - LVM+EXT4+HWRAID+BBU,
> - LVM+EXT4+MDADM and
> - ZFS.
> 
> HW setup is:
> - 5 x Micron 5100PRO 1.9TB
> - LSI9300-8i HBA or LSI9266-8i+BBU
> - Cisco UCS server w/ 24x2.5" / 2xE5-2650 / 64GB RAM
> 
> I have got competitive performance results between them locally already,
> but now I need to test over NFS. I'm pretty sure that first two options
> will operate nice with ACS default NFS mount args (because I already have
> such cases in prod), but ZFS is quite smart thing, so I started to
> investigate how to change NFS client mount options and unfortunately
> haven't succeed defining the proper place where cloudstack agent determines
> how to mount share and what args to use. I read a lot of ZFS-related
> articles and people write rsize/wsize affect quite much, so I wonder how to
> instruct cloudstack agent to use specific rsize/wsize args to mount primary
> storage.
> 
> Also, I haven't found in ACS archives mentions about ZFS NFS share, so
> might be it's a bad case for ACS because of Qcow image format?, but I think
> it could be a good one so want to test personally.
> 
> Any suggestions are welcome.
> 
> -- 
> With best regards, Ivan Kudryavtsev
> Bitworks Software, Ltd.
> Cell: +7-923-414-1515
> WWW: http://bitworks.software/ 



Re: Migrating VMs from AWS to CloudStack

2017-09-07 Thread Eric Green
Not happening unless your instances on Amazon are Centos or some other 
"standard" Linux distribution, not standard Amazon Linux. Amazon Linux is its 
own thing and won't run outside the Amazon ecosphere, and Windows instances on 
AWS don't react well at all to having their hypervisor yanked out from under 
them and a new hypervisor slid under them.

When I did this recently, I instead set up a Puppet server with a clone of the 
puppetry that set up my AWS instances (my full puppet configuration tree was in 
a git repository so that was not a big deal), modified it to set up the same 
software on Centos rather than Amazon Linux (took some slight mods because the 
standard OS packages are slightly different between the two), then set up some 
fresh instances using a template that had puppet pre-installed on it and the 
userdata scripts from my CloudFormation templates minus the AWS-specific bits 
(the userdata scripts are responsible for configuring the instances to connect 
to the puppet server with the proper instance type in order to do the final 
configuration). 

If you are not using a configuration management system like Puppet, Chef, or 
Ansible, you're doing things wrong. You've *been* doing things wrong. Managing 
a fleet of virtual machines by hand is not the way to do things in the 21st 
Century.


> On Sep 7, 2017, at 05:08, Imran Ahmed  wrote:
> 
> Hi All,
> I have got a task to migrate VMs from AWS to CloudStack  (private cloud) .
> Any ideas to get this done efficiently?
> 
> Regards,
> Imran
> 



Re: Secondary storage is not secondary properly

2017-08-18 Thread Eric Green

> On Aug 18, 2017, at 03:22, Asanka Gunasekara  wrote:
> 
> Hi Eric, 
> 
> SSVM can access my nfs and I an manual mount :(
> 
> This "s-397-VM:/# grep com.cloud.agent.api.SecStorageSetupCommand 
> /var/log/cloud.log" did not produced any output, but found below error
> 
> From the VM's /var/log/cloud.log:
> ERROR [cloud.agent.AgentShell] (main:null) Unable to start agent: Resource 
> class not found: com.cloud.storage.resource.PremiumSecondaryStorageResource 
> due to: java.lang.ClassNotFoundException: 
> com.cloud.storage.resource.PremiumSecondaryStorageResource


Hmm. That doesn't look good. So the agent is never even able to start because 
of that exception. This looks like a mismatch between your SSVM template and 
your version of Cloudstack. It looks like you're using a version of Cloudstack 
that has been compiled with premium features that is expecting a template that 
supports premium features. 

Someone else will have to tell you how to change the SSVM template, I don't 
know that. Or since this is a zone that has never been operational, you may 
choose to simply wipe the current install entirely and start over again from 
scratch with a Cloudstack and SSVM template all from the same source. Remember 
to drop and recreate the database as part of that process, as well as remove 
all the contents of the secondary store and follow the directions again to 
reinitialize with the initial template.

For the record, I got my Cloudstack from this source:

[cloudstack]
name=cloudstack
baseurl=http://cloudstack.apt-get.eu/centos/$releasever/4.9/
enabled=1
gpgcheck=0

My template similarly came from that source (but the 4.6 version, as you 
specified).

Once I got my networking sorted out, which you seem to have done, it Just 
Worked.



Re: Secondary storage is not secondary properly

2017-08-08 Thread Eric Green

> On Aug 7, 2017, at 23:44, Asanka Gunasekara  wrote:
> NFS is running on a different server, I can manual mount this share as NFS
> and SMB
> Cloud stack - 4.9
> Os is Centos 7 (64)

* Make sure that it's accessible from the *storage* network (the network that 
you configured as storage when you created the zone, assuming you selected 
advanced networking). 
* Is the secondary storage virtual machine up and running? Check your 
Infrastructure tab.
* If the secondary storage virtual machine is up and running, open its console 
and log in as root / password. Then check 'ip addr list' to make sure that it 
has IP addresses.
* If it has IP addresses, try pinging your secondary storage NFS server (still 
within the SSVM).
* If you can ping your secondary storage NFS server, try mounting the NFS share 
at some random place in your filesystem to make sure you can mount it from the 
SSVM. e.g., 'mkdir /tmp/t; mount myserver:/export/secstorage /tmp/t' 
* Make sure you're using the NFS server's *storage* network IP address when you 
make this attempt.

It is possible that your NFS server has a firewall configured? But from my 
experiments, the secondary storage VM not providing secondary storage usually 
is a networking problem, things not set up properly in your zone's networking 
so that the secondary storage VM can't reach the secondary storage. Are you 
using advanced networking, or basic networking?



Re: KVM qcow2 perfomance

2017-08-06 Thread Eric Green

> On Aug 5, 2017, at 21:03, Ivan Kudryavtsev  wrote:
> 
> Hi, I think Eric's comments are too tough. E.g. I have 11xSSD 1TB with
> linux soft raid 5 and Ext4 and it works like a charm without special
> tunning.
> 
> Qcow2 also not so bad. LVM2 does it better of course (if not being
> snapshotted). Our users have different workloads and nobody claims disk
> performance is a problem. Read/write 100 MB/sec over 10G connection is not
> a problem at all for the setup specified above.

100 MB/sec is the speed of a single vintage 2010 5200 RPM SATA-2 drive. For 
many people, that is not a problem. For some, it is. For example, I have a 
12x-SSD RAID10 for a database. This RAID10 is on a SAS2 bus with 4 channels 
thus capable of 2.4 gigaBYTES per second raw throughput. Yes, I have validated 
that the SAS2 bus is the limit on throughput for my SSD array. If I provided a 
qcow2 volume to the database instance that only managed 100MB/sec, my database 
people would howl.

I have many virtual machines that run quite happily with thin qcow2 volumes on 
12-disk RAID6 XFS datastores (spinning storage) with no problem, because they 
don't care about disk throughput, they are there to process data, or provide 
services like DNS or a Wiki knowledge base, or otherwise do things that aren't 
particularly time-critical in our environment. So it's all about your customer 
and his needs. For maximum throughput, qcow2 on a ext4 soft RAID capable of 
doing 100Mb/sec is very... 2010 spinning storage... and people who need more 
than that, like database people, will be extremely dissatisfied. 

Thus my suggestions of ways to improve performance via providing a custom disk 
offering for those cases where disk performance and specifically write 
performance is a problem -- switching to 'sparse' rather than 'thin' as the 
provisioning mechanism (which greatly speeds writes since now only the 
filesystem block allocation mechanisms get invoked, rather than qcow2's block 
allocation mechanisms, and qcow2 now only has a single allocation zone which 
greatly speeds its own lookups), using a different underlying filesystem that 
has proven to have more consistent performance (xfs isn't much faster than ext4 
under most scenarios but doesn't have the lengthy dropouts in performance that 
come with lots of writes on ext4), and possibly flipping on async caching in 
the disk offering if data integrity isn't a problem (for example, for an 
Elasticsearch instance, the data is all replicated across multiple nodes on 
multiple datastores anyhow, so if I lose an Elasticsearch node's data so what? 
I just destroy that instance and create a new one to join to the cluster!). And 
of course there's always the option of simply avoiding qcow2 altogether and 
providing the data via iSCSI or NFS directly to the instance, which may be what 
you need to do for something like a database that has some very specific 
performance and throughput requirements.




Re: KVM qcow2 perfomance

2017-08-05 Thread Eric Green
qcow2 performance has been historically bad regardless of the underlying 
storage (it is an absolutely terrible storage format), which is why most 
OpenStack Kilo and later installations instead usually use managed LVM and 
present LVM volumes as iSCSI volumes to QEMU, because using raw LVM volumes 
directly works quite a bit better (especially since you can do "thick" volumes, 
which get you the best performance, without having to zero out a large file on 
disk). But Cloudstack doesn't use that paradigm. Still, you can get much better 
performance with qcow2 regardless:

1) Create a disk offering that creates 'sparse' qcow2 volumes (the 'sparse' 
provisioning type). Otherwise every write is actually multiple writes -- one to 
extend the previous qcow2 file, one to update the inode with the new file size, 
and one to update the qcow2 file's own notion of how long it is and what all of 
its sections are, and one to write the actual data. And these are all *small* 
random writes, which SSD's have historically been bad at due to write zones. 
Note that if you look at a freshly provisioned 'sparse' file in the actual data 
store, it might look like it's taking up 2tb of space, but it's actually taking 
up only a few blocks. 

2) In that disk offering, if you care more about performance than about 
reliability, set the caching mode to 'writeback'. (The default is 'none'). This 
will result in larger writes to the SSD, which it'll do at higher rates of 
speeds than small writes. The downside is that your hardware and OS better be 
*ultra* reliable with battery backup and clean shutdown in case of power 
failure and etc., or the data in question is toast if something crashes or the 
power goes out. So consider how important the data is before selecting this 
option.

3) If you have a lot of time and want to pre-provision your disks in full, in 
that disk offering set the provisioning type to 'fat'. This will pre-zero a 
qcow2 file of the full size that you selected. Be aware that Cloudstack does 
this zeroing of a volume commissioned with this offering type *when you attach 
it to a virtual machine*, not when you create it. So attach it to a "trash" 
virtual machine first before you attach it to your "real" virtual machine, 
unless you want a lot of downtime waiting for it to zero. But assuming you have 
a host filesystem that properly allocates files on a per-extent basis, and the 
extents match up with the underlying SSD write block size well, you should be 
able to get within 5% of hardware performance with 'fat' qcow2. (With 'thin' 
you can still come within 10% of that, which is why 'thin' might be the best 
for most workloads that require performance, and 'thin' doesn't waste space on 
blocks that have never been written and doesn't tie up your storage system for 
hours zeroing out a 2tb qcow2 file, so consider that if thinking 'fat'). 

4) USE XFS AS THE HOST FILESYSTEM FOR THE DATASTORE. ext4 will be *terrible*. 
I'm not sure what causes the bad will between ext4 on the storage host and 
qcow2, but I've seen it multiple times in my own testing of raw libvirt (no 
CloudStack). As for btrfs, btrfs will be terrible with regular 'thin' qcow2. 
There is an interaction between its write cycles and qcow2's write patterns 
that, as with ext4, causes very slow performance. I have not tested sparse 
qcow2 with btrfs because I don't trust btrfs, it has many design decisions 
reminiscent of ReiserFS, which ate many Linux filesystems back during the day. 
I have not tested ZFS. The ZFS on Linux implementation generally has good but 
not great performance, it was written for reliability, not performance, so it 
seemed a waste of my time to test it. I may do that this weekend however just 
to see. I inherited a PCIe M2.e SSD, you see, and want to see what having that 
as the write cache device will do for performance

5) For the guest filesystem it really depends on your workload and the guest 
OS. I love ext4 for reliability inside a virtual machine, because you can't 
just lose an entire ext4 filesystem (it's based on ext2/ext3, which in turn 
were created when hardware was much less reliable than today and thus has a lot 
of features to keep you from losing an entire filesystem just because a few 
blocks went AWOL), but it's not a very fast filesystem. Xfs in my testing has 
the best performance for virtually all workloads. Generally, I use ext4 for 
root volumes, and make decisions for data volumes based upon how important the 
performance versus reliability equation works out for me. I have a lot of ext4 
filesystems hanging around for data that basically sits there in place without 
many writes but which I don't want to lose.

For best performance of all, manage this SSD storage *outside* of Cloudstack as 
a bunch of LVM volumes which are exported to virtual machine guests via LIO 
(iSCSI). Even 'sparse' LVM volumes perform better than qcow2 'thin' volumes. If 
you choose to do that, there's some LIO settings 

Some things I found out installing on Centos 7

2017-08-02 Thread Eric Green
First, about me -- I've been administering Linux systems since 1995. No, that's 
not a typo -- that's 22 years. I've also worked for a firewall manufacturer in 
the past, I designed the layer 2 VLAN support for a firewall vendor, so I know 
VLAN's and such. I run a fairly complex production network with multiple 
VLAN's, multiple networks, etc. already, and speak fluent Cisco CLI. In short, 
I'm not an amateur at this networking stuff, but figuring out how Cloudstack 
wanted my CentOS 7 networking to be configured, and doing all the gymnastics to 
make it happen, consumed nearly a week because the documentation simply isn't 
up to date, thorough, or accurate, at least for Centos 7. 

So anyhow, my configuration:

Cloudstack 4.9.2.0 from the RPM repository at cloudstack.apt-get.eu

Centos 7 servers with:

2 10gbit Ethernet ports -> bond0 

A handful of VLANS:

100 -- from my top of rack switch is sent to my core backbone switch layer 3 
routed to my local network as 10.100.x.x and thru the NAT border firewall and 
router to the Internet. Management.
101 -- same but for 10.101.x.x  -- public.
102 -- same but for 10.102.x.x  -- guest public (see below).
192 -- A video surveillance camera network that is not routed to anywhere, via 
a drop from the core video surveillance POE switch to an access mode port on my 
top of rack switch. Not routed.
200 -- 10 gig drop over to my production racks to my storage network there for 
accessing legacy storage. Not routed. (Legacy storage is not used for 
Cloudstack instance or secondary storage but can be accessed by virtual 
machines being migrated to this rack).
1000-2000 -- VLAN's that exist in my top of rack switch on the Cloudstack rack 
and assigned to my trunk ports to the cloud servers but routed nowhere else, 
for VPC's and such. 

Stuck with VLAN's rather than one of the SDN modules like VXNET because a) it's 
the oldest and most likely to be stable, b) compatible with my already-existing 
network hardware and networks (wouldn't have to somehow map a VLAN to a SDN 
virtual network to reach 192 or 200 or create a public 102), and c) least 
complex to set up and configure given my existing top-of-rack switch that does 
VLANs just fine.

Okay, here's how I had to configure Centos 7 to make it work: 

enp4s[01] -> bond0 -> bond0.100 -> br100  -- had to create two interface files, 
add them to bond0 bridge, then create a bond0.100 vlan interface, then a br100 
bridge,  for my management network. In
/etc/sysconfig-network-scripts: 

# ls ifcfg-*
ifcfg-bond0 ifcfg-bond0.100 ifcfg-br100 ifcfg-enp4s0 ifcfg-enp4s1

(where 4s0 and 4s1 are my 10 gigabit Ethernets).

Don't create anything else. You'll just confuse Cloudstack. Any other 
configuration of the network simply fails to work. In particular, creating 
br101 etc. fails because CloudStack wants to create its own VLANs and  bridges 
and if you traffic label it as br101 it'll try making vlan br101.101 (doesn't 
work, duh). Yes, I know this contradicts every single piece of advice I've seen 
on this list. All I know is that this is what works, while every other piece of 
advice I've seen for labeling the public and private guest networking fails. 

When creating the networks in the GUI under Advanced networking, set bond0 as 
your physical network and br100 as the KVM traffic label for the Management 
network and Storage network and give them addresses with VLAN 100 (assuming 
you're using the same network for both management and storage networks, which 
is what makes sense with my single 10gbit pipe), but do *not* set up anything 
as a traffic label for Guest or Public networks. You will confuse the agent 
greatly. Let it use the default labels. It'll work. It'll set up its on 
bond0. VLAN interface and brbond0- as needed. This violates every 
other piece of advice I've seen for labeling, but this is what actually works 
with this version of Cloudstack and this version of Centos when you're sending 
everything through a VLAN-tagged bond0.

A very important configuration option *not* documented in the installation 
documents:

secstorage.allowed.internal.sites=10.100.0.0/16

(for my particular network). 

Otherwise I couldn't upload ISO files to the server from my nginx server that's 
pointing at the NFS directory full of ISO files.

---

Very important guest VM image prep *NOT* in the docs:

Be sure to install / enable / run acpid on Linux guests, otherwise "clean" 
shutdowns can't happen. Turns out Cloudstack on KVM uses the ACPI shutdown 
functionality of qemu-kvm. Probably does that on other hypervisors too.

---

Now on for that mysterious VLAN 102:

I created a "public" shared network on the 102 vlan for stuff I don't care is 
out in the open. This is a QA lab environment, not a public cloud. So I 
assigned a subnet and a VLAN, ran a VLAN drop over to my main backbone layer 3 
switch (and bopped up to my border firewall and told it about the new subnet 
too so that we could get out to the Internet as needed), and