Re: some button labels not displaying correctly in the UI 4.19

2024-03-29 Thread Marcus Torres
i'm embarrassed to say , that worked. Sorry for wasting your time lol.
Thank you and thank you guys for all the work on this project

On Fri, Mar 29, 2024 at 1:19 PM Wei ZHOU  wrote:

> It seems like an issue with browser cache.
> try incognito mode, or clean the browser cache.
>
> -Wei
>
> On Fri, Mar 29, 2024 at 5:55 PM Marcus Torres  wrote:
> >
> > Hi!
> > First off i would like to tip my hat to everyone involved in the release
> of
> > 4.19.0. The major new features and changes just put cloudstack over the
> top
> > in terms of functionality and feature sets compared to competitive
> > platforms. It's quite amazing.
> >
> > After ugpgrading to 4.19.0 (management server first, usage second,
> > hypervisors third, then restart of all services), i noticed some of the
> new
> > buttons in the UI display 'label.x' instead of a proper title ? or is it
> > like that intentionally?
> > For instance, drilling down to a cluster view, for the DRS tab it shows
> > 'label.drs' and if i click that tab , the button to generate a plan is
> > labeled 'label.drs.generate.plan' instead of 'Generate DRS Plan' as the
> > release screenshots show. I'm seeing the same thing with label.bucks and
> > label.object.storage.
> >
> > The buttons work, just a display issue it seems. I have screenshots if
> > needeed
>


some button labels not displaying correctly in the UI 4.19

2024-03-29 Thread Marcus Torres
Hi!
First off i would like to tip my hat to everyone involved in the release of
4.19.0. The major new features and changes just put cloudstack over the top
in terms of functionality and feature sets compared to competitive
platforms. It's quite amazing.

After ugpgrading to 4.19.0 (management server first, usage second,
hypervisors third, then restart of all services), i noticed some of the new
buttons in the UI display 'label.x' instead of a proper title ? or is it
like that intentionally?
For instance, drilling down to a cluster view, for the DRS tab it shows
'label.drs' and if i click that tab , the button to generate a plan is
labeled 'label.drs.generate.plan' instead of 'Generate DRS Plan' as the
release screenshots show. I'm seeing the same thing with label.bucks and
label.object.storage.

The buttons work, just a display issue it seems. I have screenshots if
needeed


Re: Method to edit global settings from command line

2024-01-30 Thread Marcus Torres
Hello Suresh

1. Cloudstack version 4.18.1.0
2. Management server = Rocky Linux 8.5
3. Hypervisors = Rocky Linux 8.5
4. the only change was enabling SAML in the global config in the UI.
5. I saw some entries in the log regarding SAML and the 'admin' user not
being able to authenticate against SAML. not sure if it's related

I've sent the you management log to your gmail address if that's OK. it's
pretty large and i've scrubbed it of any sensitive data

Thanks Suresh.

On Tue, Jan 30, 2024 at 11:52 AM Suresh Kumar Anaparti <
sureshkumar.anapa...@gmail.com> wrote:

> Hi Marcus,
>
> Thanks for the update.
>
> Maybe some issue after enabling SAML, can you share the cloudstack version,
> and error log from the management server?
>
> Regards,
> Suresh
>
> On Tue, Jan 30, 2024 at 9:21 PM Marcus Torres  wrote:
>
> > @SureshKumarAnaparti
> >
> > That worked! after a restart of the management service, I'm able to hit
> the
> > UI on port 8080 now! thank you for that tip!!
> >
> > It's peculiar that simply enabling SAML in the global config and having a
> > fault SAML config would stop the UI from opening port 8080 to access the
> > webpage.
> >
> > Thanks again!
> >
> > On Mon, Jan 29, 2024 at 11:32 PM Suresh Kumar Anaparti <
> > sureshkumar.anapa...@gmail.com> wrote:
> >
> > > Hi Marcus,
> > >
> > > You can revert the config (disable saml) using the update sql query
> > below.
> > >
> > > UPDATE cloud.configuration SET value = 'false' WHERE name =
> > > 'saml2.enabled';
> > >
> > > Regards,
> > > Suresh
> > >
> > > On Tue, Jan 30, 2024 at 5:41 AM Marcus Torres 
> wrote:
> > >
> > > > Hi!
> > > > i recently enabled saml in the global config settings in the UI and
> > upon
> > > a
> > > > restart of the management service , the cloudstack-management process
> > > > starts successfully and i'm seeing activity and traffic to and from
> the
> > > > hypervisors, looks like the management server is working, but the UI
> is
> > > not
> > > > reachable at all on port 8080.
> > > >
> > > >
> > > >   *   i do not have ssl https enabled
> > > >   *   selinux is disabled
> > > >   *   iptables is disabled
> > > >   *   i don't see 8080 port open/listening  in netstat
> > > >   *   port 9090 is open and listening
> > > >   *   mysql is up and running fine
> > > >   *   cloudmonkey api no longer able to connect since 8080 is down
> > > >
> > > > the saml config could in fact be a red herring and unrelated but
> that's
> > > > the last change besides adding a new isolated vlan guest network.
> > > >
> > > > does the ability exist to revert or edit global config settings from
> > > > command line ,  that were originally made in the ui  ?
> > > >
> > > > thanks for your time !
> > > >
> > >
> >
>


Re: Method to edit global settings from command line

2024-01-30 Thread Marcus Torres
@SureshKumarAnaparti

That worked! after a restart of the management service, I'm able to hit the
UI on port 8080 now! thank you for that tip!!

It's peculiar that simply enabling SAML in the global config and having a
fault SAML config would stop the UI from opening port 8080 to access the
webpage.

Thanks again!

On Mon, Jan 29, 2024 at 11:32 PM Suresh Kumar Anaparti <
sureshkumar.anapa...@gmail.com> wrote:

> Hi Marcus,
>
> You can revert the config (disable saml) using the update sql query below.
>
> UPDATE cloud.configuration SET value = 'false' WHERE name =
> 'saml2.enabled';
>
> Regards,
> Suresh
>
> On Tue, Jan 30, 2024 at 5:41 AM Marcus Torres  wrote:
>
> > Hi!
> > i recently enabled saml in the global config settings in the UI and upon
> a
> > restart of the management service , the cloudstack-management process
> > starts successfully and i'm seeing activity and traffic to and from the
> > hypervisors, looks like the management server is working, but the UI is
> not
> > reachable at all on port 8080.
> >
> >
> >   *   i do not have ssl https enabled
> >   *   selinux is disabled
> >   *   iptables is disabled
> >   *   i don't see 8080 port open/listening  in netstat
> >   *   port 9090 is open and listening
> >   *   mysql is up and running fine
> >   *   cloudmonkey api no longer able to connect since 8080 is down
> >
> > the saml config could in fact be a red herring and unrelated but that's
> > the last change besides adding a new isolated vlan guest network.
> >
> > does the ability exist to revert or edit global config settings from
> > command line ,  that were originally made in the ui  ?
> >
> > thanks for your time !
> >
>


Method to edit global settings from command line

2024-01-29 Thread Marcus Torres
Hi!
i recently enabled saml in the global config settings in the UI and upon a 
restart of the management service , the cloudstack-management process starts 
successfully and i'm seeing activity and traffic to and from the hypervisors, 
looks like the management server is working, but the UI is not reachable at all 
on port 8080.


  *   i do not have ssl https enabled
  *   selinux is disabled
  *   iptables is disabled
  *   i don't see 8080 port open/listening  in netstat
  *   port 9090 is open and listening
  *   mysql is up and running fine
  *   cloudmonkey api no longer able to connect since 8080 is down

the saml config could in fact be a red herring and unrelated but that's the 
last change besides adding a new isolated vlan guest network.

does the ability exist to revert or edit global config settings from command 
line ,  that were originally made in the ui  ?

thanks for your time !


Re: Root disk resizing

2021-10-11 Thread Marcus
Cloud-init is always fun to debug :-). It will probably require some
playing with to get a pattern down.

There is perhaps a way to get it to re-check and grow every reboot if you
adjust/override the module frequency, deleting the module semaphore in
/var/lib/cloud/sem or worst case clearing the metadata via 'cloud-init
clear' or  deleting the /var/lib/cloud.

On Mon, Oct 11, 2021 at 3:07 AM Wido den Hollander  wrote:

>
>
> On 10/10/21 10:35 AM, Ranjit Jadhav wrote:
> > Hello folks,
> >
> > I have implemented cloudstack with Xenserver Host. The template has been
> > made out of VM with basic centos 7 and following package installed on it
> > 
> > sudo yum -y cloud-init
> > sudo yum -y install cloud-utils-growpart
> > sudo yum -y install gdisk
> > 
> >
> > After creating new VM with this template, root disk is created as per
> size
> > mention in template or we are able to increase it at them time of
> creation.
> >
> > But later when we try to increase root disk again, it increases disk
> space
> > but "/" partiton do not get autoresize.
> >
>
> As far as I know it only grows the partition once, eg, upon first boot.
> I won't do it again afterwards.
>
> Wido
>
> >
> > Following parameters were passed in userdata
> > 
> > #cloud-config
> > growpart:
> > mode: auto
> > devices: ["/"]
> > ignore_growroot_disabled: true
> > 
> >
> > Thanks & Regards,
> > Ranjit
> >
>


Re: [DISCUSS] Using qemu-kvm vs qemu-kvm-ev with CloudStack

2021-04-10 Thread Marcus
Yes, +1 on EV. It is more current, better maintained and I think it is
generally considered the go-to for EL based hypervisors (largely due to the
oVirt use).


On Saturday, April 10, 2021, Rohit Yadav  wrote:

> Great, thanks for sharing Simon. If we've consensus and there are no
> objections I would propose we update the docs around CentOS/KVM to use the
> -ev packages.
>
> Let's hear from others.
>
>
> Thanks and regards.
>
> 
> From: Simon Weller 
> Sent: Friday, April 9, 2021 19:09
> To: d...@cloudstack.apache.org ;
> users@cloudstack.apache.org 
> Subject: Re: [DISCUSS] Using qemu-kvm vs qemu-kvm-ev with CloudStack
>
> Hi Rohit,
>
> We've been using ev exclusively for a few years now. Our main reason was
> in order to support features we upstreamed around KVM iop limits a couple
> of years back.
> Short of one challenge that was addressed on the ACS side a while ago
> related to the patchviasocket integration, it has worked very well and has
> been very stable.
>
> -Si
>
> 
> From: Rohit Yadav 
> Sent: Friday, April 9, 2021 2:26 AM
> To: d...@cloudstack.apache.org ;
> users@cloudstack.apache.org 
> Subject: [DISCUSS] Using qemu-kvm vs qemu-kvm-ev with CloudStack
>
> All,
>
> We've recently seen some tests around live VM with storage failing on
> CentOS7 which is addressed in this PR:
> https://github.com/apache/cloudstack/pull/4801
>
> Some users have added on the original issue ticket that it works with
> qemu-kvm-ev on CentOS:
> https://github.com/apache/cloudstack/issues/4757#issuecomment-812595973
>
> I also see many other IaaS platforms notably oVirt using qemu-kvm-ev, is
> there any interest and argument in saying we test and update our docs to
> advise users to use qemu-kvm-ev on CentOS? Are there any CloudStack users
> who want to share their experience with it who may be using it already?
>
> The installation steps don't require configuring any 3rd party repository
> manually and usually done with:
>
> yum install centos-release-qemu-ev
> yum install qemu-kvm-ev
>
> Additional references:
> https://lists.centos.org/pipermail/centos-virt/2015-October/004717.html
> (what is qemu-kvm vs qemu-kvm-ev)
> https://wiki.centos.org/SpecialInterestGroup/Virtualization (the SIG that
> is behind the qemu-kvm-ev repository)
>
> Thanks and regards.
>
> rohit.ya...@shapeblue.com
> www.shapeblue.com
> 3 London Bridge Street,  3rd floor, News Building, London  SE1 9SGUK
> @shapeblue
>
>
>
>
> rohit.ya...@shapeblue.com
> www.shapeblue.com
> 3 London Bridge Street,  3rd floor, News Building, London  SE1 9SGUK
> @shapeblue
>
>
>
>


Re: [VOTE] Apache CloudStack 4.14.0.0 RC3

2020-05-20 Thread Marcus
 recovery, we've managed to recover vCenter and Cloudstack
> > after
> > > reboots of the vCenter machine and the Cloudstack management service.
> > > There's no exact points to recover for now, but restart seems to work.
> > > By graceful failure I mean, cloudstack erroring out the deployment
> > and
> > > VM finished in ERROR state, meanwhile connection and operability with
> > > vCenter cluster remains the same.
> > >
> > > We're currently exploring options to fix this, one could be to
> > disable
> > > the feature for VMWare and work to introduce more sustainable fix in
> next
> > > release. Other is to look for more guarding code when installing a
> > > template, since VMware doesn’t actually allow you install that
> particular
> > > template but cloudstack does. We'll keep you posted.
> > >
> > > Thanks,
> > > Bobby.
> > >
> > > On 18.05.20, 23:01, "Marcus"  wrote:
> > >
> > > The issue sounds severe enough that a release note probably
> won't
> > > suffice -
> > > unless there's a documented way to recover we'd never want to
> > > leave a
> > > system susceptible to being unrecoverable, even if it's rarely
> > > triggered.
> > >
> > > What's involved in "failing gracefully"? Is this a small fix,
> or
> > an
> > > overhaul?  Perhaps the new feature could be disabled for
> VMware,
> > or
> > > disabled altogether until a fix is made in a patch release.
> > >
> > > Does it only affect new templates, or is there a risk that an
> > > existing
> > > template out in vSphere could suddenly cause problems?
> > >
> > > On Mon, May 18, 2020 at 12:49 AM Boris Stoyanov <
> > > boris.stoya...@shapeblue.com> wrote:
> > >
> > > > Hi guys,
> > > >
> > > > A little further info on this, it appears when we use a
> > > corrupted template
> > > > and UEFI/Legacy mode when deploy a VM, it breaks the
> connection
> > > between
> > > > cloudstack and vCenter.
> > > >
> > > > All hosts become unreachable and basically the cluster is not
> > > functional,
> > > > have not investigated a way to recover this but seems like a
> > > huge mess..
> > > > Please note that user is not able to register such template
> in
> > > vCenter
> > > > directly, but cloudstack allows using it.
> > > >
> > > > Open to discuss if we'll fix this, since it's expected users
> to
> > > use
> > > > working templates, I think we should be failing gracefully
> and
> > > such action
> > > > should not be able to create downtime on such a large scale.
> > > >
> > > > I believe the boot type feature is new one and it's not
> > > available in older
> > > > releases, so this issue should be limited to 4.14/current
> > master.
> > > >
> > > > Thanks,
> > > > Bobby.
> > > >
> > > > On 15.05.20, 17:07, "Boris Stoyanov" <
> > > boris.stoya...@shapeblue.com>
> > > > wrote:
> > > >
> > > > I'll have to -1 RC3, we've discovered details about an
> > issue
> > > which is
> > > > causing severe consequences with a particular hypervisor in
> the
> > > afternoon.
> > > > We'll need more time to investigate before disclosing.
> > > >
> > > > Bobby.
> > > >
> > > > On 15.05.20, 9:12, "Boris Stoyanov" <
> > > boris.stoya...@shapeblue.com>
> > > > wrote:
> > > >
> > > > +1 (binding)
> > > >
> > > > I've executed upgrade tests with the following
> > > configurations:
> > > >
> > > > 4.13.1 with KVM on CentOS7 hosts
> > > > 4.13 with VMware6.5 hosts
> > > > 4.11.3 with KVM on CentOS7 hosts
> > > > 4.11.2 with XenServer7 hosts
> > > > 

Re: [VOTE] Apache CloudStack 4.14.0.0 RC3

2020-05-18 Thread Marcus
The issue sounds severe enough that a release note probably won't suffice -
unless there's a documented way to recover we'd never want to leave a
system susceptible to being unrecoverable, even if it's rarely triggered.

What's involved in "failing gracefully"? Is this a small fix, or an
overhaul?  Perhaps the new feature could be disabled for VMware, or
disabled altogether until a fix is made in a patch release.

Does it only affect new templates, or is there a risk that an existing
template out in vSphere could suddenly cause problems?

On Mon, May 18, 2020 at 12:49 AM Boris Stoyanov <
boris.stoya...@shapeblue.com> wrote:

> Hi guys,
>
> A little further info on this, it appears when we use a corrupted template
> and UEFI/Legacy mode when deploy a VM, it breaks the connection between
> cloudstack and vCenter.
>
> All hosts become unreachable and basically the cluster is not functional,
> have not investigated a way to recover this but seems like a huge mess..
> Please note that user is not able to register such template in vCenter
> directly, but cloudstack allows using it.
>
> Open to discuss if we'll fix this, since it's expected users to use
> working templates, I think we should be failing gracefully and such action
> should not be able to create downtime on such a large scale.
>
> I believe the boot type feature is new one and it's not available in older
> releases, so this issue should be limited to 4.14/current master.
>
> Thanks,
> Bobby.
>
> On 15.05.20, 17:07, "Boris Stoyanov" 
> wrote:
>
> I'll have to -1 RC3, we've discovered details about an issue which is
> causing severe consequences with a particular hypervisor in the afternoon.
> We'll need more time to investigate before disclosing.
>
> Bobby.
>
> On 15.05.20, 9:12, "Boris Stoyanov" 
> wrote:
>
> +1 (binding)
>
> I've executed upgrade tests with the following configurations:
>
> 4.13.1 with KVM on CentOS7 hosts
> 4.13 with VMware6.5 hosts
> 4.11.3 with KVM on CentOS7 hosts
> 4.11.2 with XenServer7 hosts
> 4.11.1 with VMware 6.7
> 4.9.3 with XenServer 7 hosts
> 4.9.2 with KVM on CentOS 7 hosts
>
> Also I've run basic lifecycle operations on the following
> components:
> VMs
> Volumes
> Infra (zones, pod, clusters, hosts)
> Networks
> and more
>
> I did not come across any problems during this testing.
>
> Thanks,
> Bobby.
>
>
> On 11.05.20, 18:21, "Andrija Panic" 
> wrote:
>
> Hi All,
>
> I've created a 4.14.0.0 release (RC3), with the following
> artefacts up for
> testing and a vote:
>
> Git Branch and Commit SH:
>
> https://gitbox.apache.org/repos/asf?p=cloudstack.git;a=shortlog;h=refs/heads/4.14.0.0-RC20200511T1503
> Commit: 6f96b3b2b391a9b7d085f76bcafa3989d9832b4e
>
> Source release (checksums and signatures are available at the
> same
> location):
> https://dist.apache.org/repos/dist/dev/cloudstack/4.14.0.0/
>
> PGP release keys (signed using 3DC01AE8):
> https://dist.apache.org/repos/dist/release/cloudstack/KEYS
>
> The vote will be open until 14th May 2020, 17.00 CET (72h).
>
> For sanity in tallying the vote, can PMC members please be
> sure to indicate
> "(binding)" with their vote?
>
> [ ] +1 approve
> [ ] +0 no opinion
> [ ] -1 disapprove (and reason why)
>
> Additional information:
>
> For users' convenience, I've built packages from
> 6f96b3b2b391a9b7d085f76bcafa3989d9832b4e and published RC3
> repository here:
> http://packages.shapeblue.com/testing/41400rc3/  (CentOS 7 and
> Debian/generic, both with noredist support)
> and here
>
> https://download.cloudstack.org/testing/4.14.0.0-RC20200506T2028/ubuntu/bionic/
>  (Ubuntu 18.04 specific, no noredist support - thanks to
> Gabriel):
>
> The release notes are still work-in-progress, but for the
> upgrade
> instructions (including the new systemVM templates) you may
> refer to the
> following URL:
>
> https://acs-www.shapeblue.com/docs/WIP-PROOFING/pr112/upgrading/index.html
>
> 4.14.0.0 systemVM templates are available from here:
> http://download.cloudstack.org/systemvm/4.14/
>
> NOTES on issues fixed in this RC3 release:
>
> (this one does *NOT* require a full retest if you were testing
> RC1/RC2
> already - just if you were affected this issue):
> - https://github.com/apache/cloudstack/pull/4064 - affects
> hostnames when
> attaching a VM to additional networks
>
> Regards,
>
>
> Andrija Panić
>
>
>
>
>
> boris.stoya...@shapeblue.com
> www.shapeblue.com
> 3 London Bridge Street,  3rd 

Re: [DISCUSS] Use of Github packages for hosting version specific maven packages?

2019-11-19 Thread Marcus
I've played with this a little, and it shows promise. I did run into two
issues:

1) My maven build began downloading packages and stopped at 'cloud-engine'.
Artifact seems to be missing - "Failed to read artifact descriptor for
org.apache.cloudstack:cloud-engine-api:jar:4.13.0.0: Could not find
artifact org.apache.cloudstack:cloud-engine:pom:4.13.0.0"

2) Github Packages seems to require authentication. While this isn't the
end of the world, it complicates setup slightly. I'm not sure if this is a
github setting or if it's just not possible to have public artifacts with
Github.

On Mon, Nov 18, 2019 at 10:24 AM Marcus  wrote:

> I'll try to find time to see if I can point my plugins archetype generator
> at that. It would be extremely useful and simplify building plugin packages.
>
> On Fri, Nov 15, 2019 at 5:10 AM Rohit Yadav 
> wrote:
>
>> All,
>>
>> This has come up a few times in the past when someone wants to
>> build/extend CloudStack and for that they would need to extract and use
>> version specific jars from deb/rpm packages for their apps. I was
>> experimenting with the new Github packages feature against the recent
>> 4.13.0.0 release and could easily publish the jar artifacts here:
>> https://github.com/apache/cloudstack/packages
>>
>> Thoughts if that's a good way to proceed or find any suitable way of
>> version specific jar artifact/packages publication and hosting?
>>
>>
>> Regards,
>>
>> Rohit Yadav
>>
>> Software Architect, ShapeBlue
>>
>> https://www.shapeblue.com
>>
>> rohit.ya...@shapeblue.com
>> www.shapeblue.com
>> Amadeus House, Floral Street, London  WC2E 9DPUK
>> @shapeblue
>>
>>
>>
>>


Re: [DISCUSS] Use of Github packages for hosting version specific maven packages?

2019-11-18 Thread Marcus
I'll try to find time to see if I can point my plugins archetype generator
at that. It would be extremely useful and simplify building plugin packages.

On Fri, Nov 15, 2019 at 5:10 AM Rohit Yadav 
wrote:

> All,
>
> This has come up a few times in the past when someone wants to
> build/extend CloudStack and for that they would need to extract and use
> version specific jars from deb/rpm packages for their apps. I was
> experimenting with the new Github packages feature against the recent
> 4.13.0.0 release and could easily publish the jar artifacts here:
> https://github.com/apache/cloudstack/packages
>
> Thoughts if that's a good way to proceed or find any suitable way of
> version specific jar artifact/packages publication and hosting?
>
>
> Regards,
>
> Rohit Yadav
>
> Software Architect, ShapeBlue
>
> https://www.shapeblue.com
>
> rohit.ya...@shapeblue.com
> www.shapeblue.com
> Amadeus House, Floral Street, London  WC2E 9DPUK
> @shapeblue
>
>
>
>


Re: Hackathon @apachecon

2019-09-07 Thread Marcus
I think some direction may come out of what we see at the conference.

UX - UI, API, CLI
KVM agent communication model

On Friday, September 6, 2019, Paul Angus  wrote:

> Hi All,
>
> We have a large room for the day on Wednesday for a hackathon.  I think it
> might be a good idea if we marshal some thoughts around what we'd like to
> do with the time.
> I guess that we'll end up with some splinter groups who want to work on
> something very specific together as well, I can't see that being a problem.
>
> Some thing that I'd like to put out there as a discussion is the
> networking models - there has been talk of rationalising and dropping
> 'basic' networks as a separate model and using advanced networks with
> security groups instead.  Also the combining of the VR and VPC code to make
> an isolated network a single tier VPC.   I'd like to have a group
> discussion around what everyone would like to do and how we might do it.
>
> Are there any other topics that people think that would benefit from a
> group discussion ?
>
>
> Cheers
>
>
> Paul Angus
>
>
> paul.an...@shapeblue.com
> www.shapeblue.com
> Amadeus House, Floral Street, London  WC2E 9DPUK
> @shapeblue
>
>
>
>


Re: Runing cloudstack failed in docker

2018-11-06 Thread Marcus
Caused by: java.net.SocketException: Protocol family unavailable

This is a common Java issue when trying to containerize, I believe you can
search for it and find a generic answer. Basically has to do with whether
Java is trying to use IPv4 or IPv6 and what your Docker solution supports.
If I recall there is a 'bind.interface' field in
/etc/cloudstack/management/server.properties
that can determine if it is going to bind IPv4 or 6.

On Mon, Nov 5, 2018 at 2:37 AM li li  wrote:

> Hi ALL
>
>
> I'm trying to encapsulate cloudstack 4.11 into docker. After build is
> successful, cloudstack-management cannot function properly.
>
> Can someone help me? Thank you very much.
>
>
> Dockerfile:
>
>
> https://github.com/apache/cloudstack/blob/4.11/tools/docker/Dockerfile.centos6
>
>
> from cloudstack-management.err Error:
>
> 05/11/2018 09:07:39 128 jsvc.exec error: Cannot start daemon
> 05/11/2018 09:07:39 126 jsvc.exec error: Service exit with a return value
> of 5
> java.lang.reflect.InvocationTargetException
>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>at java.lang.reflect.Method.invoke(Method.java:498)
>at
> org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:241)
> Caused by: java.net.SocketException: Protocol family unavailable
>at sun.nio.ch.Net.bind0(Native Method)
>at sun.nio.ch.Net.bind(Net.java:433)
>at sun.nio.ch.Net.bind(Net.java:425)
>at sun.nio.ch
> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>at
> org.eclipse.jetty.server.ServerConnector.openAcceptChannel(ServerConnector.java:334)
>at
> org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:302)
>at
> org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:80)
>at
> org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:238)
>at
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
>at org.eclipse.jetty.server.Server.doStart(Server.java:397)
>at
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
>at org.apache.cloudstack.ServerDaemon.start(ServerDaemon.java:200)
>... 5 more
> OpenJDK 64-Bit Server VM warning: ignoring option PermSize=512M; support
> was removed in 8.0
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=800m;
> support was removed in 8.0
>
>
> From management-server.log:
>
> 2018-11-05 09:24:14,967 INFO  [o.a.c.s.m.m.i.DefaultModuleDefinitionSet]
> (main:null) (logid:) Starting module [outofbandmanagement]
> 2018-11-05 09:24:14,967 INFO  [o.a.c.s.l.CloudStackExtendedLifeCycle]
> (main:null) (logid:) Starting CloudStack Components
> 2018-11-05 09:24:14,967 INFO  [o.a.c.s.l.CloudStackExtendedLifeCycle]
> (main:null) (logid:) Done Starting CloudStack Components
> 2018-11-05 09:24:14,967 INFO  [o.a.c.s.m.m.i.DefaultModuleDefinitionSet]
> (main:null) (logid:) Starting module [ipmitool]
> 2018-11-05 09:24:14,967 INFO  [o.a.c.s.l.CloudStackExtendedLifeCycle]
> (main:null) (logid:) Starting CloudStack Components
> 2018-11-05 09:24:14,977 DEBUG
> [o.a.c.o.d.i.IpmitoolOutOfBandManagementDriver] (main:null) (logid:)
> OutOfBandManagementDriver ipmitool initialized: ipmitool version 1.8.15
>
> 2018-11-05 09:24:14,977 INFO  [o.a.c.s.l.CloudStackExtendedLifeCycle]
> (main:null) (logid:) Done Starting CloudStack Components
> 2018-11-05 09:24:14,977 INFO  [o.a.c.s.m.m.i.DefaultModuleDefinitionSet]
> (main:null) (logid:) Starting module [nested-cloudstack]
> 2018-11-05 09:24:14,977 INFO  [o.a.c.s.l.CloudStackExtendedLifeCycle]
> (main:null) (logid:) Starting CloudStack Components
> 2018-11-05 09:24:14,977 INFO  [o.a.c.s.l.CloudStackExtendedLifeCycle]
> (main:null) (logid:) Done Starting CloudStack Components
> 2018-11-05 09:24:14,978 INFO  [o.e.j.s.h.C.client] (main:null) (logid:)
> Initializing Spring root WebApplicationContext
> 2018-11-05 09:24:15,043 INFO  [o.a.c.s.l.CloudStackExtendedLifeCycle]
> (main:null) (logid:) Configuring CloudStack Components
> 2018-11-05 09:24:15,043 INFO  [o.a.c.s.l.CloudStackExtendedLifeCycle]
> (main:null) (logid:) Done Configuring CloudStack Components
> 2018-11-05 09:24:15,056 INFO  [c.c.u.LogUtils] (main:null) (logid:) log4j
> configuration found at /etc/cloudstack/management/log4j-cloud.xml
> 2018-11-05 09:24:15,072 INFO  [o.e.j.s.h.ContextHandler] (main:null)
> (logid:) Started o.e.j.w.WebAppContext@6b1274d2
> {/client,file:///usr/share/cloudstack-management/webapp/,AVAILABLE}{/usr/share/cloudstack-management/webapp}
> 2018-11-05 09:24:15,073 INFO  [o.e.j.s.h.ContextHandler] (main:null)
> (logid:) Started 

Re: [RFC] Metrics views for CloudStack UI

2015-11-12 Thread Marcus
Hi Nux,
   The thing about ghz is that it is the unit of capacity for CPU, VMs are
allocated to hosts according to the number of "cycles" it has.  As a
customer, I agree, core count is more important. As an admin, if you have a
single host in a cluster that is using much more CPU than the others and
want to try to balance, the ghz number for the VM can tell you 1) which VMs
on a host are the 'biggest' when cgroup throttling kicks in, that is, how
much of the host CPU share a VM will get, and 2) if that VM will fit on
another host - the old UI helps you know which hosts don't have capacity
for a migration, but it doesn't tell you how full each host is and doesn't
give you this data to know how full you'll make a host if you migrate.

Many people will want these metrics to go into a time series system and
use something like the graphite publisher instead, as that will give better
visibility into what's going on over time, but this seems like a good
out-of-the-box solution to expose the data we already have buried in the UI.

On Thu, Nov 5, 2015 at 9:35 AM, Nux!  wrote:

> Great work Rohit,
>
> What I'd like to see:
> - vCPU list/count for instance metrics (GHz is meaningless to me)
> - can we make the whole thing wider so we can fit more columns there
> without that ugly horizontal scroll bar? So much wasted screen space
>
> --
> Sent from the Delta quadrant using Borg technology!
>
> Nux!
> www.nux.ro
>
> - Original Message -
> > From: "Rohit Yadav" 
> > To: d...@cloudstack.apache.org
> > Cc: users@cloudstack.apache.org
> > Sent: Thursday, 5 November, 2015 14:09:14
> > Subject: [RFC] Metrics views for CloudStack UI
>
> > Hi all,
> >
> > The present CloudStack UI hides most of the metrics data such as cpu,
> memory,
> > disk, network usage in inner detail views. Such information is critical
> to find
> > issues in one’s cloud, for example finding clusters where hosts are
> failing, or
> > finding storage pools where disk space has depleted beyond configured
> global or
> > cluster thresholds.
> >
> > The metrics views for CloudStack UI is an attempt to solve those
> problems that
> > brings in several UI enhancements such as sortable tables, new status
> icons,
> > methods to control breadcrumb navigation, making UI’s global list* API
> pagesize
> > dynamic, a new table widget based on listView widget that is both
> horizontally
> > and vertically scrollable, supports cell/threshold coloring, collapsible
> > columns along with navigation from one view to another and quick-view
> actions.
> > For example, currently support navigation are: Zone to Cluster to Host to
> > Instance to Volumes, and Storage Pool to Volumes.
> >
> > The current version implements six resource views for zone, cluster,
> host,
> > instance, volume and storage pool (primary storage). The metrics
> framework
> > (based on listView widget) would allow developers to write more such
> view where
> > information can be densely packed.
> >
> > Please checkout the FS (with some screenshots) and the PR;
> >
> > FS: https://issues.apache.org/jira/browse/CLOUDSTACK-9020
> > JIRA: https://issues.apache.org/jira/browse/CLOUDSTACK-9020
> > PR: https://github.com/apache/cloudstack/pull/1038
> >
> > Comments and suggestions?
> >
> > Regards,
> > Rohit Yadav
> > Software Architect, ShapeBlue
> >
> >
> > [cid:image003.png@01D104EF.CE276C40]
> >
> >
> > M. +91 88 262 30892 |
> > rohit.ya...@shapeblue.com
> > Blog: bhaisaab.org | Twitter: @_bhaisaab
> > ShapeBlue Ltd, 53 Chandos Place, Covent Garden, London, WC2N 4HS
> >
> > Find out more about ShapeBlue and our range of CloudStack related
> services
> >
> > IaaS Cloud Design & Build<
> http://shapeblue.com/iaas-cloud-design-and-build//>
> > CSForge – rapid IaaS deployment framework
> > CloudStack Consulting
> > CloudStack Software
> > Engineering
> > CloudStack Infrastructure
> > Support
> > CloudStack Bootcamp Training Courses<
> http://shapeblue.com/cloudstack-training/>
> >
> > This email and any attachments to it may be confidential and are
> intended solely
> > for the use of the individual to whom it is addressed. Any views or
> opinions
> > expressed are solely those of the author and do not necessarily
> represent those
> > of Shape Blue Ltd or related companies. If you are not the intended
> recipient
> > of this email, you must neither take any action based upon its contents,
> nor
> > copy or show it to anyone. Please contact the sender if you believe you
> have
> > received this email in error. Shape Blue Ltd is a company incorporated in
> > England & Wales. ShapeBlue Services India LLP is a company incorporated
> in
> > India and is operated under license from Shape Blue Ltd. Shape Blue
> Brasil

Re: ACS 4.5.1 mgmt DB HA - not working - invalid load balancing strategy

2015-06-11 Thread Marcus
When I build CloudStack RPMs on 4.5 branch I get mysql ha RPMs. Looking at
the specfile:

%if %{_ossnoss} == noredist

%package mysql-ha

Summary: Apache CloudStack Balancing Strategy for MySQL

Requires: mysql-connector-java

Requires: %{_tomcatversion}

Group: System Environmnet/Libraries

%description mysql-ha

Apache CloudStack Balancing Strategy for MySQL


%endif


Looks like you need to ./package.sh -p noredist when packaging. Not sure
what the equivalent is for .deb packaging.  That means you also have to be
set up with the non-oss dependencies. If you're not familiar with that let
us know.

On Thu, Jun 11, 2015 at 7:06 AM, Andrija Panic andrija.pa...@gmail.com
wrote:

 Actually, on another ACS installation, there is file:

  
 /usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-plugin-database-mysqlha-4.3.2.jar
 (acs 4.3.2 :) )

 But on this 4.5.1 there is no such file.

 Is it possible that we didnt compile 4.5.1 in appropriate way ?

 Thanks,
 Andrija

 On 11 June 2015 at 15:14, Andrija Panic andrija.pa...@gmail.com wrote:

  Hi,
 
  I'm trying the DB HA setup, by chaning 3 lines in db.properties file
  (enable HA, define slaves for cloud, define slaves for usage DB)
 
  After restart, mgmt server doesn start with folowing error, as it seems
  invalid load balancing strategy variable...
 
  Any clues on this ?
 
  mysql setup is galera 3 node cluster, 1st node being used as master in
  db.properties, and only second node being used as slave (galera2 node)
  I confirmed I can telnet, login etc... also mysql login from mgmt to
 these
  master/slaves works of course...
 
 
 
  2015-06-11 14:25:03,149 INFO  [c.c.u.d.T.Transaction] (main:null) Is Data
  Base High Availiability enabled? Ans : true
  2015-06-11 14:25:03,191 INFO  [c.c.u.d.T.Transaction] (main:null) The
  slaves configured for Cloud Data base is/are : 10.20.10.6
  2015-06-11 14:25:03,269 ERROR [c.c.u.d.Merovingian2] (main:null) Unable
 to
  get a new db connection
  java.sql.SQLException: Invalid load balancing strategy
  'com.cloud.utils.db.StaticStrategy'.
  at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:927)
  at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:924)
  at com.mysql.jdbc.Util.loadExtensions(Util.java:602)
  at
 
 com.mysql.jdbc.LoadBalancingConnectionProxy.init(LoadBalancingConnectionProxy.java:285)
  at
 
 com.mysql.jdbc.FailoverConnectionProxy.init(FailoverConnectionProxy.java:67)
  at
 
 com.mysql.jdbc.NonRegisteringDriver.connectFailover(NonRegisteringDriver.java:430)
  at
 
 com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:343)
  at java.sql.DriverManager.getConnection(DriverManager.java:571)
  at java.sql.DriverManager.getConnection(DriverManager.java:215)
  at
 
 org.apache.commons.dbcp.DriverManagerConnectionFactory.createConnection(DriverManagerConnectionFactory.java:75)
  at
 
 org.apache.commons.dbcp.PoolableConnectionFactory.makeObject(PoolableConnectionFactory.java:582)
  at
 
 org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1188)
  at
 
 org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
  at
 
 com.cloud.utils.db.TransactionLegacy.getStandaloneConnectionWithException(TransactionLegacy.java:203)
  at com.cloud.utils.db.Merovingian2.init(Merovingian2.java:68)
  at
  com.cloud.utils.db.Merovingian2.createLockMaster(Merovingian2.java:80)
  at
  com.cloud.server.LockMasterListener.init(LockMasterListener.java:33)
  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
  Method)
  at
 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
  at
 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at
 java.lang.reflect.Constructor.newInstance(Constructor.java:526)
  at
  org.springframework.beans.BeanUtils.instantiateClass(BeanUtils.java:148)
  at
 
 org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:121)
  at
 
 org.springframework.beans.factory.support.ConstructorResolver.autowireConstructor(ConstructorResolver.java:280)
  at
 
 org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.autowireConstructor(AbstractAutowireCapableBeanFactory.java:1045)
  at
 
 org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutowireCapableBeanFactory.java:949)
  at
 
 org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:487)
  at
 
 org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:458)
  

Re: Problem Upload Windows volume to ACS 4.5.1

2015-06-09 Thread Marcus
... and this is the relevant portion of the logs indicating the failure
reason:

Template
content is unsupported, or mismatch between selected format and template
content. Found  : x86 boot sector; partition 1

On Tue, Jun 9, 2015 at 4:53 PM, Marcus shadow...@gmail.com wrote:

 Looks like it is seeing a raw disk, when you specified it was QCOW2.
 That's what the 'file' command is doing. We used to just trust the name of
 the file, but this was enhanced to inspect the first 1MB of the download
 and validate that you are supplying the image format that CS expects.

 Because the first 1MB shows that this image has a boot sector, I'm
 assuming it is a raw disk image, instead of a qcow2. Note that CloudStack
 4.5 should support raw format for KVM images, but you have to specify it on
 the API.

 [root@devcloud-kvm7 tmp]# qemu-img create -f qcow2 img.qcow2 1G

 Formatting 'img.qcow2', fmt=qcow2 size=1073741824 encryption=off
 cluster_size=65536 lazy_refcounts=off


 [root@devcloud-kvm7 tmp]# qemu-img create -f raw img.raw 1G

 Formatting 'img.raw', fmt=raw size=1073741824


 [root@devcloud-kvm7 tmp]# file img.qcow2

 img.qcow2: QEMU QCOW Image (v3), 1073741824 bytes


 [root@devcloud-kvm7 tmp]# parted img.raw mklabel

 New disk label type? msdos


 [root@devcloud-kvm7 tmp]# file img.raw

 img.raw: x86 boot sector, code offset 0xb8

 On Tue, Jun 9, 2015 at 3:01 AM, Jochim, Ingo ingo.joc...@bautzen-it.de
 wrote:

 Hi Andrija,

 have you tried to convert (qemu-img convert) it to RAW and upload then?
 Ceph is using RAW devices. The conversion takes place on storage
 migration but on upload?

 Regards,
 Ingo

 -Ursprüngliche Nachricht-
 Von: Andrija Panic [mailto:andrija.pa...@gmail.com]
 Gesendet: Dienstag, 9. Juni 2015 10:24
 An: d...@cloudstack.apache.org; users@cloudstack.apache.org
 Betreff: Problem Upload Windows volume to ACS 4.5.1

 HI guys,

 we try to move some volumes from one ACS installation to another (from
 4.3.2 to 4.5.1).

 Since we are using CEPH, and volume extract/download doesn't work at the
 moment, we do workarround, we snapshots Windows DATA volume, convert to
 template, and then we extract URL / download.

 Then we use this URL to Upload Volume to ACS 4.5.1 - but it fails
 almoust imiddiately with error inside SSVM (nothing usefull in management
 log) - and the volume is deleted from ACS:

  I see there is inspecting disk with file commands...any thought why is
 this failing ?

 These are source Windows DATA disk btw:

 2015-06-09 07:58:47,811 DEBUG [cloud.agent.Agent]
 (agentRequest-Handler-10:null) Request:Seq 81-4133460032995983410:  { Cmd
 ,
 MgmtId: 90520741174948, via: 81, Ver: v1, Flags: 100011,
 [{org.apache.cloudstack.storage.command.DownloadCommand:{hvm:false,maxDownloadSizeInBytes:5497558138880,id:468,resourceType:VOLUME,installPath:volumes/2/468,_store:{com.cloud.agent.api.to.NfsTO:{_url:nfs://
 10.13.2.1/data/tank/secondary,_role:Image}},url:
 http://46.232.180.244/userdata/6f2280e7-86d6-4fc7-abe8-e2bbfdeed442.qcow2
 ,format:QCOW2,accountId:2,name:andrija2,wait:0}}]
 }
 2015-06-09 07:58:47,814 DEBUG [cloud.agent.Agent]
 (agentRequest-Handler-10:null) Processing command:
 org.apache.cloudstack.storage.command.DownloadCommand
 2015-06-09 07:58:47,815 INFO
  [storage.resource.NfsSecondaryStorageResource]
 (agentRequest-Handler-10:null) Determined host 10.13.2.1 corresponds to IP
 10.13.2.1
 2015-06-09 07:58:47,873 INFO  [storage.template.HttpTemplateDownloader]
 (agentRequest-Handler-10:null) No credentials configured for host=
 46.232.180.244:80
 2015-06-09 07:58:47,895 INFO  [storage.template.HttpTemplateDownloader]
 (pool-1-thread-3:null) Starting download from
 http://46.232.180.244/userdata/6f2280e7-86d6-4fc7-abe8-e2bbfdeed442.qcow2
 to

 /mnt/SecStorage/dd40674b-575d-3bd2-87b3-7f1e1db4c02e/volumes/2/468/dnld855093017660470tmp_
 remoteSize=21474836480 , max size=5497558138880
 2015-06-09 07:58:47,909 DEBUG [utils.script.Script] (pool-1-thread-3:null)
 Executing: /bin/bash -c file

 /mnt/SecStorage/dd40674b-575d-3bd2-87b3-7f1e1db4c02e/volumes/2/468/dnld855093017660470tmp_
 | cut -d: -f2
 2015-06-09 07:58:47,936 DEBUG [utils.script.Script]
 (pool-1-thread-3:null) Execution is successful.
 2015-06-09 07:58:47,941 INFO  [storage.template.DownloadManagerImpl]
 (pool-1-thread-3:null) Download Completion for jobId:
 c95b3d60-f904-4ecc-ab9f-7ed1c36eba20, status=UNRECOVERABLE_ERROR
 2015-06-09 07:58:47,942 INFO  [storage.template.DownloadManagerImpl]
 (pool-1-thread-3:null) local:

 /mnt/SecStorage/dd40674b-575d-3bd2-87b3-7f1e1db4c02e/volumes/2/468/dnld855093017660470tmp_,
 bytes=1053854, error=Template content is unsupported, or mismatch between
 selected format and template content. Found  : x86 boot sector; partition
 1, pct=0
 2015-06-09 07:58:50,876 DEBUG [cloud.agent.Agent]
 (agentRequest-Handler-10:null) Seq 81-4133460032995983410:  { Ans: ,
 MgmtId: 90520741174948, via: 81, Ver: v1, Flags: 10,
 [{com.cloud.agent.api.storage.DownloadAnswer:{jobId:c95b3d60

Re: Problem Upload Windows volume to ACS 4.5.1

2015-06-09 Thread Marcus
Looks like it is seeing a raw disk, when you specified it was QCOW2. That's
what the 'file' command is doing. We used to just trust the name of the
file, but this was enhanced to inspect the first 1MB of the download and
validate that you are supplying the image format that CS expects.

Because the first 1MB shows that this image has a boot sector, I'm assuming
it is a raw disk image, instead of a qcow2. Note that CloudStack 4.5 should
support raw format for KVM images, but you have to specify it on the API.

[root@devcloud-kvm7 tmp]# qemu-img create -f qcow2 img.qcow2 1G

Formatting 'img.qcow2', fmt=qcow2 size=1073741824 encryption=off
cluster_size=65536 lazy_refcounts=off


[root@devcloud-kvm7 tmp]# qemu-img create -f raw img.raw 1G

Formatting 'img.raw', fmt=raw size=1073741824


[root@devcloud-kvm7 tmp]# file img.qcow2

img.qcow2: QEMU QCOW Image (v3), 1073741824 bytes


[root@devcloud-kvm7 tmp]# parted img.raw mklabel

New disk label type? msdos


[root@devcloud-kvm7 tmp]# file img.raw

img.raw: x86 boot sector, code offset 0xb8

On Tue, Jun 9, 2015 at 3:01 AM, Jochim, Ingo ingo.joc...@bautzen-it.de
wrote:

 Hi Andrija,

 have you tried to convert (qemu-img convert) it to RAW and upload then?
 Ceph is using RAW devices. The conversion takes place on storage migration
 but on upload?

 Regards,
 Ingo

 -Ursprüngliche Nachricht-
 Von: Andrija Panic [mailto:andrija.pa...@gmail.com]
 Gesendet: Dienstag, 9. Juni 2015 10:24
 An: d...@cloudstack.apache.org; users@cloudstack.apache.org
 Betreff: Problem Upload Windows volume to ACS 4.5.1

 HI guys,

 we try to move some volumes from one ACS installation to another (from
 4.3.2 to 4.5.1).

 Since we are using CEPH, and volume extract/download doesn't work at the
 moment, we do workarround, we snapshots Windows DATA volume, convert to
 template, and then we extract URL / download.

 Then we use this URL to Upload Volume to ACS 4.5.1 - but it fails
 almoust imiddiately with error inside SSVM (nothing usefull in management
 log) - and the volume is deleted from ACS:

  I see there is inspecting disk with file commands...any thought why is
 this failing ?

 These are source Windows DATA disk btw:

 2015-06-09 07:58:47,811 DEBUG [cloud.agent.Agent]
 (agentRequest-Handler-10:null) Request:Seq 81-4133460032995983410:  { Cmd ,
 MgmtId: 90520741174948, via: 81, Ver: v1, Flags: 100011,
 [{org.apache.cloudstack.storage.command.DownloadCommand:{hvm:false,maxDownloadSizeInBytes:5497558138880,id:468,resourceType:VOLUME,installPath:volumes/2/468,_store:{com.cloud.agent.api.to.NfsTO:{_url:nfs://
 10.13.2.1/data/tank/secondary,_role:Image}},url:
 http://46.232.180.244/userdata/6f2280e7-86d6-4fc7-abe8-e2bbfdeed442.qcow2
 ,format:QCOW2,accountId:2,name:andrija2,wait:0}}]
 }
 2015-06-09 07:58:47,814 DEBUG [cloud.agent.Agent]
 (agentRequest-Handler-10:null) Processing command:
 org.apache.cloudstack.storage.command.DownloadCommand
 2015-06-09 07:58:47,815 INFO
  [storage.resource.NfsSecondaryStorageResource]
 (agentRequest-Handler-10:null) Determined host 10.13.2.1 corresponds to IP
 10.13.2.1
 2015-06-09 07:58:47,873 INFO  [storage.template.HttpTemplateDownloader]
 (agentRequest-Handler-10:null) No credentials configured for host=
 46.232.180.244:80
 2015-06-09 07:58:47,895 INFO  [storage.template.HttpTemplateDownloader]
 (pool-1-thread-3:null) Starting download from
 http://46.232.180.244/userdata/6f2280e7-86d6-4fc7-abe8-e2bbfdeed442.qcow2
 to

 /mnt/SecStorage/dd40674b-575d-3bd2-87b3-7f1e1db4c02e/volumes/2/468/dnld855093017660470tmp_
 remoteSize=21474836480 , max size=5497558138880
 2015-06-09 07:58:47,909 DEBUG [utils.script.Script] (pool-1-thread-3:null)
 Executing: /bin/bash -c file

 /mnt/SecStorage/dd40674b-575d-3bd2-87b3-7f1e1db4c02e/volumes/2/468/dnld855093017660470tmp_
 | cut -d: -f2
 2015-06-09 07:58:47,936 DEBUG [utils.script.Script] (pool-1-thread-3:null)
 Execution is successful.
 2015-06-09 07:58:47,941 INFO  [storage.template.DownloadManagerImpl]
 (pool-1-thread-3:null) Download Completion for jobId:
 c95b3d60-f904-4ecc-ab9f-7ed1c36eba20, status=UNRECOVERABLE_ERROR
 2015-06-09 07:58:47,942 INFO  [storage.template.DownloadManagerImpl]
 (pool-1-thread-3:null) local:

 /mnt/SecStorage/dd40674b-575d-3bd2-87b3-7f1e1db4c02e/volumes/2/468/dnld855093017660470tmp_,
 bytes=1053854, error=Template content is unsupported, or mismatch between
 selected format and template content. Found  : x86 boot sector; partition
 1, pct=0
 2015-06-09 07:58:50,876 DEBUG [cloud.agent.Agent]
 (agentRequest-Handler-10:null) Seq 81-4133460032995983410:  { Ans: ,
 MgmtId: 90520741174948, via: 81, Ver: v1, Flags: 10,
 [{com.cloud.agent.api.storage.DownloadAnswer:{jobId:c95b3d60-f904-4ecc-ab9f-7ed1c36eba20,downloadPct:0,errorString:Template
 content is unsupported, or mismatch between selected format and template
 content. Found  : x86 boot sector; partition
 

Re: ACS 4.5.1 KVM live migration problem

2015-05-15 Thread Marcus
Hmmm, this seems like an unrelated issue, though the culprits are the same
fields.  It has me wondering if there's a bug in the vm sync or network
persistence. It would be interesting to know if:

1) The null values are somehow reproduceable

2) If stopping a VM with null values is possible

3) If starting a vm with null values fixes them

Are the networks these belong to marked as persistent? Network ids can be
dynamic in certain situations, if a network is not used it gives back its
vlan id, then gets a new one when you spin up vms again. This means these
fields on the nic also need to be updated to reflect that, and I'm
wondering if there's some issue there.

On Fri, May 15, 2015 at 6:01 AM, Andrija Panic andrija.pa...@gmail.com
wrote:

 Ok, but since they are guest, it confuses me - is this advanced zone with
 vlan, right ? Then my understanding all NICs (of user VM) needs to have
 some isolation method...

 Anyway - I'm running advanced zone  + vlans, and all VMS (VMs behind VPC
 and VMS on internet/public network - but still that's Guest network) -
 still all of them have some vlan://x value.

 For VR, SSVM, CPVM - there are NICs on ACS public network that doesnt use
 vlan - they have vlan://untagged, and NULL is only used for LinkLocal
 (169.x) NICs, and for mgmt/sec-storage NIC for SSVM/CPVM in my case.



 On 15 May 2015 at 13:47, Andrei Mikhailovsky and...@arhont.com wrote:

  Andrija,
 
  I've ran the command and it showed me a bunch of running vms with NULLs.
 I
  would roughly say about 20% of my total running vms do have NULL under
 the
  isolation and broadcast URIs.
 
  All of these vms are working perfectly well (in terms of network
  connectivity) and there is nothing special about them. They all have at
  least one guest NIC.
 
  Andrei
  - Original Message -
 
  From: Andrija Panic andrija.pa...@gmail.com
  To: d...@cloudstack.apache.org
  Cc: users@cloudstack.apache.org
  Sent: Friday, 15 May, 2015 12:34:24 PM
  Subject: Re: ACS 4.5.1 KVM live migration problem
 
  Andrei,
 
  select instance_id,isolation_uri,broadcast_uri from nics where
 instance_id
  in (select id from vm_instance where state='Running' and name not like
  'r-%' and name not like 'v-%' and name not like 's-%') order by
  instance_id;
 
  This gives me every niC, that does not belong to router or SSVm CPVMI
  always have vlan values - since this is all Guest NICs - they must have
  vlan ID...
  NULL values are only present when VM is deleted/stoped in my case...
 
  Can you check your VM 664 - what is so specific about it ?
  all NICs (in my understanding, if this is advacned zone) must have some
  vlan, can not be NULL or untagged ?
 
  On 15 May 2015 at 12:58, Andrei Mikhailovsky and...@arhont.com wrote:
 
  
  
   Hi Andrija, Marcus,
  
   Thanks for your comments and suggestions. I've checked the cloud.nics
  table
  
   mysql select instance_id,isolation_uri,broadcast_uri from nics where
   instance_id=564 or instance_id=664 or instance_id=;
   +-+---+---+
   | instance_id | isolation_uri | broadcast_uri |
   +-+---+---+
   | 564 | vlan://96 | vlan://96 |
   | 664 | NULL | NULL |
   |  | vlan://1127 | vlan://1127 |
   +-+---+---+
  
  
   From my tests, instance_ids 564 and  are migrating correctly, but
   instance 664 is not ans showing the npe similar to the one i've given.
  
  
   Is this what is causing the migration issues? If so, should i change
 all
   isolation_uri and broadcast_uri to the corresponding network vlan ids?
  
   Thanks
  
   Andrei
  
   - Original Message -
  
   From: Andrija Panic andrija.pa...@gmail.com
   To: d...@cloudstack.apache.org
   Sent: Thursday, 14 May, 2015 4:00:07 PM
   Subject: Re: Fwd: ACS 4.5.1 KVM live migration problem
  
   That would probably be a bug that I had...but we updated main VLAN
 table
   with change URI or something... Marcus saved me that time :)
   Andrei, please provide more info and the info Marcus said, I will try
 to
   compare my values with yours if of any help.
  
   On 14 May 2015 at 16:56, Marcus shadow...@gmail.com wrote:
  
So, I vaguely remember an issue introduced a little over a year ago
  where
the broadcast domain value of the nic was changed from a URI to just
 a
   vlan
ID, which worked for vlans but broke vxlan and some other things. If
 I
remember correctly, there would be a small set of installs during
 this
period that wouldn't have created their nics with the correct
 broadcast
domain value. I don't remember which versions were doing this but I
 do
   know
there's a JIRA ticket and a paper trail on how people were fixing it.
  The
code that broke the URI was backed out. VMs created with the bad code
   would
not be compatible with the new or the old versions of code.
   
I was under the impression at the time that there was some SQL
 provided

Re: venom/CVE-2015-345 Update your KVM folks

2015-05-14 Thread Marcus
Yes, and follow best practices of running qemu as non-root, and a user that
has no privileges and a restricted shell! Change user and group in
/etc/libvirt/qemu.conf

On Wed, May 13, 2015 at 7:23 AM, Nux! n...@li.nux.ro wrote:

 https://access.redhat.com/articles/1444903

 People running KVM might want to update their stuff.

 --
 Sent from the Delta quadrant using Borg technology!

 Nux!
 www.nux.ro



Re: Cloudstack and KVM clusters,

2015-03-31 Thread Marcus
Don't forget SharedMountPoint. This (in theory, haven't tried it
recently) allows you to use any clustered filesystem that has a
consistent mountpoint across all KVM hosts in a CS cluster, e.g. mount
an OCFS2 to /vmstore1 then register /vmstore1 as a SharedMountPoint.

The Ceph support is in the form of RBD, by the way. You could use
CephFS if you wished via SharedMountPoint.

On Tue, Mar 31, 2015 at 2:09 PM, Simon Weller swel...@ena.com wrote:
 The hosts need to be part of the same Cloudstack cluster, and depending on 
 the underlying storage technology, you may need a clustered file system as 
 well.

 A Cloudstack cluster is basically a group of physical hosts.

 For example:

 You build a new Zone in Cloudstack. Under the zone you have a pod. Within the 
 pod, you build a new cluster (just a group of hosts). Then you assigned 4 
 servers (hosts) into that cluster. You will be able to live migrate between 
 the 4 hosts assuming the original mentioned criteria are met.

 - Si

 
 From: Rafael Weingartner rafaelweingart...@gmail.com
 Sent: Tuesday, March 31, 2015 4:02 PM
 To: d...@cloudstack.apache.org
 Cc: users@cloudstack.apache.org
 Subject: Re: Cloudstack and KVM clusters,

 Thanks Simon,


 I think I got it.

 So, the hosts do not need to be in a cluster to perform the live migration.

 On Tue, Mar 31, 2015 at 5:59 PM, Simon Weller swel...@ena.com wrote:

 Rafael,

 KVM live migration really relies on whether the underlying shared storage
 (and file system) supports the ability to provide data consistency during a
 migration. You never ever want a situation where 2 hosts are able  to mount
 and write to the same volume concurrently.

 You can live migrate in KVM today using the following underlying file
 systems/methods:

 1. NFS
 2. CEPH
 3. Clustered Logical Volume Management (CLVM) on top of SAN exposed
 storage via iSCSI,FC or FCOE.

 It's also possible to build your own storage driver and set a LUN to read
 only on a particular host using your SANs API.

 Solidfire, Nexenta and Cloudbyte have also added storage drivers more
 recently that may provide support for live migration, but as I'm not
 personally familiar with these storage platforms, I'll leave it up to
 others to comment if they wish.

 - Si





 
 From: Rafael Weingartner rafaelweingart...@gmail.com
 Sent: Tuesday, March 31, 2015 3:36 PM
 To: users@cloudstack.apache.org; d...@cloudstack.apache.org
 Subject: Cloudstack and KVM clusters,

 Hi folks,

 I was looking a matrix of Cloudstack compatibility matrix at
 http://pt.slideshare.net/TimMackey/hypervisor-31754727,

 Slide 25 seemed to show that we cannot have clusters of KVM in CS? Is that
 true? Is it possible to live migrate VMs between KVM hosts that are not
 clustered in CS?


 --
 Rafael Weingärtner




 --
 Rafael Weingärtner


Re: [URGENT - HELP NEEDED]

2015-03-16 Thread Marcus
Good to hear. Glad I could lend another pair of eyes.

On Mon, Mar 16, 2015 at 9:19 AM, Andrija Panic andrija.pa...@gmail.com wrote:
 FIXED !!!

 Thanks a lot Marcus, this is the second time you saved me from the deep
 $$it..

 Only 1 VM that had only 1 NIC and not set to default in DB - so just after
 changing that default_nic=1, destroyed the VR, and new one was recreated.

 Thanks a lot for help !


 On 16 March 2015 at 16:52, Andrija Panic andrija.pa...@gmail.com wrote:

 I will thanks a lot Marcus for hints...

 On 16 March 2015 at 16:49, Marcus shadow...@gmail.com wrote:

 Ok, just watch for those createdhcpentry mgmt server logs. Perhaps
 they're just triggered by you trying to fix the situation by
 migrating, but the original issue was something else entirely.

 On Mon, Mar 16, 2015 at 8:44 AM, Andrija Panic andrija.pa...@gmail.com
 wrote:
  I did migrate and also changed accounts, unsucessfully, so some bugs
  definitively or my specific setup...
 
  Thanks, I' fixing this now and will let you know.
 
  On 16 March 2015 at 16:42, Marcus shadow...@gmail.com wrote:
 
  Yes, each VM should have at least one default nic, so if there's only
  one nic it should be set to default. Take a db backup first, of
  course, before messing with it. Any idea how it may have happened? Do
  you migrate VMs between networks ever?
 
  On Mon, Mar 16, 2015 at 8:39 AM, Andrija Panic 
 andrija.pa...@gmail.com
  wrote:
   Ok, so if the VM has only 1 VM - and default_nic=0, then I need to
 change
   all of them to default_nic=1... ?
  
  
   On 16 March 2015 at 16:38, Marcus shadow...@gmail.com wrote:
  
   VMs can have multiple nics and be on multiple networks. If you set a
   nic as default, it becomes the network that the vm has its default
   route on. Every VM should have a default nic, and if it doesn't I
   wonder how it might have happened (maybe a specific combination of
   add/delete nic triggered a bug?). You should set a default nic for
   every VM that might be missing one, and see if that gets your router
   up.
  
   On Mon, Mar 16, 2015 at 8:34 AM, Andrija Panic 
 andrija.pa...@gmail.com
  
   wrote:
Hi Marcus,
   
Thanks a lot fot hint
   
True, I have the 0 as the value for some reason in database, for
  couple
   of
NICs
select * from nics where ip4_address like 46.232% and
 broadcast_uri
  =
vlan://500 and default_nic = 0;
   
results: http://pastebin.com/rDAe2RY9
   
or down there...
   
This Techvee-FileServer server is already running (still not dead)
  and I
can see 1 NIC from UI...
   
Should I reset all of these to 1 ?
What is the purpose of this field default_nic = 0.
   
vlan://500 in my case limits results only to the network for this
 VR
  that
is having problems...
   
Any suggestions ?
   
id uuid instance_id mac_address ip4_address netmask
  gateway
ip_type broadcast_uri network_id mode state strategy
reserver_name reservation_id device_id update_time
   isolation_uri
ip6_address default_nic vm_type created removed
  ip6_gateway
ip6_cidr secondary_ip display_nic
2816 5066bc3a-dbec-4789-aa42-3b9eb8f50bb4 1795
  06:70:0a:00:00:ac
46.232.180.101 255.255.255.0 46.232.180.1 Ip4 vlan://500
  212
Dhcp Deallocating Create DirectNetworkGuru \N 1
 2015-02-04
23:06:23 vlan://500 \N 0 User 2015-02-04 20:41:05
 2015-02-04
22:06:23 \N \N 0 1
3132 c8a5f98e-5663-40e3-ac03-1ac3545eaa83 1958
  06:fc:c2:00:00:ad
46.232.180.102 255.255.255.0 46.232.180.1 Ip4 vlan://500
  212
Dhcp Deallocating Create DirectNetworkGuru \N 1
 2015-03-03
15:45:47 vlan://500 \N 0 User 2015-02-18 15:50:35
 2015-03-03
14:45:47 \N \N 0 1
3139 f5a41229-2267-4615-9128-63fbce69bb01 1962
  06:d7:ac:00:00:ae
46.232.180.103 255.255.255.0 46.232.180.1 Ip4 vlan://500
  212
Dhcp Deallocating Create DirectNetworkGuru \N 1
 2015-02-19
03:10:45 vlan://500 \N 0 User 2015-02-19 00:09:02
 2015-02-19
02:10:45 \N \N 0 1
707 99afa70a-39d5-4685-8fc0-9857fdc77c90 511
 06:b5:72:00:00:72
46.232.180.144 255.255.255.0 46.232.180.1 Ip4 vlan://500
  212
Dhcp Deallocating Create DirectNetworkGuru \N 0
 2014-01-27
14:38:52 vlan://500 \N 0 User 2014-01-27 11:29:08
 2014-01-27
13:38:52 \N \N 0 1
1580 bf56315e-b4c3-4338-88d9-3013ab2e2c37 1088
  06:1d:90:00:00:72
46.232.180.144 255.255.255.0 46.232.180.1 Ip4 vlan://500
  212
Dhcp Deallocating Create DirectNetworkGuru \N 1
 2014-07-23
10:15:18 vlan://500 \N 0 User 2014-07-17 19:14:06
 2014-07-23
08:15:18 \N \N 0 1
3799 712cbcb6-097f-4555-a73b-e8c2a5bd557f 2306
  06:33:ac:00:00:77
46.232.180.149 255.255.255.0 46.232.180.1 Ip4 vlan://500
  212
Dhcp Deallocating Create DirectNetworkGuru \N 1
 2015-03-16
12:41:11 vlan://500 \N 0 User 2015-03-16 09:50:25
 2015-03-16
11:41:11 \N \N 0 1
3817 3599d144-6cdc-488b-8c9d-5837c7f612ac 2311
  06:8a:ac:00:00:77
46.232.180.149 255.255.255.0 46.232.180.1 Ip4 vlan://500
  212

Re: [URGENT - HELP NEEDED]

2015-03-16 Thread Marcus
FWIW, if your 4.3.2 is the same as what's in the source tree for the
router code, the null pointer indicates that there's no default nic
for one of your guests Techvee-FileServer. I'd guess that if you
delete/move just that host you may be able to start the router, or at
least get past this. You can also look into the db and see if you can
find its nics in the cloud.nics table and see if there is in fact one
marked default.

On Mon, Mar 16, 2015 at 8:05 AM, Marcus shadow...@gmail.com wrote:
 Looks like the issue is that null pointer in CreateDhcpEntry for
 either Techvee-FileServer or the DHCP entry immediately after that.
 It would suggest some inconsistent/unexpected data when creating a
 DHCP entry for one of the guests serviced by this router. It's too bad
 that one bad entry is fatal for the whole router.


 On Mon, Mar 16, 2015 at 7:16 AM, Andrija Panic andrija.pa...@gmail.com 
 wrote:
 Not really - we are painfully migrating stopped VMs, from VPS network
 (Guest Shared netwotk) to VPCs...

 MGMT server sends the STOP command to agent, even though the VM was never
 started, BUT the storage provisioning from template to volume is done...

 We are also looking into some external help as we speak...

 On 16 March 2015 at 14:52, Nux! n...@li.nux.ro wrote:

 Hi,

 Have you managed to get to the bottom of this?

 --
 Sent from the Delta quadrant using Borg technology!

 Nux!
 www.nux.ro

 - Original Message -
  From: Andrija Panic andrija.pa...@gmail.com
  To: users@cloudstack.apache.org, d...@cloudstack.apache.org
  Sent: Sunday, 15 March, 2015 16:19:07
  Subject: [URGENT - HELP NEEDED]

  Hi guys,
 
  we have updated the cloudstack from 4.3.0 to 4.3.2 (OS updated right
 before
  that, from CentOS 6.5 to CentOS 6.6)
 
  And now I can not start SYSTEM VR - that is used for SHARED GUEST network
  anymore.
  And some VMs are down - and cant be started because they depend on this
  VR...
 
  VPC VRs are created fine, so new VR for VPC are created fine, but this
 one
  fro Guest network fails to start:
 
  Here you can see, after agent copies template from secondary storage, to
  primary local storage, it created base image, and backing file - so
 storage
  setup seems completed.
 
  Than all out of sudden we have errors:




 --

 Andrija Panić


Re: [URGENT - HELP NEEDED]

2015-03-16 Thread Marcus
Ok, just watch for those createdhcpentry mgmt server logs. Perhaps
they're just triggered by you trying to fix the situation by
migrating, but the original issue was something else entirely.

On Mon, Mar 16, 2015 at 8:44 AM, Andrija Panic andrija.pa...@gmail.com wrote:
 I did migrate and also changed accounts, unsucessfully, so some bugs
 definitively or my specific setup...

 Thanks, I' fixing this now and will let you know.

 On 16 March 2015 at 16:42, Marcus shadow...@gmail.com wrote:

 Yes, each VM should have at least one default nic, so if there's only
 one nic it should be set to default. Take a db backup first, of
 course, before messing with it. Any idea how it may have happened? Do
 you migrate VMs between networks ever?

 On Mon, Mar 16, 2015 at 8:39 AM, Andrija Panic andrija.pa...@gmail.com
 wrote:
  Ok, so if the VM has only 1 VM - and default_nic=0, then I need to change
  all of them to default_nic=1... ?
 
 
  On 16 March 2015 at 16:38, Marcus shadow...@gmail.com wrote:
 
  VMs can have multiple nics and be on multiple networks. If you set a
  nic as default, it becomes the network that the vm has its default
  route on. Every VM should have a default nic, and if it doesn't I
  wonder how it might have happened (maybe a specific combination of
  add/delete nic triggered a bug?). You should set a default nic for
  every VM that might be missing one, and see if that gets your router
  up.
 
  On Mon, Mar 16, 2015 at 8:34 AM, Andrija Panic andrija.pa...@gmail.com
 
  wrote:
   Hi Marcus,
  
   Thanks a lot fot hint
  
   True, I have the 0 as the value for some reason in database, for
 couple
  of
   NICs
   select * from nics where ip4_address like 46.232% and broadcast_uri
 =
   vlan://500 and default_nic = 0;
  
   results: http://pastebin.com/rDAe2RY9
  
   or down there...
  
   This Techvee-FileServer server is already running (still not dead)
 and I
   can see 1 NIC from UI...
  
   Should I reset all of these to 1 ?
   What is the purpose of this field default_nic = 0.
  
   vlan://500 in my case limits results only to the network for this VR
 that
   is having problems...
  
   Any suggestions ?
  
   id uuid instance_id mac_address ip4_address netmask
 gateway
   ip_type broadcast_uri network_id mode state strategy
   reserver_name reservation_id device_id update_time
  isolation_uri
   ip6_address default_nic vm_type created removed
 ip6_gateway
   ip6_cidr secondary_ip display_nic
   2816 5066bc3a-dbec-4789-aa42-3b9eb8f50bb4 1795
 06:70:0a:00:00:ac
   46.232.180.101 255.255.255.0 46.232.180.1 Ip4 vlan://500
 212
   Dhcp Deallocating Create DirectNetworkGuru \N 1 2015-02-04
   23:06:23 vlan://500 \N 0 User 2015-02-04 20:41:05 2015-02-04
   22:06:23 \N \N 0 1
   3132 c8a5f98e-5663-40e3-ac03-1ac3545eaa83 1958
 06:fc:c2:00:00:ad
   46.232.180.102 255.255.255.0 46.232.180.1 Ip4 vlan://500
 212
   Dhcp Deallocating Create DirectNetworkGuru \N 1 2015-03-03
   15:45:47 vlan://500 \N 0 User 2015-02-18 15:50:35 2015-03-03
   14:45:47 \N \N 0 1
   3139 f5a41229-2267-4615-9128-63fbce69bb01 1962
 06:d7:ac:00:00:ae
   46.232.180.103 255.255.255.0 46.232.180.1 Ip4 vlan://500
 212
   Dhcp Deallocating Create DirectNetworkGuru \N 1 2015-02-19
   03:10:45 vlan://500 \N 0 User 2015-02-19 00:09:02 2015-02-19
   02:10:45 \N \N 0 1
   707 99afa70a-39d5-4685-8fc0-9857fdc77c90 511 06:b5:72:00:00:72
   46.232.180.144 255.255.255.0 46.232.180.1 Ip4 vlan://500
 212
   Dhcp Deallocating Create DirectNetworkGuru \N 0 2014-01-27
   14:38:52 vlan://500 \N 0 User 2014-01-27 11:29:08 2014-01-27
   13:38:52 \N \N 0 1
   1580 bf56315e-b4c3-4338-88d9-3013ab2e2c37 1088
 06:1d:90:00:00:72
   46.232.180.144 255.255.255.0 46.232.180.1 Ip4 vlan://500
 212
   Dhcp Deallocating Create DirectNetworkGuru \N 1 2014-07-23
   10:15:18 vlan://500 \N 0 User 2014-07-17 19:14:06 2014-07-23
   08:15:18 \N \N 0 1
   3799 712cbcb6-097f-4555-a73b-e8c2a5bd557f 2306
 06:33:ac:00:00:77
   46.232.180.149 255.255.255.0 46.232.180.1 Ip4 vlan://500
 212
   Dhcp Deallocating Create DirectNetworkGuru \N 1 2015-03-16
   12:41:11 vlan://500 \N 0 User 2015-03-16 09:50:25 2015-03-16
   11:41:11 \N \N 0 1
   3817 3599d144-6cdc-488b-8c9d-5837c7f612ac 2311
 06:8a:ac:00:00:77
   46.232.180.149 255.255.255.0 46.232.180.1 Ip4 vlan://500
 212
   Dhcp Deallocating Create DirectNetworkGuru \N 1 2015-03-16
   13:02:59 vlan://500 \N 0 User 2015-03-16 11:49:31 2015-03-16
   12:02:59 \N \N 0 1
   1581 47cfe113-218c-4a60-a108-584d64cd16ed 1089
 06:c2:c4:00:00:7a
   46.232.180.152 255.255.255.0 46.232.180.1 Ip4 vlan://500
 212
   Dhcp Deallocating Create DirectNetworkGuru \N 1 2014-07-23
   10:16:12 vlan://500 \N 0 User 2014-07-17 19:14:43 2014-07-23
   08:16:12 \N \N 0 1
   1582 dae4b3cf-e2b2-4cbc-960e-77d2e854fffa 1090
 06:1c:78:00:00:7c
   46.232.180.154 255.255.255.0 46.232.180.1 Ip4 vlan://500
 212
   Dhcp Deallocating Create DirectNetworkGuru \N 1 2014-07-23
   10:16:32 vlan://500 \N 0 User 2014-07-17 19:30:44 2014-07-23
   08:16:32 \N \N 0 1
   435

Re: [URGENT - HELP NEEDED]

2015-03-16 Thread Marcus
VMs can have multiple nics and be on multiple networks. If you set a
nic as default, it becomes the network that the vm has its default
route on. Every VM should have a default nic, and if it doesn't I
wonder how it might have happened (maybe a specific combination of
add/delete nic triggered a bug?). You should set a default nic for
every VM that might be missing one, and see if that gets your router
up.

On Mon, Mar 16, 2015 at 8:34 AM, Andrija Panic andrija.pa...@gmail.com wrote:
 Hi Marcus,

 Thanks a lot fot hint

 True, I have the 0 as the value for some reason in database, for couple of
 NICs
 select * from nics where ip4_address like 46.232% and broadcast_uri =
 vlan://500 and default_nic = 0;

 results: http://pastebin.com/rDAe2RY9

 or down there...

 This Techvee-FileServer server is already running (still not dead) and I
 can see 1 NIC from UI...

 Should I reset all of these to 1 ?
 What is the purpose of this field default_nic = 0.

 vlan://500 in my case limits results only to the network for this VR that
 is having problems...

 Any suggestions ?

 id uuid instance_id mac_address ip4_address netmask gateway
 ip_type broadcast_uri network_id mode state strategy
 reserver_name reservation_id device_id update_time isolation_uri
 ip6_address default_nic vm_type created removed ip6_gateway
 ip6_cidr secondary_ip display_nic
 2816 5066bc3a-dbec-4789-aa42-3b9eb8f50bb4 1795 06:70:0a:00:00:ac
 46.232.180.101 255.255.255.0 46.232.180.1 Ip4 vlan://500 212
 Dhcp Deallocating Create DirectNetworkGuru \N 1 2015-02-04
 23:06:23 vlan://500 \N 0 User 2015-02-04 20:41:05 2015-02-04
 22:06:23 \N \N 0 1
 3132 c8a5f98e-5663-40e3-ac03-1ac3545eaa83 1958 06:fc:c2:00:00:ad
 46.232.180.102 255.255.255.0 46.232.180.1 Ip4 vlan://500 212
 Dhcp Deallocating Create DirectNetworkGuru \N 1 2015-03-03
 15:45:47 vlan://500 \N 0 User 2015-02-18 15:50:35 2015-03-03
 14:45:47 \N \N 0 1
 3139 f5a41229-2267-4615-9128-63fbce69bb01 1962 06:d7:ac:00:00:ae
 46.232.180.103 255.255.255.0 46.232.180.1 Ip4 vlan://500 212
 Dhcp Deallocating Create DirectNetworkGuru \N 1 2015-02-19
 03:10:45 vlan://500 \N 0 User 2015-02-19 00:09:02 2015-02-19
 02:10:45 \N \N 0 1
 707 99afa70a-39d5-4685-8fc0-9857fdc77c90 511 06:b5:72:00:00:72
 46.232.180.144 255.255.255.0 46.232.180.1 Ip4 vlan://500 212
 Dhcp Deallocating Create DirectNetworkGuru \N 0 2014-01-27
 14:38:52 vlan://500 \N 0 User 2014-01-27 11:29:08 2014-01-27
 13:38:52 \N \N 0 1
 1580 bf56315e-b4c3-4338-88d9-3013ab2e2c37 1088 06:1d:90:00:00:72
 46.232.180.144 255.255.255.0 46.232.180.1 Ip4 vlan://500 212
 Dhcp Deallocating Create DirectNetworkGuru \N 1 2014-07-23
 10:15:18 vlan://500 \N 0 User 2014-07-17 19:14:06 2014-07-23
 08:15:18 \N \N 0 1
 3799 712cbcb6-097f-4555-a73b-e8c2a5bd557f 2306 06:33:ac:00:00:77
 46.232.180.149 255.255.255.0 46.232.180.1 Ip4 vlan://500 212
 Dhcp Deallocating Create DirectNetworkGuru \N 1 2015-03-16
 12:41:11 vlan://500 \N 0 User 2015-03-16 09:50:25 2015-03-16
 11:41:11 \N \N 0 1
 3817 3599d144-6cdc-488b-8c9d-5837c7f612ac 2311 06:8a:ac:00:00:77
 46.232.180.149 255.255.255.0 46.232.180.1 Ip4 vlan://500 212
 Dhcp Deallocating Create DirectNetworkGuru \N 1 2015-03-16
 13:02:59 vlan://500 \N 0 User 2015-03-16 11:49:31 2015-03-16
 12:02:59 \N \N 0 1
 1581 47cfe113-218c-4a60-a108-584d64cd16ed 1089 06:c2:c4:00:00:7a
 46.232.180.152 255.255.255.0 46.232.180.1 Ip4 vlan://500 212
 Dhcp Deallocating Create DirectNetworkGuru \N 1 2014-07-23
 10:16:12 vlan://500 \N 0 User 2014-07-17 19:14:43 2014-07-23
 08:16:12 \N \N 0 1
 1582 dae4b3cf-e2b2-4cbc-960e-77d2e854fffa 1090 06:1c:78:00:00:7c
 46.232.180.154 255.255.255.0 46.232.180.1 Ip4 vlan://500 212
 Dhcp Deallocating Create DirectNetworkGuru \N 1 2014-07-23
 10:16:32 vlan://500 \N 0 User 2014-07-17 19:30:44 2014-07-23
 08:16:32 \N \N 0 1
 435 5c370395-2b70-4c8c-b710-eab4151b14ab 252 06:96:4e:00:00:8e
 46.232.180.172 255.255.255.0 46.232.180.1 Ip4 vlan://500 212
 Dhcp Deallocating Create DirectNetworkGuru \N 1 2014-01-13
 23:25:44 vlan://500 \N 0 User 2013-10-24 23:16:11 2014-01-13
 22:25:44 \N \N 0 1
 2820 feee7c33-c723-41e3-8583-bcbfc4e257b1 1797 06:be:24:00:00:8e
 46.232.180.172 255.255.255.0 46.232.180.1 Ip4 vlan://500 212
 Dhcp Reserved Create DirectNetworkGuru \N 1 2015-02-04 22:54:29
 vlan://500 \N 0 User 2015-02-04 21:54:29 \N \N \N 0 1
 1813 df725af8-12ed-4c27-b521-51d1d465326d 1213 06:cd:d6:00:00:97
 46.232.180.181 255.255.255.0 46.232.180.1 Ip4 vlan://500 212
 Dhcp Deallocating Create DirectNetworkGuru \N 1 2014-09-02
 11:06:56 vlan://500 \N 0 User 2014-09-02 09:03:11 2014-09-02
 09:06:56 \N \N 0 1
 2897 aab5d282-e9d5-4c4f-91c9-bfae88f404d6 1304 06:6e:0c:00:00:9c
 46.232.180.186 255.255.255.0 46.232.180.1 Ip4 vlan://500 212
 Dhcp Reserved Create DirectNetworkGuru \N 0 2015-02-11 14:32:00
 vlan://500 \N 0 User 2015-02-11 13:31:59 \N \N \N 0 1
 2235 30865e2e-083e-4e8a-bccf-c24bf403dece 1488 06:b8:b4:00:00:a1
 46.232.180.191 255.255.255.0 46.232.180.1 Ip4 vlan://500 212
 Dhcp Deallocating Create DirectNetworkGuru \N

Re: [URGENT - HELP NEEDED]

2015-03-16 Thread Marcus
Yes, each VM should have at least one default nic, so if there's only
one nic it should be set to default. Take a db backup first, of
course, before messing with it. Any idea how it may have happened? Do
you migrate VMs between networks ever?

On Mon, Mar 16, 2015 at 8:39 AM, Andrija Panic andrija.pa...@gmail.com wrote:
 Ok, so if the VM has only 1 VM - and default_nic=0, then I need to change
 all of them to default_nic=1... ?


 On 16 March 2015 at 16:38, Marcus shadow...@gmail.com wrote:

 VMs can have multiple nics and be on multiple networks. If you set a
 nic as default, it becomes the network that the vm has its default
 route on. Every VM should have a default nic, and if it doesn't I
 wonder how it might have happened (maybe a specific combination of
 add/delete nic triggered a bug?). You should set a default nic for
 every VM that might be missing one, and see if that gets your router
 up.

 On Mon, Mar 16, 2015 at 8:34 AM, Andrija Panic andrija.pa...@gmail.com
 wrote:
  Hi Marcus,
 
  Thanks a lot fot hint
 
  True, I have the 0 as the value for some reason in database, for couple
 of
  NICs
  select * from nics where ip4_address like 46.232% and broadcast_uri =
  vlan://500 and default_nic = 0;
 
  results: http://pastebin.com/rDAe2RY9
 
  or down there...
 
  This Techvee-FileServer server is already running (still not dead) and I
  can see 1 NIC from UI...
 
  Should I reset all of these to 1 ?
  What is the purpose of this field default_nic = 0.
 
  vlan://500 in my case limits results only to the network for this VR that
  is having problems...
 
  Any suggestions ?
 
  id uuid instance_id mac_address ip4_address netmask gateway
  ip_type broadcast_uri network_id mode state strategy
  reserver_name reservation_id device_id update_time
 isolation_uri
  ip6_address default_nic vm_type created removed ip6_gateway
  ip6_cidr secondary_ip display_nic
  2816 5066bc3a-dbec-4789-aa42-3b9eb8f50bb4 1795 06:70:0a:00:00:ac
  46.232.180.101 255.255.255.0 46.232.180.1 Ip4 vlan://500 212
  Dhcp Deallocating Create DirectNetworkGuru \N 1 2015-02-04
  23:06:23 vlan://500 \N 0 User 2015-02-04 20:41:05 2015-02-04
  22:06:23 \N \N 0 1
  3132 c8a5f98e-5663-40e3-ac03-1ac3545eaa83 1958 06:fc:c2:00:00:ad
  46.232.180.102 255.255.255.0 46.232.180.1 Ip4 vlan://500 212
  Dhcp Deallocating Create DirectNetworkGuru \N 1 2015-03-03
  15:45:47 vlan://500 \N 0 User 2015-02-18 15:50:35 2015-03-03
  14:45:47 \N \N 0 1
  3139 f5a41229-2267-4615-9128-63fbce69bb01 1962 06:d7:ac:00:00:ae
  46.232.180.103 255.255.255.0 46.232.180.1 Ip4 vlan://500 212
  Dhcp Deallocating Create DirectNetworkGuru \N 1 2015-02-19
  03:10:45 vlan://500 \N 0 User 2015-02-19 00:09:02 2015-02-19
  02:10:45 \N \N 0 1
  707 99afa70a-39d5-4685-8fc0-9857fdc77c90 511 06:b5:72:00:00:72
  46.232.180.144 255.255.255.0 46.232.180.1 Ip4 vlan://500 212
  Dhcp Deallocating Create DirectNetworkGuru \N 0 2014-01-27
  14:38:52 vlan://500 \N 0 User 2014-01-27 11:29:08 2014-01-27
  13:38:52 \N \N 0 1
  1580 bf56315e-b4c3-4338-88d9-3013ab2e2c37 1088 06:1d:90:00:00:72
  46.232.180.144 255.255.255.0 46.232.180.1 Ip4 vlan://500 212
  Dhcp Deallocating Create DirectNetworkGuru \N 1 2014-07-23
  10:15:18 vlan://500 \N 0 User 2014-07-17 19:14:06 2014-07-23
  08:15:18 \N \N 0 1
  3799 712cbcb6-097f-4555-a73b-e8c2a5bd557f 2306 06:33:ac:00:00:77
  46.232.180.149 255.255.255.0 46.232.180.1 Ip4 vlan://500 212
  Dhcp Deallocating Create DirectNetworkGuru \N 1 2015-03-16
  12:41:11 vlan://500 \N 0 User 2015-03-16 09:50:25 2015-03-16
  11:41:11 \N \N 0 1
  3817 3599d144-6cdc-488b-8c9d-5837c7f612ac 2311 06:8a:ac:00:00:77
  46.232.180.149 255.255.255.0 46.232.180.1 Ip4 vlan://500 212
  Dhcp Deallocating Create DirectNetworkGuru \N 1 2015-03-16
  13:02:59 vlan://500 \N 0 User 2015-03-16 11:49:31 2015-03-16
  12:02:59 \N \N 0 1
  1581 47cfe113-218c-4a60-a108-584d64cd16ed 1089 06:c2:c4:00:00:7a
  46.232.180.152 255.255.255.0 46.232.180.1 Ip4 vlan://500 212
  Dhcp Deallocating Create DirectNetworkGuru \N 1 2014-07-23
  10:16:12 vlan://500 \N 0 User 2014-07-17 19:14:43 2014-07-23
  08:16:12 \N \N 0 1
  1582 dae4b3cf-e2b2-4cbc-960e-77d2e854fffa 1090 06:1c:78:00:00:7c
  46.232.180.154 255.255.255.0 46.232.180.1 Ip4 vlan://500 212
  Dhcp Deallocating Create DirectNetworkGuru \N 1 2014-07-23
  10:16:32 vlan://500 \N 0 User 2014-07-17 19:30:44 2014-07-23
  08:16:32 \N \N 0 1
  435 5c370395-2b70-4c8c-b710-eab4151b14ab 252 06:96:4e:00:00:8e
  46.232.180.172 255.255.255.0 46.232.180.1 Ip4 vlan://500 212
  Dhcp Deallocating Create DirectNetworkGuru \N 1 2014-01-13
  23:25:44 vlan://500 \N 0 User 2013-10-24 23:16:11 2014-01-13
  22:25:44 \N \N 0 1
  2820 feee7c33-c723-41e3-8583-bcbfc4e257b1 1797 06:be:24:00:00:8e
  46.232.180.172 255.255.255.0 46.232.180.1 Ip4 vlan://500 212
  Dhcp Reserved Create DirectNetworkGuru \N 1 2015-02-04
 22:54:29
  vlan://500 \N 0 User 2015-02-04 21:54:29 \N \N \N 0 1
  1813 df725af8-12ed-4c27-b521-51d1d465326d 1213 06:cd:d6:00:00:97
  46.232.180.181 255.255.255.0

Re: [URGENT - HELP NEEDED]

2015-03-16 Thread Marcus
Looks like the issue is that null pointer in CreateDhcpEntry for
either Techvee-FileServer or the DHCP entry immediately after that.
It would suggest some inconsistent/unexpected data when creating a
DHCP entry for one of the guests serviced by this router. It's too bad
that one bad entry is fatal for the whole router.


On Mon, Mar 16, 2015 at 7:16 AM, Andrija Panic andrija.pa...@gmail.com wrote:
 Not really - we are painfully migrating stopped VMs, from VPS network
 (Guest Shared netwotk) to VPCs...

 MGMT server sends the STOP command to agent, even though the VM was never
 started, BUT the storage provisioning from template to volume is done...

 We are also looking into some external help as we speak...

 On 16 March 2015 at 14:52, Nux! n...@li.nux.ro wrote:

 Hi,

 Have you managed to get to the bottom of this?

 --
 Sent from the Delta quadrant using Borg technology!

 Nux!
 www.nux.ro

 - Original Message -
  From: Andrija Panic andrija.pa...@gmail.com
  To: users@cloudstack.apache.org, d...@cloudstack.apache.org
  Sent: Sunday, 15 March, 2015 16:19:07
  Subject: [URGENT - HELP NEEDED]

  Hi guys,
 
  we have updated the cloudstack from 4.3.0 to 4.3.2 (OS updated right
 before
  that, from CentOS 6.5 to CentOS 6.6)
 
  And now I can not start SYSTEM VR - that is used for SHARED GUEST network
  anymore.
  And some VMs are down - and cant be started because they depend on this
  VR...
 
  VPC VRs are created fine, so new VR for VPC are created fine, but this
 one
  fro Guest network fails to start:
 
  Here you can see, after agent copies template from secondary storage, to
  primary local storage, it created base image, and backing file - so
 storage
  setup seems completed.
 
  Than all out of sudden we have errors:




 --

 Andrija Panić


Re: Replace systemvm template 4.3.0 with recent one during ACS upgrade

2015-03-14 Thread Marcus
It should pull the highest/last entry with the name 4.3 when
redeploying the routers, but I'm not sure if it will detect that the
router needs upgrade without a minor version change. I imagine it
would fetch the highest entry, see that the template id doesn't exist,
and install it, but you may want to test first.

On Sat, Mar 14, 2015 at 4:08 AM, Andrija Panic andrija.pa...@gmail.com wrote:
 Hi guys,

 I'm wondering, since I'm upgrading ACS 4.3.0 with original systemvm (from ~
 24.05.2014), to ACS 4.3.2 - am I required to also register new systemVM
 template (i.e. from UI like when you upgrade from 4.3 to 4.4..) - or should
 I just upgrade ACS and ACS would somehow update systemVM template from
 original one (4.3.0) to newer one 4.3.2 (there is one on
 http://cloudstack.apt-get.eu/ from 24.09.2014 and there is 15.01.2015 on
 shapeblue site also)

 I'm trying (with upgrade) to mitigate some security risks of SSLs that has
 been happening recently, and solve some of Port Forwarding / Static NAT
 issues, where remote IP is not seen really...

 So basicaly what is the systemvm template upgrade/replace procedure for the
 same release version of ACS (4.3.0 - 4.3.x) ?


 Thanks,

 --

 Andrija Panić


Re: Updating VR template, without ACS update

2015-03-11 Thread Marcus
I've never used the official script to upgrade. I always set to the
global setting to recreate on reboot of systemvms, it has been more
robust for me to do it the cloudy way and get a fresh vm on every
boot. With various issues that have arisen in the past (file system
filling up, fsck required on unclean shutdown, etc) it's just nice to
know that you're always a reboot away from getting a pristine config.
I was surprised when I heard about the patchviasocket issue as I've
run thousands of routers  and upgraded them multiple times, never once
having an issue. Nor in my nested vm dev environment. Perhaps it was
just our fast storage or something. I think someone added in some
retries or something like that.

Keep in mind that you usually don't need to drop in a new template for
a bugfix release, and it's sufficient to reboot. The exception to this
is if the bugfix release specifically indicates a new template, say
for a security fix on software in the OS of the template. Either way,
CloudStack will go through the full reprogramming process, and
stop/start the router to attach a new ISO with the new code and
install it on the router template, whether it images a fresh template
or uses an existing one.

On Wed, Mar 11, 2015 at 3:59 AM, Andrija Panic andrija.pa...@gmail.com wrote:
 Thanks Markus.

 So anyway, I need to make some time to upgrade to 4.3.2.

 Can I manually reboot VR/s one by one after the upgrade is done (instead of
 using the script for rebooting ssvm, cpvm, and 66 VRs...)
 And is this reallt reboot inside OS - not destroying and recreating VRs ???

 Or would you still recommend rebooting VRs via sctipt - I understand that
 it reboots VRs one by one...

 I would not like to recreate VR, and then hit a bug with VR creation, that
 I'm having right now... :(

 Thanks




 On 10 March 2015 at 20:14, Marcus shadow...@gmail.com wrote:

 Hi,
   It's impossible to know without looking at the changes in 4.3.1,
 4.3.2.  Your routers will be running old code, and will probably work,
 but might not, e.g. if a router script is called with parameters that
 don't exist in the version of the script that the router runs. If you
 don't plan on making any changes (add ACLs, spin up new VMs, etc) to
 these VPCs they'll most likely run just fine as-is, but any changes
 are a big ?

  As far as your question about replacing the template, I believe
 CloudStack looks for the latest of a specific version, so if you
 retire your existing template and install a new one per the 4.3
 upgrade instructions it should choose that. Note that for routers
 specifically there s a global option 'router.template.kvm' that can be
 pointed to a specific template name to use for routers.

 On Tue, Mar 10, 2015 at 7:46 AM, Andrija Panic andrija.pa...@gmail.com
 wrote:
  Hi,
 
  I was wondering is it possibe to update/replace the VR template somehow
  without actually updating the ACS.
 
  I'm running ACS 4.3.0, and having some issues with remote IP not being
  really shown during Port Forwarding and Static NAT (VR also does SNAT
  beside the DNAT)
 
  I know question is a little bit weird - but...
 
  Another Q: I can see that after ACS is upgraded, there is restart of each
  System VM needed - we have over 50-60 VPCs - this also means that I need
 to
  wait for 60 VRs to reboot.
  Is there any drawnback of runnng existing VRs after ACS 4.3.0 is updated
 to
  4.3.2 and then later manually reboot each VR from Infrastructure/Virtual
  Routers ?
 
 
 
  --
 
  Andrija Panić




 --

 Andrija Panić


Re: Updating VR template, without ACS update

2015-03-10 Thread Marcus
Hi,
  It's impossible to know without looking at the changes in 4.3.1,
4.3.2.  Your routers will be running old code, and will probably work,
but might not, e.g. if a router script is called with parameters that
don't exist in the version of the script that the router runs. If you
don't plan on making any changes (add ACLs, spin up new VMs, etc) to
these VPCs they'll most likely run just fine as-is, but any changes
are a big ?

 As far as your question about replacing the template, I believe
CloudStack looks for the latest of a specific version, so if you
retire your existing template and install a new one per the 4.3
upgrade instructions it should choose that. Note that for routers
specifically there s a global option 'router.template.kvm' that can be
pointed to a specific template name to use for routers.

On Tue, Mar 10, 2015 at 7:46 AM, Andrija Panic andrija.pa...@gmail.com wrote:
 Hi,

 I was wondering is it possibe to update/replace the VR template somehow
 without actually updating the ACS.

 I'm running ACS 4.3.0, and having some issues with remote IP not being
 really shown during Port Forwarding and Static NAT (VR also does SNAT
 beside the DNAT)

 I know question is a little bit weird - but...

 Another Q: I can see that after ACS is upgraded, there is restart of each
 System VM needed - we have over 50-60 VPCs - this also means that I need to
 wait for 60 VRs to reboot.
 Is there any drawnback of runnng existing VRs after ACS 4.3.0 is updated to
 4.3.2 and then later manually reboot each VR from Infrastructure/Virtual
 Routers ?



 --

 Andrija Panić


Re: Agent deleting/stoping main bridge on Linux ???

2015-03-04 Thread Marcus
Er, problems for the hypervisor, that is. And an admin probably
doesn't want to deal with configuring all of those, even if it can be
scripted, so CloudStack does the creation/deletion.

On Wed, Mar 4, 2015 at 8:18 AM, Marcus shadow...@gmail.com wrote:
 As for why, it's a scalability issue. There are people who are using
 (or have defined) 10,000+ guest networks. If CloudStack left bridges
 around that weren't being used it could cause problems for the guest,
 so it keeps things tidy by only keeping bridges that are used.

 On Wed, Mar 4, 2015 at 8:16 AM, Marcus shadow...@gmail.com wrote:
 I don't think anyone has ever tested what would happen if the admin
 has manually defined the same guest bridges that CloudStack wants to
 use. CloudStack creates them on the fly and deletes them when the last
 VM has been removed. I assume you're using these bridges on the host
 for something out of band that CloudStack isn't aware of?

 On Wed, Mar 4, 2015 at 8:09 AM, Andrija Panic andrija.pa...@gmail.com 
 wrote:
 I forgot to add - this is ACS 4.3.0 and CentOS 6.x... breth1-500 is the
 main bridge used for Shared Network (yes, I know, somewhat confusing name
 for the bridge...)

 On 4 March 2015 at 17:07, Andrija Panic andrija.pa...@gmail.com wrote:

 Hi people.

 on physical host, I was having breth1-500(bridge) with eth1(joined to this
 bridge) - all defined manually in Centos network config files.
 (when you boot physical host - this bridge is active of course)

 When I deploy new VM with Shared Network with vlan 500, new device is
 created eth1.500 and joined to this bridge - which is fine, and then vnet0
 device from VM is also joined to the bridge...



 When I stop the last VM that is using this Shared Network, CloudStack
 (agent?) removes eth1.500 from bridge (fine with me), and ***then removed
 eth1 from bridge (which I manually configured !!!) and later stoped/removed
 the whole bridge***

 If you try to start new VM again (joined to Shared Network with vlan 500)
 - then no bridge is available, VM is started, but no vnet device, no
 breth1-500 bridge up, and no eth1.500 up.
 No bridge was available - but also bridge was not created on the fly...


 Is there any explanation - why the heck would my manually configured
 bridge get deleted?

 --

 Andrija Panić




 --

 Andrija Panić


Re: Agent deleting/stoping main bridge on Linux ???

2015-03-04 Thread Marcus
As for why, it's a scalability issue. There are people who are using
(or have defined) 10,000+ guest networks. If CloudStack left bridges
around that weren't being used it could cause problems for the guest,
so it keeps things tidy by only keeping bridges that are used.

On Wed, Mar 4, 2015 at 8:16 AM, Marcus shadow...@gmail.com wrote:
 I don't think anyone has ever tested what would happen if the admin
 has manually defined the same guest bridges that CloudStack wants to
 use. CloudStack creates them on the fly and deletes them when the last
 VM has been removed. I assume you're using these bridges on the host
 for something out of band that CloudStack isn't aware of?

 On Wed, Mar 4, 2015 at 8:09 AM, Andrija Panic andrija.pa...@gmail.com wrote:
 I forgot to add - this is ACS 4.3.0 and CentOS 6.x... breth1-500 is the
 main bridge used for Shared Network (yes, I know, somewhat confusing name
 for the bridge...)

 On 4 March 2015 at 17:07, Andrija Panic andrija.pa...@gmail.com wrote:

 Hi people.

 on physical host, I was having breth1-500(bridge) with eth1(joined to this
 bridge) - all defined manually in Centos network config files.
 (when you boot physical host - this bridge is active of course)

 When I deploy new VM with Shared Network with vlan 500, new device is
 created eth1.500 and joined to this bridge - which is fine, and then vnet0
 device from VM is also joined to the bridge...



 When I stop the last VM that is using this Shared Network, CloudStack
 (agent?) removes eth1.500 from bridge (fine with me), and ***then removed
 eth1 from bridge (which I manually configured !!!) and later stoped/removed
 the whole bridge***

 If you try to start new VM again (joined to Shared Network with vlan 500)
 - then no bridge is available, VM is started, but no vnet device, no
 breth1-500 bridge up, and no eth1.500 up.
 No bridge was available - but also bridge was not created on the fly...


 Is there any explanation - why the heck would my manually configured
 bridge get deleted?

 --

 Andrija Panić




 --

 Andrija Panić


Re: Agent deleting/stoping main bridge on Linux ???

2015-03-04 Thread Marcus
Yeah, sorry. You could request an enhancement for an agent tunable
that keeps the bridge from being removed, but it will not help you
now.

If you want to get hacky, you can edit the
   /usr/share/cloudstack-common/scripts/vm/network/vnet/modifyvlan.sh
script on your agent, commenting out deleteVlan(). That *might* do
what you want.

On Wed, Mar 4, 2015 at 8:19 AM, Andrija Panic andrija.pa...@gmail.com wrote:
 Thanks Markus - yes don't ask me why I did name bridge like this... so this
 is obviously unsuported scenario, that I did... crap...

 Thx again.

 On 4 March 2015 at 17:16, Marcus shadow...@gmail.com wrote:

 I don't think anyone has ever tested what would happen if the admin
 has manually defined the same guest bridges that CloudStack wants to
 use. CloudStack creates them on the fly and deletes them when the last
 VM has been removed. I assume you're using these bridges on the host
 for something out of band that CloudStack isn't aware of?

 On Wed, Mar 4, 2015 at 8:09 AM, Andrija Panic andrija.pa...@gmail.com
 wrote:
  I forgot to add - this is ACS 4.3.0 and CentOS 6.x... breth1-500 is the
  main bridge used for Shared Network (yes, I know, somewhat confusing name
  for the bridge...)
 
  On 4 March 2015 at 17:07, Andrija Panic andrija.pa...@gmail.com wrote:
 
  Hi people.
 
  on physical host, I was having breth1-500(bridge) with eth1(joined to
 this
  bridge) - all defined manually in Centos network config files.
  (when you boot physical host - this bridge is active of course)
 
  When I deploy new VM with Shared Network with vlan 500, new device is
  created eth1.500 and joined to this bridge - which is fine, and then
 vnet0
  device from VM is also joined to the bridge...
 
 
 
  When I stop the last VM that is using this Shared Network, CloudStack
  (agent?) removes eth1.500 from bridge (fine with me), and ***then
 removed
  eth1 from bridge (which I manually configured !!!) and later
 stoped/removed
  the whole bridge***
 
  If you try to start new VM again (joined to Shared Network with vlan
 500)
  - then no bridge is available, VM is started, but no vnet device, no
  breth1-500 bridge up, and no eth1.500 up.
  No bridge was available - but also bridge was not created on the fly...
 
 
  Is there any explanation - why the heck would my manually configured
  bridge get deleted?
 
  --
 
  Andrija Panić
 
 
 
 
  --
 
  Andrija Panić




 --

 Andrija Panić


Re: Uploading of Volume - how does what here ?

2015-02-26 Thread Marcus
The volume is downloaded by the SSVM into secondary storage under the
volumes directory. It will sit there until you choose to attach it
somewhere, at which point a CopyCommand will be sent to a hypervisor
that has access to the primary storage for the cluster on which the
target VM is running to copy from secondary to primary. This will be
handled by the appropriate StorageAdaptor for the primary storage type
(most likely LibvirtStorageAdaptor, which will qemu-img it, converting
to RAW format for RBD/LVM, or just a plain cp last I checked for QCOW2
to QCOW2).Then an AttachCommand will be sent to the hypervisor on
which the target vm is running (if it is running) and it will be
hotplugged.

On Thu, Feb 26, 2015 at 1:05 AM, Andrija Panic andrija.pa...@gmail.com wrote:
 Thx Lucian,

 that was my guessing, but would like some confirmation if anyone familiar
 with this...

 Thanks

 On 25 February 2015 at 17:57, Nux! n...@li.nux.ro wrote:

 Not a CEPH user, but what I believe happens is your HV mounts the NFS
 storage and then does something like qemu-img convert to move it into
 CEPH.

 HTH
 Lucian

 --
 Sent from the Delta quadrant using Borg technology!

 Nux!
 www.nux.ro

 - Original Message -
  From: Andrija Panic andrija.pa...@gmail.com
  To: d...@cloudstack.apache.org, users@cloudstack.apache.org
  Sent: Wednesday, 25 February, 2015 12:09:59
  Subject: Uploading of Volume - how does what here ?

  Hi guys,
 
  I'm just uploading a Volume, and I guess it will end up on the CEPH
 storage
  that we are using.
 
  So my qyestion would be: VOLUME will end up on the primary storage in the
  end, but right now, I can see that the SSVM is actuall y downloading the
  Volume from internet, and writing it to Secondary Storage at the moment.
 
  WHat happens next - I know that both SSVM and ofcoure my NFS server, can
  not write/talk at all to CEPH ?
 
  Does maybe some randomly choosen host, mounts Secondary Storage NFS, read
  Volume, and upload/write to CEPH ? (Since only hypervisor hosts can
  actually talk to CEPH)
 
  Thanks in advance for clarification...
 
  --
 
  Andrija Panić




 --

 Andrija Panić


Re: Agent dies every night/morning.... memory violation

2015-02-23 Thread Marcus
It doesn't really sound like an agent problem, but some other root
problem that is causing issues for the agent. Perhaps it is specific
to the host simply because there is a particular VM that always runs
on that host and the VM itself is triggering the issue. Perhaps a
heavy logrotate or cron job on the vm causes issues for librados. Just
grasping at straws here. From the output provided it does seem that
the libvirt bindings that include ceph code are terminating the agent
execution.  My guess is that if you focus on why this host as
opposed to what's going on, you'll find the answer to both. Sorry, I
know that's not much help.

On Mon, Feb 23, 2015 at 7:29 AM, Andrija Panic andrija.pa...@gmail.com wrote:
 Anybody?, before I start to cry :(

 On 21 February 2015 at 21:18, Andrija Panic andrija.pa...@gmail.com wrote:

 HI Simon,

 selinux is disabled, I have just double checked.

 BTW, this is what I can see in the cloudstack-agent.err log - seems like
 some CEPH related issues, but not sure why would agent die...
 If I recall correclty, this might be happening since the CEPH update from
 0.80.3? to 0.87 - and this seesm like some crash in librados


 libust[1907/2046]: Warning: HOME environment variable not set. Disabling
 LTTng-UST per-user tracing. (in setup_local_apps() at lttng-ust-comm.c:305)
 libvirt:  error : name in virDomainLookupByName must not be NULL
 libvirt:  error : name in virDomainLookupByName must not be NULL
 libvirt:  error : name in virDomainLookupByName must not be NULL
 libvirt:  error : name in virDomainLookupByName must not be NULL
 libvirt: Storage Driver error : failed to remove volume
 'cloudstack/bd751250-de35-4d2e-a4e3-3ee4b636c2a7': Device or resource busy
 ./log/SubsystemMap.h: In function 'bool
 ceph::log::SubsystemMap::should_gather(unsigned int, int)' thread
 7f04427fc700 time 2015-02-21 06:39:38.839210
 ./log/SubsystemMap.h: 62: FAILED assert(sub  m_subsys.size())
  ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578)
  1: (()+0x1fe223) [0x7f060c932223]
  2: (ObjectCacher::flusher_entry()+0x155) [0x7f060c9866e5]
  3: (ObjectCacher::FlusherThread::entry()+0xd) [0x7f060c9976cd]
  4: (()+0x79d1) [0x7f06605ee9d1]
  5: (clone()+0x6d) [0x7f066033bb5d]
  NOTE: a copy of the executable, or `objdump -rdS executable` is needed
 to interpret this.
 terminate called after throwing an instance of 'ceph::FailedAssertion'
 21/02/2015 06:39:38 1905 jsvc.exec error: Service did not exit cleanly

 On 20 February 2015 at 21:56, Simon Weller swel...@ena.com wrote:

 Andrija,

 What is SELinux set to on this host?


 - SI


 
 From: Andrija Panic andrija.pa...@gmail.com
 Sent: Friday, February 20, 2015 6:06 AM
 To: d...@cloudstack.apache.org; users@cloudstack.apache.org
 Subject: Agent dies every night/morning memory violation

 Hi,

 I have crazy agent on one of the hosts, that is being killed each morning
 and I found this in /var/log/audit.log:

 type=ANOM_ABEND msg=audit(1424321463.930:430678): auid=0 uid=0 gid=0
 ses=68891 pid=10831 comm=jsvc reason=memory violation sig=6

 I dont remember changing anything on the system, but this keeps happening
 each morning arrond same time 5.20am-5.40am.

 I'm wondering what the hack is happening, any suggestions where to
 troubleshoot ?
 Will check logs in details anyway...

 --

 Andrija Panić




 --

 Andrija Panić




 --

 Andrija Panić


Re: Network QoS (not bandwidth limiting)

2015-02-21 Thread Marcus
The points raised are certainly valid from an enterprise networking
standpoint, and don't fall on deaf ears, but we should keep things in
perspective. To provide the aforementioned features would be
relatively uncharted territory in the cloud orchestration world (at
least not considering vendor provided networking solutions that only
handle the network part of the equation), so while it would be good to
aspire to providing those things, it should be no surprise that the
platform works that way and lacks such features.

For further perspective, keep in mind that cloud orchestration in
general has been a pitch to software developers and management for
easy infrastructure. Cloud consumers are end users, web developers,
application developers, so again it should be no surprise that the
product provides features that cater to that, rather than providing
the bells and whistles that a network admin would want to see in their
infrastructure. CloudStack was never built to be pitched to network
teams as a cure for managing their infra deployments, the only cloud
product providers doing that are network vendors who have cloud
networking products. This is of course why a VPC needs IPs defined, as
applications care more about how to serve up a web page than network
engineering and managing distinct layer 2 and 3, so the whole network
stack is sandwiched into a simple orchestration mechanism that gets
the application what it needs.

In designing and deploying cloud, the most common complaint I see from
people who are infrastructure maintainers is why can't I just build
the infrastructure the way I want and then have it orchestrated?.
Unfortunately, we can't just automate and integrate with anyone's pet
design. CloudStack supports many novel and custom network designs
simply by allowing the option of letting you manage the network
hardware and being hands-off (shared/public networks), while also
being pluggable to allow vendors to take over whatever features and
they wish. I've seen some pretty advanced overlay networking provided
through third party plugins to CloudStack that take over all network
functionality and provide more.

What's really being asked for here is for CloudStack to provide and
maintain a fully fledged and featured router distribution in its
provided virtual router. It's an admirable project to have if we can
get support for it. My guess is there's a bit of a disconnect in
interest though, because many (but not all) enterprises who want
CloudStack for infrastructure automation are skeptical about a VM as
software router and prefer to bring in aforementioned enterprise
vendors who have their own plugins. People who provide cloud hosting
and other services tend to use the routers, but their interest in
enterprise level routing and redundancy varies greatly, and their
customers are designing their apps to be resilient to infrastructure
loss (e.g. most AWS customers). That's of course not entirely the
whole truth, as is evidenced by the work we are seeing on redundant
routers, but I do believe that's why we haven't seen these things from
the beginning. They just haven't been all that important to the target
customers, even though infrastructure engineers are used to providing
them.

So now comes my philosophy. In the end, I think the great thing about
open source communities is that if there's the right level of
interest, it will happen.  I'm the kind of person who feels a pang of
stress at the idea that something I work on can't be all things to all
people, but after building a hosting business over the last few years
I've begun to realize that it's really only practical to try to be
good for a subset of the market and focus on that. You'll never please
everyone, there are limits to what you can accomplish, and sometimes
it's OK to just concede that your product is not going to work for
everyone. If you don't, you'll spread yourself too thin and fail
everyone. In order to make something great you have to have a limit on
your scope. That's not to say you don't listen to your customers, but
you sometimes have to make hard choices on who to listen to and who to
upset.

None of this should be taken as a discouragement to the topics at
hand, but again as someone to takes it personally when I don't deliver
I wanted to provide some follow up to address the rant and try to
provide perspective on why the things are the way they are.

On Sat, Feb 21, 2015 at 1:58 PM, Somesh Naidu somesh.na...@citrix.com wrote:
 Adrian,

 Rant or not, I believe you have raised a valid point and reflect certain 
 group of peoples requirement.

 Based on your requirement, I believe you are looking for something like 
 Vyatta.

 Regards,
 Somesh

 -Original Message-
 From: Adrian Lewis [mailto:adr...@alsiconsulting.co.uk]
 Sent: Friday, February 20, 2015 8:50 PM
 To: users@cloudstack.apache.org
 Subject: RE: Network QoS (not bandwidth limiting)

 Tempted to suggest some sort of special interest group where networking
 people 

Re: [VOTE] Apache CloudStack 4.5.0 RC1

2015-01-13 Thread Marcus
+1, ran some of the smoke tests that cover basic deployments of vm,
vpc, and several storage types.

On Mon, Jan 12, 2015 at 11:36 PM, Rohit Yadav rohit.ya...@shapeblue.com wrote:
 (+ users)

 Hi everyone,

 David has started the voting process for 4.5.0 candidate, please help test 
 this candidate.
 In case you’re unable to build from source, you may use following repository 
 built from SHA 8db3cbd4ff62b17a8b496026b68cf60ee0c76740:

 DEB: http://packages.bhaisaab.org/cloudstack/testing/debian/4.5/
 RPM: http://packages.bhaisaab.org/cloudstack/testing/centos/4.5/
 SystemVM Templates: http://packages.shapeblue.com/systemvmtemplate/4.5/4.5.0

 On 13-Jan-2015, at 4:46 am, David Nalley da...@gnsa.us wrote:

 Hi folks,

 I've created a 4.5.0 release candidate, with the following artifacts
 up for a vote:

 Git Branch and Commit SH:
 https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;a=tree;h=refs/heads/4.5-RC20150112T2256;hb=4.5-RC20150112T2256
 Commit: 8db3cbd4ff62b17a8b496026b68cf60ee0c76740

 Source release (checksums and signatures are available at the same
 location):
 https://dist.apache.org/repos/dist/dev/cloudstack/4.5.0-rc1/

 PGP release keys (signed using 6FE50F1C):
 https://dist.apache.org/repos/dist/release/cloudstack/KEYS

 Vote will be open for at least 72 hours.

 For sanity in tallying the vote, can PMC members please be sure to
 indicate (binding) with their vote?

 [ ] +1  approve
 [ ] +0  no opinion
 [ ] -1  disapprove (and reason why)

 Regards,
 Rohit Yadav
 Software Architect, ShapeBlue
 M. +91 88 262 30892 | rohit.ya...@shapeblue.com
 Blog: bhaisaab.org | Twitter: @_bhaisaab



 Find out more about ShapeBlue and our range of CloudStack related services

 IaaS Cloud Design  Buildhttp://shapeblue.com/iaas-cloud-design-and-build//
 CSForge – rapid IaaS deployment frameworkhttp://shapeblue.com/csforge/
 CloudStack Consultinghttp://shapeblue.com/cloudstack-consultancy/
 CloudStack Software 
 Engineeringhttp://shapeblue.com/cloudstack-software-engineering/
 CloudStack Infrastructure 
 Supporthttp://shapeblue.com/cloudstack-infrastructure-support/
 CloudStack Bootcamp Training 
 Courseshttp://shapeblue.com/cloudstack-training/

 This email and any attachments to it may be confidential and are intended 
 solely for the use of the individual to whom it is addressed. Any views or 
 opinions expressed are solely those of the author and do not necessarily 
 represent those of Shape Blue Ltd or related companies. If you are not the 
 intended recipient of this email, you must neither take any action based upon 
 its contents, nor copy or show it to anyone. Please contact the sender if you 
 believe you have received this email in error. Shape Blue Ltd is a company 
 incorporated in England  Wales. ShapeBlue Services India LLP is a company 
 incorporated in India and is operated under license from Shape Blue Ltd. 
 Shape Blue Brasil Consultoria Ltda is a company incorporated in Brasil and is 
 operated under license from Shape Blue Ltd. ShapeBlue SA Pty Ltd is a company 
 registered by The Republic of South Africa and is traded under license from 
 Shape Blue Ltd. ShapeBlue is a registered trademark.


Re: Add Storage Network

2014-12-18 Thread Marcus
My guess is that you're right. I haven't seen anything that ties the
'physical network' to a specific NIC, thus if you register the traffic type
it should work so long as the hosts are able to route to that subnet on
*any* NIC. I'm not sure what hypervisor you're using, but for KVM at least,
you've also got to think about the bridge that SSVMs will want to attach
to. This is determined by bridge name (traffic label), and I believe if it
is set up in advance on the hosts it won't try to bridge to the primary
interface.

On Thu, Dec 18, 2014 at 8:06 AM, Logan Barfield lbarfi...@tqhosting.com
wrote:

 During initial deployment we went with a simpler network design by letting
 secondary storage traffic run over the management network.  We would like
 to offload secondary storage to a separate network, and are trying to
 figure out the best method of doing so.

 Based on the recommendations in the documentation we would need to add a
 new Physical Network, and add the Storage traffic type to that network.

 Could we instead just add a new traffic type to the existing network, just
 set up the specified VLAN + Storage subnet on a separate NIC on the
 hypervisor?

 For Primary Storage the NIC used is just determined by the hypervisor's
 routing table, so wouldn't it work the same for the secondary storage
 network, or is there a reason it should be added as a separate Physical
 Network in CloudStack?



Re: Port forwarding (web) - doesnt show real client IP

2014-12-08 Thread Marcus
It sounds like some iptables rules got broken at some point for the static
NAT, and since there's still a catch-all SNAT for outbound it gets caught
by that and still keeps working, but is broken in a subtle way that goes
unnoticed.

On Mon, Dec 8, 2014 at 2:55 PM, Andrija Panic andrija.pa...@gmail.com
wrote:

 And just to spice things a little bit, ALL remote connections appears to
 come from main Public IP of the VPC VR.
 So we can not block some stuff on firewall onVM (while doing port
 forwading) because all connections appear to come from main Public IP of
 the VPC VR.

 This is terrible design/bug - can we change this ?
 I'm on the ACS 4.3 currently...

 cheers

 On 8 December 2014 at 23:42, Andrija Panic andrija.pa...@gmail.com
 wrote:

  Hi,
 
  when doing port forwarding on VPC VR - port 80 - when some client access
  web site - only the main Public IP of the VPC is logged in apache access
  logs as remote IP.
 
  Why is this behaviour - and can this be changed ?
  My understanding is that this is kind of bug (unless needed for some
 other
  reasons) - port forwading is DNAT in essence, so only the destination
  IP/port should be changed, not proxied all the way, as it seems to be the
  case here...
 
  I read on other guys mailing list - same behavior for loadbalancer...
 
  Any suggestion ?
 
  Thanks,
 
  --
 
  Andrija Panić
 



 --

 Andrija Panić



Re: Automatic KVM host reboot on Primary Storage failure

2014-11-14 Thread Marcus
It is there (I believe) because cloudstack is acting as a cluster manager
for KVM. It is using NFS to determine if it is 'alive' on the network, and
if it is not, it reboots itself to avoid having a split brain scenario
where VMs start coming up on other hosts when they are already running on
this host.  It generally works, if the problem is the host, but as you
point out, there's a situation where the problem can be the NFS server.
This fairly rare for enterprise NFS with high availability, but there are a
fair number of people who have NFS on servers that are relatively low
availability (non-clustered, or get overloaded and unresponsive).

There's plenty of room for improvement in that script, I agree the original
implemention seems fairly rudimentary, but we have to be careful in
thinking about all scenarios and make sure there's no chance of split
brain. In the mean time, one could also partition the resources such that
you have more clusters and only one primary storage per cluster (or
something else, like storage/host tags to guarantee each host only uses one
NFS).

On Fri, Nov 14, 2014 at 8:07 AM, Andrija Panic andrija.pa...@gmail.com
wrote:

 Hi guys,

 I'm wondering why us there a check
 inside
 /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/kvmheartbeat.sh
 ?

 I understand that the KVM host checks availability of Primary Storage, and
 reboots itself if it can't write to storage.

 But, if we have say, 3 NFS in a cluster, then lot of KVM hosts - 1 primary
 storage going down (server crashing or whatever) - will bring porbably 99%
 of KVM hosts also down for reboot ?
 So instead of loosing uptime for 1/3 of my VMs (1 storage out of 3) - I
 loose uptime for 99%-100% of my VMs ?

 I manually edit this script to disabled reboots - but why is it there in
 any case ?
 It doesn't make sense to me - unless I'm mising a point (probably)...

 Thanks,
 --

 Andrija Panić



Re: CloudStack Ports

2014-10-15 Thread Marcus
From outside, (say from hotel, through home router, to mgmt server) you
need access to the web ui and for the web ui to have access to the api
server. That would just be 8080 (UI) and 8096(API), I believe. you wouldn't
need libvirt and the others unless you are stringing mgmt servers and hosts
across the link.

On Wed, Oct 15, 2014 at 10:43 AM, Mo m...@daoenix.com wrote:

 Hello,

 I’ve setup Cloudstack on my home server. However, it works without issues
 locally. When I attempt to pull up console outside, it times out. I have of
 course enabled ports for SSH / UI, so I can setup instances, but I am not
 sure what else I need to permit through my router to allow all the
 necessary ports to be opened.

 According to the site, I have done the following:

 22 (SSH)
 1798
 16509 (libvirt)
 5900 - 6100 (VNC consoles)
 49152 - 49216 (libvirt live migration)
 Anything else?

 // Mo


Re: CloudStack Ports

2014-10-15 Thread Marcus
Ah, I see. I believe you'd need access to whatever IP the consoleproxy vm
is listening on. I don't actually use the console proxy vm for my purposes,
but I don't think you need to open the vnc console or libvirt ports to the
outside. If the console proxy works internally, you probably just don't
have access to the console proxy vm's IP when it opens the link to redirect
you. Are you NAT'ing to the mgmt server from outside? I think you'd need
the console proxy vm to be publicly reachable, and cloudstack seems to be
assigning it a rfc1918 address (192.168), which you'll never be able to
reach from the outside. Your best bet might be to set up a remote access
VPN in your home if you want to use the system from outside, such that you
are treated like you are inside. Something like openVPN.

On Wed, Oct 15, 2014 at 11:02 AM, Mo m...@daoenix.com wrote:

 Would this be on the Console VM, Or from the node? Need to know which
 local IP I need to redirect it to.

 I see in the log, it’s coming from 192.168.1.43 (which is console vm) so I
 suspect there?


 --
 Mo
 Sent with Airmail

 On October 15, 2014 at 1:00:12 PM, Marcus (shadow...@gmail.com) wrote:

 From outside, (say from hotel, through home router, to mgmt server) you
 need access to the web ui and for the web ui to have access to the api
 server. That would just be 8080 (UI) and 8096(API), I believe. you
 wouldn't
 need libvirt and the others unless you are stringing mgmt servers and
 hosts
 across the link.

 On Wed, Oct 15, 2014 at 10:43 AM, Mo m...@daoenix.com wrote:

  Hello,
 
  I’ve setup Cloudstack on my home server. However, it works without
 issues
  locally. When I attempt to pull up console outside, it times out. I have
 of
  course enabled ports for SSH / UI, so I can setup instances, but I am
 not
  sure what else I need to permit through my router to allow all the
  necessary ports to be opened.
 
  According to the site, I have done the following:
 
  22 (SSH)
  1798
  16509 (libvirt)
  5900 - 6100 (VNC consoles)
  49152 - 49216 (libvirt live migration)
  Anything else?
 
  // Mo




Re: Upgrade from 4.1.1 to 4.3.0 ( KVM, Traffic labels, Adv. VLAN ) VR's bug

2014-04-20 Thread Marcus
No idea, but have you verified that the vm is running the new system
vm template? What happens if you destroy the router and let it
recreate?

On Sun, Apr 20, 2014 at 6:20 PM, Serg Senko kernc...@gmail.com wrote:
 Hi

 After upgrade and restarting system-VM's
 all VR started with some bad network configuration, egress rules stopped
 work.
 also some staticNAT rules,


 there is  ip addr show  from one of VR's

 root@r-256-VM:~# ip addr show

 1: lo: LOOPBACK,UP,LOWER_UP mtu 16436 qdisc noqueue state UNKNOWN

 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

 inet 127.0.0.1/8 scope host lo

 inet6 ::1/128 scope host

valid_lft forever preferred_lft forever

 2: eth0: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc pfifo_fast state
 UP qlen 1000

 link/ether 02:00:6b:16:00:09 brd ff:ff:ff:ff:ff:ff

 inet 10.1.1.1/24 brd 10.1.1.255 scope global eth0

 inet6 fe80::6bff:fe16:9/64 scope link

valid_lft forever preferred_lft forever

 3: eth1: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc pfifo_fast state
 UP qlen 1000

 link/ether 0e:00:a9:fe:01:38 brd ff:ff:ff:ff:ff:ff

 inet 169.254.1.56/16 brd 169.254.255.255 scope global eth1

 inet6 fe80::c00:a9ff:fefe:138/64 scope link

valid_lft forever preferred_lft forever

 4: eth2: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc pfifo_fast state
 UP qlen 1000

 link/ether 06:06:ec:00:00:0e brd ff:ff:ff:ff:ff:ff

 inet XXX.XXX.XXX.219/26 brd 46.165.231.255 scope global eth2

 inet6 fe80::406:ecff:fe00:e/64 scope link

valid_lft forever preferred_lft forever

 5: eth3: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc pfifo_fast state
 UP qlen 1000

 link/ether 06:81:44:00:00:0e brd ff:ff:ff:ff:ff:ff

 inet XXX.XXX.XXX.219/26 brd 46.165.231.255 scope global eth3

 inet XXX.XXX.XXX.230/26 brd 46.165.231.255 scope global secondary eth3

 inet XXX.XXX.XXX.228/26 brd 46.165.231.255 scope global secondary eth3

 inet XXX.XXX.XXX.209/26 brd 46.165.231.255 scope global secondary eth3

 inet XXX.XXX.XXX.247/26 brd 46.165.231.255 scope global secondary eth3

 inet XXX.XXX.XXX.227/26 brd 46.165.231.255 scope global secondary eth3

 inet6 fe80::481:44ff:fe00:e/64 scope link

valid_lft forever preferred_lft forever

 6: eth4: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc pfifo_fast state
 UP qlen 1000

 link/ether 06:e5:36:00:00:0e brd ff:ff:ff:ff:ff:ff

 inet XXX.XXX.XXX.219/26 brd 46.165.231.255 scope global eth4

 inet XXX.XXX.XXX.247/26 brd 46.165.231.255 scope global secondary eth4

 inet XXX.XXX.XXX.209/26 brd 46.165.231.255 scope global secondary eth4

 inet XXX.XXX.XXX.227/26 brd 46.165.231.255 scope global secondary eth4

 inet XXX.XXX.XXX.230/26 brd 46.165.231.255 scope global secondary eth4

 inet6 fe80::4e5:36ff:fe00:e/64 scope link

valid_lft forever preferred_lft forever

 7: eth5: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc pfifo_fast state
 UP qlen 1000

 link/ether 06:6f:3a:00:00:0e brd ff:ff:ff:ff:ff:ff

 inet XXX.XXX.XXX.219/26 brd 46.165.231.255 scope global eth5

 inet XXX.XXX.XXX.228/26 brd 46.165.231.255 scope global secondary eth5

 inet XXX.XXX.XXX.227/26 brd 46.165.231.255 scope global secondary eth5

 inet XXX.XXX.XXX.209/26 brd 46.165.231.255 scope global secondary eth5

 inet XXX.XXX.XXX.247/26 brd 46.165.231.255 scope global secondary eth5

 inet XXX.XXX.XXX.230/26 brd 46.165.231.255 scope global secondary eth5

 inet6 fe80::46f:3aff:fe00:e/64 scope link

valid_lft forever preferred_lft forever

 8: eth6: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc pfifo_fast state
 UP qlen 1000

 link/ether 06:b0:30:00:00:0e brd ff:ff:ff:ff:ff:ff

 inet XXX.XXX.XXX.219/26 brd 46.165.231.255 scope global eth6

 inet XXX.XXX.XXX.209/26 brd 46.165.231.255 scope global secondary eth6

 inet XXX.XXX.XXX.247/26 brd 46.165.231.255 scope global secondary eth6

 inet XXX.XXX.XXX.230/26 brd 46.165.231.255 scope global secondary eth6

 inet XXX.XXX.XXX.227/26 brd 46.165.231.255 scope global secondary eth6

 inet6 fe80::4b0:30ff:fe00:e/64 scope link

valid_lft forever preferred_lft forever

 9: eth7: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc pfifo_fast state
 UP qlen 1000

 link/ether 06:26:b4:00:00:0e brd ff:ff:ff:ff:ff:ff

 inet XXX.XXX.XXX.219/26 brd 46.165.231.255 scope global eth7

 inet XXX.XXX.XXX.247/26 brd 46.165.231.255 scope global secondary eth7

 inet XXX.XXX.XXX.228/26 brd 46.165.231.255 scope global secondary eth7

 inet XXX.XXX.XXX.230/26 brd 46.165.231.255 scope global secondary eth7

 inet XXX.XXX.XXX.209/26 brd 46.165.231.255 scope global secondary eth7

 inet XXX.XXX.XXX.227/26 brd 46.165.231.255 scope global secondary eth7

 inet6 fe80::426:b4ff:fe00:e/64 scope link

valid_lft forever preferred_lft forever


 --
 ttyv0 /usr/libexec/gmail Pc  webcons on secure


Re: unable to connect to /var/lib/libvirt/qemu/v-1-VM.agent

2014-04-20 Thread Marcus
No, it has nothing to do with ssh or libvirt daemon. It's the literal
unix socket that is created for virtio-serial communication when the
qemu process starts. The question is why the system is refusing access
to the socket. I assume this is being attempted as root.

On Sat, Apr 19, 2014 at 9:58 AM, Nux! n...@li.nux.ro wrote:
 On 19.04.2014 15:24, Giri Prasad wrote:


 # grep listen_ /etc/libvirt/libvirtd.conf
 listen_tls=0
 listen_tcp=1
 #listen_addr = 192.XX.XX.X
 listen_addr = 0.0.0.0

 #
 /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl
 -n v-1-VM -p

 %template=domP%type=consoleproxy%host=192.XXX.XX.5%port=8250%name=v-1-VM%zone=1%pod=1%guid=Proxy.1%proxy_vm=1%disable_rp_filter=true%eth2ip=192.XXX.XX.173%eth2mask=255.255.255.0%gateway=192.XXX.XX.1%eth0ip=169.254.0.173%eth0mask=255.255.0.0%eth1ip=192.XXX.XX.166%eth1mask=255.255.255.0%mgmtcidr=192.XXX.XX.0/24%localgw=192.XXX.XX.1%internaldns1=192.XXX.XX.1%dns1=192.XXX.XX.1
 .
 ERROR: unable to connect to /var/lib/libvirt/qemu/v-1-VM.agent -
 Connection refused


 Do you have -l or --listen as LIBVIRTD_ARGS in /etc/sysconfig/libvirtd?

 (kind of stabbing in the dark)


 --
 Sent from the Delta quadrant using Borg technology!

 Nux!
 www.nux.ro


Re: unable to connect to /var/lib/libvirt/qemu/v-1-VM.agent

2014-04-20 Thread Marcus
You may want to look in the qemu log of the vm to see if there's
something deeper going on, perhaps the qemu process is not fully
starting due to some other issue. /var/log/libvirt/qemu/v-1-VM.log, or
something like that.

On Sun, Apr 20, 2014 at 11:22 PM, Marcus shadow...@gmail.com wrote:
 No, it has nothing to do with ssh or libvirt daemon. It's the literal
 unix socket that is created for virtio-serial communication when the
 qemu process starts. The question is why the system is refusing access
 to the socket. I assume this is being attempted as root.

 On Sat, Apr 19, 2014 at 9:58 AM, Nux! n...@li.nux.ro wrote:
 On 19.04.2014 15:24, Giri Prasad wrote:


 # grep listen_ /etc/libvirt/libvirtd.conf
 listen_tls=0
 listen_tcp=1
 #listen_addr = 192.XX.XX.X
 listen_addr = 0.0.0.0

 #
 /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl
 -n v-1-VM -p

 %template=domP%type=consoleproxy%host=192.XXX.XX.5%port=8250%name=v-1-VM%zone=1%pod=1%guid=Proxy.1%proxy_vm=1%disable_rp_filter=true%eth2ip=192.XXX.XX.173%eth2mask=255.255.255.0%gateway=192.XXX.XX.1%eth0ip=169.254.0.173%eth0mask=255.255.0.0%eth1ip=192.XXX.XX.166%eth1mask=255.255.255.0%mgmtcidr=192.XXX.XX.0/24%localgw=192.XXX.XX.1%internaldns1=192.XXX.XX.1%dns1=192.XXX.XX.1
 .
 ERROR: unable to connect to /var/lib/libvirt/qemu/v-1-VM.agent -
 Connection refused


 Do you have -l or --listen as LIBVIRTD_ARGS in /etc/sysconfig/libvirtd?

 (kind of stabbing in the dark)


 --
 Sent from the Delta quadrant using Borg technology!

 Nux!
 www.nux.ro


Re: unable to connect to /var/lib/libvirt/qemu/v-1-VM.agent

2014-04-20 Thread Marcus
Sorry, actually I see the 'connection refused' is just your own test
after the fact. By that time the vm may be shut down, so connection
refused would make sense.

What happens if you do this:

'virsh dumpxml v-1-VM  /tmp/v-1-VM.xml' while it is running
stop the cloudstack agent
'virsh destroy v-1-VM'
'virsh create /tmp/v-1-VM.xml'
Then try connecting to that VM via VNC to watch it boot up, or running
that command manually, repeatedly? Does it time out?

In the end this may not mean much, because in CentOS 6.x that command
is retried over and over while the system vm is coming up anyway (in
other words, some failures are expected). It could be related, but it
could also be that the system vm is failing to come up for any other
reason, and this is just the thing you noticed.

On Sun, Apr 20, 2014 at 11:25 PM, Marcus shadow...@gmail.com wrote:
 You may want to look in the qemu log of the vm to see if there's
 something deeper going on, perhaps the qemu process is not fully
 starting due to some other issue. /var/log/libvirt/qemu/v-1-VM.log, or
 something like that.

 On Sun, Apr 20, 2014 at 11:22 PM, Marcus shadow...@gmail.com wrote:
 No, it has nothing to do with ssh or libvirt daemon. It's the literal
 unix socket that is created for virtio-serial communication when the
 qemu process starts. The question is why the system is refusing access
 to the socket. I assume this is being attempted as root.

 On Sat, Apr 19, 2014 at 9:58 AM, Nux! n...@li.nux.ro wrote:
 On 19.04.2014 15:24, Giri Prasad wrote:


 # grep listen_ /etc/libvirt/libvirtd.conf
 listen_tls=0
 listen_tcp=1
 #listen_addr = 192.XX.XX.X
 listen_addr = 0.0.0.0

 #
 /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl
 -n v-1-VM -p

 %template=domP%type=consoleproxy%host=192.XXX.XX.5%port=8250%name=v-1-VM%zone=1%pod=1%guid=Proxy.1%proxy_vm=1%disable_rp_filter=true%eth2ip=192.XXX.XX.173%eth2mask=255.255.255.0%gateway=192.XXX.XX.1%eth0ip=169.254.0.173%eth0mask=255.255.0.0%eth1ip=192.XXX.XX.166%eth1mask=255.255.255.0%mgmtcidr=192.XXX.XX.0/24%localgw=192.XXX.XX.1%internaldns1=192.XXX.XX.1%dns1=192.XXX.XX.1
 .
 ERROR: unable to connect to /var/lib/libvirt/qemu/v-1-VM.agent -
 Connection refused


 Do you have -l or --listen as LIBVIRTD_ARGS in /etc/sysconfig/libvirtd?

 (kind of stabbing in the dark)


 --
 Sent from the Delta quadrant using Borg technology!

 Nux!
 www.nux.ro


Re: Live migration failed to newly provisioned KVM host

2014-04-18 Thread Marcus
Yes, it looks as though the two machines are running different
versions of qemu/libvirt, as the destination doesn't support the
machine type that the VM has defined in it's XML on the source host.

On Fri, Apr 18, 2014 at 3:43 PM, Nux! n...@li.nux.ro wrote:
 On 18.04.2014 19:45, Indra Pramana wrote:

 Unable to migrate due to internal error Process exited while reading
 console log output: Supported machines are:
 pc Standard PC (alias of pc-1.0)
 pc-1.0 Standard PC (default)
 pc-0.14Standard PC
 pc-0.13Standard PC
 pc-0.12Standard PC
 pc-0.11Standard PC, qemu 0.11
 pc-0.10Standard PC, qemu 0.10
 isapc  ISA-only PC


 What OS versions are you running and also what KVM versions, do you have
 anything extra enabled in the agent (eg a specific CPU type vs the generic
 KVM cpu)?
 Additionally do check
 https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Virtualization_Administration_Guide/sect-Live_migration_and_RHEL_compatibility.html#Live_Migration_Compatibility

 --
 Sent from the Delta quadrant using Borg technology!

 Nux!
 www.nux.ro


Re: KVM, Re-create VR failed

2014-04-14 Thread Marcus
Your agent snippet just looks like the system trying to stop the vm.
If a vm fails to start, it will also run through the stop routine to
clean up all of the prework, so the 'failed to stop' debug is all
normal. You may need to go above and look at why it failed to start.

On Fri, Apr 11, 2014 at 3:53 PM, Serg Senko kernc...@gmail.com wrote:
 Hi,

 It's can be some know bug?
 Possible it's already solved in new releases of CS but i need the
 work-around or fix before upgrade or reference to bug id.

 Environment:
 CS 4.1.1
 libvirt-1.0.1
 qemu-kvm-1.2
 NFS Storage ( as primary for VR's )
 Advanced VLAN isolation

 After hypervisor host crashing, one of VR's has failed to start in failover
 case,
 I have stopped it through UI with force, then was removed the VR for
 re-create it again by start/create VM API call.


 Try to start the Instance associated with this network, but failed because
 the VR can't be started when newly created.

 cloudstack-agent:

 2014-04-11 07:05:34,546 DEBUG [kvm.resource.LibvirtComputingResource]
 (agentRequest-Handler-2:null) Failed to get dom xml:
 org.libvirt.LibvirtException: Domain not found: no domain with matching
 uuid '373ab4a9-cb8c-3275-a455-b9b4b963a983'

 2014-04-11 07:05:34,547 DEBUG [kvm.resource.LibvirtComputingResource]
 (agentRequest-Handler-2:null) Failed to get dom xml:
 org.libvirt.LibvirtException: Domain not found: no domain with matching
 uuid '373ab4a9-cb8c-3275-a455-b9b4b963a983'

 2014-04-11 07:05:34,548 DEBUG [kvm.resource.LibvirtComputingResource]
 (agentRequest-Handler-2:null) Failed to get dom xml:
 org.libvirt.LibvirtException: Domain not found: no domain with matching
 uuid '373ab4a9-cb8c-3275-a455-b9b4b963a983'

 2014-04-11 07:05:34,548 DEBUG [kvm.resource.LibvirtComputingResource]
 (agentRequest-Handler-2:null) Executing:
 /usr/share/cloudstack-common/scripts/vm/network/security_group.py
 destroy_network_rules_for_vm --vmname r-377-VM

 2014-04-11 07:05:34,663 DEBUG [kvm.resource.LibvirtComputingResource]
 (agentRequest-Handler-2:null) Execution is successful.

 2014-04-11 07:05:34,664 DEBUG [kvm.resource.LibvirtComputingResource]
 (agentRequest-Handler-2:null) Try to stop the vm at first

 2014-04-11 07:05:34,665 DEBUG [kvm.resource.LibvirtComputingResource]
 (agentRequest-Handler-2:null) Failed to stop VM :r-377-VM :

 org.libvirt.LibvirtException: Domain not found: no domain with matching
 uuid '373ab4a9-cb8c-3275-a455-b9b4b963a983'

 at org.libvirt.ErrorHandler.processError(Unknown Source)

 at org.libvirt.Connect.processError(Unknown Source)

 at org.libvirt.Connect.domainLookupByUUIDString(Unknown Source)

 at org.libvirt.Connect.domainLookupByUUID(Unknown Source)

 at
 com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.stopVM(LibvirtComputingResource.java:4021)

 at
 com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.stopVM(LibvirtComputingResource.java:3970)

 at
 com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.execute(LibvirtComputingResource.java:2894)

 at
 com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1032)

 at com.cloud.agent.Agent.processRequest(Agent.java:525)

 at com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:852)

 at com.cloud.utils.nio.Task.run(Task.java:83)

 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)

 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)

 at java.lang.Thread.run(Thread.java:679)

 2014-04-11 07:05:34,666 DEBUG [kvm.resource.LibvirtComputingResource]
 (agentRequest-Handler-2:null) Failed to get vm status:Domain not found: no
 domain with matching uuid '373ab4a9-cb8c-3275-a455-b9b4b963a983'

 2014-04-11 07:05:34,667 DEBUG [kvm.resource.LibvirtComputingResource]
 (agentRequest-Handler-2:null) Failed to get vm status:Domain not found: no
 domain with matching uuid '373ab4a9-cb8c-3275-a455-b9b4b963a983'

 2014-04-11 07:05:34,668 DEBUG [kvm.resource.LibvirtComputingResource]
 (agentRequest-Handler-2:null) Failed to get vm status:Domain not found: no
 domain with matching uuid '373ab4a9-cb8c-3275-a455-b9b4b963a983'




 Management CS:

 2014-04-11 07:05:40,503 DEBUG
 [network.router.VirtualNetworkApplianceManagerImpl]
 (Job-Executor-114:job-3001) Found 5 ip(s) to apply as a part of domR
 VM[DomainRouter|r-377-VM] start.

 2014-04-11 07:05:40,528 DEBUG
 [network.router.VirtualNetworkApplianceManagerImpl]
 (Job-Executor-114:job-3001) Resending ipAssoc, port forwarding, load
 balancing rules as a part of Virtual router start

 2014-04-11 07:05:40,542 DEBUG
 [network.router.VirtualNetworkApplianceManagerImpl]
 (Job-Executor-114:job-3001) Found 1 firewall Egress rule(s) to apply as a
 part of domR VM[DomainRouter|r-377-VM] start.

 2014-04-11 07:05:40,581 ERROR [cloud.vm.VirtualMachineManagerImpl]
 (Job-Executor-114:job-3001) Failed to start instance
 VM[DomainRouter|r-377-VM]

 java.lang.NullPointerException

 at
 

Re: ALARM - ACS reboots host servers!!!

2014-03-04 Thread Marcus
On Tue, Mar 4, 2014 at 3:34 AM, France mailingli...@isg.si wrote:
 Hi Marcus and others.

 There is no need to kill of the entire hypervisor, if one of the primary
 storages fail.
 You just need to kill the VMs and probably disable SR on XenServer, because
 all other SRs and VMs have no problems.
 if you kill those, then you can safely start them elsewhere. On XenServer
 6.2 you call destroy the VMs which lost access to NFS without any problems.

That's a great idea, but as already mentioned, it doesn't work in
practice. You can't kill a VM that is hanging in D state, waiting on
storage. I also mentioned that it causes problems for libvirt and much
of the other system not using the storage.


 If you really want to still kill the entire host and it's VMs in one go, I
 would suggest live migrating the VMs which have had not lost their storage
 off first, and then kill those VMs on a stale NFS by doing hard reboot.
 Additional time, while migrating working VMs, would even give some grace
 time for NFS to maybe recover. :-)

You won't be able to live migrate a VM that is stuck in D state, or
use libvirt to do so if one of its storage pools is unresponsive,
anyway.


 Hard reboot to recover from D state of NFS client can also be avoided by
 using soft mount options.

As mentioned, soft and intr very rarely actually work, in my
experience. I wish they did as I truly have come to loathe NFS for it.


 I run a bunch of Pacemaker/Corosync/Cman/Heartbeat/etc clusters and we don't
 just kill whole nodes but fence services from specific nodes. STONITH is
 implemented only when the node looses the quorum.

Sure, but how do you fence a KVM host from an NFS server? I don't
think we've written a firewall plugin that works to fence hosts from
any NFS server. Regardless, what CloudStack does is more of a poor
man's clustering, the mgmt server is the locking in the sense that it
is managing what's going on, but it's not a real clustering service.
Heck, it doesn't even STONITH, it tries to clean shutdown, which fails
as well due to hanging NFS (per the mentioned bug, to fix it they'll
need IPMI fencing or something like that).

I didn't write the code, I'm just saying that I can completely
understand why it kills nodes when it deems that their storage has
gone belly-up. It's dangerous to leave that D state VM hanging around,
and it will until the NFS storage comes back. In a perfect world you'd
just stop the VMs that were having the issue, or if there were no VMs
you'd just de-register the storage from libvirt, I agree.


 Regards,
 F.


 On 3/3/14 5:35 PM, Marcus wrote:

 It's the standard clustering problem. Any software that does any sort
 of avtive clustering is going to fence nodes that have problems, or
 should if it cares about your data. If the risk of losing a host due
 to a storage pool outage is too great, you could perhaps look at
 rearranging your pool-to-host correlations (certain hosts run vms from
 certain pools) via clusters. Note that if you register a storage pool
 with a cluster, it will register the pool with libvirt when the pool
 is not in maintenance, which, when the storage pool goes down will
 cause problems for the host even if no VMs from that storage are
 running (fetching storage stats for example will cause agent threads
 to hang if its NFS), so you'd need to put ceph in its own cluster and
 NFS in its own cluster.

 It's far more dangerous to leave a host in an unknown/bad state. If a
 host loses contact with one of your storage nodes, with HA, cloudstack
 will want to start the affected VMs elsewhere. If it does so, and your
 original host wakes up from it's NFS hang, you suddenly have a VM
 running in two locations, corruption ensues. You might think we could
 just stop the affected VMs, but NFS tends to make things that touch it
 go into D state, even with 'intr' and other parameters, which affects
 libvirt and the agent.

 We could perhaps open a feature request to disable all HA and just
 leave things as-is, disallowing operations when there are outages. If
 that sounds useful you can create the feature request on
 https://issues.apache.org/jira.


 On Mon, Mar 3, 2014 at 5:37 AM, Andrei Mikhailovsky and...@arhont.com
 wrote:

 Koushik, I understand that and I will put the storage into the
 maintenance mode next time. However, things happen and servers crash from
 time to time, which is not the reason to reboot all host servers, even those
 which do not have any running vms with volumes on the nfs storage. The
 bloody agent just rebooted every single host server regardless if they were
 running vms with volumes on the rebooted nfs server. 95% of my vms are
 running from ceph and those should have never been effected in the first
 place.
 - Original Message -

 From: Koushik Das koushik@citrix.com
 To: users@cloudstack.apache.org users@cloudstack.apache.org
 Cc: d...@cloudstack.apache.org
 Sent: Monday, 3 March, 2014 5:55:34 AM
 Subject: Re: ALARM - ACS reboots host servers

Re: KVM

2014-03-02 Thread Marcus
It doesn't look like you've enabled debug, you're only getting WARN
and INFO messages. Please enable debug.

On Sun, Mar 2, 2014 at 4:40 PM, María Noelia Gil marianoelia@um.es wrote:
 When I run CloudStack-setup-agent shows the following:

 Starting to configure your system:
 Configure Apparmor ...[OK]
 Configure Network ... [OK]
 Configure Libvirt ...
 [OK]
 Configure Firewall ...
 [OK]
 Configure Nfs ... [OK]
 Configure cloudAgent ...
 [OK]
 CloudStack Agent setup is done!

 The log file displays the following.

 2014-03-03 00:32:44,320 INFO  [cloud.agent.AgentShell] (main:null) Agent 
 started
 2014-03-03 00:32:44,321 INFO  [cloud.agent.AgentShell] (main:null) 
 Implementation Version is 4.2.1
 2014-03-03 00:32:44,322 INFO  [cloud.agent.AgentShell] (main:null) 
 agent.properties found at /etc/cloudstack/agent/agent.properties
 2014-03-03 00:32:44,323 INFO  [cloud.agent.AgentShell] (main:null) Defaulting 
 to using properties file for storage
 2014-03-03 00:32:44,324 INFO  [cloud.agent.AgentShell] (main:null) Defaulting 
 to the constant time backoff algorithm
 2014-03-03 00:32:44,326 INFO  [cloud.utils.LogUtils] (main:null) log4j 
 configuration found at /etc/cloudstack/agent/log4j-cloud.xml
 2014-03-03 00:32:44,384 INFO  [cloud.agent.Agent] (main:null) id is 0
 2014-03-03 00:32:44,396 INFO  
 [resource.virtualnetwork.VirtualRoutingResource] (main:null) 
 VirtualRoutingResource _scriptDir to use: scripts/network/domr/kvm
 2014-03-03 00:32:45,020 WARN  [kvm.resource.LibvirtComputingResource] 
 (main:null) LibVirt version 0.9.10 required for guest cpu mode, but version 
 0.9.8 detected, so it will be disabled
 2014-03-03 00:32:45,114 INFO  [kvm.resource.LibvirtComputingResource] 
 (main:null) No libvirt.vif.driver specified. Defaults to BridgeVifDriver.
 2014-03-03 00:32:45,145 INFO  [cloud.agent.Agent] (main:null) Agent [id = 0 : 
 type = LibvirtComputingResource : zone = default : pod = default : workers = 
 5 : host = localhost : port = 8250
 2014-03-03 00:32:45,154 INFO  [utils.nio.NioClient] (Agent-Selector:null) 
 Connecting to localhost:8250
 2014-03-03 00:32:45,333 INFO  [utils.nio.NioClient] (Agent-Selector:null) 
 SSL: Handshake done
 2014-03-03 00:32:45,334 INFO  [utils.nio.NioClient] (Agent-Selector:null) 
 Connected to localhost:8250
 2014-03-03 00:32:45,662 INFO  [cloud.serializer.GsonHelper] 
 (Agent-Handler-1:null) Default Builder inited.
 2014-03-03 00:32:45,733 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) 
 Proccess agent startup answer, agent id = 0
 2014-03-03 00:32:45,733 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Set 
 agent id 0
 2014-03-03 00:32:45,737 INFO  [cloud.agent.Agent] (AgentShutdownThread:null) 
 Stopping the agent: Reason = sig.kill
 2014-03-03 00:32:45,738 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) 
 Startup Response Received: agent id = 0

 I do not get to fix the error.

 Thanks.

 El 02/03/2014, a las 00:25, Marcus shadow...@gmail.com escribió:

 changing



Re: ALARM - ACS reboots host servers!!!

2014-03-02 Thread Marcus
I'm not sure I understand. How do you expect to reboot your primary
storage while vms are running?  It sounds like the host is being
fenced since it cannot contact the resources it depends on.

On Sun, Mar 2, 2014 at 3:24 PM, Nux! n...@li.nux.ro wrote:
 On 02.03.2014 21:17, Andrei Mikhailovsky wrote:

 Hello guys,


 I've recently came across the bug CLOUDSTACK-5429 which has rebooted
 all of my host servers without properly shutting down the guest vms.
 I've simply upgraded and rebooted one of the nfs primary storage
 servers and a few minutes later, to my horror, i've found out that all
 of my host servers have been rebooted. Is it just me thinking so, or
 is this bug should be fixed ASAP and should be a blocker for any new
 ACS release. I mean not only does it cause downtime, but also possible
 data loss and server corruption.


 Hi Andrei,

 Do you have HA enabled and did you put that primary storage in maintenance
 mode before rebooting it?
 It's my understanding that ACS relies on the shared storage to perform HA so
 if the storage goes it's expected to go berserk. I've noticed similar
 behaviour in Xenserver pools without ACS.
 I'd imagine a cure for this would be to use network distributed
 filesystems like GlusterFS or CEPH.

 Lucian

 --
 Sent from the Delta quadrant using Borg technology!

 Nux!
 www.nux.ro


Re: ALARM - ACS reboots host servers!!!

2014-03-02 Thread Marcus
Or do you mean you have multiple primary storages and this one was not
in use and put into maintenance?

On Sun, Mar 2, 2014 at 5:25 PM, Marcus shadow...@gmail.com wrote:
 I'm not sure I understand. How do you expect to reboot your primary
 storage while vms are running?  It sounds like the host is being
 fenced since it cannot contact the resources it depends on.

 On Sun, Mar 2, 2014 at 3:24 PM, Nux! n...@li.nux.ro wrote:
 On 02.03.2014 21:17, Andrei Mikhailovsky wrote:

 Hello guys,


 I've recently came across the bug CLOUDSTACK-5429 which has rebooted
 all of my host servers without properly shutting down the guest vms.
 I've simply upgraded and rebooted one of the nfs primary storage
 servers and a few minutes later, to my horror, i've found out that all
 of my host servers have been rebooted. Is it just me thinking so, or
 is this bug should be fixed ASAP and should be a blocker for any new
 ACS release. I mean not only does it cause downtime, but also possible
 data loss and server corruption.


 Hi Andrei,

 Do you have HA enabled and did you put that primary storage in maintenance
 mode before rebooting it?
 It's my understanding that ACS relies on the shared storage to perform HA so
 if the storage goes it's expected to go berserk. I've noticed similar
 behaviour in Xenserver pools without ACS.
 I'd imagine a cure for this would be to use network distributed
 filesystems like GlusterFS or CEPH.

 Lucian

 --
 Sent from the Delta quadrant using Borg technology!

 Nux!
 www.nux.ro


Re: ALARM - ACS reboots host servers!!!

2014-03-02 Thread Marcus
Also, please note that in the bug you referenced it doesn't have a
problem with the reboot being triggered, but with the fact that reboot
never completes due to hanging NFS mount (which is why the reboot
occurs, inaccessible primary storage).

On Sun, Mar 2, 2014 at 5:26 PM, Marcus shadow...@gmail.com wrote:
 Or do you mean you have multiple primary storages and this one was not
 in use and put into maintenance?

 On Sun, Mar 2, 2014 at 5:25 PM, Marcus shadow...@gmail.com wrote:
 I'm not sure I understand. How do you expect to reboot your primary
 storage while vms are running?  It sounds like the host is being
 fenced since it cannot contact the resources it depends on.

 On Sun, Mar 2, 2014 at 3:24 PM, Nux! n...@li.nux.ro wrote:
 On 02.03.2014 21:17, Andrei Mikhailovsky wrote:

 Hello guys,


 I've recently came across the bug CLOUDSTACK-5429 which has rebooted
 all of my host servers without properly shutting down the guest vms.
 I've simply upgraded and rebooted one of the nfs primary storage
 servers and a few minutes later, to my horror, i've found out that all
 of my host servers have been rebooted. Is it just me thinking so, or
 is this bug should be fixed ASAP and should be a blocker for any new
 ACS release. I mean not only does it cause downtime, but also possible
 data loss and server corruption.


 Hi Andrei,

 Do you have HA enabled and did you put that primary storage in maintenance
 mode before rebooting it?
 It's my understanding that ACS relies on the shared storage to perform HA so
 if the storage goes it's expected to go berserk. I've noticed similar
 behaviour in Xenserver pools without ACS.
 I'd imagine a cure for this would be to use network distributed
 filesystems like GlusterFS or CEPH.

 Lucian

 --
 Sent from the Delta quadrant using Borg technology!

 Nux!
 www.nux.ro


Re: mem.overprovisioning.facto and KVM

2014-01-24 Thread Marcus Sorensen
I guess not. It should work though. We ran into the same issue with
storage, everything hardcoded to only work with vmware. I'll take a
look.

On Mon, Oct 7, 2013 at 1:09 PM, Sebastien Goasguen run...@gmail.com wrote:

 On Sep 25, 2013, at 2:59 AM, Harikrishna Patnala 
 harikrishna.patn...@citrix.com wrote:

 As far as I know men over provisioning is intended to work only with VMWare 
 hypervisor to allocate reserved memory for VM.


 @Marcus, could you comment on this: is mem over provisioning supposed to work 
 with KVM ?

 On 25-Sep-2013, at 11:11 AM, Nikolay Kabadjov niki...@yahoo.com wrote:

 Yes Kirk, I did



 
 From: Kirk Jantzer kirk.jant...@gmail.com
 To: Cloudstack users mailing list users@cloudstack.apache.org; Nikolay 
 Kabadjov niki...@yahoo.com
 Sent: Tuesday, September 24, 2013 5:50 PM
 Subject: Re: mem.overprovisioning.facto and KVM



 Did you restart the management service after making the change?



 Regards,

 Kirk Jantzer
 http://about.me/kirkjantzer



 On Tue, Sep 24, 2013 at 10:25 AM, Nikolay Kabadjov niki...@yahoo.com 
 wrote:

 Hi all,
 I've noticed that increasing mem.overprovisioning.factor doesn't take 
 effect?
 I mean I still see in the dashboard the exact amount of memory I have 
 multiplying the memory of all the hosts.

 It's CS 4.1.1 with one zone, one pod, one cluster, 6 KVM hosts

 Any idea?

 Thanks
 Niki




Re: mem.overprovisioning.facto and KVM

2014-01-24 Thread Marcus Sorensen
Looks like it works as of 4.2, but you need to update existing cluster
settings, rather than global (or both, I suppose).

On Fri, Jan 24, 2014 at 3:17 PM, Marcus Sorensen shadow...@gmail.com wrote:
 I guess not. It should work though. We ran into the same issue with
 storage, everything hardcoded to only work with vmware. I'll take a
 look.

 On Mon, Oct 7, 2013 at 1:09 PM, Sebastien Goasguen run...@gmail.com wrote:

 On Sep 25, 2013, at 2:59 AM, Harikrishna Patnala 
 harikrishna.patn...@citrix.com wrote:

 As far as I know men over provisioning is intended to work only with VMWare 
 hypervisor to allocate reserved memory for VM.


 @Marcus, could you comment on this: is mem over provisioning supposed to 
 work with KVM ?

 On 25-Sep-2013, at 11:11 AM, Nikolay Kabadjov niki...@yahoo.com wrote:

 Yes Kirk, I did



 
 From: Kirk Jantzer kirk.jant...@gmail.com
 To: Cloudstack users mailing list users@cloudstack.apache.org; Nikolay 
 Kabadjov niki...@yahoo.com
 Sent: Tuesday, September 24, 2013 5:50 PM
 Subject: Re: mem.overprovisioning.facto and KVM



 Did you restart the management service after making the change?



 Regards,

 Kirk Jantzer
 http://about.me/kirkjantzer



 On Tue, Sep 24, 2013 at 10:25 AM, Nikolay Kabadjov niki...@yahoo.com 
 wrote:

 Hi all,
 I've noticed that increasing mem.overprovisioning.factor doesn't take 
 effect?
 I mean I still see in the dashboard the exact amount of memory I have 
 multiplying the memory of all the hosts.

 It's CS 4.1.1 with one zone, one pod, one cluster, 6 KVM hosts

 Any idea?

 Thanks
 Niki




Re: Status of CLVM?

2014-01-08 Thread Marcus Sorensen
Yes, because current snapshot is really Copy raw-formatted LVM volume
to qcow2 file on secondary storage. So there is no real LVM snapshot,
and if there were, it wouldn't be copied internally.

On Wed, Jan 8, 2014 at 3:47 PM, Nux! n...@li.nux.ro wrote:
 Hi,

 I've just watched Marcus Sorensen's presentation on CLVM on youtube and he
 was mentioning that migrating a VM with snapshots will make the snapshots
 disappear.
 Can anyone testify if this is still the case?
 Since at it, are there any alternative ways of using a multipathed iSCSI lun
 with Cloudstack (KVM)? I'm thinking clustered filesystems such as GFS or
 Ocfs, but afraid of the penalty performance.

 Regards,
 Lucian

 --
 Sent from the Delta quadrant using Borg technology!

 Nux!
 www.nux.ro


Re: Status of CLVM?

2014-01-08 Thread Marcus Sorensen
You'd create a sharedmountpoint style primary storage, which would
host qcow2 files. You can do this via iscsi, fibrechannel, or any
other SAN tech.

On Wed, Jan 8, 2014 at 3:49 PM, Marcus Sorensen shadow...@gmail.com wrote:
 Yes, because current snapshot is really Copy raw-formatted LVM volume
 to qcow2 file on secondary storage. So there is no real LVM snapshot,
 and if there were, it wouldn't be copied internally.

 On Wed, Jan 8, 2014 at 3:47 PM, Nux! n...@li.nux.ro wrote:
 Hi,

 I've just watched Marcus Sorensen's presentation on CLVM on youtube and he
 was mentioning that migrating a VM with snapshots will make the snapshots
 disappear.
 Can anyone testify if this is still the case?
 Since at it, are there any alternative ways of using a multipathed iSCSI lun
 with Cloudstack (KVM)? I'm thinking clustered filesystems such as GFS or
 Ocfs, but afraid of the penalty performance.

 Regards,
 Lucian

 --
 Sent from the Delta quadrant using Borg technology!

 Nux!
 www.nux.ro


Re: 4.2.0 Upgrade and System Templates (KVM - Ubuntu 12.04)

2013-10-26 Thread Marcus Sorensen
Yes, you do need to upgrade your system VMS, and you should also have a new
systemvm.iso that was bundled in the cloudstack-common deb file that would
have been installed as an upgrade on your KVM hosts. I also feel that the
documentation of system vm upgrade is lacking. The only place I know if is
in the release notes:
http://cloudstack.apache.org/docs/en-US/Apache_CloudStack/4.2.0/html-single/Release_Notes/index.html,
see 3.1 Upgrade Instructions, item 12. It references a script
cloudstack-sysvmadm, but the upgrade of the system vm template should be
done beforehand.  Now look at the section just below, 3.2. This
documentation is obviously messed up because it first says this applies
only to VMware, and then it promptly gives system vm upgrade instructions
for XenServer, KVM, and VMWare hosts.  It's unclear why this system vm
upgrade would only apply to zones which had VMware hosts, and why these
instructions aren't also on the 4.1.x to 4.2.x instructions. At any rate,
the system vm instuctions there for KVM should apply. Register the template
(optionally, check the data base to ensure the template is set as system
type), then restart the system vms per the item 12 script. If your KVM
hosts relaunch the system vms per the new template and they have the new
systemvm.iso, they should work.


On Oct 26, 2013 2:19 PM, Marty Sweet msweet@gmail.com wrote:

 Hi Guys,

 I have just upgraded to 4.2.0 from 4.1.1 and am having some issues with the
 SystemVMs.
 I understand that we are meant to upgrade to the new system image? Using
 the script in the 'Prepare systemvm' documentation I did this with no
 avail, editing the database to suit what I think would work has also not
 worked.

 Restoring a backup, I now have my original 4.1.1 acton systemvm templates.
 What steps should I take to launch a systemVM successfully?

 The upgrade documentation is pretty lacking in this respect, and just says
 restart the systemvms, with no reference to upgrading the image.

 I also note that the new systemvms don't seem to be mounting the NFS and
 are instead using  /usr/share/cloudstack-common/vms/systemvm.iso.

 Opening a VNC session to the VM, shows the following messages:
 Cannot assign requested address: make_sock: could not bind address
 dnsmasq: unknown interface eth0
 dnsmasq apache2 ... failed!

 My MD5 sum for the CD boot file is below and is consistant across all 4
 nodes:
 092a299932bda93cc522b1c3e56af4a8
  /usr/share/cloudstack-common/vms/systemvm.iso


 Many thanks,
 Marty



Re: High CPU utilization on KVM hosts while doing RBD snapshot - was Re: snapshot caused host disconnected

2013-10-07 Thread Marcus Sorensen
You may want to post this to the ceph mailing list as well.

On Mon, Oct 7, 2013 at 8:59 PM, Indra Pramana in...@sg.or.id wrote:
 Dear Wido and all,

 I performed some further tests last night:

 (1) CPU utilization of the KVM host while RBD snapshot running is still
 shooting up high even after I set global setting:
 concurrent.snapshots.threshold.perhost to 2.

 (2) Most of the concurrent snapshot processes will fail with either stuck
 in Creating state, or CreatedOnPrimary error message.

 (3) I also have adjusted some other related global settings such as
 backup.snapshot.wait and job.expire.minutes, without any luck.

 Any advise on the reason what causes the high CPU utilization is greatly
 appreciated.

 Looking forward to your reply, thank you.

 Cheers.


 On Mon, Oct 7, 2013 at 11:03 PM, Indra Pramana in...@sg.or.id wrote:

 Dear all,

 I also found out that when the RBD snapshot is being run, the CPU
 utilisation on the KVM host will be shooting up very high, which might
 explain why the host becomes disconnected.

 top - 22:49:32 up 3 days, 19:31,  1 user,  load average: 7.85, 4.97, 3.47
 Tasks: 297 total,   3 running, 294 sleeping,   0 stopped,   0 zombie
 Cpu(s):  4.5%us,  1.2%sy,  0.0%ni, 94.1%id,  0.1%wa,  0.0%hi,  0.0%si,
 0.0%st
 Mem:  264125244k total, 77203460k used, 186921784k free,   154888k buffers
 Swap:   545788k total,0k used,   545788k free, 60677092k cached

   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 18161 root  20   0 3871m  31m 8444 S  101  0.0 301:58.09 kvm
  2790 root  20   0 43.5g 1.6g  19m S   97  0.7  45:52.42 jsvc
 24544 root  20   0 4583m  31m 8364 S   97  0.0 425:29.48 kvm
  6537 root  20   0 000 R   71  0.0   0:17.49 kworker/3:2
 22546 root  20   0 6143m 2.0g 8452 S   26  0.8  55:14.07 kvm
  4219 root  20   0 7671m 4.0g 8524 S6  1.6 106:12.26 kvm
  5989 root  20   0 43.2g 1.6g  232 D6  0.6   0:08.13 jsvc
  5993 root  20   0 43.3g 1.6g  224 D6  0.6   0:08.36 jsvc

 Is it normal when snapshot is being run on the VM running on that host,
 the host's CPU utilisation will be higher than usual? How can I limit the
 CPU resources used by the snapshot?


 Looking forward to your reply, thank you.

 Cheers.



 On Mon, Oct 7, 2013 at 7:18 PM, Indra Pramana in...@sg.or.id wrote:

 Dear all,

 I did some tests on snapshots since it's now supported for my Ceph RBD
 primary storage in CloudStack 4.2. When I ran the snapshot for a particular
 VM instance earlier, I noticed that this has caused the host (where the VM
 is on) becomes disconnected.

 Here's the excerpt from the agent.log:

 http://pastebin.com/dxVV7stu

 The management-server.log doesn't much showing anything other than
 detecting that the host was down and HA is being activated:

 http://pastebin.com/UeLiSm9K

 Anyone can advise what is causing the problem? So far there is only one
 user doing the snapshotting and it has caused issues to the host, I can't
 imagine what if multiple users try to do snapshotting at the same time?

 I read about snapshot job throttling which is described on the manual:


 http://cloudstack.apache.org/docs/en-US/Apache_CloudStack/4.2.0/html/Admin_Guide/working-with-snapshots.html

 But I am not too sure whether this will help to resolve the problem since
 there is only one user trying to perform snapshot and we already encounter
 the problem already.

 Anyone can advise how I can troubleshoot further and find a solution to
 the problem?

 Looking forward to your reply, thank you.

 Cheers.





Re: [URGENT] Mounting issues from KVM hosts to secondary storage

2013-10-03 Thread Marcus Sorensen
Are you using the release artifacts, or your own 4.2 build?

When you rebooted the host, did the problem go away or come back the
same? You may want to look at 'virsh pool-list' to see if libvirt is
mounting/registering the secondary storage.

Is this happening on multiple hosts, the same way? You may want to
look at /etc/mtab, if the system reports it's mounted, though it's
not, it might be in there. Look at /proc/mounts as well.

On Thu, Oct 3, 2013 at 9:53 AM, Indra Pramana in...@sg.or.id wrote:
 Dear all,

 We face a major problem after upgrading to 4.2.0. Mounting from KVM hosts
 to secondary storage seems to fail, every time a new VM instance is
 created, it will use up the / (root) partition of the KVM hosts instead.

 Here is the df result:

 
 root@hv-kvm-02:/home/indra# df
 Filesystem  1K-blocksUsed
 Available Use% Mounted on
 /dev/sda1 4195924
 4195924 0 100% /
 udev132053356   4
 132053352   1% /dev
 tmpfs52825052 440
 52824612   1% /run
 none 5120
 0  5120   0% /run/lock
 none132062620   0
 132062620   0% /run/shm
 cgroup  132062620   0
 132062620   0% /sys/fs/cgroup
 /dev/sda610650544 2500424
 7609092  25% /home
 10.237.11.31:/mnt/vol1/sec-storage2/template/tmpl/2/288   4195924
 4195924 0 100% /mnt/5230667e-9c58-3ff6-983c-5fc2a72df669
 

 The strange thing is that df shows that it seems to be mounted, but it's
 actually not mounted. If you noticed, the total capacity of the mount point
 is exactly the same as the capacity of the / (root) partition. By right it
 should show 7 TB instead of just 4 GB.

 This caused VM creation to be error due to out of disk space. This also
 affect the KVM operations since the / (root) partition becomes full, and we
 can only release the space after we reboot the KVM host.

 Anyone experience this problem before? We are at loss on how to resolve the
 problem.

 Looking forward to your reply, thank you.

 Cheers.


Re: [URGENT] Mounting issues from KVM hosts to secondary storage

2013-10-03 Thread Marcus Sorensen
It sort of looks like the out of space triggered the issue. Libvirt shows

2013-10-03 15:38:57.414+: 2710: error : virCommandWait:2348 :
internal error Child process (/bin/umount
/mnt/5230667e-9c58-3ff6-983c-5fc2a72df669) unexpected exit status 32:
error writing /etc/mtab.tmp: No space left on device

So that entry of the filesystem being mounted is orphaned in
/etc/mtab, since it can't be removed. That seems to be the source of
why 'df' shows the thing mounted when it isn't, at least.


On Thu, Oct 3, 2013 at 11:45 AM, Indra Pramana in...@sg.or.id wrote:
 Hi Marcus and all,

 I also find some strange and interesting error messages from the libvirt
 logs:

 http://pastebin.com/5ByfNpAf

 Looking forward to your reply, thank you.

 Cheers.



 On Fri, Oct 4, 2013 at 1:38 AM, Indra Pramana in...@sg.or.id wrote:

 Hi Marcus,

 Good day to you, and thank you for your e-mail. See my reply inline.

 On Fri, Oct 4, 2013 at 12:29 AM, Marcus Sorensen shadow...@gmail.comwrote:

 Are you using the release artifacts, or your own 4.2 build?


 [Indra:] We are using the release artifacts from below repo since we are
 using Ubuntu:

 deb http://cloudstack.apt-get.eu/ubuntu precise 4.2


 When you rebooted the host, did the problem go away or come back the
 same?


 [Indra:] When I rebooted the host, the problem go away for a while, but it
 will come back again after some time. It will come randomly at the time
 when we need to create a new instance on that host, or start an existing
 stopped instance.


 You may want to look at 'virsh pool-list' to see if libvirt is
 mounting/registering the secondary storage.


 [Indra:] This is the result of the virsh pool-list command:

 root@hv-kvm-02:/var/log/libvirt# virsh pool-list
 Name State  Autostart
 -
 301071ac-4c1d-4eac-855b-124126da0a38 active no
 5230667e-9c58-3ff6-983c-5fc2a72df669 active no
 d433809b-01ea-3947-ba0f-48077244e4d6 active no

 Strange thing is that none of my secondary storage IDs are there. Could it
 be that the ID might have changed during Cloudstack upgrade? Here is the
 list of my secondary storage (there are two of them) even though they are
 on the same NFS server:
   c02da448-b9f4-401b-b8d5-83e8ead5cfde nfs://
 10.237.11.31/mnt/vol1/sec-storage NFS
 5937edb6-2e95-4ae2-907b-80fe4599ed87 nfs://
 10.237.11.31/mnt/vol1/sec-storage2 NFS

 Is this happening on multiple hosts, the same way?


 [Indra:] Yes, it is happening on all the two hosts that I have, the same
 way.


 You may want to
 look at /etc/mtab, if the system reports it's mounted, though it's
 not, it might be in there. Look at /proc/mounts as well.


 [Indra:] Please find result of df, /etc/mtab and /proc/mounts below. The
 ghost mount point is on df and /etc/mtab, but not on /proc/mounts.

 root@hv-kvm-02:/etc# df

 Filesystem  1K-blocksUsed
 Available Use% Mounted on
 /dev/sda1 4195924
 4192372 0 100% /

 udev132053356   4
 132053352   1% /dev
 tmpfs52825052 704
 52824348   1% /run

 none 5120
 0  5120   0% /run/lock
 none132062620   0
 132062620   0% /run/shm
 cgroup  132062620   0
 132062620   0% /sys/fs/cgroup
 /dev/sda610650544
 2500460   7609056  25% /home
 10.237.11.31:/mnt/vol1/sec-storage2/template/tmpl/2/288   4195924
 4192372 0 100% /mnt/5230667e-9c58-3ff6-983c-5fc2a72df669

 root@hv-kvm-02:/etc# cat /etc/mtab
 /dev/sda1 / ext4 rw,errors=remount-ro 0 0
 proc /proc proc rw,noexec,nosuid,nodev 0 0
 sysfs /sys sysfs rw,noexec,nosuid,nodev 0 0
 none /sys/fs/fuse/connections fusectl rw 0 0
 none /sys/kernel/debug debugfs rw 0 0
 none /sys/kernel/security securityfs rw 0 0
 udev /dev devtmpfs rw,mode=0755 0 0
 devpts /dev/pts devpts rw,noexec,nosuid,gid=5,mode=0620 0 0
 tmpfs /run tmpfs rw,noexec,nosuid,size=10%,mode=0755 0 0
 none /run/lock tmpfs rw,noexec,nosuid,nodev,size=5242880 0 0
 none /run/shm tmpfs rw,nosuid,nodev 0 0
 cgroup /sys/fs/cgroup tmpfs rw,relatime,mode=755 0 0
 cgroup /sys/fs/cgroup/cpuset cgroup rw,relatime,cpuset 0 0
 cgroup /sys/fs/cgroup/cpu cgroup rw,relatime,cpu 0 0
 cgroup /sys/fs/cgroup/cpuacct cgroup rw,relatime,cpuacct 0 0
 cgroup /sys/fs/cgroup/memory cgroup rw,relatime,memory 0 0
 cgroup /sys/fs/cgroup/devices cgroup rw,relatime,devices 0 0
 cgroup /sys/fs/cgroup/freezer cgroup rw,relatime,freezer 0 0
 cgroup /sys/fs/cgroup/blkio cgroup rw,relatime,blkio 0 0
 cgroup /sys/fs/cgroup/perf_event cgroup rw,relatime,perf_event 0 0
 /dev/sda6 /home ext4 rw 0 0
 rpc_pipefs /run/rpc_pipefs rpc_pipefs rw 0 0
 binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc

Re: Resize data-disk doesn't work after upgrade

2013-10-03 Thread Marcus Sorensen
I just tested local storage qcow2 and CLVM resize on 4.2, they both worked.

Resize works like this:

1. Do sanity checks
2. Send resize command to the agent
3. Resize the disk/lun/file
4. Inform the VM instance that the disk has changed by making a
libvirt volBlockResize call (this is not fatal, some guest types can't
resize online and need to be restarted)
5. Update the database

You can check #3 looking at the disks themselves on storage to see if
they've grown. You can check #4 by restarting the VM to see if it
picks up the change.

It may be that libvirt was unable to inform the VM of the change (for
example if you haven't upgraded to a supported version of Ubuntu or
CentOS and it has an old libvirt that doesn't support volBlockResize).
 The way to know for sure is stop/start the VM if you can.

Look at those two things and let us know

On Thu, Oct 3, 2013 at 2:33 PM, Indra Pramana in...@sg.or.id wrote:
 Dear all,

 After upgrading to 4.2.0, I tried to resize a data disk of a VM instance
 from 20 GB to 60 GB, through the Cloudstack GUI. The UI reports that the
 resize was successful, and that the data disk is now showing 60 GB instead
 of 20 GB. However, when I check the actual disk on the VM, it seems that
 it's still 20 GB.

 Any reason what might have been the cause of the problem? I even tried to
 re-partition it to see if the size changed, but it wasn't and still at 20
 GB. Which logs I need to look into?

 Any help on this is greatly appreciated.

 Looking forward to your reply, thank you.

 Cheers.


Re: Resize data-disk doesn't work after upgrade

2013-10-03 Thread Marcus Sorensen
What primary storage are you using? Any errors in agent log?
On Oct 3, 2013 3:16 PM, Indra Pramana in...@sg.or.id wrote:

 Hi Marcus,

 Good day to you, and thank you for your e-mail.

 I have tried restarting the VM and even stop and start the VM, but after
 logging in to the VM, I still see the hard drive's size as 20 GB instead of
 60 GB.

 I tried to check /var/log/libvirt/libvirtd.log file on the KVM host where
 the VM is hosted, and can't find any messages related to volBlockResize.

 Any other troubleshooting steps you can recommend, i.e. any other area I
 can look into?

 Looking forward to your reply, thank you.

 Cheers.



 On Fri, Oct 4, 2013 at 4:46 AM, Marcus Sorensen shadow...@gmail.com
 wrote:

  I just tested local storage qcow2 and CLVM resize on 4.2, they both
 worked.
 
  Resize works like this:
 
  1. Do sanity checks
  2. Send resize command to the agent
  3. Resize the disk/lun/file
  4. Inform the VM instance that the disk has changed by making a
  libvirt volBlockResize call (this is not fatal, some guest types can't
  resize online and need to be restarted)
  5. Update the database
 
  You can check #3 looking at the disks themselves on storage to see if
  they've grown. You can check #4 by restarting the VM to see if it
  picks up the change.
 
  It may be that libvirt was unable to inform the VM of the change (for
  example if you haven't upgraded to a supported version of Ubuntu or
  CentOS and it has an old libvirt that doesn't support volBlockResize).
   The way to know for sure is stop/start the VM if you can.
 
  Look at those two things and let us know
 
  On Thu, Oct 3, 2013 at 2:33 PM, Indra Pramana in...@sg.or.id wrote:
   Dear all,
  
   After upgrading to 4.2.0, I tried to resize a data disk of a VM
 instance
   from 20 GB to 60 GB, through the Cloudstack GUI. The UI reports that
 the
   resize was successful, and that the data disk is now showing 60 GB
  instead
   of 20 GB. However, when I check the actual disk on the VM, it seems
 that
   it's still 20 GB.
  
   Any reason what might have been the cause of the problem? I even tried
 to
   re-partition it to see if the size changed, but it wasn't and still at
 20
   GB. Which logs I need to look into?
  
   Any help on this is greatly appreciated.
  
   Looking forward to your reply, thank you.
  
   Cheers.
 



RE: CloudStack VPC - VPN - VPC

2013-08-23 Thread Marcus Sorensen
It is possible, sort of. You have to bring both up at the same time,
otherwise they will time out and fail. There is no mode to make one side or
the other just listen for connections.
On Aug 23, 2013 12:37 AM, Kimihiko Kitase kimihiko.kit...@citrix.co.jp
wrote:

 Thanks! Is it in 4.2?

 -Original Message-
 From: Sheng Yang [mailto:sh...@yasker.org]
 Sent: Friday, August 23, 2013 2:52 PM
 To: users@cloudstack.apache.org
 Cc: d...@cloudstack.apache.org
 Subject: Re: CloudStack VPC - VPN - VPC

 Not now. But I think it's in the road map.

 --Sheng


 On Thu, Aug 22, 2013 at 10:42 PM, Kimihiko Kitase 
 kimihiko.kit...@citrix.co.jp wrote:

  Hello
 
  Is it possible to make site-to-site VPN connection between VPC and VPC
  on CloudStack?
 
  Thanks
  Kimi
 
  -Original Message-
  From: Ahmad Emneina [mailto:aemne...@gmail.com]
  Sent: Friday, August 23, 2013 2:38 PM
  To: users@cloudstack.apache.org
  Cc: users@cloudstack.apache.org
  Subject: Re: CloudStack VPC - VPN - VPC
 
  Don't think so, you might want to confirm with dev@cloudstack and or
  create an enhancement ticket (if one doesn't exist) in Jira.
 
  Ahmad
 
  On Aug 22, 2013, at 10:32 PM, Kimihiko Kitase 
  kimihiko.kit...@citrix.co.jp wrote:
 
   Hello
  
   Can we make site to site VPN connection between VPC and VPC on
  CloudStack?
  
   Thanks
   Kimi
  
  
 



Re: FW: CS 4.1.0 - this will help a number of people who struggle with Advanced Networking

2013-08-01 Thread Marcus Sorensen
I'm short on time, but here's the KVM advanced networking config we
use for testing. If someone wants to write a doc based around it that
would be nice.

Start out KVM host with two networks, eth0, eth1. eth0 is intended for
public traffic, eth0 will be guest vlans and management vlan. then
create a bridge interface for each:

[root@devcloud-kvm ~]# brctl show
bridge name bridge id STP enabled interfaces
cloud0 8000. no
br0 8000.5254004eff4f no eth0
br1 8000.52540052b15e no eth1

br0  Link encap:Ethernet  HWaddr 52:54:00:4E:FF:4F
  inet addr:172.17.10.10  Bcast:172.17.10.255  Mask:255.255.255.0
  inet6 addr: fe80::5054:ff:fe4e:ff4f/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:127 errors:0 dropped:0 overruns:0 frame:0
  TX packets:30 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:0
  RX bytes:5846 (5.7 KiB)  TX bytes:4345 (4.2 KiB)

br1  Link encap:Ethernet  HWaddr 52:54:00:52:B1:5E
  inet addr:192.168.100.10  Bcast:192.168.100.255  Mask:255.255.255.0
  inet6 addr: fe80::5054:ff:fe52:b15e/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:343 errors:0 dropped:0 overruns:0 frame:0
  TX packets:153 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:0
  RX bytes:24227 (23.6 KiB)  TX bytes:29108 (28.4 KiB)

eth0  Link encap:Ethernet  HWaddr 52:54:00:4E:FF:4F
  inet6 addr: fe80::5054:ff:fe4e:ff4f/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:157 errors:0 dropped:0 overruns:0 frame:0
  TX packets:38 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:12276 (11.9 KiB)  TX bytes:4897 (4.7 KiB)

eth1  Link encap:Ethernet  HWaddr 52:54:00:52:B1:5E
  inet6 addr: fe80::5054:ff:fe52:b15e/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:377 errors:0 dropped:0 overruns:0 frame:0
  TX packets:163 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:34044 (33.2 KiB)  TX bytes:29748 (29.0 KiB)

loLink encap:Local Loopback
  inet addr:127.0.0.1  Mask:255.0.0.0
  inet6 addr: ::1/128 Scope:Host
  UP LOOPBACK RUNNING  MTU:16436  Metric:1
  RX packets:863 errors:0 dropped:0 overruns:0 frame:0
  TX packets:863 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:0
  RX bytes:120247 (117.4 KiB)  TX bytes:120247 (117.4 KiB)

Ok, now kvm host is ready. Just define the kvm traffic label for
Management traffic to be 'br0', for guest to be 'br0', and for public
to be 'br1'. Cloudstack will create any necessary bridges or vlans.
You can leave the vlan option empty if you don't want it to create a
vlan (say for management). I can perhaps go into more detail later.

On Wed, Jul 31, 2013 at 12:33 PM, Marcus Sorensen shadow...@gmail.com wrote:
 Yes, that's correct. I think we need to update the documentation. The
 user simply needs to create a bridge where 'public' traffic will work,
 and then set that bridge name as the traffic label for public traffic.
 Then it will create the vlan device and the bridge necessary for
 public based on the physical ethernet device of that bridge.

 Note, in this example, it is only looking for cloudVirBr for
 compatibility, if there are existing cloudVirBr bridges then the agent
 will continue to create cloudVirBr bridges, otherwise, it will create
 breth bridges, which allow the same vlan number on different physical
 interfaces.

 We can easily create some concrete examples for this... such as the
 one represented in devcloud-kvm by
 tools/devcloud-kvm/devcloud-kvm-advanced.cfg

 On Wed, Jul 31, 2013 at 12:06 PM, Edison Su edison...@citrix.com wrote:
 The KVM installation guide at 
 http://cloudstack.apache.org/docs/en-US/Apache_CloudStack/4.1.0/html/Installation_Guide/hypervisor-kvm-install-flow.html
  , is unnecessary complicated and inaccurate.
 For example, we don't need to configure vlan on kvm host by users 
 themselves, cloudstack-agent will create vlans automatically.
 All users need to do is to create bridges(if the default bridge created by 
 cloudstack-agent is not enough), then add these bridge names from cloudstack 
 mgt server UI during the zone creation.

 -Original Message-
 From: Noel Kendall [mailto:noeldkend...@hotmail.com]
 Sent: Wednesday, July 31, 2013 9:49 AM
 To: users@cloudstack.apache.org
 Subject: CS 4.1.0 - this will help a number of people who struggle with 
 Advanced Networking

 The documentation for installation in a KVM environment is utterly 
 misleading.
 The documentation reads as though one can set up the bridge for the public 
 network with any name one chooses, the default being cloudbr0.
 You cannot use just any old name. That simply

Re: FW: CS 4.1.0 - this will help a number of people who struggle with Advanced Networking

2013-08-01 Thread Marcus Sorensen
Here's a simple (not recommended) one-nic setup:

http://marcus.mlsorensen.com/cloudstack-extras/cs-4.1-kvm-networking-one-nic.rtf

And a simple two-nic setup:

http://marcus.mlsorensen.com/cloudstack-extras/cs-4.1-kvm-networking-two-nic.rtf

Hasty docs put together on the road...


On Thu, Aug 1, 2013 at 11:28 AM, Marcus Sorensen shadow...@gmail.com wrote:
 I'm short on time, but here's the KVM advanced networking config we
 use for testing. If someone wants to write a doc based around it that
 would be nice.

 Start out KVM host with two networks, eth0, eth1. eth0 is intended for
 public traffic, eth0 will be guest vlans and management vlan. then
 create a bridge interface for each:

 [root@devcloud-kvm ~]# brctl show
 bridge name bridge id STP enabled interfaces
 cloud0 8000. no
 br0 8000.5254004eff4f no eth0
 br1 8000.52540052b15e no eth1

 br0  Link encap:Ethernet  HWaddr 52:54:00:4E:FF:4F
   inet addr:172.17.10.10  Bcast:172.17.10.255  Mask:255.255.255.0
   inet6 addr: fe80::5054:ff:fe4e:ff4f/64 Scope:Link
   UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
   RX packets:127 errors:0 dropped:0 overruns:0 frame:0
   TX packets:30 errors:0 dropped:0 overruns:0 carrier:0
   collisions:0 txqueuelen:0
   RX bytes:5846 (5.7 KiB)  TX bytes:4345 (4.2 KiB)

 br1  Link encap:Ethernet  HWaddr 52:54:00:52:B1:5E
   inet addr:192.168.100.10  Bcast:192.168.100.255  Mask:255.255.255.0
   inet6 addr: fe80::5054:ff:fe52:b15e/64 Scope:Link
   UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
   RX packets:343 errors:0 dropped:0 overruns:0 frame:0
   TX packets:153 errors:0 dropped:0 overruns:0 carrier:0
   collisions:0 txqueuelen:0
   RX bytes:24227 (23.6 KiB)  TX bytes:29108 (28.4 KiB)

 eth0  Link encap:Ethernet  HWaddr 52:54:00:4E:FF:4F
   inet6 addr: fe80::5054:ff:fe4e:ff4f/64 Scope:Link
   UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
   RX packets:157 errors:0 dropped:0 overruns:0 frame:0
   TX packets:38 errors:0 dropped:0 overruns:0 carrier:0
   collisions:0 txqueuelen:1000
   RX bytes:12276 (11.9 KiB)  TX bytes:4897 (4.7 KiB)

 eth1  Link encap:Ethernet  HWaddr 52:54:00:52:B1:5E
   inet6 addr: fe80::5054:ff:fe52:b15e/64 Scope:Link
   UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
   RX packets:377 errors:0 dropped:0 overruns:0 frame:0
   TX packets:163 errors:0 dropped:0 overruns:0 carrier:0
   collisions:0 txqueuelen:1000
   RX bytes:34044 (33.2 KiB)  TX bytes:29748 (29.0 KiB)

 loLink encap:Local Loopback
   inet addr:127.0.0.1  Mask:255.0.0.0
   inet6 addr: ::1/128 Scope:Host
   UP LOOPBACK RUNNING  MTU:16436  Metric:1
   RX packets:863 errors:0 dropped:0 overruns:0 frame:0
   TX packets:863 errors:0 dropped:0 overruns:0 carrier:0
   collisions:0 txqueuelen:0
   RX bytes:120247 (117.4 KiB)  TX bytes:120247 (117.4 KiB)

 Ok, now kvm host is ready. Just define the kvm traffic label for
 Management traffic to be 'br0', for guest to be 'br0', and for public
 to be 'br1'. Cloudstack will create any necessary bridges or vlans.
 You can leave the vlan option empty if you don't want it to create a
 vlan (say for management). I can perhaps go into more detail later.

 On Wed, Jul 31, 2013 at 12:33 PM, Marcus Sorensen shadow...@gmail.com wrote:
 Yes, that's correct. I think we need to update the documentation. The
 user simply needs to create a bridge where 'public' traffic will work,
 and then set that bridge name as the traffic label for public traffic.
 Then it will create the vlan device and the bridge necessary for
 public based on the physical ethernet device of that bridge.

 Note, in this example, it is only looking for cloudVirBr for
 compatibility, if there are existing cloudVirBr bridges then the agent
 will continue to create cloudVirBr bridges, otherwise, it will create
 breth bridges, which allow the same vlan number on different physical
 interfaces.

 We can easily create some concrete examples for this... such as the
 one represented in devcloud-kvm by
 tools/devcloud-kvm/devcloud-kvm-advanced.cfg

 On Wed, Jul 31, 2013 at 12:06 PM, Edison Su edison...@citrix.com wrote:
 The KVM installation guide at 
 http://cloudstack.apache.org/docs/en-US/Apache_CloudStack/4.1.0/html/Installation_Guide/hypervisor-kvm-install-flow.html
  , is unnecessary complicated and inaccurate.
 For example, we don't need to configure vlan on kvm host by users 
 themselves, cloudstack-agent will create vlans automatically.
 All users need to do is to create bridges(if the default bridge created by 
 cloudstack-agent is not enough), then add these bridge names from 
 cloudstack mgt server UI during the zone creation.

 -Original Message-
 From: Noel Kendall [mailto:noeldkend...@hotmail.com]
 Sent: Wednesday, July 31, 2013

Re: FW: CS 4.1.0 - this will help a number of people who struggle with Advanced Networking

2013-07-31 Thread Marcus Sorensen
Yes, that's correct. I think we need to update the documentation. The
user simply needs to create a bridge where 'public' traffic will work,
and then set that bridge name as the traffic label for public traffic.
Then it will create the vlan device and the bridge necessary for
public based on the physical ethernet device of that bridge.

Note, in this example, it is only looking for cloudVirBr for
compatibility, if there are existing cloudVirBr bridges then the agent
will continue to create cloudVirBr bridges, otherwise, it will create
breth bridges, which allow the same vlan number on different physical
interfaces.

We can easily create some concrete examples for this... such as the
one represented in devcloud-kvm by
tools/devcloud-kvm/devcloud-kvm-advanced.cfg

On Wed, Jul 31, 2013 at 12:06 PM, Edison Su edison...@citrix.com wrote:
 The KVM installation guide at 
 http://cloudstack.apache.org/docs/en-US/Apache_CloudStack/4.1.0/html/Installation_Guide/hypervisor-kvm-install-flow.html
  , is unnecessary complicated and inaccurate.
 For example, we don't need to configure vlan on kvm host by users themselves, 
 cloudstack-agent will create vlans automatically.
 All users need to do is to create bridges(if the default bridge created by 
 cloudstack-agent is not enough), then add these bridge names from cloudstack 
 mgt server UI during the zone creation.

 -Original Message-
 From: Noel Kendall [mailto:noeldkend...@hotmail.com]
 Sent: Wednesday, July 31, 2013 9:49 AM
 To: users@cloudstack.apache.org
 Subject: CS 4.1.0 - this will help a number of people who struggle with 
 Advanced Networking

 The documentation for installation in a KVM environment is utterly misleading.
 The documentation reads as though one can set up the bridge for the public 
 network with any name one chooses, the default being cloudbr0.
 You cannot use just any old name. That simply will not work.
 Let's suppose I have a public network that I isolate on VLAN 5, which is 
 interfaced on ethernet adapter eth4. I will need to define an adapter eth4.5 
 with VLAN set to yes.
 So far, so good.
 Next, for the bridge...
 By enabling debugging output in the log, I was able to see that the code 
 looks for a bridge with the name cloudVirBr5 for my public network.
 I had tried several different approaches, none would work if I did not name 
 my bridge cloudVirBr5, and set my traffic label on the network 
 configurationto the same.
 I have seen numerous posts in the mailing lists, blog entries, you name it, 
 representing frustrations of throngs of users trying to validate a CS setup.
 The documentation is utterly wrong and misleading.
 Summary:
 does not work:traffic label: cloudbr0 with eth4.5 pointing to cloudbr0 - code 
 still tries to create a breth4.5 and enlist eth4.5 to it but cannot because 
 it is already enlisted to cloudbr0.
 Good luck everyone with advanced networking with VLAN isolation on CentOS KVM 
 hosts.



Re: Unable to migrate VM to another host

2013-07-23 Thread Marcus Sorensen
There is no cloud user on the host/agent side. Agent runs as root. Or is
this an experiment to try to run the agent as another user?
On Jul 23, 2013 4:44 AM, Prasanna Santhanam t...@apache.org wrote:

 On Tue, Jul 23, 2013 at 06:19:06PM +0800, Indra Pramana wrote:
  Dear Prasanna and all,
 
  I have tried to remove the host from CloudStack, uninstall and reinstall
  cloudstack-agent and added the host back into CloudStack. The cloud
  user-id is still not yet created. I then tried to add the cloud user
  manually, using exactly the same credentials as the cloud user on the
  other host.
 
  From the host, I tried to do a virsh connect qemu+ssh to the other host
  using the cloud user (instead of root), and getting this error:
 
  cloud@hv-kvm-01:~$ virsh --connect qemu+ssh://cloud@10.237.3.22/systemlist
  cloud@10.237.3.22's password:
  error: failed to connect to the hypervisor
  error: no valid connection
  error: End of file while reading data: : Input/output error
 
  If you notice, the error is the same exact error message I see on the
  management-server.log when I tried to do a live migration of VM. So I
 tried
  to follow this lead and implement this instruction to allow the cloud
  user to have access to the hypervisor on the other host:
 
  http://wiki.libvirt.org/page/SSHSetup
 
  usermod -G libvirtd -a cloud
 
  Enabled these on /etc/libvirt/libvirtd.conf:
 
  unix_sock_group = libvirtd
  unix_sock_rw_perms = 0770
 
  I also copied the SSH keys of the cloud user to the other host, so that
 it
  will not prompt for password.
 
  And I am now able to do a virsh connect using the cloud user:
 
  cloud@hv-kvm-01:~$ virsh --connect qemu+ssh://cloud@10.237.3.22/systemlist
   IdName   State
  
   2 i-2-275-VM running
   3 i-2-276-VM running
   4 i-2-293-VM running
 
  But I am still not able to perform live migration of VM.
 
  May I know how CloudStack connects to the hypervisors when it performs
 live
  migration, after finding a suitable target host? Does it request the
 source
  host to perform a qemu+ssh connect to the target host?
 

 I'm not sure about the specifics here. Perhaps someone else will chime
 in - But from looking at the resource for KVM in cloudstack it appears
 the libvirt XML contains a qemu+tcp:// connection on migrate.

 Can you tell us if the KVM hosts share the same management subnet?
 Are they in the same cluster as CloudStack denotes a cluster?

 Is there a way to trap the XMLs sent to the KVM resource in the
 libvirt logs? I'd try to enable that if so.

 --
 Prasanna.,

 
 Powered by BigRock.com




Re: cloudstack 4.1 QinQ vlan behaviour

2013-07-10 Thread Marcus Sorensen
I created that document, as a suggestion. I never got feedback. The
way it worked previously was sort of  a happy accident, which was
'fixed' when the code changed to accept overlapping vlan numbers on
multiple physical devices (hence the bridge name change).

However... I believe there is still a way to do what you want with the
stock code. What is your guest KVM traffic label set to?  Cloudstack
is looking for the 'parent' physical device of the bridge, so if it
sees that it's on a vlan, it goes up one more to find the real device.
It only does this once. So if instead of:

cloudbrguest8000.90e2ba317614   yes vlan211

You create:

cloudbrguest-10   8000.90e2ba317614   yes vlan211.10

And tell it to use cloudbrguest-10 as the traffic label, it will go up
one from vlan10 and settle on vlan211 as the physical device. The nice
thing about the new behavior is that I believe it will work on ANY
type, not just 'vlan' ones (so you could bond, for instance).

On Wed, Jul 10, 2013 at 2:34 AM, Valery Ciareszka
valery.teres...@gmail.com wrote:
 Hi all.

 I was able to change vlan creation behaviour by source code modification
 (plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/resource/LibvirtComputingResource.java),
 had to comment several lines of code:

  private String getPif(String bridge) {
 String pif = matchPifFileInDirectory(bridge);
 //File vlanfile = new File(/proc/net/vlan/ + pif);

 //if (vlanfile.isFile()) {
 //pif = Script.runSimpleBashScript(grep ^Device\\:
 /proc/net/vlan/
 //  + pif +  | awk {'print
 $2'});
 //}

 return pif;
 }

 Could someone please comment this new behaviour of vlan creation ? Why does
 it try to create vlan on real physical device, but not on vlan (vlan in
 vlan) ? There is nothing about this in documentation.
 I have found Q-in-Q for isolated networks functional spec -
 https://cwiki.apache.org/CLOUDSTACK/q-in-q-for-isolated-networks-functional-spec.html
 The admin simply needs to create any 'vlan#' devices, and CloudStack uses
 them as physical devices.

 That worked for me in CS 4.0.2. But as you can see, current version of
 cloudstack DOES NOT use 'vlan#' devices as physical devices!!!
 Is that a bug ?



 On Tue, Jul 9, 2013 at 12:39 PM, Valery Ciareszka valery.teres...@gmail.com
 wrote:

 So, nobody uses q in q and cloudstack 4.1 ?


 On Mon, Jul 8, 2013 at 3:13 PM, Valery Ciareszka 
 valery.teres...@gmail.com wrote:

 Hi all,

 I use the following environment: CS 4.1, KVM, Centos 6.4
 (management+node1+node2), OpenIndiana NFS server as primary and secondary
 storage
 I have advanced networking in zone. I split management/public/guest
 traffic into different vlans, and use kvm network labels (bridge names):
 # cat /etc/cloud/agent/agent.properties |grep device
 guest.network.device=cloudbrguest
 private.network.device=cloudbrmanage
 public.network.device=cloudbrpublic

 I have following network configuration:
 eth0+eth1=bond0
 eth2+eth3=bond1

 I use  vlan with id=211 on bond1 interface for guest traffic:
 cloudbrguest8000.90e2ba317614   yes vlan211
 cloudbrmanage   8000.90e2ba317614   yes bond1.210
 cloudbrpublic   8000.90e2ba317614   yes bond1.221
 cloudbrstor 8000.0025908814a4   yes bond0


 The problem appeared after I have upgraded CS from 4.0.2 to 4.1.

 How it works in 4.0.2:
 -bridge interface cloudVirBr#VLANID is created on hypervisor, #VLANID -
 value from 1024 to 4096(is specified when creating zone), i.e.
 cloudVirBr1224
 -vlan interface vlan211.#VLANID is created on hypervisor and is plugged
 into cloudVirBr#VLANID
 I should had permitted 211 vlanid on switchports and all guest traffic
 (vlans 1024-4096) was encapsulated.

 How it works in 4.1:
 -bridge interface br#ETHNAME-#VLANID is created on hypervisor, where
 #VLANID - value from 1024 to 4096(is specified when creating zone) and
 #ETHNAME - name of device on top of which vlan will be created
 i.e. brbond1-1224
 -vlan interface bond1.#VLANID is created on hypervisor and is plugged
 into br#ETHNAME-#VLANID
 However, vlan interface is created on top of bond1 interface, while I
 would like it to be created on top of vlan211 (bond1.211)
 Now I should permit 1024-4096 vlanid on switchports, that is not
 convenient.

 How do I configure CS 4.1 so that it could work with guest vlans the same
 way as it had worked in CS 4.0 ?

 --
 Regards,
 Valery

 http://protocol.by/slayer




 --
 Regards,
 Valery

 http://protocol.by/slayer




 --
 Regards,
 Valery

 http://protocol.by/slayer


Re: ebtables

2013-04-19 Thread Marcus Sorensen
you can go back and disable security groups in the zone if you don't care
about the ebtables rules, or you can start up ebtables and then restart any
associated VMs through cloudstack. The rules are dynamic, so they're not
going to be saved anywhere on the host to be reinstated, they have to be
reapplied by cloudstack via a restart of the vms.


On Fri, Apr 19, 2013 at 11:12 AM, Maurice Lawler maurice.law...@me.comwrote:

 Anyone know how to correct my mistake?

 - Maurice


 On Apr 19, 2013, at 2:01 AM, Maurice Lawler maurice.law...@me.com wrote:

  Perhaps this was not the best thing, now my ports are open; how can I
 revert back to eatables.
 
  Along with that, when reverted, how can I drop rules for a particular VM
 to allow communication via second IP address.
 
 
  On Apr 18, 2013, at 10:34 PM, Maurice Lawler maurice.law...@me.com
 wrote:
 
  Disregard, for now, I have disabled/removed ebtables as shown here:
 
 
 http://mail-archives.apache.org/mod_mbox/incubator-cloudstack-users/201302.mbox/%3cb1df26ecc0458748ac97cece2da98d41012fa47b6...@sjcpmailbox01.citrite.net%3E
 
 
  On Apr 18, 2013, at 11:28 PM, Maurice Lawler maurice.law...@me.com
 wrote:
 
  Hello --
 
  Previously one told me how to do this, but I cannot find my notes on
 this, so I hope you can help me out.
 
  I am attempting to allow a secondary IP address on an instance by-pass
 the routing rules set forth in ebtables. I recall doing something like
 
  ebtables nat i-2-25-VM something ... I cannot for the life of me
 remember.
 
  How to list and/or drop the rules per VM.
 
  Can you guys assist?
 




Re: IP tables blocking KVM/Console

2013-04-19 Thread Marcus Sorensen
That's reflected by this line:

ACCEPT tcp  --  anywhere anywheretcp
dpts:vnc-server:synchronet-db

Although we don't know what interfaces it applies to because we don't have
an 'iptables -L -v'

If stopping iptables fixes Maurice's problem it would be interesting to
know, as the rules seem to let VNC through. It should be easy to tcpdump
and see what traffic is actually being blocked because his rules suggest
that VNC is wide open on the KVM host.


On Fri, Apr 19, 2013 at 12:15 PM, Edison Su edison...@citrix.com wrote:

 This rule will reject all the ingress activities: REJECT all  --
  anywhere anywherereject-with icmp-host-prohibited
 You can try:
 iptables -I INPUT -p tcp -m tcp --dport 5900:6100 -j ACCEPT
 to allow console access.

 From: Maurice Lawler [mailto:maurice.law...@me.com]
 Sent: Wednesday, April 17, 2013 7:48 PM
 To: Cloud Dev
 Cc: users@cloudstack.apache.org; users@cloudstack.apache.org
 Subject: IP tables blocking KVM/Console

 I have stopped iptables at least 15 times, because it keeps blocking my
 console access to my instances. How can I either A) disable Iptables all
 together / b add a rule to allow it's access.

 Right now, it has this:

 [root@lunder ~]# iptables -L
 Chain INPUT (policy ACCEPT)
 target prot opt source   destination
 ACCEPT udp  --  anywhere anywhereudp dpt:bootps
 ACCEPT tcp  --  anywhere anywheretcp dpt:bootps
 ACCEPT tcp  --  anywhere anywheretcp
 dpts:49152:49216
 ACCEPT tcp  --  anywhere anywheretcp
 dpts:vnc-server:synchronet-db
 ACCEPT tcp  --  anywhere anywheretcp dpt:16509
 ACCEPT tcp  --  anywhere anywheretcp dpt:websm
 ACCEPT tcp  --  anywhere anywheretcp dpt:8250
 ACCEPT tcp  --  anywhere anywheretcp
 dpt:empowerid
 ACCEPT tcp  --  anywhere anywheretcp
 dpt:webcache
 ACCEPT all  --  anywhere anywherestate
 RELATED,ESTABLISHED
 ACCEPT icmp --  anywhere anywhere
 ACCEPT all  --  anywhere anywhere
 ACCEPT tcp  --  anywhere anywherestate NEW tcp
 dpt:ssh
 REJECT all  --  anywhere anywherereject-with
 icmp-host-prohibited

 Chain FORWARD (policy ACCEPT)
 target prot opt source   destination

 Chain OUTPUT (policy ACCEPT)
 target prot opt source   destination
 [root@lunder ~]#

 But there was plenty of other rules previously to my stopping it.





Re: IP tables blocking KVM/Console

2013-04-19 Thread Marcus Sorensen
What do you see in :



On Fri, Apr 19, 2013 at 2:17 PM, Maurice Lawler maurice.law...@me.comwrote:

 I've tried it with them disabled (iptables get written) and enabled (the
 same issue)

 The cron job seemed to do the trick, until someone just mentioned to try:

   iptables -I INPUT -p tcp -m tcp --dport 5900:6100 -j ACCEPT

 That's not working, so I am going back to my cronjob!

 - Maurice


 On Apr 19, 2013, at 02:08 PM, Edison Su edison...@citrix.com wrote:



  -Original Message-
  From: Jason Pavao [mailto:jason.pa...@oracle.com]
  Sent: Thursday, April 18, 2013 8:50 AM
  To: d...@cloudstack.apache.org
  Cc: Maurice Lawler; users@cloudstack.apache.org
  Subject: Re: IP tables blocking KVM/Console
 
  Maurice,
  I was having the same issues, I tried a number of iptables rule changes,
 but it
  seems that whenever a new instance was deployed it would overwrite my
  changes and break things again. My temporary fix is to run a cron job
 that
  runs every minute that issues a service iptables stop.

 Do you disable security group when creating the zone? If security group is
 disabled, then there should be no iptables rules created on kvm host when a
 new instance created.

 
  It's not elegant but it works since I don't have a need for security
 groups and
  am supporting a jenkins continuous testing environment with no need for
  network ingress/egress rules.
 
  Does anyone else know why this is happening?
 
  I am running cs 4.0.1 on oel6.3x64
 
  Any help would be appreciated.
  Thanks.
  -jason
 
  On 4/17/2013 7:47 PM, Maurice Lawler wrote:
   I have stopped iptables at least 15 times, because it keeps blocking
   my console access to my instances. How can I either A) disable
   Iptables all together / b add a rule to allow it's access.
  
   Right now, it has this:
  
   [root@lunder ~]# iptables -L
   Chain INPUT (policy ACCEPT)
   target prot opt source destination
   ACCEPT udp -- anywhere anywhere udp
   dpt:bootps
   ACCEPT tcp -- anywhere anywhere tcp
   dpt:bootps
   ACCEPT tcp -- anywhere anywhere tcp
   dpts:49152:49216
   ACCEPT tcp -- anywhere anywhere tcp
   dpts:vnc-server:synchronet-db
   ACCEPT tcp -- anywhere anywhere tcp
   dpt:16509
   ACCEPT tcp -- anywhere anywhere tcp
   dpt:websm
   ACCEPT tcp -- anywhere anywhere tcp dpt:8250
   ACCEPT tcp -- anywhere anywhere tcp
   dpt:empowerid
   ACCEPT tcp -- anywhere anywhere tcp
   dpt:webcache
   ACCEPT all -- anywhere anywhere state
   RELATED,ESTABLISHED
   ACCEPT icmp -- anywhere anywhere
   ACCEPT all -- anywhere anywhere
   ACCEPT tcp -- anywhere anywhere state NEW tcp dpt:ssh
   REJECT all -- anywhere anywhere reject-with
   icmp-host-prohibited
  
   Chain FORWARD (policy ACCEPT)
   target prot opt source destination
  
   Chain OUTPUT (policy ACCEPT)
   target prot opt source destination
   [root@lunder ~]#
  
   But there was plenty of other rules previously to my stopping it.
  
  
 
  --
  Thanks.
  -Jason




Re: IP tables blocking KVM/Console

2013-04-19 Thread Marcus Sorensen
what do you see in:

 cat /proc/sys/net/bridge/bridge*

?  I think I've seen issues with these being set to 1, but I think it might
need to be set to 1 if you're using security groups.


On Fri, Apr 19, 2013 at 5:20 PM, Marcus Sorensen shadow...@gmail.comwrote:

 What do you see in :



 On Fri, Apr 19, 2013 at 2:17 PM, Maurice Lawler maurice.law...@me.comwrote:

 I've tried it with them disabled (iptables get written) and enabled (the
 same issue)

 The cron job seemed to do the trick, until someone just mentioned to try:

   iptables -I INPUT -p tcp -m tcp --dport 5900:6100 -j ACCEPT

 That's not working, so I am going back to my cronjob!

 - Maurice


 On Apr 19, 2013, at 02:08 PM, Edison Su edison...@citrix.com wrote:



  -Original Message-
  From: Jason Pavao [mailto:jason.pa...@oracle.com]
  Sent: Thursday, April 18, 2013 8:50 AM
  To: d...@cloudstack.apache.org
  Cc: Maurice Lawler; users@cloudstack.apache.org
  Subject: Re: IP tables blocking KVM/Console
 
  Maurice,
  I was having the same issues, I tried a number of iptables rule
 changes, but it
  seems that whenever a new instance was deployed it would overwrite my
  changes and break things again. My temporary fix is to run a cron job
 that
  runs every minute that issues a service iptables stop.

 Do you disable security group when creating the zone? If security group
 is disabled, then there should be no iptables rules created on kvm host
 when a new instance created.

 
  It's not elegant but it works since I don't have a need for security
 groups and
  am supporting a jenkins continuous testing environment with no need for
  network ingress/egress rules.
 
  Does anyone else know why this is happening?
 
  I am running cs 4.0.1 on oel6.3x64
 
  Any help would be appreciated.
  Thanks.
  -jason
 
  On 4/17/2013 7:47 PM, Maurice Lawler wrote:
   I have stopped iptables at least 15 times, because it keeps blocking
   my console access to my instances. How can I either A) disable
   Iptables all together / b add a rule to allow it's access.
  
   Right now, it has this:
  
   [root@lunder ~]# iptables -L
   Chain INPUT (policy ACCEPT)
   target prot opt source destination
   ACCEPT udp -- anywhere anywhere udp
   dpt:bootps
   ACCEPT tcp -- anywhere anywhere tcp
   dpt:bootps
   ACCEPT tcp -- anywhere anywhere tcp
   dpts:49152:49216
   ACCEPT tcp -- anywhere anywhere tcp
   dpts:vnc-server:synchronet-db
   ACCEPT tcp -- anywhere anywhere tcp
   dpt:16509
   ACCEPT tcp -- anywhere anywhere tcp
   dpt:websm
   ACCEPT tcp -- anywhere anywhere tcp dpt:8250
   ACCEPT tcp -- anywhere anywhere tcp
   dpt:empowerid
   ACCEPT tcp -- anywhere anywhere tcp
   dpt:webcache
   ACCEPT all -- anywhere anywhere state
   RELATED,ESTABLISHED
   ACCEPT icmp -- anywhere anywhere
   ACCEPT all -- anywhere anywhere
   ACCEPT tcp -- anywhere anywhere state NEW tcp dpt:ssh
   REJECT all -- anywhere anywhere reject-with
   icmp-host-prohibited
  
   Chain FORWARD (policy ACCEPT)
   target prot opt source destination
  
   Chain OUTPUT (policy ACCEPT)
   target prot opt source destination
   [root@lunder ~]#
  
   But there was plenty of other rules previously to my stopping it.
  
  
 
  --
  Thanks.
  -Jason





Re: Emergency: Cloud NOT starting

2013-04-13 Thread Marcus Sorensen
-SecondaryStorageManagerImpl.startSecStorageVm:257
 2013-04-13 12:43:28,970 INFO  [cloud.vm.VirtualMachineManagerImpl]
 (secstorage-1:null) Insufficient capacity
 com.cloud.exception.InsufficientAddressCapacityException: Unable to get a
 management ip addressScope=interface com.cloud.dc.Pod; id=1
 at
 com.cloud.network.guru.PodBasedNetworkGuru.reserve(PodBasedNetworkGuru.java:121)
 at
 com.cloud.network.NetworkManagerImpl.prepareNic(NetworkManagerImpl.java:2143)
 at
 com.cloud.network.NetworkManagerImpl.prepare(NetworkManagerImpl.java:2113)
 at
 com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:752)
 at
 com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:472)
 at
 com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:465)
 at
 com.cloud.storage.secondary.SecondaryStorageManagerImpl.startSecStorageVm(SecondaryStorageManagerImpl.java:257)
 at
 com.cloud.storage.secondary.SecondaryStorageManagerImpl.allocCapacity(SecondaryStorageManagerImpl.java:684)
 at
 com.cloud.storage.secondary.SecondaryStorageManagerImpl.expandPool(SecondaryStorageManagerImpl.java:1310)
 at
 com.cloud.secstorage.PremiumSecondaryStorageManagerImpl.scanPool(PremiumSecondaryStorageManagerImpl.java:119)
 at
 com.cloud.secstorage.PremiumSecondaryStorageManagerImpl.scanPool(PremiumSecondaryStorageManagerImpl.java:50)
 at
 com.cloud.vm.SystemVmLoadScanner.loadScan(SystemVmLoadScanner.java:106)
 at
 com.cloud.vm.SystemVmLoadScanner.access$100(SystemVmLoadScanner.java:34)
 at
 com.cloud.vm.SystemVmLoadScanner$1.reallyRun(SystemVmLoadScanner.java:83)
 at
 com.cloud.vm.SystemVmLoadScanner$1.run(SystemVmLoadScanner.java:73)
 at
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at
 java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
 at
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165)
 at
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:679)
 2013-04-13 12:43:28,973 DEBUG [cloud.vm.VirtualMachineManagerImpl]
 (secstorage-1:null) Cleaning up resources for the vm
 VM[SecondaryStorageVm|s-588-VM] in Starting state
 2013-04-13 12:43:28,975 DEBUG [agent.transport.Request]
 (secstorage-1:null) Seq 1-751304715: Waiting for Seq 751304714 Scheduling:
  { Cmd , MgmtId: 219948120943996, via: 1, Ver: v1, Flags: 100111,
 [{StopCommand:{isProxy:false,vmName:s-588-VM,wait:0}}] }
 2013-04-13 12:43:29,186 DEBUG
 [network.router.VirtualNetworkApplianceManagerImpl]
 (RouterStatusMonitor-1:null) Found 0 routers.
 2013-04-13 12:43:37,927 DEBUG [agent.manager.AgentManagerImpl]
 (AgentManager-Handler-14:null) Ping from 1
 2013-04-13 12:43:43,240 DEBUG [cloud.server.StatsCollector]
 (StatsCollector-3:null) VmStatsCollector is running...
 2013-04-13 12:43:43,323 DEBUG [cloud.server.StatsCollector]
 (StatsCollector-3:null) StorageCollector is running...
 2013-04-13 12:43:43,327 DEBUG [cloud.server.StatsCollector]
 (StatsCollector-3:null) There is no secondary storage VM for secondary
 storage host nfs://96.31.67.232/secondary
 2013-04-13 12:43:43,400 DEBUG [agent.transport.Request]
 (StatsCollector-3:null) Seq 1-751304716: Received:  { Ans: , MgmtId:
 219948120943996, via: 1, Ver: v1, Flags: 10, { GetStorageStatsAnswer } }
 2013-04-13 12:43:43,936 DEBUG [cloud.server.StatsCollector]
 (StatsCollector-3:null) HostStatsCollector is running...
 2013-04-13 12:43:44,545 DEBUG [agent.transport.Request]
 (StatsCollector-3:null) Seq 1-751304717: Received:  { Ans: , MgmtId:
 219948120943996, via: 1, Ver: v1, Flags: 10, { GetHostStatsAnswer } }
 2013-04-13 12:43:58,231 DEBUG [cloud.server.ManagementServerImpl]
 (EventChecker-1:null) Deleting events older than: Fri Apr 12 12:43:58 CDT
 2013
 2013-04-13 12:43:58,233 DEBUG [cloud.server.ManagementServerImpl]
 (EventChecker-1:null) Found 0 events to be purged
 2013-04-13 12:43:58,235 DEBUG [cloud.server.ManagementServerImpl]
 (EventChecker-1:null) Deleting events older than: Fri Apr 12 12:43:58 CDT
 2013
 2013-04-13 12:43:58,238 DEBUG [cloud.server.ManagementServerImpl]
 (EventChecker-1:null) Found 0 events to be purged
 2013-04-13 12:43:59,186 DEBUG
 [network.router.VirtualNetworkApplianceManagerImpl]
 (RouterStatusMonitor-1:null) Found 0 routers.
 [root@lunder agent]#




 On Apr 13, 2013, at 12:30 PM, Marcus Sorensen shadow...@gmail.com wrote:

  Well you've got something trying to start, because you have vnet
  interfaces. You need to look

Re: Emergency: Cloud NOT starting

2013-04-13 Thread Marcus Sorensen
A brctl show would also be good to have.
On Apr 13, 2013 11:52 AM, Marcus Sorensen shadow...@gmail.com wrote:

 If you do a virsh list on the agent there's a good chance you would see
 a VM running, however the system will only wait so long for it to boot up
 before shutting it down, so it will come and go. You can do virsh
 vncdisplay (vmname) and it will tell you what port to vnc to on the host
 in order to connect to the VM and see what state it is in.

 I see in the agent log that at one point it failed to start due to no
 private bridge. Is cloudbr0 your private as defined in agent.properties?

 You can also open /etc/cloud/agent/log4j-cloud.xml and change every INFO
 to DEBUG, restart the agent, and get more info.
 On Apr 13, 2013 11:45 AM, Maurice Lawler maurice.law...@me.com wrote:

 Thank you.

 The FSCK was already completed during boot up, it was forced. However,
 how can I access the VM's when they are in starting state to see if they
 need a FSCK?

 Agent log is showing this presently.


 2013-04-13 12:35:09,989 INFO  [cloud.agent.Agent]
 (AgentShutdownThread:null) Stopping the agent: Reason = sig.kill
 2013-04-13 12:37:32,244 INFO  [utils.component.ComponentLocator]
 (main:null) Unable to find components.xml
 2013-04-13 12:37:32,285 INFO  [utils.component.ComponentLocator]
 (main:null) Skipping configuration using components.xml
 2013-04-13 12:37:32,285 INFO  [cloud.agent.AgentShell] (main:null)
 Implementation Version is 4.0.1.20130201075054
 2013-04-13 12:37:32,286 INFO  [cloud.agent.AgentShell] (main:null)
 agent.properties found at /etc/cloud/agent/agent.properties
 2013-04-13 12:37:32,287 INFO  [cloud.agent.AgentShell] (main:null)
 Defaulting to using properties file for storage
 2013-04-13 12:37:32,289 INFO  [cloud.agent.AgentShell] (main:null)
 Defaulting to the constant time backoff algorithm
 2013-04-13 12:37:32,413 INFO  [cloud.agent.Agent] (main:null) id is 1
 2013-04-13 12:37:32,418 ERROR [cloud.resource.ServerResourceBase]
 (main:null) Nics are not configured!
 2013-04-13 12:37:32,420 ERROR [cloud.agent.AgentShell] (main:null) Unable
 to start agent: Private NIC is not configured
 2013-04-13 12:42:30,653 INFO  [utils.component.ComponentLocator]
 (main:null) Unable to find components.xml
 2013-04-13 12:42:30,654 INFO  [utils.component.ComponentLocator]
 (main:null) Skipping configuration using components.xml
 2013-04-13 12:42:30,654 INFO  [cloud.agent.AgentShell] (main:null)
 Implementation Version is 4.0.1.20130201075054
 2013-04-13 12:42:30,655 INFO  [cloud.agent.AgentShell] (main:null)
 agent.properties found at /etc/cloud/agent/agent.properties
 2013-04-13 12:42:30,656 INFO  [cloud.agent.AgentShell] (main:null)
 Defaulting to using properties file for storage
 2013-04-13 12:42:30,658 INFO  [cloud.agent.AgentShell] (main:null)
 Defaulting to the constant time backoff algorithm
 2013-04-13 12:42:30,721 INFO  [cloud.agent.Agent] (main:null) id is 1
 2013-04-13 12:42:30,820 INFO
  [resource.virtualnetwork.VirtualRoutingResource] (main:null)
 VirtualRoutingResource _scriptDir to use: scripts/network/domr/kvm
 2013-04-13 12:42:32,094 INFO  [kvm.resource.LibvirtComputingResource]
 (main:null) No libvirt.vif.driver specififed. Defaults to BridgeVifDriver.
 2013-04-13 12:42:32,147 INFO  [cloud.agent.Agent] (main:null) Agent [id =
 1 : type = LibvirtComputingResource : zone = 1 : pod = 1 : workers = 5 :
 host = 96.31.67.232 : port = 8250
 2013-04-13 12:42:32,154 INFO  [utils.nio.NioClient] (Agent-Selector:null)
 Connecting to myipaddress:8250
 2013-04-13 12:42:32,444 INFO  [utils.nio.NioClient] (Agent-Selector:null)
 SSL: Handshake done
 2013-04-13 12:42:32,599 INFO  [cloud.serializer.GsonHelper]
 (Agent-Handler-1:null) Default Builder inited.
 2013-04-13 12:42:32,803 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
 Proccess agent startup answer, agent id = 1
 2013-04-13 12:42:32,803 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
 Set agent id 1
 2013-04-13 12:42:32,808 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
 Startup Response Received: agent id = 1


 The management log says this:

 2013-04-13 12:43:28,952 DEBUG [cloud.network.NetworkManagerImpl]
 (secstorage-1:null) Lock is released for network id 201 as a part of
 network implement
 2013-04-13 12:43:28,969 DEBUG [db.Transaction.Transaction]
 (secstorage-1:null) Rolling back the transaction: Time = 1 Name =
  
 -SystemVmLoadScanner$1.run:71-Executors$RunnableAdapter.call:471-FutureTask$Sync.innerRunAndReset:351-FutureTask.runAndReset:178-ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201:165-ScheduledThreadPoolExecutor$ScheduledFutureTask.run:267-ThreadPoolExecutor.runWorker:1146-ThreadPoolExecutor$Worker.run:615-Thread.run:679;
 called by
 -Transaction.rollback:887-DataCenterIpAddressDaoImpl.takeIpAddress:57-DatabaseCallback.intercept:34-DataCenterDaoImpl.allocatePrivateIpAddress:228-DatabaseCallback.intercept:34-PodBasedNetworkGuru.reserve:119-NetworkManagerImpl.prepareNic:2143-NetworkManagerImpl.prepare:2113

Re: Cloudstack KVM Failing

2013-04-02 Thread Marcus Sorensen
All I can suggest is to troubleshoot step-by-step.

Can you connect directly to the VMs via VNC client?

Can you log into consoleproxy system VM?

Can consoleproxy system VM ping the KVM hosts?

Can consoleproxy system VM telnet to VM VNC server ports?

Is consoleproxy running the java consoleproxy process?

Can you ping consoleproxy's public address from where your UI is running?

Does consoleproxy's routes allow it to get back to the system where your UI
is running?

etc...


On Mon, Apr 1, 2013 at 11:15 AM, Maurice Lawler maurice.law...@me.comwrote:

 I added 5900 + to ingress for that particular security profile, I would
 think that is how it would be done correct? I again restarted the proxy VM,
 yet and still I am unable to get the KVM to connect. The instances are
 fine, because I can access them via SSH without issue.

 What would you suggest?

 On Mar 31, 2013, at 12:36 AM, Marcus Sorensen shadow...@gmail.com wrote:

  This looks like your console proxy VM isn't working. It could be firewall
  on the KVM host not allowing access to 5900+ anymore, or the consoleproxy
  VM may need to be restarted (system VM starting with v). Has it worked
 in
  the past?  You could try connecting to the instances with a vnc client to
  see if they are accessible, skipping console proxy.
  On Mar 30, 2013 10:40 PM, Maurice Lawler maurice.law...@me.com
 wrote:
 
  For some reason, I am unsure of the issue; attempting to review the KVM
 of
  any instance times out:
 
  cloudstack realhostip.com took too long to respond
 
  Has anyone encountered this error and how can it be corrected?
 
  Thanks,
  Maurice