Re: some button labels not displaying correctly in the UI 4.19

2024-03-29 Thread Marcus Torres
i'm embarrassed to say , that worked. Sorry for wasting your time lol.
Thank you and thank you guys for all the work on this project

On Fri, Mar 29, 2024 at 1:19 PM Wei ZHOU  wrote:

> It seems like an issue with browser cache.
> try incognito mode, or clean the browser cache.
>
> -Wei
>
> On Fri, Mar 29, 2024 at 5:55 PM Marcus Torres  wrote:
> >
> > Hi!
> > First off i would like to tip my hat to everyone involved in the release
> of
> > 4.19.0. The major new features and changes just put cloudstack over the
> top
> > in terms of functionality and feature sets compared to competitive
> > platforms. It's quite amazing.
> >
> > After ugpgrading to 4.19.0 (management server first, usage second,
> > hypervisors third, then restart of all services), i noticed some of the
> new
> > buttons in the UI display 'label.x' instead of a proper title ? or is it
> > like that intentionally?
> > For instance, drilling down to a cluster view, for the DRS tab it shows
> > 'label.drs' and if i click that tab , the button to generate a plan is
> > labeled 'label.drs.generate.plan' instead of 'Generate DRS Plan' as the
> > release screenshots show. I'm seeing the same thing with label.bucks and
> > label.object.storage.
> >
> > The buttons work, just a display issue it seems. I have screenshots if
> > needeed
>


some button labels not displaying correctly in the UI 4.19

2024-03-29 Thread Marcus Torres
Hi!
First off i would like to tip my hat to everyone involved in the release of
4.19.0. The major new features and changes just put cloudstack over the top
in terms of functionality and feature sets compared to competitive
platforms. It's quite amazing.

After ugpgrading to 4.19.0 (management server first, usage second,
hypervisors third, then restart of all services), i noticed some of the new
buttons in the UI display 'label.x' instead of a proper title ? or is it
like that intentionally?
For instance, drilling down to a cluster view, for the DRS tab it shows
'label.drs' and if i click that tab , the button to generate a plan is
labeled 'label.drs.generate.plan' instead of 'Generate DRS Plan' as the
release screenshots show. I'm seeing the same thing with label.bucks and
label.object.storage.

The buttons work, just a display issue it seems. I have screenshots if
needeed


Re: Method to edit global settings from command line

2024-01-30 Thread Marcus Torres
Hello Suresh

1. Cloudstack version 4.18.1.0
2. Management server = Rocky Linux 8.5
3. Hypervisors = Rocky Linux 8.5
4. the only change was enabling SAML in the global config in the UI.
5. I saw some entries in the log regarding SAML and the 'admin' user not
being able to authenticate against SAML. not sure if it's related

I've sent the you management log to your gmail address if that's OK. it's
pretty large and i've scrubbed it of any sensitive data

Thanks Suresh.

On Tue, Jan 30, 2024 at 11:52 AM Suresh Kumar Anaparti <
sureshkumar.anapa...@gmail.com> wrote:

> Hi Marcus,
>
> Thanks for the update.
>
> Maybe some issue after enabling SAML, can you share the cloudstack version,
> and error log from the management server?
>
> Regards,
> Suresh
>
> On Tue, Jan 30, 2024 at 9:21 PM Marcus Torres  wrote:
>
> > @SureshKumarAnaparti
> >
> > That worked! after a restart of the management service, I'm able to hit
> the
> > UI on port 8080 now! thank you for that tip!!
> >
> > It's peculiar that simply enabling SAML in the global config and having a
> > fault SAML config would stop the UI from opening port 8080 to access the
> > webpage.
> >
> > Thanks again!
> >
> > On Mon, Jan 29, 2024 at 11:32 PM Suresh Kumar Anaparti <
> > sureshkumar.anapa...@gmail.com> wrote:
> >
> > > Hi Marcus,
> > >
> > > You can revert the config (disable saml) using the update sql query
> > below.
> > >
> > > UPDATE cloud.configuration SET value = 'false' WHERE name =
> > > 'saml2.enabled';
> > >
> > > Regards,
> > > Suresh
> > >
> > > On Tue, Jan 30, 2024 at 5:41 AM Marcus Torres 
> wrote:
> > >
> > > > Hi!
> > > > i recently enabled saml in the global config settings in the UI and
> > upon
> > > a
> > > > restart of the management service , the cloudstack-management process
> > > > starts successfully and i'm seeing activity and traffic to and from
> the
> > > > hypervisors, looks like the management server is working, but the UI
> is
> > > not
> > > > reachable at all on port 8080.
> > > >
> > > >
> > > >   *   i do not have ssl https enabled
> > > >   *   selinux is disabled
> > > >   *   iptables is disabled
> > > >   *   i don't see 8080 port open/listening  in netstat
> > > >   *   port 9090 is open and listening
> > > >   *   mysql is up and running fine
> > > >   *   cloudmonkey api no longer able to connect since 8080 is down
> > > >
> > > > the saml config could in fact be a red herring and unrelated but
> that's
> > > > the last change besides adding a new isolated vlan guest network.
> > > >
> > > > does the ability exist to revert or edit global config settings from
> > > > command line ,  that were originally made in the ui  ?
> > > >
> > > > thanks for your time !
> > > >
> > >
> >
>


Re: Method to edit global settings from command line

2024-01-30 Thread Marcus Torres
@SureshKumarAnaparti

That worked! after a restart of the management service, I'm able to hit the
UI on port 8080 now! thank you for that tip!!

It's peculiar that simply enabling SAML in the global config and having a
fault SAML config would stop the UI from opening port 8080 to access the
webpage.

Thanks again!

On Mon, Jan 29, 2024 at 11:32 PM Suresh Kumar Anaparti <
sureshkumar.anapa...@gmail.com> wrote:

> Hi Marcus,
>
> You can revert the config (disable saml) using the update sql query below.
>
> UPDATE cloud.configuration SET value = 'false' WHERE name =
> 'saml2.enabled';
>
> Regards,
> Suresh
>
> On Tue, Jan 30, 2024 at 5:41 AM Marcus Torres  wrote:
>
> > Hi!
> > i recently enabled saml in the global config settings in the UI and upon
> a
> > restart of the management service , the cloudstack-management process
> > starts successfully and i'm seeing activity and traffic to and from the
> > hypervisors, looks like the management server is working, but the UI is
> not
> > reachable at all on port 8080.
> >
> >
> >   *   i do not have ssl https enabled
> >   *   selinux is disabled
> >   *   iptables is disabled
> >   *   i don't see 8080 port open/listening  in netstat
> >   *   port 9090 is open and listening
> >   *   mysql is up and running fine
> >   *   cloudmonkey api no longer able to connect since 8080 is down
> >
> > the saml config could in fact be a red herring and unrelated but that's
> > the last change besides adding a new isolated vlan guest network.
> >
> > does the ability exist to revert or edit global config settings from
> > command line ,  that were originally made in the ui  ?
> >
> > thanks for your time !
> >
>


Method to edit global settings from command line

2024-01-29 Thread Marcus Torres
Hi!
i recently enabled saml in the global config settings in the UI and upon a 
restart of the management service , the cloudstack-management process starts 
successfully and i'm seeing activity and traffic to and from the hypervisors, 
looks like the management server is working, but the UI is not reachable at all 
on port 8080.


  *   i do not have ssl https enabled
  *   selinux is disabled
  *   iptables is disabled
  *   i don't see 8080 port open/listening  in netstat
  *   port 9090 is open and listening
  *   mysql is up and running fine
  *   cloudmonkey api no longer able to connect since 8080 is down

the saml config could in fact be a red herring and unrelated but that's the 
last change besides adding a new isolated vlan guest network.

does the ability exist to revert or edit global config settings from command 
line ,  that were originally made in the ui  ?

thanks for your time !


Re: Root disk resizing

2021-10-11 Thread Marcus
Cloud-init is always fun to debug :-). It will probably require some
playing with to get a pattern down.

There is perhaps a way to get it to re-check and grow every reboot if you
adjust/override the module frequency, deleting the module semaphore in
/var/lib/cloud/sem or worst case clearing the metadata via 'cloud-init
clear' or  deleting the /var/lib/cloud.

On Mon, Oct 11, 2021 at 3:07 AM Wido den Hollander  wrote:

>
>
> On 10/10/21 10:35 AM, Ranjit Jadhav wrote:
> > Hello folks,
> >
> > I have implemented cloudstack with Xenserver Host. The template has been
> > made out of VM with basic centos 7 and following package installed on it
> > 
> > sudo yum -y cloud-init
> > sudo yum -y install cloud-utils-growpart
> > sudo yum -y install gdisk
> > 
> >
> > After creating new VM with this template, root disk is created as per
> size
> > mention in template or we are able to increase it at them time of
> creation.
> >
> > But later when we try to increase root disk again, it increases disk
> space
> > but "/" partiton do not get autoresize.
> >
>
> As far as I know it only grows the partition once, eg, upon first boot.
> I won't do it again afterwards.
>
> Wido
>
> >
> > Following parameters were passed in userdata
> > 
> > #cloud-config
> > growpart:
> > mode: auto
> > devices: ["/"]
> > ignore_growroot_disabled: true
> > 
> >
> > Thanks & Regards,
> > Ranjit
> >
>


Re: [DISCUSS] Using qemu-kvm vs qemu-kvm-ev with CloudStack

2021-04-10 Thread Marcus
Yes, +1 on EV. It is more current, better maintained and I think it is
generally considered the go-to for EL based hypervisors (largely due to the
oVirt use).


On Saturday, April 10, 2021, Rohit Yadav  wrote:

> Great, thanks for sharing Simon. If we've consensus and there are no
> objections I would propose we update the docs around CentOS/KVM to use the
> -ev packages.
>
> Let's hear from others.
>
>
> Thanks and regards.
>
> 
> From: Simon Weller 
> Sent: Friday, April 9, 2021 19:09
> To: d...@cloudstack.apache.org ;
> users@cloudstack.apache.org 
> Subject: Re: [DISCUSS] Using qemu-kvm vs qemu-kvm-ev with CloudStack
>
> Hi Rohit,
>
> We've been using ev exclusively for a few years now. Our main reason was
> in order to support features we upstreamed around KVM iop limits a couple
> of years back.
> Short of one challenge that was addressed on the ACS side a while ago
> related to the patchviasocket integration, it has worked very well and has
> been very stable.
>
> -Si
>
> 
> From: Rohit Yadav 
> Sent: Friday, April 9, 2021 2:26 AM
> To: d...@cloudstack.apache.org ;
> users@cloudstack.apache.org 
> Subject: [DISCUSS] Using qemu-kvm vs qemu-kvm-ev with CloudStack
>
> All,
>
> We've recently seen some tests around live VM with storage failing on
> CentOS7 which is addressed in this PR:
> https://github.com/apache/cloudstack/pull/4801
>
> Some users have added on the original issue ticket that it works with
> qemu-kvm-ev on CentOS:
> https://github.com/apache/cloudstack/issues/4757#issuecomment-812595973
>
> I also see many other IaaS platforms notably oVirt using qemu-kvm-ev, is
> there any interest and argument in saying we test and update our docs to
> advise users to use qemu-kvm-ev on CentOS? Are there any CloudStack users
> who want to share their experience with it who may be using it already?
>
> The installation steps don't require configuring any 3rd party repository
> manually and usually done with:
>
> yum install centos-release-qemu-ev
> yum install qemu-kvm-ev
>
> Additional references:
> https://lists.centos.org/pipermail/centos-virt/2015-October/004717.html
> (what is qemu-kvm vs qemu-kvm-ev)
> https://wiki.centos.org/SpecialInterestGroup/Virtualization (the SIG that
> is behind the qemu-kvm-ev repository)
>
> Thanks and regards.
>
> rohit.ya...@shapeblue.com
> www.shapeblue.com
> 3 London Bridge Street,  3rd floor, News Building, London  SE1 9SGUK
> @shapeblue
>
>
>
>
> rohit.ya...@shapeblue.com
> www.shapeblue.com
> 3 London Bridge Street,  3rd floor, News Building, London  SE1 9SGUK
> @shapeblue
>
>
>
>


Re: [VOTE] Apache CloudStack 4.14.0.0 RC3

2020-05-20 Thread Marcus
 would not expect to encounter it.
> > >
> > > As for recovery, we've managed to recover vCenter and Cloudstack
> > after
> > > reboots of the vCenter machine and the Cloudstack management service.
> > > There's no exact points to recover for now, but restart seems to work.
> > > By graceful failure I mean, cloudstack erroring out the deployment
> > and
> > > VM finished in ERROR state, meanwhile connection and operability with
> > > vCenter cluster remains the same.
> > >
> > > We're currently exploring options to fix this, one could be to
> > disable
> > > the feature for VMWare and work to introduce more sustainable fix in
> next
> > > release. Other is to look for more guarding code when installing a
> > > template, since VMware doesn’t actually allow you install that
> particular
> > > template but cloudstack does. We'll keep you posted.
> > >
> > > Thanks,
> > > Bobby.
> > >
> > > On 18.05.20, 23:01, "Marcus"  wrote:
> > >
> > > The issue sounds severe enough that a release note probably
> won't
> > > suffice -
> > > unless there's a documented way to recover we'd never want to
> > > leave a
> > > system susceptible to being unrecoverable, even if it's rarely
> > > triggered.
> > >
> > > What's involved in "failing gracefully"? Is this a small fix,
> or
> > an
> > > overhaul?  Perhaps the new feature could be disabled for
> VMware,
> > or
> > > disabled altogether until a fix is made in a patch release.
> > >
> > > Does it only affect new templates, or is there a risk that an
> > > existing
> > > template out in vSphere could suddenly cause problems?
> > >
> > > On Mon, May 18, 2020 at 12:49 AM Boris Stoyanov <
> > > boris.stoya...@shapeblue.com> wrote:
> > >
> > > > Hi guys,
> > > >
> > > > A little further info on this, it appears when we use a
> > > corrupted template
> > > > and UEFI/Legacy mode when deploy a VM, it breaks the
> connection
> > > between
> > > > cloudstack and vCenter.
> > > >
> > > > All hosts become unreachable and basically the cluster is not
> > > functional,
> > > > have not investigated a way to recover this but seems like a
> > > huge mess..
> > > > Please note that user is not able to register such template
> in
> > > vCenter
> > > > directly, but cloudstack allows using it.
> > > >
> > > > Open to discuss if we'll fix this, since it's expected users
> to
> > > use
> > > > working templates, I think we should be failing gracefully
> and
> > > such action
> > > > should not be able to create downtime on such a large scale.
> > > >
> > > > I believe the boot type feature is new one and it's not
> > > available in older
> > > > releases, so this issue should be limited to 4.14/current
> > master.
> > > >
> > > > Thanks,
> > > > Bobby.
> > > >
> > > > On 15.05.20, 17:07, "Boris Stoyanov" <
> > > boris.stoya...@shapeblue.com>
> > > > wrote:
> > > >
> > > > I'll have to -1 RC3, we've discovered details about an
> > issue
> > > which is
> > > > causing severe consequences with a particular hypervisor in
> the
> > > afternoon.
> > > > We'll need more time to investigate before disclosing.
> > > >
> > > > Bobby.
> > > >
> > > > On 15.05.20, 9:12, "Boris Stoyanov" <
> > > boris.stoya...@shapeblue.com>
> > > > wrote:
> > > >
> > > > +1 (binding)
> > > >
> > > > I've executed upgrade tests with the following
> > > configurations:
> > > >
> > > > 4.13.1 with KVM on CentOS7 hosts
> > > > 4.13 with VMware6.5 hosts
> > &

Re: [VOTE] Apache CloudStack 4.14.0.0 RC3

2020-05-18 Thread Marcus
The issue sounds severe enough that a release note probably won't suffice -
unless there's a documented way to recover we'd never want to leave a
system susceptible to being unrecoverable, even if it's rarely triggered.

What's involved in "failing gracefully"? Is this a small fix, or an
overhaul?  Perhaps the new feature could be disabled for VMware, or
disabled altogether until a fix is made in a patch release.

Does it only affect new templates, or is there a risk that an existing
template out in vSphere could suddenly cause problems?

On Mon, May 18, 2020 at 12:49 AM Boris Stoyanov <
boris.stoya...@shapeblue.com> wrote:

> Hi guys,
>
> A little further info on this, it appears when we use a corrupted template
> and UEFI/Legacy mode when deploy a VM, it breaks the connection between
> cloudstack and vCenter.
>
> All hosts become unreachable and basically the cluster is not functional,
> have not investigated a way to recover this but seems like a huge mess..
> Please note that user is not able to register such template in vCenter
> directly, but cloudstack allows using it.
>
> Open to discuss if we'll fix this, since it's expected users to use
> working templates, I think we should be failing gracefully and such action
> should not be able to create downtime on such a large scale.
>
> I believe the boot type feature is new one and it's not available in older
> releases, so this issue should be limited to 4.14/current master.
>
> Thanks,
> Bobby.
>
> On 15.05.20, 17:07, "Boris Stoyanov" 
> wrote:
>
> I'll have to -1 RC3, we've discovered details about an issue which is
> causing severe consequences with a particular hypervisor in the afternoon.
> We'll need more time to investigate before disclosing.
>
> Bobby.
>
> On 15.05.20, 9:12, "Boris Stoyanov" 
> wrote:
>
> +1 (binding)
>
> I've executed upgrade tests with the following configurations:
>
> 4.13.1 with KVM on CentOS7 hosts
> 4.13 with VMware6.5 hosts
> 4.11.3 with KVM on CentOS7 hosts
> 4.11.2 with XenServer7 hosts
> 4.11.1 with VMware 6.7
> 4.9.3 with XenServer 7 hosts
> 4.9.2 with KVM on CentOS 7 hosts
>
> Also I've run basic lifecycle operations on the following
> components:
> VMs
> Volumes
> Infra (zones, pod, clusters, hosts)
> Networks
> and more
>
> I did not come across any problems during this testing.
>
> Thanks,
> Bobby.
>
>
> On 11.05.20, 18:21, "Andrija Panic" 
> wrote:
>
> Hi All,
>
> I've created a 4.14.0.0 release (RC3), with the following
> artefacts up for
> testing and a vote:
>
> Git Branch and Commit SH:
>
> https://gitbox.apache.org/repos/asf?p=cloudstack.git;a=shortlog;h=refs/heads/4.14.0.0-RC20200511T1503
> Commit: 6f96b3b2b391a9b7d085f76bcafa3989d9832b4e
>
> Source release (checksums and signatures are available at the
> same
> location):
> https://dist.apache.org/repos/dist/dev/cloudstack/4.14.0.0/
>
> PGP release keys (signed using 3DC01AE8):
> https://dist.apache.org/repos/dist/release/cloudstack/KEYS
>
> The vote will be open until 14th May 2020, 17.00 CET (72h).
>
> For sanity in tallying the vote, can PMC members please be
> sure to indicate
> "(binding)" with their vote?
>
> [ ] +1 approve
> [ ] +0 no opinion
> [ ] -1 disapprove (and reason why)
>
> Additional information:
>
> For users' convenience, I've built packages from
> 6f96b3b2b391a9b7d085f76bcafa3989d9832b4e and published RC3
> repository here:
> http://packages.shapeblue.com/testing/41400rc3/  (CentOS 7 and
> Debian/generic, both with noredist support)
> and here
>
> https://download.cloudstack.org/testing/4.14.0.0-RC20200506T2028/ubuntu/bionic/
>  (Ubuntu 18.04 specific, no noredist support - thanks to
> Gabriel):
>
> The release notes are still work-in-progress, but for the
> upgrade
> instructions (including the new systemVM templates) you may
> refer to the
> following URL:
>
> https://acs-www.shapeblue.com/docs/WIP-PROOFING/pr112/upgrading/index.html
>
> 4.14.0.0 systemVM templates are available from here:
> http://download.cloudstack.org/systemvm/4.14/
>
> NOTES on issues fixed in this RC3 release:
>
> (this one does *NOT* require a full retest if you were testing
> RC1/RC2
> already - just if you were affected this issue):
> - https://github.com/apache/cloudstack/pull/4064 - affects
> hostnames when
> attaching a VM to additional networks
>
> Regards,
>
>
> Andrija Panić
>
>
>
>
>
> boris.stoya...@shapeblue.com
> www.shapeblue.com
> 3 London Bridge Street,  3rd fl

Re: [DISCUSS] Use of Github packages for hosting version specific maven packages?

2019-11-19 Thread Marcus
I've played with this a little, and it shows promise. I did run into two
issues:

1) My maven build began downloading packages and stopped at 'cloud-engine'.
Artifact seems to be missing - "Failed to read artifact descriptor for
org.apache.cloudstack:cloud-engine-api:jar:4.13.0.0: Could not find
artifact org.apache.cloudstack:cloud-engine:pom:4.13.0.0"

2) Github Packages seems to require authentication. While this isn't the
end of the world, it complicates setup slightly. I'm not sure if this is a
github setting or if it's just not possible to have public artifacts with
Github.

On Mon, Nov 18, 2019 at 10:24 AM Marcus  wrote:

> I'll try to find time to see if I can point my plugins archetype generator
> at that. It would be extremely useful and simplify building plugin packages.
>
> On Fri, Nov 15, 2019 at 5:10 AM Rohit Yadav 
> wrote:
>
>> All,
>>
>> This has come up a few times in the past when someone wants to
>> build/extend CloudStack and for that they would need to extract and use
>> version specific jars from deb/rpm packages for their apps. I was
>> experimenting with the new Github packages feature against the recent
>> 4.13.0.0 release and could easily publish the jar artifacts here:
>> https://github.com/apache/cloudstack/packages
>>
>> Thoughts if that's a good way to proceed or find any suitable way of
>> version specific jar artifact/packages publication and hosting?
>>
>>
>> Regards,
>>
>> Rohit Yadav
>>
>> Software Architect, ShapeBlue
>>
>> https://www.shapeblue.com
>>
>> rohit.ya...@shapeblue.com
>> www.shapeblue.com
>> Amadeus House, Floral Street, London  WC2E 9DPUK
>> @shapeblue
>>
>>
>>
>>


Re: [DISCUSS] Use of Github packages for hosting version specific maven packages?

2019-11-18 Thread Marcus
I'll try to find time to see if I can point my plugins archetype generator
at that. It would be extremely useful and simplify building plugin packages.

On Fri, Nov 15, 2019 at 5:10 AM Rohit Yadav 
wrote:

> All,
>
> This has come up a few times in the past when someone wants to
> build/extend CloudStack and for that they would need to extract and use
> version specific jars from deb/rpm packages for their apps. I was
> experimenting with the new Github packages feature against the recent
> 4.13.0.0 release and could easily publish the jar artifacts here:
> https://github.com/apache/cloudstack/packages
>
> Thoughts if that's a good way to proceed or find any suitable way of
> version specific jar artifact/packages publication and hosting?
>
>
> Regards,
>
> Rohit Yadav
>
> Software Architect, ShapeBlue
>
> https://www.shapeblue.com
>
> rohit.ya...@shapeblue.com
> www.shapeblue.com
> Amadeus House, Floral Street, London  WC2E 9DPUK
> @shapeblue
>
>
>
>


Re: Hackathon @apachecon

2019-09-07 Thread Marcus
I think some direction may come out of what we see at the conference.

UX - UI, API, CLI
KVM agent communication model

On Friday, September 6, 2019, Paul Angus  wrote:

> Hi All,
>
> We have a large room for the day on Wednesday for a hackathon.  I think it
> might be a good idea if we marshal some thoughts around what we'd like to
> do with the time.
> I guess that we'll end up with some splinter groups who want to work on
> something very specific together as well, I can't see that being a problem.
>
> Some thing that I'd like to put out there as a discussion is the
> networking models - there has been talk of rationalising and dropping
> 'basic' networks as a separate model and using advanced networks with
> security groups instead.  Also the combining of the VR and VPC code to make
> an isolated network a single tier VPC.   I'd like to have a group
> discussion around what everyone would like to do and how we might do it.
>
> Are there any other topics that people think that would benefit from a
> group discussion ?
>
>
> Cheers
>
>
> Paul Angus
>
>
> paul.an...@shapeblue.com
> www.shapeblue.com
> Amadeus House, Floral Street, London  WC2E 9DPUK
> @shapeblue
>
>
>
>


Re: Runing cloudstack failed in docker

2018-11-06 Thread Marcus
Caused by: java.net.SocketException: Protocol family unavailable

This is a common Java issue when trying to containerize, I believe you can
search for it and find a generic answer. Basically has to do with whether
Java is trying to use IPv4 or IPv6 and what your Docker solution supports.
If I recall there is a 'bind.interface' field in
/etc/cloudstack/management/server.properties
that can determine if it is going to bind IPv4 or 6.

On Mon, Nov 5, 2018 at 2:37 AM li li  wrote:

> Hi ALL
>
>
> I'm trying to encapsulate cloudstack 4.11 into docker. After build is
> successful, cloudstack-management cannot function properly.
>
> Can someone help me? Thank you very much.
>
>
> Dockerfile:
>
>
> https://github.com/apache/cloudstack/blob/4.11/tools/docker/Dockerfile.centos6
>
>
> from cloudstack-management.err Error:
>
> 05/11/2018 09:07:39 128 jsvc.exec error: Cannot start daemon
> 05/11/2018 09:07:39 126 jsvc.exec error: Service exit with a return value
> of 5
> java.lang.reflect.InvocationTargetException
>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>at java.lang.reflect.Method.invoke(Method.java:498)
>at
> org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:241)
> Caused by: java.net.SocketException: Protocol family unavailable
>at sun.nio.ch.Net.bind0(Native Method)
>at sun.nio.ch.Net.bind(Net.java:433)
>at sun.nio.ch.Net.bind(Net.java:425)
>at sun.nio.ch
> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>at
> org.eclipse.jetty.server.ServerConnector.openAcceptChannel(ServerConnector.java:334)
>at
> org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:302)
>at
> org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:80)
>at
> org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:238)
>at
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
>at org.eclipse.jetty.server.Server.doStart(Server.java:397)
>at
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
>at org.apache.cloudstack.ServerDaemon.start(ServerDaemon.java:200)
>... 5 more
> OpenJDK 64-Bit Server VM warning: ignoring option PermSize=512M; support
> was removed in 8.0
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=800m;
> support was removed in 8.0
>
>
> From management-server.log:
>
> 2018-11-05 09:24:14,967 INFO  [o.a.c.s.m.m.i.DefaultModuleDefinitionSet]
> (main:null) (logid:) Starting module [outofbandmanagement]
> 2018-11-05 09:24:14,967 INFO  [o.a.c.s.l.CloudStackExtendedLifeCycle]
> (main:null) (logid:) Starting CloudStack Components
> 2018-11-05 09:24:14,967 INFO  [o.a.c.s.l.CloudStackExtendedLifeCycle]
> (main:null) (logid:) Done Starting CloudStack Components
> 2018-11-05 09:24:14,967 INFO  [o.a.c.s.m.m.i.DefaultModuleDefinitionSet]
> (main:null) (logid:) Starting module [ipmitool]
> 2018-11-05 09:24:14,967 INFO  [o.a.c.s.l.CloudStackExtendedLifeCycle]
> (main:null) (logid:) Starting CloudStack Components
> 2018-11-05 09:24:14,977 DEBUG
> [o.a.c.o.d.i.IpmitoolOutOfBandManagementDriver] (main:null) (logid:)
> OutOfBandManagementDriver ipmitool initialized: ipmitool version 1.8.15
>
> 2018-11-05 09:24:14,977 INFO  [o.a.c.s.l.CloudStackExtendedLifeCycle]
> (main:null) (logid:) Done Starting CloudStack Components
> 2018-11-05 09:24:14,977 INFO  [o.a.c.s.m.m.i.DefaultModuleDefinitionSet]
> (main:null) (logid:) Starting module [nested-cloudstack]
> 2018-11-05 09:24:14,977 INFO  [o.a.c.s.l.CloudStackExtendedLifeCycle]
> (main:null) (logid:) Starting CloudStack Components
> 2018-11-05 09:24:14,977 INFO  [o.a.c.s.l.CloudStackExtendedLifeCycle]
> (main:null) (logid:) Done Starting CloudStack Components
> 2018-11-05 09:24:14,978 INFO  [o.e.j.s.h.C.client] (main:null) (logid:)
> Initializing Spring root WebApplicationContext
> 2018-11-05 09:24:15,043 INFO  [o.a.c.s.l.CloudStackExtendedLifeCycle]
> (main:null) (logid:) Configuring CloudStack Components
> 2018-11-05 09:24:15,043 INFO  [o.a.c.s.l.CloudStackExtendedLifeCycle]
> (main:null) (logid:) Done Configuring CloudStack Components
> 2018-11-05 09:24:15,056 INFO  [c.c.u.LogUtils] (main:null) (logid:) log4j
> configuration found at /etc/cloudstack/management/log4j-cloud.xml
> 2018-11-05 09:24:15,072 INFO  [o.e.j.s.h.ContextHandler] (main:null)
> (logid:) Started o.e.j.w.WebAppContext@6b1274d2
> {/client,file:///usr/share/cloudstack-management/webapp/,AVAILABLE}{/usr/share/cloudstack-management/webapp}
> 2018-11-05 09:24:15,073 INFO  [o.e.j.s.h.ContextHandler] (main:null)
> (logid:) Started o.e.j.s.h.MovedContextHa

Re: [RFC] Metrics views for CloudStack UI

2015-11-12 Thread Marcus
Hi Nux,
   The thing about ghz is that it is the unit of capacity for CPU, VMs are
allocated to hosts according to the number of "cycles" it has.  As a
customer, I agree, core count is more important. As an admin, if you have a
single host in a cluster that is using much more CPU than the others and
want to try to balance, the ghz number for the VM can tell you 1) which VMs
on a host are the 'biggest' when cgroup throttling kicks in, that is, how
much of the host CPU share a VM will get, and 2) if that VM will fit on
another host - the old UI helps you know which hosts don't have capacity
for a migration, but it doesn't tell you how full each host is and doesn't
give you this data to know how full you'll make a host if you migrate.

Many people will want these metrics to go into a time series system and
use something like the graphite publisher instead, as that will give better
visibility into what's going on over time, but this seems like a good
out-of-the-box solution to expose the data we already have buried in the UI.

On Thu, Nov 5, 2015 at 9:35 AM, Nux!  wrote:

> Great work Rohit,
>
> What I'd like to see:
> - vCPU list/count for instance metrics (GHz is meaningless to me)
> - can we make the whole thing wider so we can fit more columns there
> without that ugly horizontal scroll bar? So much wasted screen space
>
> --
> Sent from the Delta quadrant using Borg technology!
>
> Nux!
> www.nux.ro
>
> - Original Message -
> > From: "Rohit Yadav" 
> > To: d...@cloudstack.apache.org
> > Cc: users@cloudstack.apache.org
> > Sent: Thursday, 5 November, 2015 14:09:14
> > Subject: [RFC] Metrics views for CloudStack UI
>
> > Hi all,
> >
> > The present CloudStack UI hides most of the metrics data such as cpu,
> memory,
> > disk, network usage in inner detail views. Such information is critical
> to find
> > issues in one’s cloud, for example finding clusters where hosts are
> failing, or
> > finding storage pools where disk space has depleted beyond configured
> global or
> > cluster thresholds.
> >
> > The metrics views for CloudStack UI is an attempt to solve those
> problems that
> > brings in several UI enhancements such as sortable tables, new status
> icons,
> > methods to control breadcrumb navigation, making UI’s global list* API
> pagesize
> > dynamic, a new table widget based on listView widget that is both
> horizontally
> > and vertically scrollable, supports cell/threshold coloring, collapsible
> > columns along with navigation from one view to another and quick-view
> actions.
> > For example, currently support navigation are: Zone to Cluster to Host to
> > Instance to Volumes, and Storage Pool to Volumes.
> >
> > The current version implements six resource views for zone, cluster,
> host,
> > instance, volume and storage pool (primary storage). The metrics
> framework
> > (based on listView widget) would allow developers to write more such
> view where
> > information can be densely packed.
> >
> > Please checkout the FS (with some screenshots) and the PR;
> >
> > FS: https://issues.apache.org/jira/browse/CLOUDSTACK-9020
> > JIRA: https://issues.apache.org/jira/browse/CLOUDSTACK-9020
> > PR: https://github.com/apache/cloudstack/pull/1038
> >
> > Comments and suggestions?
> >
> > Regards,
> > Rohit Yadav
> > Software Architect, ShapeBlue
> >
> >
> > [cid:image003.png@01D104EF.CE276C40]
> >
> >
> > M. +91 88 262 30892 |
> > rohit.ya...@shapeblue.com
> > Blog: bhaisaab.org | Twitter: @_bhaisaab
> > ShapeBlue Ltd, 53 Chandos Place, Covent Garden, London, WC2N 4HS
> >
> > Find out more about ShapeBlue and our range of CloudStack related
> services
> >
> > IaaS Cloud Design & Build<
> http://shapeblue.com/iaas-cloud-design-and-build//>
> > CSForge – rapid IaaS deployment framework
> > CloudStack Consulting
> > CloudStack Software
> > Engineering
> > CloudStack Infrastructure
> > Support
> > CloudStack Bootcamp Training Courses<
> http://shapeblue.com/cloudstack-training/>
> >
> > This email and any attachments to it may be confidential and are
> intended solely
> > for the use of the individual to whom it is addressed. Any views or
> opinions
> > expressed are solely those of the author and do not necessarily
> represent those
> > of Shape Blue Ltd or related companies. If you are not the intended
> recipient
> > of this email, you must neither take any action based upon its contents,
> nor
> > copy or show it to anyone. Please contact the sender if you believe you
> have
> > received this email in error. Shape Blue Ltd is a company incorporated in
> > England & Wales. ShapeBlue Services India LLP is a company incorporated
> in
> > India and is operated under license from Shape Blue Ltd. Shape Blue
> Brasil
> > Consultoria Ltda is a company incorporat

Re: devcloud-kvm doesn't work (tag 4.5.1)

2015-07-02 Thread Marcus
I'm not a Marvin guru, but if all else fails you can set up a zone manually
via the UI, using the parameters outlined in the advanced cfg.

On Thu, Jul 2, 2015 at 2:55 AM, Sebastien Goasguen  wrote:

> Marcus in cc wrote devcloud-kvm a while back,
>
> he might be able to help.
>
> > On Jul 2, 2015, at 11:40 AM, Fabio Da Soghe
>  wrote:
> >
> > Hello everybody.
> >
> > I'm experimenting with devcloud-kvm following instruction in
> https://cwiki.apache.org/confluence/display/CLOUDSTACK/devcloud-kvm on
> Ubuntu 14.04.
> >
> > I get stuck on the very last step, the command: "python
> marvin/marvin/deployDataCenter.py -i
> devcloud-kvm/devcloud-kvm-advanced.cfg" (the wiki page here is a little
> messy, there are some typos that could be corrected).
> >
> > I got this error:
> >
> > Exception Occurred Under createLogs :['Traceback (most recent call
> last):', '  File
> "/usr/local/lib/python2.7/dist-packages/Marvin-4.5.1-py2.7.egg/marvin/marvinLog.py",
> line 157, in createLogs (\'LogFolderPath\' in log_cfg.__dict__.keys())
> and', "AttributeError: 'list' object has no attribute '__dict__'\n"]
> >
> > I found an old discussion on this mailing list about this same error
> (here: http://markmail.org/thread/n673ecf2eztn3nqo ). Riassuming: there
> was an error in tools/devcloud/devcloud-advanced.cfg about defining the
> logger element (it should be an object and not a list). This was corrected
> in the codebase (commit here:
> https://github.com/apache/cloudstack/commit/7cf33f96aa113a389d8e75f64dca810f9f032065)
> BUT not for the devcloud-kvm, where the logger is still defined as a list.
> >
> > I corrected the file tools/devcloud-kvm/devcloud-kvm-advanced.cfg as in
> tools/devcloud/devcloud-advanced.cfg, gave again the last python command,
> and now I get:
> >
> >  Log Folder Path:
> /tmp//MarvinLogs//DeployDataCenter__Jul_02_2015_03_25_08_FQ3N8R. All logs
> will be available here 
> >
> > === TestClient Creation Failed===
> >
> > Now I'm really stuck. Does anybody have any hint for me, please?
> >
> > Thank you in advance,
> >
> > Fabio Da Soghe
> >
>
>


Re: ACS 4.5.1 mgmt DB HA - not working - invalid load balancing strategy

2015-06-11 Thread Marcus
When I build CloudStack RPMs on 4.5 branch I get mysql ha RPMs. Looking at
the specfile:

%if "%{_ossnoss}" == "noredist"

%package mysql-ha

Summary: Apache CloudStack Balancing Strategy for MySQL

Requires: mysql-connector-java

Requires: %{_tomcatversion}

Group: System Environmnet/Libraries

%description mysql-ha

Apache CloudStack Balancing Strategy for MySQL


%endif


Looks like you need to "./package.sh -p noredist" when packaging. Not sure
what the equivalent is for .deb packaging.  That means you also have to be
set up with the non-oss dependencies. If you're not familiar with that let
us know.

On Thu, Jun 11, 2015 at 7:06 AM, Andrija Panic 
wrote:

> Actually, on another ACS installation, there is file:
>
>  
> /usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-plugin-database-mysqlha-4.3.2.jar
> (acs 4.3.2 :) )
>
> But on this 4.5.1 there is no such file.
>
> Is it possible that we didnt compile 4.5.1 in appropriate way ?
>
> Thanks,
> Andrija
>
> On 11 June 2015 at 15:14, Andrija Panic  wrote:
>
> > Hi,
> >
> > I'm trying the DB HA setup, by chaning 3 lines in db.properties file
> > (enable HA, define slaves for cloud, define slaves for usage DB)
> >
> > After restart, mgmt server doesn start with folowing error, as it seems
> > invalid load balancing strategy variable...
> >
> > Any clues on this ?
> >
> > mysql setup is galera 3 node cluster, 1st node being used as master in
> > db.properties, and only second node being used as slave (galera2 node)
> > I confirmed I can telnet, login etc... also mysql login from mgmt to
> these
> > master/slaves works of course...
> >
> >
> >
> > 2015-06-11 14:25:03,149 INFO  [c.c.u.d.T.Transaction] (main:null) Is Data
> > Base High Availiability enabled? Ans : true
> > 2015-06-11 14:25:03,191 INFO  [c.c.u.d.T.Transaction] (main:null) The
> > slaves configured for Cloud Data base is/are : 10.20.10.6
> > 2015-06-11 14:25:03,269 ERROR [c.c.u.d.Merovingian2] (main:null) Unable
> to
> > get a new db connection
> > java.sql.SQLException: Invalid load balancing strategy
> > 'com.cloud.utils.db.StaticStrategy'.
> > at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:927)
> > at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:924)
> > at com.mysql.jdbc.Util.loadExtensions(Util.java:602)
> > at
> >
> com.mysql.jdbc.LoadBalancingConnectionProxy.(LoadBalancingConnectionProxy.java:285)
> > at
> >
> com.mysql.jdbc.FailoverConnectionProxy.(FailoverConnectionProxy.java:67)
> > at
> >
> com.mysql.jdbc.NonRegisteringDriver.connectFailover(NonRegisteringDriver.java:430)
> > at
> >
> com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:343)
> > at java.sql.DriverManager.getConnection(DriverManager.java:571)
> > at java.sql.DriverManager.getConnection(DriverManager.java:215)
> > at
> >
> org.apache.commons.dbcp.DriverManagerConnectionFactory.createConnection(DriverManagerConnectionFactory.java:75)
> > at
> >
> org.apache.commons.dbcp.PoolableConnectionFactory.makeObject(PoolableConnectionFactory.java:582)
> > at
> >
> org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1188)
> > at
> >
> org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
> > at
> >
> com.cloud.utils.db.TransactionLegacy.getStandaloneConnectionWithException(TransactionLegacy.java:203)
> > at com.cloud.utils.db.Merovingian2.(Merovingian2.java:68)
> > at
> > com.cloud.utils.db.Merovingian2.createLockMaster(Merovingian2.java:80)
> > at
> > com.cloud.server.LockMasterListener.(LockMasterListener.java:33)
> > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> > Method)
> > at
> >
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> > at
> >
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> > at
> java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> > at
> > org.springframework.beans.BeanUtils.instantiateClass(BeanUtils.java:148)
> > at
> >
> org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:121)
> > at
> >
> org.springframework.beans.factory.support.ConstructorResolver.autowireConstructor(ConstructorResolver.java:280)
> > at
> >
> org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.autowireConstructor(AbstractAutowireCapableBeanFactory.java:1045)
> > at
> >
> org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutowireCapableBeanFactory.java:949)
> > at
> >
> org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:487)
> > at
> >
> org.springframew

Re: Problem Upload Windows volume to ACS 4.5.1

2015-06-09 Thread Marcus
... and this is the relevant portion of the logs indicating the failure
reason:

"Template
content is unsupported, or mismatch between selected format and template
content. Found  : x86 boot sector; partition 1"

On Tue, Jun 9, 2015 at 4:53 PM, Marcus  wrote:

> Looks like it is seeing a raw disk, when you specified it was QCOW2.
> That's what the 'file' command is doing. We used to just trust the name of
> the file, but this was enhanced to inspect the first 1MB of the download
> and validate that you are supplying the image format that CS expects.
>
> Because the first 1MB shows that this image has a boot sector, I'm
> assuming it is a raw disk image, instead of a qcow2. Note that CloudStack
> 4.5 should support raw format for KVM images, but you have to specify it on
> the API.
>
> [root@devcloud-kvm7 tmp]# qemu-img create -f qcow2 img.qcow2 1G
>
> Formatting 'img.qcow2', fmt=qcow2 size=1073741824 encryption=off
> cluster_size=65536 lazy_refcounts=off
>
>
> [root@devcloud-kvm7 tmp]# qemu-img create -f raw img.raw 1G
>
> Formatting 'img.raw', fmt=raw size=1073741824
>
>
> [root@devcloud-kvm7 tmp]# file img.qcow2
>
> img.qcow2: QEMU QCOW Image (v3), 1073741824 bytes
>
>
> [root@devcloud-kvm7 tmp]# parted img.raw mklabel
>
> New disk label type? msdos
>
>
> [root@devcloud-kvm7 tmp]# file img.raw
>
> img.raw: x86 boot sector, code offset 0xb8
>
> On Tue, Jun 9, 2015 at 3:01 AM, Jochim, Ingo 
> wrote:
>
>> Hi Andrija,
>>
>> have you tried to convert (qemu-img convert) it to RAW and upload then?
>> Ceph is using RAW devices. The conversion takes place on storage
>> migration but on upload?
>>
>> Regards,
>> Ingo
>>
>> -Ursprüngliche Nachricht-
>> Von: Andrija Panic [mailto:andrija.pa...@gmail.com]
>> Gesendet: Dienstag, 9. Juni 2015 10:24
>> An: d...@cloudstack.apache.org; users@cloudstack.apache.org
>> Betreff: Problem Upload Windows volume to ACS 4.5.1
>>
>> HI guys,
>>
>> we try to move some volumes from one ACS installation to another (from
>> 4.3.2 to 4.5.1).
>>
>> Since we are using CEPH, and volume extract/download doesn't work at the
>> moment, we do workarround, we snapshots Windows DATA volume, convert to
>> template, and then we extract URL / download.
>>
>> Then we use this URL to "Upload Volume" to ACS 4.5.1 - but it fails
>> almoust imiddiately with error inside SSVM (nothing usefull in management
>> log) - and the volume is deleted from ACS:
>>
>>  I see there is inspecting disk with "file" commands...any thought why is
>> this failing ?
>>
>> These are source Windows DATA disk btw:
>>
>> 2015-06-09 07:58:47,811 DEBUG [cloud.agent.Agent]
>> (agentRequest-Handler-10:null) Request:Seq 81-4133460032995983410:  { Cmd
>> ,
>> MgmtId: 90520741174948, via: 81, Ver: v1, Flags: 100011,
>> [{"org.apache.cloudstack.storage.command.DownloadCommand":{"hvm":false,"maxDownloadSizeInBytes":5497558138880,"id":468,"resourceType":"VOLUME","installPath":"volumes/2/468","_store":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://
>> 10.13.2.1/data/tank/secondary","_role":"Image"}},"url":"
>> http://46.232.180.244/userdata/6f2280e7-86d6-4fc7-abe8-e2bbfdeed442.qcow2
>> ","format":"QCOW2","accountId":2,"name":"andrija2","wait":0}}]
>> }
>> 2015-06-09 07:58:47,814 DEBUG [cloud.agent.Agent]
>> (agentRequest-Handler-10:null) Processing command:
>> org.apache.cloudstack.storage.command.DownloadCommand
>> 2015-06-09 07:58:47,815 INFO
>>  [storage.resource.NfsSecondaryStorageResource]
>> (agentRequest-Handler-10:null) Determined host 10.13.2.1 corresponds to IP
>> 10.13.2.1
>> 2015-06-09 07:58:47,873 INFO  [storage.template.HttpTemplateDownloader]
>> (agentRequest-Handler-10:null) No credentials configured for host=
>> 46.232.180.244:80
>> 2015-06-09 07:58:47,895 INFO  [storage.template.HttpTemplateDownloader]
>> (pool-1-thread-3:null) Starting download from
>> http://46.232.180.244/userdata/6f2280e7-86d6-4fc7-abe8-e2bbfdeed442.qcow2
>> to
>>
>> /mnt/SecStorage/dd40674b-575d-3bd2-87b3-7f1e1db4c02e/volumes/2/468/dnld855093017660470tmp_
>> remoteSize=21474836480 , max size=5497558138880
>> 2015-06-09 07:58:47,909 DEBUG [utils.script.Script] (pool-1-thread-3:null)
>> Executing: /bin/bash -c file
>>
>> /mnt/SecStorage/dd40674b-575d-3bd

Re: Problem Upload Windows volume to ACS 4.5.1

2015-06-09 Thread Marcus
Looks like it is seeing a raw disk, when you specified it was QCOW2. That's
what the 'file' command is doing. We used to just trust the name of the
file, but this was enhanced to inspect the first 1MB of the download and
validate that you are supplying the image format that CS expects.

Because the first 1MB shows that this image has a boot sector, I'm assuming
it is a raw disk image, instead of a qcow2. Note that CloudStack 4.5 should
support raw format for KVM images, but you have to specify it on the API.

[root@devcloud-kvm7 tmp]# qemu-img create -f qcow2 img.qcow2 1G

Formatting 'img.qcow2', fmt=qcow2 size=1073741824 encryption=off
cluster_size=65536 lazy_refcounts=off


[root@devcloud-kvm7 tmp]# qemu-img create -f raw img.raw 1G

Formatting 'img.raw', fmt=raw size=1073741824


[root@devcloud-kvm7 tmp]# file img.qcow2

img.qcow2: QEMU QCOW Image (v3), 1073741824 bytes


[root@devcloud-kvm7 tmp]# parted img.raw mklabel

New disk label type? msdos


[root@devcloud-kvm7 tmp]# file img.raw

img.raw: x86 boot sector, code offset 0xb8

On Tue, Jun 9, 2015 at 3:01 AM, Jochim, Ingo 
wrote:

> Hi Andrija,
>
> have you tried to convert (qemu-img convert) it to RAW and upload then?
> Ceph is using RAW devices. The conversion takes place on storage migration
> but on upload?
>
> Regards,
> Ingo
>
> -Ursprüngliche Nachricht-
> Von: Andrija Panic [mailto:andrija.pa...@gmail.com]
> Gesendet: Dienstag, 9. Juni 2015 10:24
> An: d...@cloudstack.apache.org; users@cloudstack.apache.org
> Betreff: Problem Upload Windows volume to ACS 4.5.1
>
> HI guys,
>
> we try to move some volumes from one ACS installation to another (from
> 4.3.2 to 4.5.1).
>
> Since we are using CEPH, and volume extract/download doesn't work at the
> moment, we do workarround, we snapshots Windows DATA volume, convert to
> template, and then we extract URL / download.
>
> Then we use this URL to "Upload Volume" to ACS 4.5.1 - but it fails
> almoust imiddiately with error inside SSVM (nothing usefull in management
> log) - and the volume is deleted from ACS:
>
>  I see there is inspecting disk with "file" commands...any thought why is
> this failing ?
>
> These are source Windows DATA disk btw:
>
> 2015-06-09 07:58:47,811 DEBUG [cloud.agent.Agent]
> (agentRequest-Handler-10:null) Request:Seq 81-4133460032995983410:  { Cmd ,
> MgmtId: 90520741174948, via: 81, Ver: v1, Flags: 100011,
> [{"org.apache.cloudstack.storage.command.DownloadCommand":{"hvm":false,"maxDownloadSizeInBytes":5497558138880,"id":468,"resourceType":"VOLUME","installPath":"volumes/2/468","_store":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://
> 10.13.2.1/data/tank/secondary","_role":"Image"}},"url":"
> http://46.232.180.244/userdata/6f2280e7-86d6-4fc7-abe8-e2bbfdeed442.qcow2
> ","format":"QCOW2","accountId":2,"name":"andrija2","wait":0}}]
> }
> 2015-06-09 07:58:47,814 DEBUG [cloud.agent.Agent]
> (agentRequest-Handler-10:null) Processing command:
> org.apache.cloudstack.storage.command.DownloadCommand
> 2015-06-09 07:58:47,815 INFO
>  [storage.resource.NfsSecondaryStorageResource]
> (agentRequest-Handler-10:null) Determined host 10.13.2.1 corresponds to IP
> 10.13.2.1
> 2015-06-09 07:58:47,873 INFO  [storage.template.HttpTemplateDownloader]
> (agentRequest-Handler-10:null) No credentials configured for host=
> 46.232.180.244:80
> 2015-06-09 07:58:47,895 INFO  [storage.template.HttpTemplateDownloader]
> (pool-1-thread-3:null) Starting download from
> http://46.232.180.244/userdata/6f2280e7-86d6-4fc7-abe8-e2bbfdeed442.qcow2
> to
>
> /mnt/SecStorage/dd40674b-575d-3bd2-87b3-7f1e1db4c02e/volumes/2/468/dnld855093017660470tmp_
> remoteSize=21474836480 , max size=5497558138880
> 2015-06-09 07:58:47,909 DEBUG [utils.script.Script] (pool-1-thread-3:null)
> Executing: /bin/bash -c file
>
> /mnt/SecStorage/dd40674b-575d-3bd2-87b3-7f1e1db4c02e/volumes/2/468/dnld855093017660470tmp_
> | cut -d: -f2
> 2015-06-09 07:58:47,936 DEBUG [utils.script.Script] (pool-1-thread-3:null)
> Execution is successful.
> 2015-06-09 07:58:47,941 INFO  [storage.template.DownloadManagerImpl]
> (pool-1-thread-3:null) Download Completion for jobId:
> c95b3d60-f904-4ecc-ab9f-7ed1c36eba20, status=UNRECOVERABLE_ERROR
> 2015-06-09 07:58:47,942 INFO  [storage.template.DownloadManagerImpl]
> (pool-1-thread-3:null) local:
>
> /mnt/SecStorage/dd40674b-575d-3bd2-87b3-7f1e1db4c02e/volumes/2/468/dnld855093017660470tmp_,
> bytes=1053854, error=Template content is unsupported, or mismatch between
> selected format and template content. Found  : x86 boot sector; partition
> 1, pct=0
> 2015-06-09 07:58:50,876 DEBUG [cloud.agent.Agent]
> (agentRequest-Handler-10:null) Seq 81-4133460032995983410:  { Ans: ,
> MgmtId: 90520741174948, via: 81, Ver: v1, Flags: 10,
> [{"com.cloud.agent.api.storage.DownloadAnswer":{"jobId":"c95b3d60-f904-4ecc-ab9f-7ed1c36eba20","downloadPct":0,"errorString":"Template
> content is unsupported, or mismatch between selected format and template
> content. Found  : x86 boot sector; partition
> 1","

Re: ACS 4.5.1 KVM live migration problem

2015-05-15 Thread Marcus
Hmmm, this seems like an unrelated issue, though the culprits are the same
fields.  It has me wondering if there's a bug in the vm sync or network
persistence. It would be interesting to know if:

1) The null values are somehow reproduceable

2) If stopping a VM with null values is possible

3) If starting a vm with null values fixes them

Are the networks these belong to marked as persistent? Network ids can be
dynamic in certain situations, if a network is not used it gives back its
vlan id, then gets a new one when you spin up vms again. This means these
fields on the nic also need to be updated to reflect that, and I'm
wondering if there's some issue there.

On Fri, May 15, 2015 at 6:01 AM, Andrija Panic 
wrote:

> Ok, but since they are guest, it confuses me - is this advanced zone with
> vlan, right ? Then my understanding all NICs (of user VM) needs to have
> some isolation method...
>
> Anyway - I'm running advanced zone  + vlans, and all VMS (VMs behind VPC
> and VMS on internet/public network - but still that's Guest network) -
> still all of them have some vlan://x value.
>
> For VR, SSVM, CPVM - there are NICs on "ACS public" network that doesnt use
> vlan - they have "vlan://untagged", and "NULL" is only used for LinkLocal
> (169.x) NICs, and for mgmt/sec-storage NIC for SSVM/CPVM in my case.
>
>
>
> On 15 May 2015 at 13:47, Andrei Mikhailovsky  wrote:
>
> > Andrija,
> >
> > I've ran the command and it showed me a bunch of running vms with NULLs.
> I
> > would roughly say about 20% of my total running vms do have NULL under
> the
> > isolation and broadcast URIs.
> >
> > All of these vms are working perfectly well (in terms of network
> > connectivity) and there is nothing special about them. They all have at
> > least one guest NIC.
> >
> > Andrei
> > - Original Message -
> >
> > From: "Andrija Panic" 
> > To: d...@cloudstack.apache.org
> > Cc: users@cloudstack.apache.org
> > Sent: Friday, 15 May, 2015 12:34:24 PM
> > Subject: Re: ACS 4.5.1 KVM live migration problem
> >
> > Andrei,
> >
> > select instance_id,isolation_uri,broadcast_uri from nics where
> instance_id
> > in (select id from vm_instance where state='Running' and name not like
> > 'r-%' and name not like 'v-%' and name not like 's-%') order by
> > instance_id;
> >
> > This gives me every niC, that does not belong to router or SSVm CPVMI
> > always have vlan values - since this is all Guest NICs - they must have
> > vlan ID...
> > NULL values are only present when VM is deleted/stoped in my case...
> >
> > Can you check your VM 664 - what is so specific about it ?
> > all NICs (in my understanding, if this is advacned zone) must have some
> > vlan, can not be NULL or untagged ?
> >
> > On 15 May 2015 at 12:58, Andrei Mikhailovsky  wrote:
> >
> > >
> > >
> > > Hi Andrija, Marcus,
> > >
> > > Thanks for your comments and suggestions. I've checked the cloud.nics
> > table
> > >
> > > mysql> select instance_id,isolation_uri,broadcast_uri from nics where
> > > instance_id=564 or instance_id=664 or instance_id=;
> > > +-+---+---+
> > > | instance_id | isolation_uri | broadcast_uri |
> > > +-+---+---+
> > > | 564 | vlan://96 | vlan://96 |
> > > | 664 | NULL | NULL |
> > > |  | vlan://1127 | vlan://1127 |
> > > +-+---+---+
> > >
> > >
> > > From my tests, instance_ids 564 and  are migrating correctly, but
> > > instance 664 is not ans showing the npe similar to the one i've given.
> > >
> > >
> > > Is this what is causing the migration issues? If so, should i change
> all
> > > isolation_uri and broadcast_uri to the corresponding network vlan ids?
> > >
> > > Thanks
> > >
> > > Andrei
> > >
> > > - Original Message -
> > >
> > > From: "Andrija Panic" 
> > > To: d...@cloudstack.apache.org
> > > Sent: Thursday, 14 May, 2015 4:00:07 PM
> > > Subject: Re: Fwd: ACS 4.5.1 KVM live migration problem
> > >
> > > That would probably be a bug that I had...but we updated main VLAN
> table
> > > with change URI or something... Marcus saved me that time :)
> > > Andrei, please provide more info and the info Marcus said, I will try
> to
> > > co

Re: venom/CVE-2015-345 Update your KVM folks

2015-05-14 Thread Marcus
Yes, and follow best practices of running qemu as non-root, and a user that
has no privileges and a restricted shell! Change user and group in
/etc/libvirt/qemu.conf

On Wed, May 13, 2015 at 7:23 AM, Nux!  wrote:

> https://access.redhat.com/articles/1444903
>
> People running KVM might want to update their stuff.
>
> --
> Sent from the Delta quadrant using Borg technology!
>
> Nux!
> www.nux.ro
>


Re: Cloudstack and KVM clusters,

2015-03-31 Thread Marcus
Don't forget SharedMountPoint. This (in theory, haven't tried it
recently) allows you to use any clustered filesystem that has a
consistent mountpoint across all KVM hosts in a CS cluster, e.g. mount
an OCFS2 to /vmstore1 then register /vmstore1 as a SharedMountPoint.

The Ceph support is in the form of RBD, by the way. You could use
CephFS if you wished via SharedMountPoint.

On Tue, Mar 31, 2015 at 2:09 PM, Simon Weller  wrote:
> The hosts need to be part of the same Cloudstack cluster, and depending on 
> the underlying storage technology, you may need a clustered file system as 
> well.
>
> A Cloudstack cluster is basically a group of physical hosts.
>
> For example:
>
> You build a new Zone in Cloudstack. Under the zone you have a pod. Within the 
> pod, you build a new cluster (just a group of hosts). Then you assigned 4 
> servers (hosts) into that cluster. You will be able to live migrate between 
> the 4 hosts assuming the original mentioned criteria are met.
>
> - Si
>
> 
> From: Rafael Weingartner 
> Sent: Tuesday, March 31, 2015 4:02 PM
> To: d...@cloudstack.apache.org
> Cc: users@cloudstack.apache.org
> Subject: Re: Cloudstack and KVM clusters,
>
> Thanks Simon,
>
>
> I think I got it.
>
> So, the hosts do not need to be in a cluster to perform the live migration.
>
> On Tue, Mar 31, 2015 at 5:59 PM, Simon Weller  wrote:
>
>> Rafael,
>>
>> KVM live migration really relies on whether the underlying shared storage
>> (and file system) supports the ability to provide data consistency during a
>> migration. You never ever want a situation where 2 hosts are able  to mount
>> and write to the same volume concurrently.
>>
>> You can live migrate in KVM today using the following underlying file
>> systems/methods:
>>
>> 1. NFS
>> 2. CEPH
>> 3. Clustered Logical Volume Management (CLVM) on top of SAN exposed
>> storage via iSCSI,FC or FCOE.
>>
>> It's also possible to build your own storage driver and set a LUN to read
>> only on a particular host using your SANs API.
>>
>> Solidfire, Nexenta and Cloudbyte have also added storage drivers more
>> recently that may provide support for live migration, but as I'm not
>> personally familiar with these storage platforms, I'll leave it up to
>> others to comment if they wish.
>>
>> - Si
>>
>>
>>
>>
>>
>> 
>> From: Rafael Weingartner 
>> Sent: Tuesday, March 31, 2015 3:36 PM
>> To: users@cloudstack.apache.org; d...@cloudstack.apache.org
>> Subject: Cloudstack and KVM clusters,
>>
>> Hi folks,
>>
>> I was looking a matrix of Cloudstack compatibility matrix at
>> http://pt.slideshare.net/TimMackey/hypervisor-31754727,
>>
>> Slide 25 seemed to show that we cannot have clusters of KVM in CS? Is that
>> true? Is it possible to live migrate VMs between KVM hosts that are not
>> clustered in CS?
>>
>>
>> --
>> Rafael Weingärtner
>>
>
>
>
> --
> Rafael Weingärtner


Re: [URGENT - HELP NEEDED]

2015-03-16 Thread Marcus
Good to hear. Glad I could lend another pair of eyes.

On Mon, Mar 16, 2015 at 9:19 AM, Andrija Panic  wrote:
> FIXED !!!
>
> Thanks a lot Marcus, this is the second time you saved me from the deep
> $$it..
>
> Only 1 VM that had only 1 NIC and not set to default in DB - so just after
> changing that default_nic=1, destroyed the VR, and new one was recreated.
>
> Thanks a lot for help !
>
>
> On 16 March 2015 at 16:52, Andrija Panic  wrote:
>
>> I will thanks a lot Marcus for hints...
>>
>> On 16 March 2015 at 16:49, Marcus  wrote:
>>
>>> Ok, just watch for those createdhcpentry mgmt server logs. Perhaps
>>> they're just triggered by you trying to fix the situation by
>>> migrating, but the original issue was something else entirely.
>>>
>>> On Mon, Mar 16, 2015 at 8:44 AM, Andrija Panic 
>>> wrote:
>>> > I did migrate and also changed accounts, unsucessfully, so some bugs
>>> > definitively or my specific setup...
>>> >
>>> > Thanks, I' fixing this now and will let you know.
>>> >
>>> > On 16 March 2015 at 16:42, Marcus  wrote:
>>> >
>>> >> Yes, each VM should have at least one default nic, so if there's only
>>> >> one nic it should be set to default. Take a db backup first, of
>>> >> course, before messing with it. Any idea how it may have happened? Do
>>> >> you migrate VMs between networks ever?
>>> >>
>>> >> On Mon, Mar 16, 2015 at 8:39 AM, Andrija Panic <
>>> andrija.pa...@gmail.com>
>>> >> wrote:
>>> >> > Ok, so if the VM has only 1 VM - and default_nic=0, then I need to
>>> change
>>> >> > all of them to default_nic=1... ?
>>> >> >
>>> >> >
>>> >> > On 16 March 2015 at 16:38, Marcus  wrote:
>>> >> >
>>> >> >> VMs can have multiple nics and be on multiple networks. If you set a
>>> >> >> nic as default, it becomes the network that the vm has its default
>>> >> >> route on. Every VM should have a default nic, and if it doesn't I
>>> >> >> wonder how it might have happened (maybe a specific combination of
>>> >> >> add/delete nic triggered a bug?). You should set a default nic for
>>> >> >> every VM that might be missing one, and see if that gets your router
>>> >> >> up.
>>> >> >>
>>> >> >> On Mon, Mar 16, 2015 at 8:34 AM, Andrija Panic <
>>> andrija.pa...@gmail.com
>>> >> >
>>> >> >> wrote:
>>> >> >> > Hi Marcus,
>>> >> >> >
>>> >> >> > Thanks a lot fot hint
>>> >> >> >
>>> >> >> > True, I have the 0 as the value for some reason in database, for
>>> >> couple
>>> >> >> of
>>> >> >> > NICs
>>> >> >> > select * from nics where ip4_address like "46.232%" and
>>> broadcast_uri
>>> >> =
>>> >> >> > "vlan://500" and default_nic = 0;
>>> >> >> >
>>> >> >> > results: http://pastebin.com/rDAe2RY9
>>> >> >> >
>>> >> >> > or down there...
>>> >> >> >
>>> >> >> > This Techvee-FileServer server is already running (still not dead)
>>> >> and I
>>> >> >> > can see 1 NIC from UI...
>>> >> >> >
>>> >> >> > Should I reset all of these to 1 ?
>>> >> >> > What is the purpose of this field default_nic = 0.
>>> >> >> >
>>> >> >> > vlan://500 in my case limits results only to the network for this
>>> VR
>>> >> that
>>> >> >> > is having problems...
>>> >> >> >
>>> >> >> > Any suggestions ?
>>> >> >> >
>>> >> >> > "id" "uuid" "instance_id" "mac_address" "ip4_address" "netmask"
>>> >> "gateway"
>>> >> >> > "ip_type" "broadcast_uri" "network_id" "mode" "state" "strategy"
>>> >> >> > "reserver_name" "reservation_id" "

Re: [URGENT - HELP NEEDED]

2015-03-16 Thread Marcus
Ok, just watch for those createdhcpentry mgmt server logs. Perhaps
they're just triggered by you trying to fix the situation by
migrating, but the original issue was something else entirely.

On Mon, Mar 16, 2015 at 8:44 AM, Andrija Panic  wrote:
> I did migrate and also changed accounts, unsucessfully, so some bugs
> definitively or my specific setup...
>
> Thanks, I' fixing this now and will let you know.
>
> On 16 March 2015 at 16:42, Marcus  wrote:
>
>> Yes, each VM should have at least one default nic, so if there's only
>> one nic it should be set to default. Take a db backup first, of
>> course, before messing with it. Any idea how it may have happened? Do
>> you migrate VMs between networks ever?
>>
>> On Mon, Mar 16, 2015 at 8:39 AM, Andrija Panic 
>> wrote:
>> > Ok, so if the VM has only 1 VM - and default_nic=0, then I need to change
>> > all of them to default_nic=1... ?
>> >
>> >
>> > On 16 March 2015 at 16:38, Marcus  wrote:
>> >
>> >> VMs can have multiple nics and be on multiple networks. If you set a
>> >> nic as default, it becomes the network that the vm has its default
>> >> route on. Every VM should have a default nic, and if it doesn't I
>> >> wonder how it might have happened (maybe a specific combination of
>> >> add/delete nic triggered a bug?). You should set a default nic for
>> >> every VM that might be missing one, and see if that gets your router
>> >> up.
>> >>
>> >> On Mon, Mar 16, 2015 at 8:34 AM, Andrija Panic > >
>> >> wrote:
>> >> > Hi Marcus,
>> >> >
>> >> > Thanks a lot fot hint
>> >> >
>> >> > True, I have the 0 as the value for some reason in database, for
>> couple
>> >> of
>> >> > NICs
>> >> > select * from nics where ip4_address like "46.232%" and broadcast_uri
>> =
>> >> > "vlan://500" and default_nic = 0;
>> >> >
>> >> > results: http://pastebin.com/rDAe2RY9
>> >> >
>> >> > or down there...
>> >> >
>> >> > This Techvee-FileServer server is already running (still not dead)
>> and I
>> >> > can see 1 NIC from UI...
>> >> >
>> >> > Should I reset all of these to 1 ?
>> >> > What is the purpose of this field default_nic = 0.
>> >> >
>> >> > vlan://500 in my case limits results only to the network for this VR
>> that
>> >> > is having problems...
>> >> >
>> >> > Any suggestions ?
>> >> >
>> >> > "id" "uuid" "instance_id" "mac_address" "ip4_address" "netmask"
>> "gateway"
>> >> > "ip_type" "broadcast_uri" "network_id" "mode" "state" "strategy"
>> >> > "reserver_name" "reservation_id" "device_id" "update_time"
>> >> "isolation_uri"
>> >> > "ip6_address" "default_nic" "vm_type" "created" "removed"
>> "ip6_gateway"
>> >> > "ip6_cidr" "secondary_ip" "display_nic"
>> >> > "2816" "5066bc3a-dbec-4789-aa42-3b9eb8f50bb4" "1795"
>> "06:70:0a:00:00:ac"
>> >> > "46.232.180.101" "255.255.255.0" "46.232.180.1" "Ip4" "vlan://500"
>> "212"
>> >> > "Dhcp" "Deallocating" "Create" "DirectNetworkGuru" \N "1" "2015-02-04
>> >> > 23:06:23" "vlan://500" \N "0" "User" "2015-02-04 20:41:05" "2015-02-04
>> >> > 22:06:23" \N \N "0" "1"
>> >> > "3132" "c8a5f98e-5663-40e3-ac03-1ac3545eaa83" "1958"
>> "06:fc:c2:00:00:ad"
>> >> > "46.232.180.102" "255.255.255.0" "46.232.180.1" "Ip4" "vlan://500"
>> "212"
>> >> > "Dhcp" "Deallocating" "Create" "DirectNetworkGuru" \N "1" "2015-03-03
>> >> > 15:45:47" "vlan://500" \N "0" "User" "2015-02-18 15:50:35" "2015-03-03
>> >> 

Re: [URGENT - HELP NEEDED]

2015-03-16 Thread Marcus
Yes, each VM should have at least one default nic, so if there's only
one nic it should be set to default. Take a db backup first, of
course, before messing with it. Any idea how it may have happened? Do
you migrate VMs between networks ever?

On Mon, Mar 16, 2015 at 8:39 AM, Andrija Panic  wrote:
> Ok, so if the VM has only 1 VM - and default_nic=0, then I need to change
> all of them to default_nic=1... ?
>
>
> On 16 March 2015 at 16:38, Marcus  wrote:
>
>> VMs can have multiple nics and be on multiple networks. If you set a
>> nic as default, it becomes the network that the vm has its default
>> route on. Every VM should have a default nic, and if it doesn't I
>> wonder how it might have happened (maybe a specific combination of
>> add/delete nic triggered a bug?). You should set a default nic for
>> every VM that might be missing one, and see if that gets your router
>> up.
>>
>> On Mon, Mar 16, 2015 at 8:34 AM, Andrija Panic 
>> wrote:
>> > Hi Marcus,
>> >
>> > Thanks a lot fot hint
>> >
>> > True, I have the 0 as the value for some reason in database, for couple
>> of
>> > NICs
>> > select * from nics where ip4_address like "46.232%" and broadcast_uri =
>> > "vlan://500" and default_nic = 0;
>> >
>> > results: http://pastebin.com/rDAe2RY9
>> >
>> > or down there...
>> >
>> > This Techvee-FileServer server is already running (still not dead) and I
>> > can see 1 NIC from UI...
>> >
>> > Should I reset all of these to 1 ?
>> > What is the purpose of this field default_nic = 0.
>> >
>> > vlan://500 in my case limits results only to the network for this VR that
>> > is having problems...
>> >
>> > Any suggestions ?
>> >
>> > "id" "uuid" "instance_id" "mac_address" "ip4_address" "netmask" "gateway"
>> > "ip_type" "broadcast_uri" "network_id" "mode" "state" "strategy"
>> > "reserver_name" "reservation_id" "device_id" "update_time"
>> "isolation_uri"
>> > "ip6_address" "default_nic" "vm_type" "created" "removed" "ip6_gateway"
>> > "ip6_cidr" "secondary_ip" "display_nic"
>> > "2816" "5066bc3a-dbec-4789-aa42-3b9eb8f50bb4" "1795" "06:70:0a:00:00:ac"
>> > "46.232.180.101" "255.255.255.0" "46.232.180.1" "Ip4" "vlan://500" "212"
>> > "Dhcp" "Deallocating" "Create" "DirectNetworkGuru" \N "1" "2015-02-04
>> > 23:06:23" "vlan://500" \N "0" "User" "2015-02-04 20:41:05" "2015-02-04
>> > 22:06:23" \N \N "0" "1"
>> > "3132" "c8a5f98e-5663-40e3-ac03-1ac3545eaa83" "1958" "06:fc:c2:00:00:ad"
>> > "46.232.180.102" "255.255.255.0" "46.232.180.1" "Ip4" "vlan://500" "212"
>> > "Dhcp" "Deallocating" "Create" "DirectNetworkGuru" \N "1" "2015-03-03
>> > 15:45:47" "vlan://500" \N "0" "User" "2015-02-18 15:50:35" "2015-03-03
>> > 14:45:47" \N \N "0" "1"
>> > "3139" "f5a41229-2267-4615-9128-63fbce69bb01" "1962" "06:d7:ac:00:00:ae"
>> > "46.232.180.103" "255.255.255.0" "46.232.180.1" "Ip4" "vlan://500" "212"
>> > "Dhcp" "Deallocating" "Create" "DirectNetworkGuru" \N "1" "2015-02-19
>> > 03:10:45" "vlan://500" \N "0" "User" "2015-02-19 00:09:02" "2015-02-19
>> > 02:10:45" \N \N "0" "1"
>> > "707" "99afa70a-39d5-4685-8fc0-9857fdc77c90" "511" "06:b5:72:00:00:72"
>> > "46.232.180.144" "255.255.255.0" "46.232.180.1" "Ip4" "vlan://500" "212"
>> > "Dhcp" "Deallocating" "Create" "DirectNetworkGuru" \N "0" "2014-01-27
>> > 14:38:52" "vlan://500" \N "0" "User" "20

Re: [URGENT - HELP NEEDED]

2015-03-16 Thread Marcus
VMs can have multiple nics and be on multiple networks. If you set a
nic as default, it becomes the network that the vm has its default
route on. Every VM should have a default nic, and if it doesn't I
wonder how it might have happened (maybe a specific combination of
add/delete nic triggered a bug?). You should set a default nic for
every VM that might be missing one, and see if that gets your router
up.

On Mon, Mar 16, 2015 at 8:34 AM, Andrija Panic  wrote:
> Hi Marcus,
>
> Thanks a lot fot hint
>
> True, I have the 0 as the value for some reason in database, for couple of
> NICs
> select * from nics where ip4_address like "46.232%" and broadcast_uri =
> "vlan://500" and default_nic = 0;
>
> results: http://pastebin.com/rDAe2RY9
>
> or down there...
>
> This Techvee-FileServer server is already running (still not dead) and I
> can see 1 NIC from UI...
>
> Should I reset all of these to 1 ?
> What is the purpose of this field default_nic = 0.
>
> vlan://500 in my case limits results only to the network for this VR that
> is having problems...
>
> Any suggestions ?
>
> "id" "uuid" "instance_id" "mac_address" "ip4_address" "netmask" "gateway"
> "ip_type" "broadcast_uri" "network_id" "mode" "state" "strategy"
> "reserver_name" "reservation_id" "device_id" "update_time" "isolation_uri"
> "ip6_address" "default_nic" "vm_type" "created" "removed" "ip6_gateway"
> "ip6_cidr" "secondary_ip" "display_nic"
> "2816" "5066bc3a-dbec-4789-aa42-3b9eb8f50bb4" "1795" "06:70:0a:00:00:ac"
> "46.232.180.101" "255.255.255.0" "46.232.180.1" "Ip4" "vlan://500" "212"
> "Dhcp" "Deallocating" "Create" "DirectNetworkGuru" \N "1" "2015-02-04
> 23:06:23" "vlan://500" \N "0" "User" "2015-02-04 20:41:05" "2015-02-04
> 22:06:23" \N \N "0" "1"
> "3132" "c8a5f98e-5663-40e3-ac03-1ac3545eaa83" "1958" "06:fc:c2:00:00:ad"
> "46.232.180.102" "255.255.255.0" "46.232.180.1" "Ip4" "vlan://500" "212"
> "Dhcp" "Deallocating" "Create" "DirectNetworkGuru" \N "1" "2015-03-03
> 15:45:47" "vlan://500" \N "0" "User" "2015-02-18 15:50:35" "2015-03-03
> 14:45:47" \N \N "0" "1"
> "3139" "f5a41229-2267-4615-9128-63fbce69bb01" "1962" "06:d7:ac:00:00:ae"
> "46.232.180.103" "255.255.255.0" "46.232.180.1" "Ip4" "vlan://500" "212"
> "Dhcp" "Deallocating" "Create" "DirectNetworkGuru" \N "1" "2015-02-19
> 03:10:45" "vlan://500" \N "0" "User" "2015-02-19 00:09:02" "2015-02-19
> 02:10:45" \N \N "0" "1"
> "707" "99afa70a-39d5-4685-8fc0-9857fdc77c90" "511" "06:b5:72:00:00:72"
> "46.232.180.144" "255.255.255.0" "46.232.180.1" "Ip4" "vlan://500" "212"
> "Dhcp" "Deallocating" "Create" "DirectNetworkGuru" \N "0" "2014-01-27
> 14:38:52" "vlan://500" \N "0" "User" "2014-01-27 11:29:08" "2014-01-27
> 13:38:52" \N \N "0" "1"
> "1580" "bf56315e-b4c3-4338-88d9-3013ab2e2c37" "1088" "06:1d:90:00:00:72"
> "46.232.180.144" "255.255.255.0" "46.232.180.1" "Ip4" "vlan://500" "212"
> "Dhcp" "Deallocating" "Create" "DirectNetworkGuru" \N "1" "2014-07-23
> 10:15:18" "vlan://500" \N "0" "User" "2014-07-17 19:14:06" "2014-07-23
> 08:15:18" \N \N "0" "1"
> "3799" "712cbcb6-097f-4555-a73b-e8c2a5bd557f" "2306" "06:33:ac:00:00:77"
> "46.232.180.149" "255.255.255.0" "46.232.180.1" "Ip4" "vlan://500" "212"
> "Dhcp" "Deallocating" "Create" "DirectNetworkGuru" \N "1

Re: [URGENT - HELP NEEDED]

2015-03-16 Thread Marcus
FWIW, if your 4.3.2 is the same as what's in the source tree for the
router code, the null pointer indicates that there's no default nic
for one of your guests "Techvee-FileServer". I'd guess that if you
delete/move just that host you may be able to start the router, or at
least get past this. You can also look into the db and see if you can
find its nics in the cloud.nics table and see if there is in fact one
marked default.

On Mon, Mar 16, 2015 at 8:05 AM, Marcus  wrote:
> Looks like the issue is that null pointer in CreateDhcpEntry for
> either "Techvee-FileServer" or the DHCP entry immediately after that.
> It would suggest some inconsistent/unexpected data when creating a
> DHCP entry for one of the guests serviced by this router. It's too bad
> that one bad entry is fatal for the whole router.
>
>
> On Mon, Mar 16, 2015 at 7:16 AM, Andrija Panic  
> wrote:
>> Not really - we are painfully migrating stopped VMs, from VPS network
>> (Guest Shared netwotk) to VPCs...
>>
>> MGMT server sends the STOP command to agent, even though the VM was never
>> started, BUT the storage provisioning from template to volume is done...
>>
>> We are also looking into some external help as we speak...
>>
>> On 16 March 2015 at 14:52, Nux!  wrote:
>>
>>> Hi,
>>>
>>> Have you managed to get to the bottom of this?
>>>
>>> --
>>> Sent from the Delta quadrant using Borg technology!
>>>
>>> Nux!
>>> www.nux.ro
>>>
>>> - Original Message -
>>> > From: "Andrija Panic" 
>>> > To: users@cloudstack.apache.org, d...@cloudstack.apache.org
>>> > Sent: Sunday, 15 March, 2015 16:19:07
>>> > Subject: [URGENT - HELP NEEDED]
>>>
>>> > Hi guys,
>>> >
>>> > we have updated the cloudstack from 4.3.0 to 4.3.2 (OS updated right
>>> before
>>> > that, from CentOS 6.5 to CentOS 6.6)
>>> >
>>> > And now I can not start SYSTEM VR - that is used for SHARED GUEST network
>>> > anymore.
>>> > And some VMs are down - and cant be started because they depend on this
>>> > VR...
>>> >
>>> > VPC VRs are created fine, so new VR for VPC are created fine, but this
>>> one
>>> > fro Guest network fails to start:
>>> >
>>> > Here you can see, after agent copies template from secondary storage, to
>>> > primary local storage, it created base image, and backing file - so
>>> storage
>>> > setup seems completed.
>>> >
>>> > Than all out of sudden we have errors:
>>>
>>
>>
>>
>> --
>>
>> Andrija Panić


Re: [URGENT - HELP NEEDED]

2015-03-16 Thread Marcus
Looks like the issue is that null pointer in CreateDhcpEntry for
either "Techvee-FileServer" or the DHCP entry immediately after that.
It would suggest some inconsistent/unexpected data when creating a
DHCP entry for one of the guests serviced by this router. It's too bad
that one bad entry is fatal for the whole router.


On Mon, Mar 16, 2015 at 7:16 AM, Andrija Panic  wrote:
> Not really - we are painfully migrating stopped VMs, from VPS network
> (Guest Shared netwotk) to VPCs...
>
> MGMT server sends the STOP command to agent, even though the VM was never
> started, BUT the storage provisioning from template to volume is done...
>
> We are also looking into some external help as we speak...
>
> On 16 March 2015 at 14:52, Nux!  wrote:
>
>> Hi,
>>
>> Have you managed to get to the bottom of this?
>>
>> --
>> Sent from the Delta quadrant using Borg technology!
>>
>> Nux!
>> www.nux.ro
>>
>> - Original Message -
>> > From: "Andrija Panic" 
>> > To: users@cloudstack.apache.org, d...@cloudstack.apache.org
>> > Sent: Sunday, 15 March, 2015 16:19:07
>> > Subject: [URGENT - HELP NEEDED]
>>
>> > Hi guys,
>> >
>> > we have updated the cloudstack from 4.3.0 to 4.3.2 (OS updated right
>> before
>> > that, from CentOS 6.5 to CentOS 6.6)
>> >
>> > And now I can not start SYSTEM VR - that is used for SHARED GUEST network
>> > anymore.
>> > And some VMs are down - and cant be started because they depend on this
>> > VR...
>> >
>> > VPC VRs are created fine, so new VR for VPC are created fine, but this
>> one
>> > fro Guest network fails to start:
>> >
>> > Here you can see, after agent copies template from secondary storage, to
>> > primary local storage, it created base image, and backing file - so
>> storage
>> > setup seems completed.
>> >
>> > Than all out of sudden we have errors:
>>
>
>
>
> --
>
> Andrija Panić


Re: Replace systemvm template 4.3.0 with recent one during ACS upgrade

2015-03-14 Thread Marcus
It should pull the highest/last entry with the name 4.3 when
redeploying the routers, but I'm not sure if it will detect that the
router needs upgrade without a minor version change. I imagine it
would fetch the highest entry, see that the template id doesn't exist,
and install it, but you may want to test first.

On Sat, Mar 14, 2015 at 4:08 AM, Andrija Panic  wrote:
> Hi guys,
>
> I'm wondering, since I'm upgrading ACS 4.3.0 with original systemvm (from ~
> 24.05.2014), to ACS 4.3.2 - am I required to also register new systemVM
> template (i.e. from UI like when you upgrade from 4.3 to 4.4..) - or should
> I just upgrade ACS and ACS would somehow update systemVM template from
> original one (4.3.0) to newer one 4.3.2 (there is one on
> http://cloudstack.apt-get.eu/ from 24.09.2014 and there is 15.01.2015 on
> shapeblue site also)
>
> I'm trying (with upgrade) to mitigate some security risks of SSLs that has
> been happening recently, and solve some of Port Forwarding / Static NAT
> issues, where remote IP is not seen really...
>
> So basicaly what is the systemvm template upgrade/replace procedure for the
> same release version of ACS (4.3.0 - 4.3.x) ?
>
>
> Thanks,
>
> --
>
> Andrija Panić


Re: Updating VR template, without ACS update

2015-03-11 Thread Marcus
I've never used the official script to upgrade. I always set to the
global setting to recreate on reboot of systemvms, it has been more
robust for me to do it the cloudy way and get a fresh vm on every
boot. With various issues that have arisen in the past (file system
filling up, fsck required on unclean shutdown, etc) it's just nice to
know that you're always a reboot away from getting a pristine config.
I was surprised when I heard about the patchviasocket issue as I've
run thousands of routers  and upgraded them multiple times, never once
having an issue. Nor in my nested vm dev environment. Perhaps it was
just our fast storage or something. I think someone added in some
retries or something like that.

Keep in mind that you usually don't need to drop in a new template for
a bugfix release, and it's sufficient to reboot. The exception to this
is if the bugfix release specifically indicates a new template, say
for a security fix on software in the OS of the template. Either way,
CloudStack will go through the full reprogramming process, and
stop/start the router to attach a new ISO with the new code and
install it on the router template, whether it images a fresh template
or uses an existing one.

On Wed, Mar 11, 2015 at 3:59 AM, Andrija Panic  wrote:
> Thanks Markus.
>
> So anyway, I need to make some time to upgrade to 4.3.2.
>
> Can I manually reboot VR/s one by one after the upgrade is done (instead of
> using the script for rebooting ssvm, cpvm, and 66 VRs...)
> And is this reallt reboot inside OS - not destroying and recreating VRs ???
>
> Or would you still recommend rebooting VRs via sctipt - I understand that
> it reboots VRs one by one...
>
> I would not like to recreate VR, and then hit a bug with VR creation, that
> I'm having right now... :(
>
> Thanks
>
>
>
>
> On 10 March 2015 at 20:14, Marcus  wrote:
>
>> Hi,
>>   It's impossible to know without looking at the changes in 4.3.1,
>> 4.3.2.  Your routers will be running old code, and will probably work,
>> but might not, e.g. if a router script is called with parameters that
>> don't exist in the version of the script that the router runs. If you
>> don't plan on making any changes (add ACLs, spin up new VMs, etc) to
>> these VPCs they'll most likely run just fine as-is, but any changes
>> are a big ?
>>
>>  As far as your question about replacing the template, I believe
>> CloudStack looks for the latest of a specific version, so if you
>> retire your existing template and install a new one per the 4.3
>> upgrade instructions it should choose that. Note that for routers
>> specifically there s a global option 'router.template.kvm' that can be
>> pointed to a specific template name to use for routers.
>>
>> On Tue, Mar 10, 2015 at 7:46 AM, Andrija Panic 
>> wrote:
>> > Hi,
>> >
>> > I was wondering is it possibe to update/replace the VR template somehow
>> > without actually updating the ACS.
>> >
>> > I'm running ACS 4.3.0, and having some issues with remote IP not being
>> > really shown during Port Forwarding and Static NAT (VR also does SNAT
>> > beside the DNAT)
>> >
>> > I know question is a little bit weird - but...
>> >
>> > Another Q: I can see that after ACS is upgraded, there is restart of each
>> > System VM needed - we have over 50-60 VPCs - this also means that I need
>> to
>> > wait for 60 VRs to reboot.
>> > Is there any drawnback of runnng existing VRs after ACS 4.3.0 is updated
>> to
>> > 4.3.2 and then later manually reboot each VR from Infrastructure/Virtual
>> > Routers ?
>> >
>> >
>> >
>> > --
>> >
>> > Andrija Panić
>>
>
>
>
> --
>
> Andrija Panić


Re: Updating VR template, without ACS update

2015-03-10 Thread Marcus
Hi,
  It's impossible to know without looking at the changes in 4.3.1,
4.3.2.  Your routers will be running old code, and will probably work,
but might not, e.g. if a router script is called with parameters that
don't exist in the version of the script that the router runs. If you
don't plan on making any changes (add ACLs, spin up new VMs, etc) to
these VPCs they'll most likely run just fine as-is, but any changes
are a big ?

 As far as your question about replacing the template, I believe
CloudStack looks for the latest of a specific version, so if you
retire your existing template and install a new one per the 4.3
upgrade instructions it should choose that. Note that for routers
specifically there s a global option 'router.template.kvm' that can be
pointed to a specific template name to use for routers.

On Tue, Mar 10, 2015 at 7:46 AM, Andrija Panic  wrote:
> Hi,
>
> I was wondering is it possibe to update/replace the VR template somehow
> without actually updating the ACS.
>
> I'm running ACS 4.3.0, and having some issues with remote IP not being
> really shown during Port Forwarding and Static NAT (VR also does SNAT
> beside the DNAT)
>
> I know question is a little bit weird - but...
>
> Another Q: I can see that after ACS is upgraded, there is restart of each
> System VM needed - we have over 50-60 VPCs - this also means that I need to
> wait for 60 VRs to reboot.
> Is there any drawnback of runnng existing VRs after ACS 4.3.0 is updated to
> 4.3.2 and then later manually reboot each VR from Infrastructure/Virtual
> Routers ?
>
>
>
> --
>
> Andrija Panić


Re: Agent deleting/stoping main bridge on Linux ???

2015-03-04 Thread Marcus
I don't think anyone has ever tested what would happen if the admin
has manually defined the same guest bridges that CloudStack wants to
use. CloudStack creates them on the fly and deletes them when the last
VM has been removed. I assume you're using these bridges on the host
for something out of band that CloudStack isn't aware of?

On Wed, Mar 4, 2015 at 8:09 AM, Andrija Panic  wrote:
> I forgot to add - this is ACS 4.3.0 and CentOS 6.x... breth1-500 is the
> main bridge used for Shared Network (yes, I know, somewhat confusing name
> for the bridge...)
>
> On 4 March 2015 at 17:07, Andrija Panic  wrote:
>
>> Hi people.
>>
>> on physical host, I was having breth1-500(bridge) with eth1(joined to this
>> bridge) - all defined manually in Centos network config files.
>> (when you boot physical host - this bridge is active of course)
>>
>> When I deploy new VM with Shared Network with vlan 500, new device is
>> created eth1.500 and joined to this bridge - which is fine, and then vnet0
>> device from VM is also joined to the bridge...
>>
>>
>>
>> When I stop the last VM that is using this Shared Network, CloudStack
>> (agent?) removes eth1.500 from bridge (fine with me), and ***then removed
>> eth1 from bridge (which I manually configured !!!) and later stoped/removed
>> the whole bridge***
>>
>> If you try to start new VM again (joined to Shared Network with vlan 500)
>> - then no bridge is available, VM is started, but no vnet device, no
>> breth1-500 bridge up, and no eth1.500 up.
>> No bridge was available - but also bridge was not created on the fly...
>>
>>
>> Is there any explanation - why the heck would my manually configured
>> bridge get deleted?
>>
>> --
>>
>> Andrija Panić
>>
>
>
>
> --
>
> Andrija Panić


Re: Agent deleting/stoping main bridge on Linux ???

2015-03-04 Thread Marcus
Yeah, sorry. You could request an enhancement for an agent tunable
that keeps the bridge from being removed, but it will not help you
now.

If you want to get hacky, you can edit the
   /usr/share/cloudstack-common/scripts/vm/network/vnet/modifyvlan.sh
script on your agent, commenting out deleteVlan(). That *might* do
what you want.

On Wed, Mar 4, 2015 at 8:19 AM, Andrija Panic  wrote:
> Thanks Markus - yes don't ask me why I did name bridge like this... so this
> is obviously unsuported scenario, that I did... crap...
>
> Thx again.
>
> On 4 March 2015 at 17:16, Marcus  wrote:
>
>> I don't think anyone has ever tested what would happen if the admin
>> has manually defined the same guest bridges that CloudStack wants to
>> use. CloudStack creates them on the fly and deletes them when the last
>> VM has been removed. I assume you're using these bridges on the host
>> for something out of band that CloudStack isn't aware of?
>>
>> On Wed, Mar 4, 2015 at 8:09 AM, Andrija Panic 
>> wrote:
>> > I forgot to add - this is ACS 4.3.0 and CentOS 6.x... breth1-500 is the
>> > main bridge used for Shared Network (yes, I know, somewhat confusing name
>> > for the bridge...)
>> >
>> > On 4 March 2015 at 17:07, Andrija Panic  wrote:
>> >
>> >> Hi people.
>> >>
>> >> on physical host, I was having breth1-500(bridge) with eth1(joined to
>> this
>> >> bridge) - all defined manually in Centos network config files.
>> >> (when you boot physical host - this bridge is active of course)
>> >>
>> >> When I deploy new VM with Shared Network with vlan 500, new device is
>> >> created eth1.500 and joined to this bridge - which is fine, and then
>> vnet0
>> >> device from VM is also joined to the bridge...
>> >>
>> >>
>> >>
>> >> When I stop the last VM that is using this Shared Network, CloudStack
>> >> (agent?) removes eth1.500 from bridge (fine with me), and ***then
>> removed
>> >> eth1 from bridge (which I manually configured !!!) and later
>> stoped/removed
>> >> the whole bridge***
>> >>
>> >> If you try to start new VM again (joined to Shared Network with vlan
>> 500)
>> >> - then no bridge is available, VM is started, but no vnet device, no
>> >> breth1-500 bridge up, and no eth1.500 up.
>> >> No bridge was available - but also bridge was not created on the fly...
>> >>
>> >>
>> >> Is there any explanation - why the heck would my manually configured
>> >> bridge get deleted?
>> >>
>> >> --
>> >>
>> >> Andrija Panić
>> >>
>> >
>> >
>> >
>> > --
>> >
>> > Andrija Panić
>>
>
>
>
> --
>
> Andrija Panić


Re: Agent deleting/stoping main bridge on Linux ???

2015-03-04 Thread Marcus
Er, problems for the hypervisor, that is. And an admin probably
doesn't want to deal with configuring all of those, even if it can be
scripted, so CloudStack does the creation/deletion.

On Wed, Mar 4, 2015 at 8:18 AM, Marcus  wrote:
> As for why, it's a scalability issue. There are people who are using
> (or have defined) 10,000+ guest networks. If CloudStack left bridges
> around that weren't being used it could cause problems for the guest,
> so it keeps things tidy by only keeping bridges that are used.
>
> On Wed, Mar 4, 2015 at 8:16 AM, Marcus  wrote:
>> I don't think anyone has ever tested what would happen if the admin
>> has manually defined the same guest bridges that CloudStack wants to
>> use. CloudStack creates them on the fly and deletes them when the last
>> VM has been removed. I assume you're using these bridges on the host
>> for something out of band that CloudStack isn't aware of?
>>
>> On Wed, Mar 4, 2015 at 8:09 AM, Andrija Panic  
>> wrote:
>>> I forgot to add - this is ACS 4.3.0 and CentOS 6.x... breth1-500 is the
>>> main bridge used for Shared Network (yes, I know, somewhat confusing name
>>> for the bridge...)
>>>
>>> On 4 March 2015 at 17:07, Andrija Panic  wrote:
>>>
>>>> Hi people.
>>>>
>>>> on physical host, I was having breth1-500(bridge) with eth1(joined to this
>>>> bridge) - all defined manually in Centos network config files.
>>>> (when you boot physical host - this bridge is active of course)
>>>>
>>>> When I deploy new VM with Shared Network with vlan 500, new device is
>>>> created eth1.500 and joined to this bridge - which is fine, and then vnet0
>>>> device from VM is also joined to the bridge...
>>>>
>>>>
>>>>
>>>> When I stop the last VM that is using this Shared Network, CloudStack
>>>> (agent?) removes eth1.500 from bridge (fine with me), and ***then removed
>>>> eth1 from bridge (which I manually configured !!!) and later stoped/removed
>>>> the whole bridge***
>>>>
>>>> If you try to start new VM again (joined to Shared Network with vlan 500)
>>>> - then no bridge is available, VM is started, but no vnet device, no
>>>> breth1-500 bridge up, and no eth1.500 up.
>>>> No bridge was available - but also bridge was not created on the fly...
>>>>
>>>>
>>>> Is there any explanation - why the heck would my manually configured
>>>> bridge get deleted?
>>>>
>>>> --
>>>>
>>>> Andrija Panić
>>>>
>>>
>>>
>>>
>>> --
>>>
>>> Andrija Panić


Re: Agent deleting/stoping main bridge on Linux ???

2015-03-04 Thread Marcus
As for why, it's a scalability issue. There are people who are using
(or have defined) 10,000+ guest networks. If CloudStack left bridges
around that weren't being used it could cause problems for the guest,
so it keeps things tidy by only keeping bridges that are used.

On Wed, Mar 4, 2015 at 8:16 AM, Marcus  wrote:
> I don't think anyone has ever tested what would happen if the admin
> has manually defined the same guest bridges that CloudStack wants to
> use. CloudStack creates them on the fly and deletes them when the last
> VM has been removed. I assume you're using these bridges on the host
> for something out of band that CloudStack isn't aware of?
>
> On Wed, Mar 4, 2015 at 8:09 AM, Andrija Panic  wrote:
>> I forgot to add - this is ACS 4.3.0 and CentOS 6.x... breth1-500 is the
>> main bridge used for Shared Network (yes, I know, somewhat confusing name
>> for the bridge...)
>>
>> On 4 March 2015 at 17:07, Andrija Panic  wrote:
>>
>>> Hi people.
>>>
>>> on physical host, I was having breth1-500(bridge) with eth1(joined to this
>>> bridge) - all defined manually in Centos network config files.
>>> (when you boot physical host - this bridge is active of course)
>>>
>>> When I deploy new VM with Shared Network with vlan 500, new device is
>>> created eth1.500 and joined to this bridge - which is fine, and then vnet0
>>> device from VM is also joined to the bridge...
>>>
>>>
>>>
>>> When I stop the last VM that is using this Shared Network, CloudStack
>>> (agent?) removes eth1.500 from bridge (fine with me), and ***then removed
>>> eth1 from bridge (which I manually configured !!!) and later stoped/removed
>>> the whole bridge***
>>>
>>> If you try to start new VM again (joined to Shared Network with vlan 500)
>>> - then no bridge is available, VM is started, but no vnet device, no
>>> breth1-500 bridge up, and no eth1.500 up.
>>> No bridge was available - but also bridge was not created on the fly...
>>>
>>>
>>> Is there any explanation - why the heck would my manually configured
>>> bridge get deleted?
>>>
>>> --
>>>
>>> Andrija Panić
>>>
>>
>>
>>
>> --
>>
>> Andrija Panić


Re: Uploading of Volume - how does what here ?

2015-02-26 Thread Marcus
The volume is downloaded by the SSVM into secondary storage under the
volumes directory. It will sit there until you choose to attach it
somewhere, at which point a CopyCommand will be sent to a hypervisor
that has access to the primary storage for the cluster on which the
target VM is running to copy from secondary to primary. This will be
handled by the appropriate StorageAdaptor for the primary storage type
(most likely LibvirtStorageAdaptor, which will qemu-img it, converting
to RAW format for RBD/LVM, or just a plain cp last I checked for QCOW2
to QCOW2).Then an AttachCommand will be sent to the hypervisor on
which the target vm is running (if it is running) and it will be
hotplugged.

On Thu, Feb 26, 2015 at 1:05 AM, Andrija Panic  wrote:
> Thx Lucian,
>
> that was my guessing, but would like some confirmation if anyone familiar
> with this...
>
> Thanks
>
> On 25 February 2015 at 17:57, Nux!  wrote:
>
>> Not a CEPH user, but what I believe happens is your HV mounts the NFS
>> storage and then does something like "qemu-img convert" to move it into
>> CEPH.
>>
>> HTH
>> Lucian
>>
>> --
>> Sent from the Delta quadrant using Borg technology!
>>
>> Nux!
>> www.nux.ro
>>
>> - Original Message -
>> > From: "Andrija Panic" 
>> > To: d...@cloudstack.apache.org, users@cloudstack.apache.org
>> > Sent: Wednesday, 25 February, 2015 12:09:59
>> > Subject: Uploading of Volume - how does what here ?
>>
>> > Hi guys,
>> >
>> > I'm just uploading a Volume, and I guess it will end up on the CEPH
>> storage
>> > that we are using.
>> >
>> > So my qyestion would be: VOLUME will end up on the primary storage in the
>> > end, but right now, I can see that the SSVM is actuall y downloading the
>> > Volume from internet, and writing it to Secondary Storage at the moment.
>> >
>> > WHat happens next - I know that both SSVM and ofcoure my NFS server, can
>> > not write/talk at all to CEPH ?
>> >
>> > Does maybe some randomly choosen host, mounts Secondary Storage NFS, read
>> > Volume, and upload/write to CEPH ? (Since only hypervisor hosts can
>> > actually talk to CEPH)
>> >
>> > Thanks in advance for clarification...
>> >
>> > --
>> >
>> > Andrija Panić
>>
>
>
>
> --
>
> Andrija Panić


Re: Agent dies every night/morning.... memory violation

2015-02-23 Thread Marcus
It doesn't really sound like an agent problem, but some other root
problem that is causing issues for the agent. Perhaps it is specific
to the host simply because there is a particular VM that always runs
on that host and the VM itself is triggering the issue. Perhaps a
heavy logrotate or cron job on the vm causes issues for librados. Just
grasping at straws here. From the output provided it does seem that
the libvirt bindings that include ceph code are terminating the agent
execution.  My guess is that if you focus on "why this host" as
opposed to "what's going on", you'll find the answer to both. Sorry, I
know that's not much help.

On Mon, Feb 23, 2015 at 7:29 AM, Andrija Panic  wrote:
> Anybody?, before I start to cry :(
>
> On 21 February 2015 at 21:18, Andrija Panic  wrote:
>
>> HI Simon,
>>
>> selinux is disabled, I have just double checked.
>>
>> BTW, this is what I can see in the cloudstack-agent.err log - seems like
>> some CEPH related issues, but not sure why would agent die...
>> If I recall correclty, this might be happening since the CEPH update from
>> 0.80.3? to 0.87 - and this seesm like some crash in librados
>>
>>
>> libust[1907/2046]: Warning: HOME environment variable not set. Disabling
>> LTTng-UST per-user tracing. (in setup_local_apps() at lttng-ust-comm.c:305)
>> libvirt:  error : name in virDomainLookupByName must not be NULL
>> libvirt:  error : name in virDomainLookupByName must not be NULL
>> libvirt:  error : name in virDomainLookupByName must not be NULL
>> libvirt:  error : name in virDomainLookupByName must not be NULL
>> libvirt: Storage Driver error : failed to remove volume
>> 'cloudstack/bd751250-de35-4d2e-a4e3-3ee4b636c2a7': Device or resource busy
>> ./log/SubsystemMap.h: In function 'bool
>> ceph::log::SubsystemMap::should_gather(unsigned int, int)' thread
>> 7f04427fc700 time 2015-02-21 06:39:38.839210
>> ./log/SubsystemMap.h: 62: FAILED assert(sub < m_subsys.size())
>>  ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578)
>>  1: (()+0x1fe223) [0x7f060c932223]
>>  2: (ObjectCacher::flusher_entry()+0x155) [0x7f060c9866e5]
>>  3: (ObjectCacher::FlusherThread::entry()+0xd) [0x7f060c9976cd]
>>  4: (()+0x79d1) [0x7f06605ee9d1]
>>  5: (clone()+0x6d) [0x7f066033bb5d]
>>  NOTE: a copy of the executable, or `objdump -rdS ` is needed
>> to interpret this.
>> terminate called after throwing an instance of 'ceph::FailedAssertion'
>> 21/02/2015 06:39:38 1905 jsvc.exec error: Service did not exit cleanly
>>
>> On 20 February 2015 at 21:56, Simon Weller  wrote:
>>
>>> Andrija,
>>>
>>> What is SELinux set to on this host?
>>>
>>>
>>> - SI
>>>
>>>
>>> 
>>> From: Andrija Panic 
>>> Sent: Friday, February 20, 2015 6:06 AM
>>> To: d...@cloudstack.apache.org; users@cloudstack.apache.org
>>> Subject: Agent dies every night/morning memory violation
>>>
>>> Hi,
>>>
>>> I have crazy agent on one of the hosts, that is being killed each morning
>>> and I found this in /var/log/audit.log:
>>>
>>> type=ANOM_ABEND msg=audit(1424321463.930:430678): auid=0 uid=0 gid=0
>>> ses=68891 pid=10831 comm="jsvc" reason="memory violation" sig=6
>>>
>>> I dont remember changing anything on the system, but this keeps happening
>>> each morning arrond same time 5.20am-5.40am.
>>>
>>> I'm wondering what the hack is happening, any suggestions where to
>>> troubleshoot ?
>>> Will check logs in details anyway...
>>>
>>> --
>>>
>>> Andrija Panić
>>>
>>
>>
>>
>> --
>>
>> Andrija Panić
>>
>
>
>
> --
>
> Andrija Panić


Re: Network QoS (not bandwidth limiting)

2015-02-21 Thread Marcus
The points raised are certainly valid from an enterprise networking
standpoint, and don't fall on deaf ears, but we should keep things in
perspective. To provide the aforementioned features would be
relatively uncharted territory in the cloud orchestration world (at
least not considering vendor provided networking solutions that only
handle the network part of the equation), so while it would be good to
aspire to providing those things, it should be no surprise that the
platform works that way and lacks such features.

For further perspective, keep in mind that cloud orchestration in
general has been a pitch to software developers and management for
"easy infrastructure". Cloud consumers are end users, web developers,
application developers, so again it should be no surprise that the
product provides features that cater to that, rather than providing
the bells and whistles that a network admin would want to see in their
infrastructure. CloudStack was never built to be pitched to network
teams as a cure for managing their infra deployments, the only cloud
product providers doing that are network vendors who have cloud
networking products. This is of course why a VPC needs IPs defined, as
applications care more about how to serve up a web page than network
engineering and managing distinct layer 2 and 3, so the whole network
stack is sandwiched into a simple orchestration mechanism that gets
the application what it needs.

In designing and deploying cloud, the most common complaint I see from
people who are infrastructure maintainers is "why can't I just build
the infrastructure the way I want and then have it orchestrated?".
Unfortunately, we can't just automate and integrate with anyone's pet
design. CloudStack supports many novel and custom network designs
simply by allowing the option of letting you manage the network
hardware and being hands-off (shared/public networks), while also
being pluggable to allow vendors to take over whatever features and
they wish. I've seen some pretty advanced overlay networking provided
through third party plugins to CloudStack that take over all network
functionality and provide more.

What's really being asked for here is for CloudStack to provide and
maintain a fully fledged and featured router distribution in its
provided virtual router. It's an admirable project to have if we can
get support for it. My guess is there's a bit of a disconnect in
interest though, because many (but not all) enterprises who want
CloudStack for infrastructure automation are skeptical about a VM as
software router and prefer to bring in aforementioned enterprise
vendors who have their own plugins. People who provide cloud hosting
and other services tend to use the routers, but their interest in
enterprise level routing and redundancy varies greatly, and their
customers are designing their apps to be resilient to infrastructure
loss (e.g. most AWS customers). That's of course not entirely the
whole truth, as is evidenced by the work we are seeing on redundant
routers, but I do believe that's why we haven't seen these things from
the beginning. They just haven't been all that important to the target
customers, even though infrastructure engineers are used to providing
them.

So now comes my philosophy. In the end, I think the great thing about
open source communities is that if there's the right level of
interest, it will happen.  I'm the kind of person who feels a pang of
stress at the idea that something I work on can't be all things to all
people, but after building a hosting business over the last few years
I've begun to realize that it's really only practical to try to be
good for a subset of the market and focus on that. You'll never please
everyone, there are limits to what you can accomplish, and sometimes
it's OK to just concede that your product is not going to work for
everyone. If you don't, you'll spread yourself too thin and fail
everyone. In order to make something great you have to have a limit on
your scope. That's not to say you don't listen to your customers, but
you sometimes have to make hard choices on who to listen to and who to
upset.

None of this should be taken as a discouragement to the topics at
hand, but again as someone to takes it personally when I don't deliver
I wanted to provide some follow up to address the "rant" and try to
provide perspective on why the things are the way they are.

On Sat, Feb 21, 2015 at 1:58 PM, Somesh Naidu  wrote:
> Adrian,
>
> Rant or not, I believe you have raised a valid point and reflect certain 
> group of peoples requirement.
>
> Based on your requirement, I believe you are looking for something like 
> Vyatta.
>
> Regards,
> Somesh
>
> -Original Message-
> From: Adrian Lewis [mailto:adr...@alsiconsulting.co.uk]
> Sent: Friday, February 20, 2015 8:50 PM
> To: users@cloudstack.apache.org
> Subject: RE: Network QoS (not bandwidth limiting)
>
> Tempted to suggest some sort of special interest group where networking
> people

Re: Live migration ip address with kvm

2015-02-12 Thread Marcus
It's not a good idea to use resource tags. Resource tags should not be
used by CloudStack itself to do business logic, just for arbitrary
metadata the admin may want.

The trick is that you want the source host to know that the
destination host has a secondary IP that can be used for live
migration.  To do it right you'd probably want to add a field to the
host table, a new parameter to the AddHostCmd api call, and pass that
migration IP along with MigrateCommand to the hypervisor. Once you do
that then the info is available to all of the hypervisor plugins to
use.

Outside of that, there are hacks that would allow this but they're at
least as much work as what I described above with the only benefit
being that you can avoid modifying the CloudStack code (writing api,
agent plugins). The easiest way would probably be to reconfigure the
network to allow mgmt traffic through a trunk on the 10G interfaces.

On Thu, Feb 12, 2015 at 5:46 PM, Star Guo  wrote:
> Hi, all,
>
>
>
>   Can I create tags in physical host in CloudStack ? I see template and 
> instance support tags.
>
>   I want to add a tag like “livemigrationipaddress” on kvm host, and change 
> the kvm hypervisor code support assign the live migration ip address (may be 
> storage network with 10gbps ) while default is the management ip.
>
>
>
> CloudMonkey
>
>  > help create tags
>
> (createTags) Creates resource tag(s)
>
> This API is asynchronous.
>
> Required params are resourceids tags resourcetype
>
> Parameters
>
> ==
>
> customer = (string) identifies client specific tag. When the value is not 
> null, the tag can't be used by cloudStack code internally
>
> resourceids = (list) list of resources to create the tags for
>
> tags = (map) Map of tags (key/value pairs)
>
> resourcetype = (string) type of the resource
>
>
>
>   Are there another good idea to do this ? Thanks.
>
>
>
> Best Regards,
>
> Star Guo
>


Re: [VOTE] Apache CloudStack 4.5.0 RC1

2015-01-13 Thread Marcus
+1, ran some of the smoke tests that cover basic deployments of vm,
vpc, and several storage types.

On Mon, Jan 12, 2015 at 11:36 PM, Rohit Yadav  wrote:
> (+ users)
>
> Hi everyone,
>
> David has started the voting process for 4.5.0 candidate, please help test 
> this candidate.
> In case you’re unable to build from source, you may use following repository 
> built from SHA 8db3cbd4ff62b17a8b496026b68cf60ee0c76740:
>
> DEB: http://packages.bhaisaab.org/cloudstack/testing/debian/4.5/
> RPM: http://packages.bhaisaab.org/cloudstack/testing/centos/4.5/
> SystemVM Templates: http://packages.shapeblue.com/systemvmtemplate/4.5/4.5.0
>
>> On 13-Jan-2015, at 4:46 am, David Nalley  wrote:
>>
>> Hi folks,
>>
>> I've created a 4.5.0 release candidate, with the following artifacts
>> up for a vote:
>>
>> Git Branch and Commit SH:
>> https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;a=tree;h=refs/heads/4.5-RC20150112T2256;hb=4.5-RC20150112T2256
>> Commit: 8db3cbd4ff62b17a8b496026b68cf60ee0c76740
>>
>> Source release (checksums and signatures are available at the same
>> location):
>> https://dist.apache.org/repos/dist/dev/cloudstack/4.5.0-rc1/
>>
>> PGP release keys (signed using 6FE50F1C):
>> https://dist.apache.org/repos/dist/release/cloudstack/KEYS
>>
>> Vote will be open for at least 72 hours.
>>
>> For sanity in tallying the vote, can PMC members please be sure to
>> indicate "(binding)" with their vote?
>>
>> [ ] +1  approve
>> [ ] +0  no opinion
>> [ ] -1  disapprove (and reason why)
>
> Regards,
> Rohit Yadav
> Software Architect, ShapeBlue
> M. +91 88 262 30892 | rohit.ya...@shapeblue.com
> Blog: bhaisaab.org | Twitter: @_bhaisaab
>
>
>
> Find out more about ShapeBlue and our range of CloudStack related services
>
> IaaS Cloud Design & Build
> CSForge – rapid IaaS deployment framework
> CloudStack Consulting
> CloudStack Software 
> Engineering
> CloudStack Infrastructure 
> Support
> CloudStack Bootcamp Training 
> Courses
>
> This email and any attachments to it may be confidential and are intended 
> solely for the use of the individual to whom it is addressed. Any views or 
> opinions expressed are solely those of the author and do not necessarily 
> represent those of Shape Blue Ltd or related companies. If you are not the 
> intended recipient of this email, you must neither take any action based upon 
> its contents, nor copy or show it to anyone. Please contact the sender if you 
> believe you have received this email in error. Shape Blue Ltd is a company 
> incorporated in England & Wales. ShapeBlue Services India LLP is a company 
> incorporated in India and is operated under license from Shape Blue Ltd. 
> Shape Blue Brasil Consultoria Ltda is a company incorporated in Brasil and is 
> operated under license from Shape Blue Ltd. ShapeBlue SA Pty Ltd is a company 
> registered by The Republic of South Africa and is traded under license from 
> Shape Blue Ltd. ShapeBlue is a registered trademark.


Re: Add Storage Network

2014-12-18 Thread Marcus
My guess is that you're right. I haven't seen anything that ties the
'physical network' to a specific NIC, thus if you register the traffic type
it should work so long as the hosts are able to route to that subnet on
*any* NIC. I'm not sure what hypervisor you're using, but for KVM at least,
you've also got to think about the bridge that SSVMs will want to attach
to. This is determined by bridge name (traffic label), and I believe if it
is set up in advance on the hosts it won't try to bridge to the primary
interface.

On Thu, Dec 18, 2014 at 8:06 AM, Logan Barfield 
wrote:
>
> During initial deployment we went with a simpler network design by letting
> secondary storage traffic run over the management network.  We would like
> to offload secondary storage to a separate network, and are trying to
> figure out the best method of doing so.
>
> Based on the recommendations in the documentation we would need to add a
> new Physical Network, and add the "Storage" traffic type to that network.
>
> Could we instead just add a new traffic type to the existing network, just
> set up the specified VLAN + Storage subnet on a separate NIC on the
> hypervisor?
>
> For Primary Storage the NIC used is just determined by the hypervisor's
> routing table, so wouldn't it work the same for the secondary storage
> network, or is there a reason it should be added as a separate Physical
> Network in CloudStack?
>


Re: Port forwarding (web) - doesnt show real client IP

2014-12-08 Thread Marcus
It sounds like some iptables rules got broken at some point for the static
NAT, and since there's still a catch-all SNAT for outbound it gets caught
by that and still keeps working, but is broken in a subtle way that goes
unnoticed.

On Mon, Dec 8, 2014 at 2:55 PM, Andrija Panic 
wrote:

> And just to spice things a little bit, ALL remote connections appears to
> come from main Public IP of the VPC VR.
> So we can not block some stuff on firewall onVM (while doing port
> forwading) because all connections appear to come from main Public IP of
> the VPC VR.
>
> This is terrible design/bug - can we change this ?
> I'm on the ACS 4.3 currently...
>
> cheers
>
> On 8 December 2014 at 23:42, Andrija Panic 
> wrote:
>
> > Hi,
> >
> > when doing port forwarding on VPC VR - port 80 - when some client access
> > web site - only the main Public IP of the VPC is logged in apache access
> > logs as remote IP.
> >
> > Why is this behaviour - and can this be changed ?
> > My understanding is that this is kind of bug (unless needed for some
> other
> > reasons) - port forwading is DNAT in essence, so only the destination
> > IP/port should be changed, not proxied all the way, as it seems to be the
> > case here...
> >
> > I read on other guys mailing list - same behavior for loadbalancer...
> >
> > Any suggestion ?
> >
> > Thanks,
> >
> > --
> >
> > Andrija Panić
> >
>
>
>
> --
>
> Andrija Panić
>


Re: Automatic KVM host reboot on Primary Storage failure

2014-11-14 Thread Marcus
It is there (I believe) because cloudstack is acting as a cluster manager
for KVM. It is using NFS to determine if it is 'alive' on the network, and
if it is not, it reboots itself to avoid having a split brain scenario
where VMs start coming up on other hosts when they are already running on
this host.  It generally works, if the problem is the host, but as you
point out, there's a situation where the problem can be the NFS server.
This fairly rare for enterprise NFS with high availability, but there are a
fair number of people who have NFS on servers that are relatively low
availability (non-clustered, or get overloaded and unresponsive).

There's plenty of room for improvement in that script, I agree the original
implemention seems fairly rudimentary, but we have to be careful in
thinking about all scenarios and make sure there's no chance of split
brain. In the mean time, one could also partition the resources such that
you have more clusters and only one primary storage per cluster (or
something else, like storage/host tags to guarantee each host only uses one
NFS).

On Fri, Nov 14, 2014 at 8:07 AM, Andrija Panic 
wrote:

> Hi guys,
>
> I'm wondering why us there a check
> inside
> /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/kvmheartbeat.sh
> ?
>
> I understand that the KVM host checks availability of Primary Storage, and
> reboots itself if it can't write to storage.
>
> But, if we have say, 3 NFS in a cluster, then lot of KVM hosts - 1 primary
> storage going down (server crashing or whatever) - will bring porbably 99%
> of KVM hosts also down for reboot ?
> So instead of loosing uptime for 1/3 of my VMs (1 storage out of 3) - I
> loose uptime for 99%-100% of my VMs ?
>
> I manually edit this script to disabled reboots - but why is it there in
> any case ?
> It doesn't make sense to me - unless I'm mising a point (probably)...
>
> Thanks,
> --
>
> Andrija Panić
>


Re: CloudStack Ports

2014-10-15 Thread Marcus
Ah, I see. I believe you'd need access to whatever IP the consoleproxy vm
is listening on. I don't actually use the console proxy vm for my purposes,
but I don't think you need to open the vnc console or libvirt ports to the
outside. If the console proxy works internally, you probably just don't
have access to the console proxy vm's IP when it opens the link to redirect
you. Are you NAT'ing to the mgmt server from outside? I think you'd need
the console proxy vm to be publicly reachable, and cloudstack seems to be
assigning it a rfc1918 address (192.168), which you'll never be able to
reach from the outside. Your best bet might be to set up a remote access
VPN in your home if you want to use the system from outside, such that you
are treated like you are inside. Something like openVPN.

On Wed, Oct 15, 2014 at 11:02 AM, Mo  wrote:

> Would this be on the Console VM, Or from the node? Need to know which
> local IP I need to redirect it to.
>
> I see in the log, it’s coming from 192.168.1.43 (which is console vm) so I
> suspect there?
>
>
> --
> Mo
> Sent with Airmail
>
> On October 15, 2014 at 1:00:12 PM, Marcus (shadow...@gmail.com) wrote:
>
> From outside, (say from hotel, through home router, to mgmt server) you
> need access to the web ui and for the web ui to have access to the api
> server. That would just be 8080 (UI) and 8096(API), I believe. you
> wouldn't
> need libvirt and the others unless you are stringing mgmt servers and
> hosts
> across the link.
>
> On Wed, Oct 15, 2014 at 10:43 AM, Mo  wrote:
>
> > Hello,
> >
> > I’ve setup Cloudstack on my home server. However, it works without
> issues
> > locally. When I attempt to pull up console outside, it times out. I have
> of
> > course enabled ports for SSH / UI, so I can setup instances, but I am
> not
> > sure what else I need to permit through my router to allow all the
> > necessary ports to be opened.
> >
> > According to the site, I have done the following:
> >
> > 22 (SSH)
> > 1798
> > 16509 (libvirt)
> > 5900 - 6100 (VNC consoles)
> > 49152 - 49216 (libvirt live migration)
> > Anything else?
> >
> > // Mo
>
>


Re: CloudStack Ports

2014-10-15 Thread Marcus
>From outside, (say from hotel, through home router, to mgmt server) you
need access to the web ui and for the web ui to have access to the api
server. That would just be 8080 (UI) and 8096(API), I believe. you wouldn't
need libvirt and the others unless you are stringing mgmt servers and hosts
across the link.

On Wed, Oct 15, 2014 at 10:43 AM, Mo  wrote:

> Hello,
>
> I’ve setup Cloudstack on my home server. However, it works without issues
> locally. When I attempt to pull up console outside, it times out. I have of
> course enabled ports for SSH / UI, so I can setup instances, but I am not
> sure what else I need to permit through my router to allow all the
> necessary ports to be opened.
>
> According to the site, I have done the following:
>
> 22 (SSH)
> 1798
> 16509 (libvirt)
> 5900 - 6100 (VNC consoles)
> 49152 - 49216 (libvirt live migration)
> Anything else?
>
> // Mo


Re: unable to connect to /var/lib/libvirt/qemu/v-1-VM.agent

2014-04-20 Thread Marcus
Sorry, actually I see the 'connection refused' is just your own test
after the fact. By that time the vm may be shut down, so connection
refused would make sense.

What happens if you do this:

'virsh dumpxml v-1-VM > /tmp/v-1-VM.xml' while it is running
stop the cloudstack agent
'virsh destroy v-1-VM'
'virsh create /tmp/v-1-VM.xml'
Then try connecting to that VM via VNC to watch it boot up, or running
that command manually, repeatedly? Does it time out?

In the end this may not mean much, because in CentOS 6.x that command
is retried over and over while the system vm is coming up anyway (in
other words, some failures are expected). It could be related, but it
could also be that the system vm is failing to come up for any other
reason, and this is just the thing you noticed.

On Sun, Apr 20, 2014 at 11:25 PM, Marcus  wrote:
> You may want to look in the qemu log of the vm to see if there's
> something deeper going on, perhaps the qemu process is not fully
> starting due to some other issue. /var/log/libvirt/qemu/v-1-VM.log, or
> something like that.
>
> On Sun, Apr 20, 2014 at 11:22 PM, Marcus  wrote:
>> No, it has nothing to do with ssh or libvirt daemon. It's the literal
>> unix socket that is created for virtio-serial communication when the
>> qemu process starts. The question is why the system is refusing access
>> to the socket. I assume this is being attempted as root.
>>
>> On Sat, Apr 19, 2014 at 9:58 AM, Nux!  wrote:
>>> On 19.04.2014 15:24, Giri Prasad wrote:
>>>
>>>>
>>>> # grep listen_ /etc/libvirt/libvirtd.conf
>>>> listen_tls=0
>>>> listen_tcp=1
>>>> #listen_addr = "192.XX.XX.X"
>>>> listen_addr = "0.0.0.0"
>>>>
>>>> #
>>>> /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl
>>>> -n v-1-VM -p
>>>>
>>>> %template=domP%type=consoleproxy%host=192.XXX.XX.5%port=8250%name=v-1-VM%zone=1%pod=1%guid=Proxy.1%proxy_vm=1%disable_rp_filter=true%eth2ip=192.XXX.XX.173%eth2mask=255.255.255.0%gateway=192.XXX.XX.1%eth0ip=169.254.0.173%eth0mask=255.255.0.0%eth1ip=192.XXX.XX.166%eth1mask=255.255.255.0%mgmtcidr=192.XXX.XX.0/24%localgw=192.XXX.XX.1%internaldns1=192.XXX.XX.1%dns1=192.XXX.XX.1
>>>> .
>>>> ERROR: unable to connect to /var/lib/libvirt/qemu/v-1-VM.agent -
>>>> Connection refused
>>>
>>>
>>> Do you have "-l" or "--listen" as LIBVIRTD_ARGS in /etc/sysconfig/libvirtd?
>>>
>>> (kind of stabbing in the dark)
>>>
>>>
>>> --
>>> Sent from the Delta quadrant using Borg technology!
>>>
>>> Nux!
>>> www.nux.ro


Re: unable to connect to /var/lib/libvirt/qemu/v-1-VM.agent

2014-04-20 Thread Marcus
You may want to look in the qemu log of the vm to see if there's
something deeper going on, perhaps the qemu process is not fully
starting due to some other issue. /var/log/libvirt/qemu/v-1-VM.log, or
something like that.

On Sun, Apr 20, 2014 at 11:22 PM, Marcus  wrote:
> No, it has nothing to do with ssh or libvirt daemon. It's the literal
> unix socket that is created for virtio-serial communication when the
> qemu process starts. The question is why the system is refusing access
> to the socket. I assume this is being attempted as root.
>
> On Sat, Apr 19, 2014 at 9:58 AM, Nux!  wrote:
>> On 19.04.2014 15:24, Giri Prasad wrote:
>>
>>>
>>> # grep listen_ /etc/libvirt/libvirtd.conf
>>> listen_tls=0
>>> listen_tcp=1
>>> #listen_addr = "192.XX.XX.X"
>>> listen_addr = "0.0.0.0"
>>>
>>> #
>>> /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl
>>> -n v-1-VM -p
>>>
>>> %template=domP%type=consoleproxy%host=192.XXX.XX.5%port=8250%name=v-1-VM%zone=1%pod=1%guid=Proxy.1%proxy_vm=1%disable_rp_filter=true%eth2ip=192.XXX.XX.173%eth2mask=255.255.255.0%gateway=192.XXX.XX.1%eth0ip=169.254.0.173%eth0mask=255.255.0.0%eth1ip=192.XXX.XX.166%eth1mask=255.255.255.0%mgmtcidr=192.XXX.XX.0/24%localgw=192.XXX.XX.1%internaldns1=192.XXX.XX.1%dns1=192.XXX.XX.1
>>> .
>>> ERROR: unable to connect to /var/lib/libvirt/qemu/v-1-VM.agent -
>>> Connection refused
>>
>>
>> Do you have "-l" or "--listen" as LIBVIRTD_ARGS in /etc/sysconfig/libvirtd?
>>
>> (kind of stabbing in the dark)
>>
>>
>> --
>> Sent from the Delta quadrant using Borg technology!
>>
>> Nux!
>> www.nux.ro


Re: unable to connect to /var/lib/libvirt/qemu/v-1-VM.agent

2014-04-20 Thread Marcus
No, it has nothing to do with ssh or libvirt daemon. It's the literal
unix socket that is created for virtio-serial communication when the
qemu process starts. The question is why the system is refusing access
to the socket. I assume this is being attempted as root.

On Sat, Apr 19, 2014 at 9:58 AM, Nux!  wrote:
> On 19.04.2014 15:24, Giri Prasad wrote:
>
>>
>> # grep listen_ /etc/libvirt/libvirtd.conf
>> listen_tls=0
>> listen_tcp=1
>> #listen_addr = "192.XX.XX.X"
>> listen_addr = "0.0.0.0"
>>
>> #
>> /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl
>> -n v-1-VM -p
>>
>> %template=domP%type=consoleproxy%host=192.XXX.XX.5%port=8250%name=v-1-VM%zone=1%pod=1%guid=Proxy.1%proxy_vm=1%disable_rp_filter=true%eth2ip=192.XXX.XX.173%eth2mask=255.255.255.0%gateway=192.XXX.XX.1%eth0ip=169.254.0.173%eth0mask=255.255.0.0%eth1ip=192.XXX.XX.166%eth1mask=255.255.255.0%mgmtcidr=192.XXX.XX.0/24%localgw=192.XXX.XX.1%internaldns1=192.XXX.XX.1%dns1=192.XXX.XX.1
>> .
>> ERROR: unable to connect to /var/lib/libvirt/qemu/v-1-VM.agent -
>> Connection refused
>
>
> Do you have "-l" or "--listen" as LIBVIRTD_ARGS in /etc/sysconfig/libvirtd?
>
> (kind of stabbing in the dark)
>
>
> --
> Sent from the Delta quadrant using Borg technology!
>
> Nux!
> www.nux.ro


Re: Upgrade from 4.1.1 to 4.3.0 ( KVM, Traffic labels, Adv. VLAN ) VR's bug

2014-04-20 Thread Marcus
No idea, but have you verified that the vm is running the new system
vm template? What happens if you destroy the router and let it
recreate?

On Sun, Apr 20, 2014 at 6:20 PM, Serg Senko  wrote:
> Hi
>
> After upgrade and restarting system-VM's
> all VR started with some bad network configuration, egress rules stopped
> work.
> also some staticNAT rules,
>
>
> there is " ip addr show " from one of VR's
>
> root@r-256-VM:~# ip addr show
>
> 1: lo:  mtu 16436 qdisc noqueue state UNKNOWN
>
> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>
> inet 127.0.0.1/8 scope host lo
>
> inet6 ::1/128 scope host
>
>valid_lft forever preferred_lft forever
>
> 2: eth0:  mtu 1500 qdisc pfifo_fast state
> UP qlen 1000
>
> link/ether 02:00:6b:16:00:09 brd ff:ff:ff:ff:ff:ff
>
> inet 10.1.1.1/24 brd 10.1.1.255 scope global eth0
>
> inet6 fe80::6bff:fe16:9/64 scope link
>
>valid_lft forever preferred_lft forever
>
> 3: eth1:  mtu 1500 qdisc pfifo_fast state
> UP qlen 1000
>
> link/ether 0e:00:a9:fe:01:38 brd ff:ff:ff:ff:ff:ff
>
> inet 169.254.1.56/16 brd 169.254.255.255 scope global eth1
>
> inet6 fe80::c00:a9ff:fefe:138/64 scope link
>
>valid_lft forever preferred_lft forever
>
> 4: eth2:  mtu 1500 qdisc pfifo_fast state
> UP qlen 1000
>
> link/ether 06:06:ec:00:00:0e brd ff:ff:ff:ff:ff:ff
>
> inet XXX.XXX.XXX.219/26 brd 46.165.231.255 scope global eth2
>
> inet6 fe80::406:ecff:fe00:e/64 scope link
>
>valid_lft forever preferred_lft forever
>
> 5: eth3:  mtu 1500 qdisc pfifo_fast state
> UP qlen 1000
>
> link/ether 06:81:44:00:00:0e brd ff:ff:ff:ff:ff:ff
>
> inet XXX.XXX.XXX.219/26 brd 46.165.231.255 scope global eth3
>
> inet XXX.XXX.XXX.230/26 brd 46.165.231.255 scope global secondary eth3
>
> inet XXX.XXX.XXX.228/26 brd 46.165.231.255 scope global secondary eth3
>
> inet XXX.XXX.XXX.209/26 brd 46.165.231.255 scope global secondary eth3
>
> inet XXX.XXX.XXX.247/26 brd 46.165.231.255 scope global secondary eth3
>
> inet XXX.XXX.XXX.227/26 brd 46.165.231.255 scope global secondary eth3
>
> inet6 fe80::481:44ff:fe00:e/64 scope link
>
>valid_lft forever preferred_lft forever
>
> 6: eth4:  mtu 1500 qdisc pfifo_fast state
> UP qlen 1000
>
> link/ether 06:e5:36:00:00:0e brd ff:ff:ff:ff:ff:ff
>
> inet XXX.XXX.XXX.219/26 brd 46.165.231.255 scope global eth4
>
> inet XXX.XXX.XXX.247/26 brd 46.165.231.255 scope global secondary eth4
>
> inet XXX.XXX.XXX.209/26 brd 46.165.231.255 scope global secondary eth4
>
> inet XXX.XXX.XXX.227/26 brd 46.165.231.255 scope global secondary eth4
>
> inet XXX.XXX.XXX.230/26 brd 46.165.231.255 scope global secondary eth4
>
> inet6 fe80::4e5:36ff:fe00:e/64 scope link
>
>valid_lft forever preferred_lft forever
>
> 7: eth5:  mtu 1500 qdisc pfifo_fast state
> UP qlen 1000
>
> link/ether 06:6f:3a:00:00:0e brd ff:ff:ff:ff:ff:ff
>
> inet XXX.XXX.XXX.219/26 brd 46.165.231.255 scope global eth5
>
> inet XXX.XXX.XXX.228/26 brd 46.165.231.255 scope global secondary eth5
>
> inet XXX.XXX.XXX.227/26 brd 46.165.231.255 scope global secondary eth5
>
> inet XXX.XXX.XXX.209/26 brd 46.165.231.255 scope global secondary eth5
>
> inet XXX.XXX.XXX.247/26 brd 46.165.231.255 scope global secondary eth5
>
> inet XXX.XXX.XXX.230/26 brd 46.165.231.255 scope global secondary eth5
>
> inet6 fe80::46f:3aff:fe00:e/64 scope link
>
>valid_lft forever preferred_lft forever
>
> 8: eth6:  mtu 1500 qdisc pfifo_fast state
> UP qlen 1000
>
> link/ether 06:b0:30:00:00:0e brd ff:ff:ff:ff:ff:ff
>
> inet XXX.XXX.XXX.219/26 brd 46.165.231.255 scope global eth6
>
> inet XXX.XXX.XXX.209/26 brd 46.165.231.255 scope global secondary eth6
>
> inet XXX.XXX.XXX.247/26 brd 46.165.231.255 scope global secondary eth6
>
> inet XXX.XXX.XXX.230/26 brd 46.165.231.255 scope global secondary eth6
>
> inet XXX.XXX.XXX.227/26 brd 46.165.231.255 scope global secondary eth6
>
> inet6 fe80::4b0:30ff:fe00:e/64 scope link
>
>valid_lft forever preferred_lft forever
>
> 9: eth7:  mtu 1500 qdisc pfifo_fast state
> UP qlen 1000
>
> link/ether 06:26:b4:00:00:0e brd ff:ff:ff:ff:ff:ff
>
> inet XXX.XXX.XXX.219/26 brd 46.165.231.255 scope global eth7
>
> inet XXX.XXX.XXX.247/26 brd 46.165.231.255 scope global secondary eth7
>
> inet XXX.XXX.XXX.228/26 brd 46.165.231.255 scope global secondary eth7
>
> inet XXX.XXX.XXX.230/26 brd 46.165.231.255 scope global secondary eth7
>
> inet XXX.XXX.XXX.209/26 brd 46.165.231.255 scope global secondary eth7
>
> inet XXX.XXX.XXX.227/26 brd 46.165.231.255 scope global secondary eth7
>
> inet6 fe80::426:b4ff:fe00:e/64 scope link
>
>valid_lft forever preferred_lft forever
>
>
> --
> ttyv0 "/usr/libexec/gmail Pc"  webcons on secure


Re: Live migration failed to newly provisioned KVM host

2014-04-18 Thread Marcus
Yes, it looks as though the two machines are running different
versions of qemu/libvirt, as the destination doesn't support the
machine type that the VM has defined in it's XML on the source host.

On Fri, Apr 18, 2014 at 3:43 PM, Nux!  wrote:
> On 18.04.2014 19:45, Indra Pramana wrote:
>>
>> Unable to migrate due to internal error Process exited while reading
>> console log output: Supported machines are:
>> pc Standard PC (alias of pc-1.0)
>> pc-1.0 Standard PC (default)
>> pc-0.14Standard PC
>> pc-0.13Standard PC
>> pc-0.12Standard PC
>> pc-0.11Standard PC, qemu 0.11
>> pc-0.10Standard PC, qemu 0.10
>> isapc  ISA-only PC
>
>
> What OS versions are you running and also what KVM versions, do you have
> anything extra enabled in the agent (eg a specific CPU type vs the generic
> KVM cpu)?
> Additionally do check
> https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Virtualization_Administration_Guide/sect-Live_migration_and_RHEL_compatibility.html#Live_Migration_Compatibility
>
> --
> Sent from the Delta quadrant using Borg technology!
>
> Nux!
> www.nux.ro


Re: KVM, Re-create VR failed

2014-04-14 Thread Marcus
Your agent snippet just looks like the system trying to stop the vm.
If a vm fails to start, it will also run through the stop routine to
clean up all of the prework, so the 'failed to stop' debug is all
normal. You may need to go above and look at why it failed to start.

On Fri, Apr 11, 2014 at 3:53 PM, Serg Senko  wrote:
> Hi,
>
> It's can be some know bug?
> Possible it's already solved in new releases of CS but i need the
> work-around or fix before upgrade or reference to bug id.
>
> Environment:
> CS 4.1.1
> libvirt-1.0.1
> qemu-kvm-1.2
> NFS Storage ( as primary for VR's )
> Advanced VLAN isolation
>
> After hypervisor host crashing, one of VR's has failed to start in failover
> case,
> I have stopped it through UI with force, then was removed the VR for
> re-create it again by start/create VM API call.
>
>
> Try to start the Instance associated with this network, but failed because
> the VR can't be started when newly created.
>
> cloudstack-agent:
>
> 2014-04-11 07:05:34,546 DEBUG [kvm.resource.LibvirtComputingResource]
> (agentRequest-Handler-2:null) Failed to get dom xml:
> org.libvirt.LibvirtException: Domain not found: no domain with matching
> uuid '373ab4a9-cb8c-3275-a455-b9b4b963a983'
>
> 2014-04-11 07:05:34,547 DEBUG [kvm.resource.LibvirtComputingResource]
> (agentRequest-Handler-2:null) Failed to get dom xml:
> org.libvirt.LibvirtException: Domain not found: no domain with matching
> uuid '373ab4a9-cb8c-3275-a455-b9b4b963a983'
>
> 2014-04-11 07:05:34,548 DEBUG [kvm.resource.LibvirtComputingResource]
> (agentRequest-Handler-2:null) Failed to get dom xml:
> org.libvirt.LibvirtException: Domain not found: no domain with matching
> uuid '373ab4a9-cb8c-3275-a455-b9b4b963a983'
>
> 2014-04-11 07:05:34,548 DEBUG [kvm.resource.LibvirtComputingResource]
> (agentRequest-Handler-2:null) Executing:
> /usr/share/cloudstack-common/scripts/vm/network/security_group.py
> destroy_network_rules_for_vm --vmname r-377-VM
>
> 2014-04-11 07:05:34,663 DEBUG [kvm.resource.LibvirtComputingResource]
> (agentRequest-Handler-2:null) Execution is successful.
>
> 2014-04-11 07:05:34,664 DEBUG [kvm.resource.LibvirtComputingResource]
> (agentRequest-Handler-2:null) Try to stop the vm at first
>
> 2014-04-11 07:05:34,665 DEBUG [kvm.resource.LibvirtComputingResource]
> (agentRequest-Handler-2:null) Failed to stop VM :r-377-VM :
>
> org.libvirt.LibvirtException: Domain not found: no domain with matching
> uuid '373ab4a9-cb8c-3275-a455-b9b4b963a983'
>
> at org.libvirt.ErrorHandler.processError(Unknown Source)
>
> at org.libvirt.Connect.processError(Unknown Source)
>
> at org.libvirt.Connect.domainLookupByUUIDString(Unknown Source)
>
> at org.libvirt.Connect.domainLookupByUUID(Unknown Source)
>
> at
> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.stopVM(LibvirtComputingResource.java:4021)
>
> at
> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.stopVM(LibvirtComputingResource.java:3970)
>
> at
> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.execute(LibvirtComputingResource.java:2894)
>
> at
> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1032)
>
> at com.cloud.agent.Agent.processRequest(Agent.java:525)
>
> at com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:852)
>
> at com.cloud.utils.nio.Task.run(Task.java:83)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>
> at java.lang.Thread.run(Thread.java:679)
>
> 2014-04-11 07:05:34,666 DEBUG [kvm.resource.LibvirtComputingResource]
> (agentRequest-Handler-2:null) Failed to get vm status:Domain not found: no
> domain with matching uuid '373ab4a9-cb8c-3275-a455-b9b4b963a983'
>
> 2014-04-11 07:05:34,667 DEBUG [kvm.resource.LibvirtComputingResource]
> (agentRequest-Handler-2:null) Failed to get vm status:Domain not found: no
> domain with matching uuid '373ab4a9-cb8c-3275-a455-b9b4b963a983'
>
> 2014-04-11 07:05:34,668 DEBUG [kvm.resource.LibvirtComputingResource]
> (agentRequest-Handler-2:null) Failed to get vm status:Domain not found: no
> domain with matching uuid '373ab4a9-cb8c-3275-a455-b9b4b963a983'
>
>
>
>
> Management CS:
>
> 2014-04-11 07:05:40,503 DEBUG
> [network.router.VirtualNetworkApplianceManagerImpl]
> (Job-Executor-114:job-3001) Found 5 ip(s) to apply as a part of domR
> VM[DomainRouter|r-377-VM] start.
>
> 2014-04-11 07:05:40,528 DEBUG
> [network.router.VirtualNetworkApplianceManagerImpl]
> (Job-Executor-114:job-3001) Resending ipAssoc, port forwarding, load
> balancing rules as a part of Virtual router start
>
> 2014-04-11 07:05:40,542 DEBUG
> [network.router.VirtualNetworkApplianceManagerImpl]
> (Job-Executor-114:job-3001) Found 1 firewall Egress rule(s) to apply as a
> part of domR VM[DomainRouter|r-377-VM] start.
>
> 2014-04-11 07:05:40,581 ERROR [cloud.vm.VirtualMachineManagerImpl]
> (Job-Executor-114:job-3001) Failed to start insta

CloudStack implementations

2014-03-18 Thread Marcus
Do we have any general stats on how cloudstack is being used? Common
deployment sizes, largest deployments, etc? I'm curious as to how far
people have actually scaled it in real deployments, although I realize that
the info can be proprietary.


Re: ALARM - ACS reboots host servers!!!

2014-03-04 Thread Marcus
On Tue, Mar 4, 2014 at 3:34 AM, France  wrote:
> Hi Marcus and others.
>
> There is no need to kill of the entire hypervisor, if one of the primary
> storages fail.
> You just need to kill the VMs and probably disable SR on XenServer, because
> all other SRs and VMs have no problems.
> if you kill those, then you can safely start them elsewhere. On XenServer
> 6.2 you call destroy the VMs which lost access to NFS without any problems.

That's a great idea, but as already mentioned, it doesn't work in
practice. You can't kill a VM that is hanging in D state, waiting on
storage. I also mentioned that it causes problems for libvirt and much
of the other system not using the storage.

>
> If you really want to still kill the entire host and it's VMs in one go, I
> would suggest live migrating the VMs which have had not lost their storage
> off first, and then kill those VMs on a stale NFS by doing hard reboot.
> Additional time, while migrating working VMs, would even give some grace
> time for NFS to maybe recover. :-)

You won't be able to live migrate a VM that is stuck in D state, or
use libvirt to do so if one of its storage pools is unresponsive,
anyway.

>
> Hard reboot to recover from D state of NFS client can also be avoided by
> using soft mount options.

As mentioned, soft and intr very rarely actually work, in my
experience. I wish they did as I truly have come to loathe NFS for it.

>
> I run a bunch of Pacemaker/Corosync/Cman/Heartbeat/etc clusters and we don't
> just kill whole nodes but fence services from specific nodes. STONITH is
> implemented only when the node looses the quorum.

Sure, but how do you fence a KVM host from an NFS server? I don't
think we've written a firewall plugin that works to fence hosts from
any NFS server. Regardless, what CloudStack does is more of a poor
man's clustering, the mgmt server is the locking in the sense that it
is managing what's going on, but it's not a real clustering service.
Heck, it doesn't even STONITH, it tries to clean shutdown, which fails
as well due to hanging NFS (per the mentioned bug, to fix it they'll
need IPMI fencing or something like that).

I didn't write the code, I'm just saying that I can completely
understand why it kills nodes when it deems that their storage has
gone belly-up. It's dangerous to leave that D state VM hanging around,
and it will until the NFS storage comes back. In a perfect world you'd
just stop the VMs that were having the issue, or if there were no VMs
you'd just de-register the storage from libvirt, I agree.

>
> Regards,
> F.
>
>
> On 3/3/14 5:35 PM, Marcus wrote:
>>
>> It's the standard clustering problem. Any software that does any sort
>> of avtive clustering is going to fence nodes that have problems, or
>> should if it cares about your data. If the risk of losing a host due
>> to a storage pool outage is too great, you could perhaps look at
>> rearranging your pool-to-host correlations (certain hosts run vms from
>> certain pools) via clusters. Note that if you register a storage pool
>> with a cluster, it will register the pool with libvirt when the pool
>> is not in maintenance, which, when the storage pool goes down will
>> cause problems for the host even if no VMs from that storage are
>> running (fetching storage stats for example will cause agent threads
>> to hang if its NFS), so you'd need to put ceph in its own cluster and
>> NFS in its own cluster.
>>
>> It's far more dangerous to leave a host in an unknown/bad state. If a
>> host loses contact with one of your storage nodes, with HA, cloudstack
>> will want to start the affected VMs elsewhere. If it does so, and your
>> original host wakes up from it's NFS hang, you suddenly have a VM
>> running in two locations, corruption ensues. You might think we could
>> just stop the affected VMs, but NFS tends to make things that touch it
>> go into D state, even with 'intr' and other parameters, which affects
>> libvirt and the agent.
>>
>> We could perhaps open a feature request to disable all HA and just
>> leave things as-is, disallowing operations when there are outages. If
>> that sounds useful you can create the feature request on
>> https://issues.apache.org/jira.
>>
>>
>> On Mon, Mar 3, 2014 at 5:37 AM, Andrei Mikhailovsky 
>> wrote:
>>>
>>> Koushik, I understand that and I will put the storage into the
>>> maintenance mode next time. However, things happen and servers crash from
>>> time to time, which is not the reason to reboot all host servers, even those
>>> which do not have any running vms with volumes on the nf

Re: ALARM - ACS reboots host servers!!!

2014-03-03 Thread Marcus
It's the standard clustering problem. Any software that does any sort
of avtive clustering is going to fence nodes that have problems, or
should if it cares about your data. If the risk of losing a host due
to a storage pool outage is too great, you could perhaps look at
rearranging your pool-to-host correlations (certain hosts run vms from
certain pools) via clusters. Note that if you register a storage pool
with a cluster, it will register the pool with libvirt when the pool
is not in maintenance, which, when the storage pool goes down will
cause problems for the host even if no VMs from that storage are
running (fetching storage stats for example will cause agent threads
to hang if its NFS), so you'd need to put ceph in its own cluster and
NFS in its own cluster.

It's far more dangerous to leave a host in an unknown/bad state. If a
host loses contact with one of your storage nodes, with HA, cloudstack
will want to start the affected VMs elsewhere. If it does so, and your
original host wakes up from it's NFS hang, you suddenly have a VM
running in two locations, corruption ensues. You might think we could
just stop the affected VMs, but NFS tends to make things that touch it
go into D state, even with 'intr' and other parameters, which affects
libvirt and the agent.

We could perhaps open a feature request to disable all HA and just
leave things as-is, disallowing operations when there are outages. If
that sounds useful you can create the feature request on
https://issues.apache.org/jira.


On Mon, Mar 3, 2014 at 5:37 AM, Andrei Mikhailovsky  wrote:
>
> Koushik, I understand that and I will put the storage into the maintenance 
> mode next time. However, things happen and servers crash from time to time, 
> which is not the reason to reboot all host servers, even those which do not 
> have any running vms with volumes on the nfs storage. The bloody agent just 
> rebooted every single host server regardless if they were running vms with 
> volumes on the rebooted nfs server. 95% of my vms are running from ceph and 
> those should have never been effected in the first place.
> - Original Message -
>
> From: "Koushik Das" 
> To: "" 
> Cc: d...@cloudstack.apache.org
> Sent: Monday, 3 March, 2014 5:55:34 AM
> Subject: Re: ALARM - ACS reboots host servers!!!
>
> The primary storage needs to be put in maintenance before doing any 
> upgrade/reboot as mentioned in the previous mails.
>
> -Koushik
>
> On 03-Mar-2014, at 6:07 AM, Marcus  wrote:
>
>> Also, please note that in the bug you referenced it doesn't have a
>> problem with the reboot being triggered, but with the fact that reboot
>> never completes due to hanging NFS mount (which is why the reboot
>> occurs, inaccessible primary storage).
>>
>> On Sun, Mar 2, 2014 at 5:26 PM, Marcus  wrote:
>>> Or do you mean you have multiple primary storages and this one was not
>>> in use and put into maintenance?
>>>
>>> On Sun, Mar 2, 2014 at 5:25 PM, Marcus  wrote:
>>>> I'm not sure I understand. How do you expect to reboot your primary
>>>> storage while vms are running? It sounds like the host is being
>>>> fenced since it cannot contact the resources it depends on.
>>>>
>>>> On Sun, Mar 2, 2014 at 3:24 PM, Nux!  wrote:
>>>>> On 02.03.2014 21:17, Andrei Mikhailovsky wrote:
>>>>>>
>>>>>> Hello guys,
>>>>>>
>>>>>>
>>>>>> I've recently came across the bug CLOUDSTACK-5429 which has rebooted
>>>>>> all of my host servers without properly shutting down the guest vms.
>>>>>> I've simply upgraded and rebooted one of the nfs primary storage
>>>>>> servers and a few minutes later, to my horror, i've found out that all
>>>>>> of my host servers have been rebooted. Is it just me thinking so, or
>>>>>> is this bug should be fixed ASAP and should be a blocker for any new
>>>>>> ACS release. I mean not only does it cause downtime, but also possible
>>>>>> data loss and server corruption.
>>>>>
>>>>>
>>>>> Hi Andrei,
>>>>>
>>>>> Do you have HA enabled and did you put that primary storage in maintenance
>>>>> mode before rebooting it?
>>>>> It's my understanding that ACS relies on the shared storage to perform HA 
>>>>> so
>>>>> if the storage goes it's expected to go berserk. I've noticed similar
>>>>> behaviour in Xenserver pools without ACS.
>>>>> I'd imagine a "cure" for this would be to use network distributed
>>>>> "filesystems" like GlusterFS or CEPH.
>>>>>
>>>>> Lucian
>>>>>
>>>>> --
>>>>> Sent from the Delta quadrant using Borg technology!
>>>>>
>>>>> Nux!
>>>>> www.nux.ro
>
>


Re: ALARM - ACS reboots host servers!!!

2014-03-02 Thread Marcus
Also, please note that in the bug you referenced it doesn't have a
problem with the reboot being triggered, but with the fact that reboot
never completes due to hanging NFS mount (which is why the reboot
occurs, inaccessible primary storage).

On Sun, Mar 2, 2014 at 5:26 PM, Marcus  wrote:
> Or do you mean you have multiple primary storages and this one was not
> in use and put into maintenance?
>
> On Sun, Mar 2, 2014 at 5:25 PM, Marcus  wrote:
>> I'm not sure I understand. How do you expect to reboot your primary
>> storage while vms are running?  It sounds like the host is being
>> fenced since it cannot contact the resources it depends on.
>>
>> On Sun, Mar 2, 2014 at 3:24 PM, Nux!  wrote:
>>> On 02.03.2014 21:17, Andrei Mikhailovsky wrote:
>>>>
>>>> Hello guys,
>>>>
>>>>
>>>> I've recently came across the bug CLOUDSTACK-5429 which has rebooted
>>>> all of my host servers without properly shutting down the guest vms.
>>>> I've simply upgraded and rebooted one of the nfs primary storage
>>>> servers and a few minutes later, to my horror, i've found out that all
>>>> of my host servers have been rebooted. Is it just me thinking so, or
>>>> is this bug should be fixed ASAP and should be a blocker for any new
>>>> ACS release. I mean not only does it cause downtime, but also possible
>>>> data loss and server corruption.
>>>
>>>
>>> Hi Andrei,
>>>
>>> Do you have HA enabled and did you put that primary storage in maintenance
>>> mode before rebooting it?
>>> It's my understanding that ACS relies on the shared storage to perform HA so
>>> if the storage goes it's expected to go berserk. I've noticed similar
>>> behaviour in Xenserver pools without ACS.
>>> I'd imagine a "cure" for this would be to use network distributed
>>> "filesystems" like GlusterFS or CEPH.
>>>
>>> Lucian
>>>
>>> --
>>> Sent from the Delta quadrant using Borg technology!
>>>
>>> Nux!
>>> www.nux.ro


Re: ALARM - ACS reboots host servers!!!

2014-03-02 Thread Marcus
I'm not sure I understand. How do you expect to reboot your primary
storage while vms are running?  It sounds like the host is being
fenced since it cannot contact the resources it depends on.

On Sun, Mar 2, 2014 at 3:24 PM, Nux!  wrote:
> On 02.03.2014 21:17, Andrei Mikhailovsky wrote:
>>
>> Hello guys,
>>
>>
>> I've recently came across the bug CLOUDSTACK-5429 which has rebooted
>> all of my host servers without properly shutting down the guest vms.
>> I've simply upgraded and rebooted one of the nfs primary storage
>> servers and a few minutes later, to my horror, i've found out that all
>> of my host servers have been rebooted. Is it just me thinking so, or
>> is this bug should be fixed ASAP and should be a blocker for any new
>> ACS release. I mean not only does it cause downtime, but also possible
>> data loss and server corruption.
>
>
> Hi Andrei,
>
> Do you have HA enabled and did you put that primary storage in maintenance
> mode before rebooting it?
> It's my understanding that ACS relies on the shared storage to perform HA so
> if the storage goes it's expected to go berserk. I've noticed similar
> behaviour in Xenserver pools without ACS.
> I'd imagine a "cure" for this would be to use network distributed
> "filesystems" like GlusterFS or CEPH.
>
> Lucian
>
> --
> Sent from the Delta quadrant using Borg technology!
>
> Nux!
> www.nux.ro


Re: ALARM - ACS reboots host servers!!!

2014-03-02 Thread Marcus
Or do you mean you have multiple primary storages and this one was not
in use and put into maintenance?

On Sun, Mar 2, 2014 at 5:25 PM, Marcus  wrote:
> I'm not sure I understand. How do you expect to reboot your primary
> storage while vms are running?  It sounds like the host is being
> fenced since it cannot contact the resources it depends on.
>
> On Sun, Mar 2, 2014 at 3:24 PM, Nux!  wrote:
>> On 02.03.2014 21:17, Andrei Mikhailovsky wrote:
>>>
>>> Hello guys,
>>>
>>>
>>> I've recently came across the bug CLOUDSTACK-5429 which has rebooted
>>> all of my host servers without properly shutting down the guest vms.
>>> I've simply upgraded and rebooted one of the nfs primary storage
>>> servers and a few minutes later, to my horror, i've found out that all
>>> of my host servers have been rebooted. Is it just me thinking so, or
>>> is this bug should be fixed ASAP and should be a blocker for any new
>>> ACS release. I mean not only does it cause downtime, but also possible
>>> data loss and server corruption.
>>
>>
>> Hi Andrei,
>>
>> Do you have HA enabled and did you put that primary storage in maintenance
>> mode before rebooting it?
>> It's my understanding that ACS relies on the shared storage to perform HA so
>> if the storage goes it's expected to go berserk. I've noticed similar
>> behaviour in Xenserver pools without ACS.
>> I'd imagine a "cure" for this would be to use network distributed
>> "filesystems" like GlusterFS or CEPH.
>>
>> Lucian
>>
>> --
>> Sent from the Delta quadrant using Borg technology!
>>
>> Nux!
>> www.nux.ro


Re: KVM

2014-03-02 Thread Marcus
It doesn't look like you've enabled debug, you're only getting WARN
and INFO messages. Please enable debug.

On Sun, Mar 2, 2014 at 4:40 PM, María Noelia Gil  wrote:
> When I run "CloudStack-setup-agent" shows the following:
>
> Starting to configure your system:
> Configure Apparmor ...[OK]
> Configure Network ... [OK]
> Configure Libvirt ...
> [OK]
> Configure Firewall ...
> [OK]
> Configure Nfs ... [OK]
> Configure cloudAgent ...
> [OK]
> CloudStack Agent setup is done!
>
> The log file displays the following.
>
> 2014-03-03 00:32:44,320 INFO  [cloud.agent.AgentShell] (main:null) Agent 
> started
> 2014-03-03 00:32:44,321 INFO  [cloud.agent.AgentShell] (main:null) 
> Implementation Version is 4.2.1
> 2014-03-03 00:32:44,322 INFO  [cloud.agent.AgentShell] (main:null) 
> agent.properties found at /etc/cloudstack/agent/agent.properties
> 2014-03-03 00:32:44,323 INFO  [cloud.agent.AgentShell] (main:null) Defaulting 
> to using properties file for storage
> 2014-03-03 00:32:44,324 INFO  [cloud.agent.AgentShell] (main:null) Defaulting 
> to the constant time backoff algorithm
> 2014-03-03 00:32:44,326 INFO  [cloud.utils.LogUtils] (main:null) log4j 
> configuration found at /etc/cloudstack/agent/log4j-cloud.xml
> 2014-03-03 00:32:44,384 INFO  [cloud.agent.Agent] (main:null) id is 0
> 2014-03-03 00:32:44,396 INFO  
> [resource.virtualnetwork.VirtualRoutingResource] (main:null) 
> VirtualRoutingResource _scriptDir to use: scripts/network/domr/kvm
> 2014-03-03 00:32:45,020 WARN  [kvm.resource.LibvirtComputingResource] 
> (main:null) LibVirt version 0.9.10 required for guest cpu mode, but version 
> 0.9.8 detected, so it will be disabled
> 2014-03-03 00:32:45,114 INFO  [kvm.resource.LibvirtComputingResource] 
> (main:null) No libvirt.vif.driver specified. Defaults to BridgeVifDriver.
> 2014-03-03 00:32:45,145 INFO  [cloud.agent.Agent] (main:null) Agent [id = 0 : 
> type = LibvirtComputingResource : zone = default : pod = default : workers = 
> 5 : host = localhost : port = 8250
> 2014-03-03 00:32:45,154 INFO  [utils.nio.NioClient] (Agent-Selector:null) 
> Connecting to localhost:8250
> 2014-03-03 00:32:45,333 INFO  [utils.nio.NioClient] (Agent-Selector:null) 
> SSL: Handshake done
> 2014-03-03 00:32:45,334 INFO  [utils.nio.NioClient] (Agent-Selector:null) 
> Connected to localhost:8250
> 2014-03-03 00:32:45,662 INFO  [cloud.serializer.GsonHelper] 
> (Agent-Handler-1:null) Default Builder inited.
> 2014-03-03 00:32:45,733 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) 
> Proccess agent startup answer, agent id = 0
> 2014-03-03 00:32:45,733 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Set 
> agent id 0
> 2014-03-03 00:32:45,737 INFO  [cloud.agent.Agent] (AgentShutdownThread:null) 
> Stopping the agent: Reason = sig.kill
> 2014-03-03 00:32:45,738 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) 
> Startup Response Received: agent id = 0
>
> I do not get to fix the error.
>
> Thanks.
>
> El 02/03/2014, a las 00:25, Marcus  escribió:
>
>> changing
>


Re: KVM

2014-03-01 Thread Marcus
Change the agent to debug mode by changing all 'INFO'  to 'DEBUG' in
/etc/cloudstack/agent/log4j.cloud, this will give you more info,
usually the agent can't find the required network config on the host
or something like that.

On Sat, Mar 1, 2014 at 1:21 PM, María Noelia Gil  wrote:
> Hi, I'm throwing the CloudStack agent with "CloudStack-setup-agent" command. 
> Initially shows "OK" and the configuration of the interface is unchanged, but 
> the server can not find the agent. I have looked at the log file and I found 
> the following line:
>
> 02/27/2014 17:49:16,538 INFO [cloud.agent.Agent] (AgentShutdownThread: null) 
> Stopping the agent: Reason = sig.kill
>
> Can this be the problem? How can I fix this?
>
> Thanks.


devs looking to get involved?

2014-02-07 Thread Marcus
I occasionally have some minor improvement that I need to make to
whatever version of cloudstack we're currently running. This is almost
always behind what is currently being developed, and I don't always
feel like I have time to rebase/refactor it for master. I'm wondering
if there are any individuals out there who are looking for ways to
contribute that I can hand off these little projects to and mentor a
bit. You get an easy project to get your feet wet and I get the
improvements into master without using up time. :-) Maybe this sounds
like laziness on my part, but it seemed like a good idea.

I can post a patch that works for one version, and you can rework it
and test against master. Respond if you're interested, there may be
others who have easy work to farm off as well.

An example, I've made some improvements to the template downloader,
when the first 1M of the template is pulled we attempt to verify that
the image is actually what we think it is (qcow2 or vmdk or whatever)
by looking at the data. Up until now we just check file extension. It
also contains a new TemplateUtils class that has:

+public static boolean isCorrectExtension(String path, String ext) {
+if (path.toLowerCase().endsWith(ext)
+|| path.toLowerCase().endsWith(ext + ".gz")
+|| path.toLowerCase().endsWith(ext + ".bz2")
+|| path.toLowerCase().endsWith(ext + ".zip")) {
+return true;
+}
+return false;
+}

Which can be used to clean up the likes of:

private void checkFormat(String format, String url) {

if((!url.toLowerCase().endsWith("vhd"))&&(!url.toLowerCase().endsWith("vhd.zip"))

&&(!url.toLowerCase().endsWith("vhd.bz2"))&&(!url.toLowerCase().endsWith("vhd.gz"))

&&(!url.toLowerCase().endsWith("qcow2"))&&(!url.toLowerCase().endsWith("qcow2.zip"))

&&(!url.toLowerCase().endsWith("qcow2.bz2"))&&(!url.toLowerCase().endsWith("qcow2.gz"))

&&(!url.toLowerCase().endsWith("ova"))&&(!url.toLowerCase().endsWith("ova.zip"))

&&(!url.toLowerCase().endsWith("ova.bz2"))&&(!url.toLowerCase().endsWith("ova.gz"))

&&(!url.toLowerCase().endsWith("tar"))&&(!url.toLowerCase().endsWith("tar.zip"))

&&(!url.toLowerCase().endsWith("tar.bz2"))&&(!url.toLowerCase().endsWith("tar.gz"))
&&(!url.toLowerCase().endsWith("vmdk")) &&
(!url.toLowerCase().endsWith("vmdk.gz"))
&&(!url.toLowerCase().endsWith("vmdk.zip")) &&
(!url.toLowerCase().endsWith("vmdk.bz2")) &&
(!url.toLowerCase().endsWith("img"))
&&(!url.toLowerCase().endsWith("img.gz")) &&
(!url.toLowerCase().endsWith("img.zip")) &&
(!url.toLowerCase().endsWith("img.bz2"))
&&(!url.toLowerCase().endsWith("raw")) &&
(!url.toLowerCase().endsWith("raw.gz")) &&
(!url.toLowerCase().endsWith("raw.bz2"))
&&(!url.toLowerCase().endsWith("raw.zip"))){
throw new InvalidParameterValueException("Please specify a
valid " + format.toLowerCase());
}
if ((format.equalsIgnoreCase("vhd")
 && (!url.toLowerCase().endsWith("vhd")
 && !url.toLowerCase().endsWith("vhd.zip")
 && !url.toLowerCase().endsWith("vhd.bz2")
 && !url.toLowerCase().endsWith("vhd.gz")))
|| (format.equalsIgnoreCase("vhdx")
 && (!url.toLowerCase().endsWith("vhdx")
 && !url.toLowerCase().endsWith("vhdx.zip")
 && !url.toLowerCase().endsWith("vhdx.bz2")
 && !url.toLowerCase().endsWith("vhdx.gz")))
|| (format.equalsIgnoreCase("qcow2")
 && (!url.toLowerCase().endsWith("qcow2")
 && !url.toLowerCase().endsWith("qcow2.zip")
 && !url.toLowerCase().endsWith("qcow2.bz2")
 && !url.toLowerCase().endsWith("qcow2.gz")))
|| (format.equalsIgnoreCase("ova")
 && (!url.toLowerCase().endsWith("ova")
 && !url.toLowerCase().endsWith("ova.zip")
 && !url.toLowerCase().endsWith("ova.bz2")
 && !url.toLowerCase().endsWith("ova.gz")))
|| (format.equalsIgnoreCase("tar")
 && (!url.toLowerCase().endsWith("tar")
 && !url.toLowerCase().endsWith("tar.zip")
 && !url.toLowerCase().endsWith("tar.bz2")
 && !url.toLowerCase().endsWith("tar.gz")))
|| (format.equalsIgnoreCase("raw")
 && (!url.toLowerCase().endsWith("img")
 && !url.toLowerCase().endsWith("img.zip")
 && !url.toLowerCase().endsWith("img.bz2")
 && !url.toLowerCase().endsWith("img.gz")
 && !url.toLowerCase().endsWith("raw")
 && !url.toLowerCase().endsWith("raw.bz2")
 && !url.toLowerCase().endsW

Re: mem.overprovisioning.facto and KVM

2014-01-24 Thread Marcus Sorensen
Looks like it works as of 4.2, but you need to update existing cluster
settings, rather than global (or both, I suppose).

On Fri, Jan 24, 2014 at 3:17 PM, Marcus Sorensen  wrote:
> I guess not. It should work though. We ran into the same issue with
> storage, everything hardcoded to only work with vmware. I'll take a
> look.
>
> On Mon, Oct 7, 2013 at 1:09 PM, Sebastien Goasguen  wrote:
>>
>> On Sep 25, 2013, at 2:59 AM, Harikrishna Patnala 
>>  wrote:
>>
>>> As far as I know men over provisioning is intended to work only with VMWare 
>>> hypervisor to allocate reserved memory for VM.
>>>
>>
>> @Marcus, could you comment on this: is mem over provisioning supposed to 
>> work with KVM ?
>>
>>> On 25-Sep-2013, at 11:11 AM, Nikolay Kabadjov  wrote:
>>>
>>>> Yes Kirk, I did
>>>>
>>>>
>>>>
>>>> 
>>>> From: Kirk Jantzer 
>>>> To: Cloudstack users mailing list ; Nikolay 
>>>> Kabadjov 
>>>> Sent: Tuesday, September 24, 2013 5:50 PM
>>>> Subject: Re: mem.overprovisioning.facto and KVM
>>>>
>>>>
>>>>
>>>> Did you restart the management service after making the change?
>>>>
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Kirk Jantzer
>>>> http://about.me/kirkjantzer
>>>>
>>>>
>>>>
>>>> On Tue, Sep 24, 2013 at 10:25 AM, Nikolay Kabadjov  
>>>> wrote:
>>>>
>>>> Hi all,
>>>>> I've noticed that increasing mem.overprovisioning.factor doesn't take 
>>>>> effect?
>>>>> I mean I still see in the dashboard the exact amount of memory I have 
>>>>> multiplying the memory of all the hosts.
>>>>>
>>>>> It's CS 4.1.1 with one zone, one pod, one cluster, 6 KVM hosts
>>>>>
>>>>> Any idea?
>>>>>
>>>>> Thanks
>>>>> Niki
>>>
>>


Re: mem.overprovisioning.facto and KVM

2014-01-24 Thread Marcus Sorensen
I guess not. It should work though. We ran into the same issue with
storage, everything hardcoded to only work with vmware. I'll take a
look.

On Mon, Oct 7, 2013 at 1:09 PM, Sebastien Goasguen  wrote:
>
> On Sep 25, 2013, at 2:59 AM, Harikrishna Patnala 
>  wrote:
>
>> As far as I know men over provisioning is intended to work only with VMWare 
>> hypervisor to allocate reserved memory for VM.
>>
>
> @Marcus, could you comment on this: is mem over provisioning supposed to work 
> with KVM ?
>
>> On 25-Sep-2013, at 11:11 AM, Nikolay Kabadjov  wrote:
>>
>>> Yes Kirk, I did
>>>
>>>
>>>
>>> 
>>> From: Kirk Jantzer 
>>> To: Cloudstack users mailing list ; Nikolay 
>>> Kabadjov 
>>> Sent: Tuesday, September 24, 2013 5:50 PM
>>> Subject: Re: mem.overprovisioning.facto and KVM
>>>
>>>
>>>
>>> Did you restart the management service after making the change?
>>>
>>>
>>>
>>> Regards,
>>>
>>> Kirk Jantzer
>>> http://about.me/kirkjantzer
>>>
>>>
>>>
>>> On Tue, Sep 24, 2013 at 10:25 AM, Nikolay Kabadjov  
>>> wrote:
>>>
>>> Hi all,
>>>> I've noticed that increasing mem.overprovisioning.factor doesn't take 
>>>> effect?
>>>> I mean I still see in the dashboard the exact amount of memory I have 
>>>> multiplying the memory of all the hosts.
>>>>
>>>> It's CS 4.1.1 with one zone, one pod, one cluster, 6 KVM hosts
>>>>
>>>> Any idea?
>>>>
>>>> Thanks
>>>> Niki
>>
>


Re: Status of CLVM?

2014-01-08 Thread Marcus Sorensen
You'd create a sharedmountpoint style primary storage, which would
host qcow2 files. You can do this via iscsi, fibrechannel, or any
other SAN tech.

On Wed, Jan 8, 2014 at 3:49 PM, Marcus Sorensen  wrote:
> Yes, because current snapshot is really "Copy raw-formatted LVM volume
> to qcow2 file on secondary storage". So there is no real LVM snapshot,
> and if there were, it wouldn't be copied internally.
>
> On Wed, Jan 8, 2014 at 3:47 PM, Nux!  wrote:
>> Hi,
>>
>> I've just watched Marcus Sorensen's presentation on CLVM on youtube and he
>> was mentioning that migrating a VM with snapshots will make the snapshots
>> disappear.
>> Can anyone testify if this is still the case?
>> Since at it, are there any alternative ways of using a multipathed iSCSI lun
>> with Cloudstack (KVM)? I'm thinking clustered filesystems such as GFS or
>> Ocfs, but afraid of the penalty performance.
>>
>> Regards,
>> Lucian
>>
>> --
>> Sent from the Delta quadrant using Borg technology!
>>
>> Nux!
>> www.nux.ro


Re: Status of CLVM?

2014-01-08 Thread Marcus Sorensen
Yes, because current snapshot is really "Copy raw-formatted LVM volume
to qcow2 file on secondary storage". So there is no real LVM snapshot,
and if there were, it wouldn't be copied internally.

On Wed, Jan 8, 2014 at 3:47 PM, Nux!  wrote:
> Hi,
>
> I've just watched Marcus Sorensen's presentation on CLVM on youtube and he
> was mentioning that migrating a VM with snapshots will make the snapshots
> disappear.
> Can anyone testify if this is still the case?
> Since at it, are there any alternative ways of using a multipathed iSCSI lun
> with Cloudstack (KVM)? I'm thinking clustered filesystems such as GFS or
> Ocfs, but afraid of the penalty performance.
>
> Regards,
> Lucian
>
> --
> Sent from the Delta quadrant using Borg technology!
>
> Nux!
> www.nux.ro


Re: 4.2.0 Upgrade and System Templates (KVM - Ubuntu 12.04)

2013-10-26 Thread Marcus Sorensen
Yes, you do need to upgrade your system VMS, and you should also have a new
systemvm.iso that was bundled in the cloudstack-common deb file that would
have been installed as an upgrade on your KVM hosts. I also feel that the
documentation of system vm upgrade is lacking. The only place I know if is
in the release notes:
http://cloudstack.apache.org/docs/en-US/Apache_CloudStack/4.2.0/html-single/Release_Notes/index.html,
see 3.1 "Upgrade Instructions", item 12. It references a script
"cloudstack-sysvmadm", but the upgrade of the system vm template should be
done beforehand.  Now look at the section just below, 3.2. This
documentation is obviously messed up because it first says "this applies
only to VMware", and then it promptly gives system vm upgrade instructions
for XenServer, KVM, and VMWare hosts.  It's unclear why this system vm
upgrade would only apply to zones which had VMware hosts, and why these
instructions aren't also on the 4.1.x to 4.2.x instructions. At any rate,
the system vm instuctions there for KVM should apply. Register the template
(optionally, check the data base to ensure the template is set as system
type), then restart the system vms per the item 12 script. If your KVM
hosts relaunch the system vms per the new template and they have the new
systemvm.iso, they should work.


On Oct 26, 2013 2:19 PM, "Marty Sweet"  wrote:

> Hi Guys,
>
> I have just upgraded to 4.2.0 from 4.1.1 and am having some issues with the
> SystemVMs.
> I understand that we are meant to upgrade to the new system image? Using
> the script in the 'Prepare systemvm' documentation I did this with no
> avail, editing the database to suit what I think would work has also not
> worked.
>
> Restoring a backup, I now have my original 4.1.1 acton systemvm templates.
> What steps should I take to launch a systemVM successfully?
>
> The upgrade documentation is pretty lacking in this respect, and just says
> restart the systemvms, with no reference to upgrading the image.
>
> I also note that the new systemvms don't seem to be mounting the NFS and
> are instead using  /usr/share/cloudstack-common/vms/systemvm.iso.
>
> Opening a VNC session to the VM, shows the following messages:
> Cannot assign requested address: make_sock: could not bind address
> dnsmasq: unknown interface eth0
> dnsmasq apache2 ... failed!
>
> My MD5 sum for the CD boot file is below and is consistant across all 4
> nodes:
> 092a299932bda93cc522b1c3e56af4a8
>  /usr/share/cloudstack-common/vms/systemvm.iso
>
>
> Many thanks,
> Marty
>


Re: High CPU utilization on KVM hosts while doing RBD snapshot - was Re: snapshot caused host disconnected

2013-10-07 Thread Marcus Sorensen
You may want to post this to the ceph mailing list as well.

On Mon, Oct 7, 2013 at 8:59 PM, Indra Pramana  wrote:
> Dear Wido and all,
>
> I performed some further tests last night:
>
> (1) CPU utilization of the KVM host while RBD snapshot running is still
> shooting up high even after I set global setting:
> concurrent.snapshots.threshold.perhost to 2.
>
> (2) Most of the concurrent snapshot processes will fail with either stuck
> in "Creating" state, or "CreatedOnPrimary" error message.
>
> (3) I also have adjusted some other related global settings such as
> backup.snapshot.wait and job.expire.minutes, without any luck.
>
> Any advise on the reason what causes the high CPU utilization is greatly
> appreciated.
>
> Looking forward to your reply, thank you.
>
> Cheers.
>
>
> On Mon, Oct 7, 2013 at 11:03 PM, Indra Pramana  wrote:
>
>> Dear all,
>>
>> I also found out that when the RBD snapshot is being run, the CPU
>> utilisation on the KVM host will be shooting up very high, which might
>> explain why the host becomes disconnected.
>>
>> top - 22:49:32 up 3 days, 19:31,  1 user,  load average: 7.85, 4.97, 3.47
>> Tasks: 297 total,   3 running, 294 sleeping,   0 stopped,   0 zombie
>> Cpu(s):  4.5%us,  1.2%sy,  0.0%ni, 94.1%id,  0.1%wa,  0.0%hi,  0.0%si,
>> 0.0%st
>> Mem:  264125244k total, 77203460k used, 186921784k free,   154888k buffers
>> Swap:   545788k total,0k used,   545788k free, 60677092k cached
>>
>>   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
>> 18161 root  20   0 3871m  31m 8444 S  101  0.0 301:58.09 kvm
>>  2790 root  20   0 43.5g 1.6g  19m S   97  0.7  45:52.42 jsvc
>> 24544 root  20   0 4583m  31m 8364 S   97  0.0 425:29.48 kvm
>>  6537 root  20   0 000 R   71  0.0   0:17.49 kworker/3:2
>> 22546 root  20   0 6143m 2.0g 8452 S   26  0.8  55:14.07 kvm
>>  4219 root  20   0 7671m 4.0g 8524 S6  1.6 106:12.26 kvm
>>  5989 root  20   0 43.2g 1.6g  232 D6  0.6   0:08.13 jsvc
>>  5993 root  20   0 43.3g 1.6g  224 D6  0.6   0:08.36 jsvc
>>
>> Is it normal when snapshot is being run on the VM running on that host,
>> the host's CPU utilisation will be higher than usual? How can I limit the
>> CPU resources used by the snapshot?
>>
>>
>> Looking forward to your reply, thank you.
>>
>> Cheers.
>>
>>
>>
>> On Mon, Oct 7, 2013 at 7:18 PM, Indra Pramana  wrote:
>>
>>> Dear all,
>>>
>>> I did some tests on snapshots since it's now supported for my Ceph RBD
>>> primary storage in CloudStack 4.2. When I ran the snapshot for a particular
>>> VM instance earlier, I noticed that this has caused the host (where the VM
>>> is on) becomes disconnected.
>>>
>>> Here's the excerpt from the agent.log:
>>>
>>> http://pastebin.com/dxVV7stu
>>>
>>> The management-server.log doesn't much showing anything other than
>>> detecting that the host was down and HA is being activated:
>>>
>>> http://pastebin.com/UeLiSm9K
>>>
>>> Anyone can advise what is causing the problem? So far there is only one
>>> user doing the snapshotting and it has caused issues to the host, I can't
>>> imagine what if multiple users try to do snapshotting at the same time?
>>>
>>> I read about snapshot job throttling which is described on the manual:
>>>
>>>
>>> http://cloudstack.apache.org/docs/en-US/Apache_CloudStack/4.2.0/html/Admin_Guide/working-with-snapshots.html
>>>
>>> But I am not too sure whether this will help to resolve the problem since
>>> there is only one user trying to perform snapshot and we already encounter
>>> the problem already.
>>>
>>> Anyone can advise how I can troubleshoot further and find a solution to
>>> the problem?
>>>
>>> Looking forward to your reply, thank you.
>>>
>>> Cheers.
>>>
>>
>>


Re: Resize data-disk doesn't work after upgrade

2013-10-03 Thread Marcus Sorensen
What primary storage are you using? Any errors in agent log?
On Oct 3, 2013 3:16 PM, "Indra Pramana"  wrote:

> Hi Marcus,
>
> Good day to you, and thank you for your e-mail.
>
> I have tried restarting the VM and even stop and start the VM, but after
> logging in to the VM, I still see the hard drive's size as 20 GB instead of
> 60 GB.
>
> I tried to check /var/log/libvirt/libvirtd.log file on the KVM host where
> the VM is hosted, and can't find any messages related to volBlockResize.
>
> Any other troubleshooting steps you can recommend, i.e. any other area I
> can look into?
>
> Looking forward to your reply, thank you.
>
> Cheers.
>
>
>
> On Fri, Oct 4, 2013 at 4:46 AM, Marcus Sorensen 
> wrote:
>
> > I just tested local storage qcow2 and CLVM resize on 4.2, they both
> worked.
> >
> > Resize works like this:
> >
> > 1. Do sanity checks
> > 2. Send resize command to the agent
> > 3. Resize the disk/lun/file
> > 4. Inform the VM instance that the disk has changed by making a
> > libvirt volBlockResize call (this is not fatal, some guest types can't
> > resize online and need to be restarted)
> > 5. Update the database
> >
> > You can check #3 looking at the disks themselves on storage to see if
> > they've grown. You can check #4 by restarting the VM to see if it
> > picks up the change.
> >
> > It may be that libvirt was unable to inform the VM of the change (for
> > example if you haven't upgraded to a supported version of Ubuntu or
> > CentOS and it has an old libvirt that doesn't support volBlockResize).
> >  The way to know for sure is stop/start the VM if you can.
> >
> > Look at those two things and let us know
> >
> > On Thu, Oct 3, 2013 at 2:33 PM, Indra Pramana  wrote:
> > > Dear all,
> > >
> > > After upgrading to 4.2.0, I tried to resize a data disk of a VM
> instance
> > > from 20 GB to 60 GB, through the Cloudstack GUI. The UI reports that
> the
> > > resize was successful, and that the data disk is now showing 60 GB
> > instead
> > > of 20 GB. However, when I check the actual disk on the VM, it seems
> that
> > > it's still 20 GB.
> > >
> > > Any reason what might have been the cause of the problem? I even tried
> to
> > > re-partition it to see if the size changed, but it wasn't and still at
> 20
> > > GB. Which logs I need to look into?
> > >
> > > Any help on this is greatly appreciated.
> > >
> > > Looking forward to your reply, thank you.
> > >
> > > Cheers.
> >
>


Re: Resize data-disk doesn't work after upgrade

2013-10-03 Thread Marcus Sorensen
I just tested local storage qcow2 and CLVM resize on 4.2, they both worked.

Resize works like this:

1. Do sanity checks
2. Send resize command to the agent
3. Resize the disk/lun/file
4. Inform the VM instance that the disk has changed by making a
libvirt volBlockResize call (this is not fatal, some guest types can't
resize online and need to be restarted)
5. Update the database

You can check #3 looking at the disks themselves on storage to see if
they've grown. You can check #4 by restarting the VM to see if it
picks up the change.

It may be that libvirt was unable to inform the VM of the change (for
example if you haven't upgraded to a supported version of Ubuntu or
CentOS and it has an old libvirt that doesn't support volBlockResize).
 The way to know for sure is stop/start the VM if you can.

Look at those two things and let us know

On Thu, Oct 3, 2013 at 2:33 PM, Indra Pramana  wrote:
> Dear all,
>
> After upgrading to 4.2.0, I tried to resize a data disk of a VM instance
> from 20 GB to 60 GB, through the Cloudstack GUI. The UI reports that the
> resize was successful, and that the data disk is now showing 60 GB instead
> of 20 GB. However, when I check the actual disk on the VM, it seems that
> it's still 20 GB.
>
> Any reason what might have been the cause of the problem? I even tried to
> re-partition it to see if the size changed, but it wasn't and still at 20
> GB. Which logs I need to look into?
>
> Any help on this is greatly appreciated.
>
> Looking forward to your reply, thank you.
>
> Cheers.


Re: [URGENT] Mounting issues from KVM hosts to secondary storage

2013-10-03 Thread Marcus Sorensen
You'll have to go back further in the log then, and/or go to the agent
log, because the issue of the ghost mount is due to the filesystem
being full, and if that's your secondary storage I'm not sure why it
would be written to from the kvm host unless you were doing backups or
snapshots.

On Thu, Oct 3, 2013 at 12:03 PM, Indra Pramana  wrote:
> Hi Marcus,
>
> It happens randomly, mounting fails, and then / gets filled. I suspect this
> occurs when the mounting fails, any files which is supposed to be
> transferred to the mount point (e.g. templates used to create VMs) will
> fill-up the / partition of the KVM host instead. This doesn't happen in
> 4.1.1, this only happens after we upgraded to 4.2.0.
>
> And how about virsh pool-list result which is not reflecting the actual IDs
> of the secondary storage we see from Cloudstack GUI, is it supposed to be
> the case? If not, how to rectify it?
>
> Looking forward to your reply, thank you.
>
> Cheers.
>
>
>
>
> On Fri, Oct 4, 2013 at 1:51 AM, Marcus Sorensen  wrote:
>
>> It sort of looks like the out of space triggered the issue. Libvirt shows
>>
>> 2013-10-03 15:38:57.414+: 2710: error : virCommandWait:2348 :
>> internal error Child process (/bin/umount
>> /mnt/5230667e-9c58-3ff6-983c-5fc2a72df669) unexpected exit status 32:
>> error writing /etc/mtab.tmp: No space left on device
>>
>> So that entry of the filesystem being mounted is orphaned in
>> /etc/mtab, since it can't be removed. That seems to be the source of
>> why 'df' shows the thing mounted when it isn't, at least.
>>
>>
>> On Thu, Oct 3, 2013 at 11:45 AM, Indra Pramana  wrote:
>> > Hi Marcus and all,
>> >
>> > I also find some strange and interesting error messages from the libvirt
>> > logs:
>> >
>> > http://pastebin.com/5ByfNpAf
>> >
>> > Looking forward to your reply, thank you.
>> >
>> > Cheers.
>> >
>> >
>> >
>> > On Fri, Oct 4, 2013 at 1:38 AM, Indra Pramana  wrote:
>> >
>> >> Hi Marcus,
>> >>
>> >> Good day to you, and thank you for your e-mail. See my reply inline.
>> >>
>> >> On Fri, Oct 4, 2013 at 12:29 AM, Marcus Sorensen > >wrote:
>> >>
>> >>> Are you using the release artifacts, or your own 4.2 build?
>> >>>
>> >>
>> >> [Indra:] We are using the release artifacts from below repo since we are
>> >> using Ubuntu:
>> >>
>> >> deb http://cloudstack.apt-get.eu/ubuntu precise 4.2
>> >>
>> >>
>> >>> When you rebooted the host, did the problem go away or come back the
>> >>> same?
>> >>
>> >>
>> >> [Indra:] When I rebooted the host, the problem go away for a while, but
>> it
>> >> will come back again after some time. It will come randomly at the time
>> >> when we need to create a new instance on that host, or start an existing
>> >> stopped instance.
>> >>
>> >>
>> >>> You may want to look at 'virsh pool-list' to see if libvirt is
>> >>> mounting/registering the secondary storage.
>> >>>
>> >>
>> >> [Indra:] This is the result of the virsh pool-list command:
>> >>
>> >> root@hv-kvm-02:/var/log/libvirt# virsh pool-list
>> >> Name State  Autostart
>> >> -
>> >> 301071ac-4c1d-4eac-855b-124126da0a38 active no
>> >> 5230667e-9c58-3ff6-983c-5fc2a72df669 active no
>> >> d433809b-01ea-3947-ba0f-48077244e4d6 active no
>> >>
>> >> Strange thing is that none of my secondary storage IDs are there. Could
>> it
>> >> be that the ID might have changed during Cloudstack upgrade? Here is the
>> >> list of my secondary storage (there are two of them) even though they
>> are
>> >> on the same NFS server:
>> >>   c02da448-b9f4-401b-b8d5-83e8ead5cfde nfs://
>> >> 10.237.11.31/mnt/vol1/sec-storage NFS
>> >> 5937edb6-2e95-4ae2-907b-80fe4599ed87 nfs://
>> >> 10.237.11.31/mnt/vol1/sec-storage2 NFS
>> >>
>> >>> Is this happening on multiple hosts, the same way?
>> >>
>> >>
>> >> [Indra:] Yes, it is happening on all the two hosts that I have, the same
>> >> way.
>> >>
>> >>
>> >>> You may want to
>> >>> look a

Re: [URGENT] Mounting issues from KVM hosts to secondary storage

2013-10-03 Thread Marcus Sorensen
It sort of looks like the out of space triggered the issue. Libvirt shows

2013-10-03 15:38:57.414+: 2710: error : virCommandWait:2348 :
internal error Child process (/bin/umount
/mnt/5230667e-9c58-3ff6-983c-5fc2a72df669) unexpected exit status 32:
error writing /etc/mtab.tmp: No space left on device

So that entry of the filesystem being mounted is orphaned in
/etc/mtab, since it can't be removed. That seems to be the source of
why 'df' shows the thing mounted when it isn't, at least.


On Thu, Oct 3, 2013 at 11:45 AM, Indra Pramana  wrote:
> Hi Marcus and all,
>
> I also find some strange and interesting error messages from the libvirt
> logs:
>
> http://pastebin.com/5ByfNpAf
>
> Looking forward to your reply, thank you.
>
> Cheers.
>
>
>
> On Fri, Oct 4, 2013 at 1:38 AM, Indra Pramana  wrote:
>
>> Hi Marcus,
>>
>> Good day to you, and thank you for your e-mail. See my reply inline.
>>
>> On Fri, Oct 4, 2013 at 12:29 AM, Marcus Sorensen wrote:
>>
>>> Are you using the release artifacts, or your own 4.2 build?
>>>
>>
>> [Indra:] We are using the release artifacts from below repo since we are
>> using Ubuntu:
>>
>> deb http://cloudstack.apt-get.eu/ubuntu precise 4.2
>>
>>
>>> When you rebooted the host, did the problem go away or come back the
>>> same?
>>
>>
>> [Indra:] When I rebooted the host, the problem go away for a while, but it
>> will come back again after some time. It will come randomly at the time
>> when we need to create a new instance on that host, or start an existing
>> stopped instance.
>>
>>
>>> You may want to look at 'virsh pool-list' to see if libvirt is
>>> mounting/registering the secondary storage.
>>>
>>
>> [Indra:] This is the result of the virsh pool-list command:
>>
>> root@hv-kvm-02:/var/log/libvirt# virsh pool-list
>> Name State  Autostart
>> -
>> 301071ac-4c1d-4eac-855b-124126da0a38 active no
>> 5230667e-9c58-3ff6-983c-5fc2a72df669 active no
>> d433809b-01ea-3947-ba0f-48077244e4d6 active no
>>
>> Strange thing is that none of my secondary storage IDs are there. Could it
>> be that the ID might have changed during Cloudstack upgrade? Here is the
>> list of my secondary storage (there are two of them) even though they are
>> on the same NFS server:
>>   c02da448-b9f4-401b-b8d5-83e8ead5cfde nfs://
>> 10.237.11.31/mnt/vol1/sec-storage NFS
>> 5937edb6-2e95-4ae2-907b-80fe4599ed87 nfs://
>> 10.237.11.31/mnt/vol1/sec-storage2 NFS
>>
>>> Is this happening on multiple hosts, the same way?
>>
>>
>> [Indra:] Yes, it is happening on all the two hosts that I have, the same
>> way.
>>
>>
>>> You may want to
>>> look at /etc/mtab, if the system reports it's mounted, though it's
>>> not, it might be in there. Look at /proc/mounts as well.
>>>
>>
>> [Indra:] Please find result of df, /etc/mtab and /proc/mounts below. The
>> "ghost" mount point is on df and /etc/mtab, but not on /proc/mounts.
>>
>> root@hv-kvm-02:/etc# df
>>
>> Filesystem  1K-blocksUsed
>> Available Use% Mounted on
>> /dev/sda1 4195924
>> 4192372 0 100% /
>>
>> udev132053356   4
>> 132053352   1% /dev
>> tmpfs52825052 704
>> 52824348   1% /run
>>
>> none 5120
>> 0  5120   0% /run/lock
>> none132062620   0
>> 132062620   0% /run/shm
>> cgroup  132062620   0
>> 132062620   0% /sys/fs/cgroup
>> /dev/sda610650544
>> 2500460   7609056  25% /home
>> 10.237.11.31:/mnt/vol1/sec-storage2/template/tmpl/2/288   4195924
>> 4192372 0 100% /mnt/5230667e-9c58-3ff6-983c-5fc2a72df669
>>
>> root@hv-kvm-02:/etc# cat /etc/mtab
>> /dev/sda1 / ext4 rw,errors=remount-ro 0 0
>> proc /proc proc rw,noexec,nosuid,nodev 0 0
>> sysfs /sys sysfs rw,noexec,nosuid,nodev 0 0
>> none /sys/fs/fuse/connections fusectl rw 0 0
>> none /sys/kernel/debug debugfs rw 0 0
>> none /sys/kernel/security securityfs rw 0 0
>> udev /dev devtmpfs rw,mode=0755 0 0
>> devpts /dev/pts dev

Re: [URGENT] Mounting issues from KVM hosts to secondary storage

2013-10-03 Thread Marcus Sorensen
Are you using the release artifacts, or your own 4.2 build?

When you rebooted the host, did the problem go away or come back the
same? You may want to look at 'virsh pool-list' to see if libvirt is
mounting/registering the secondary storage.

Is this happening on multiple hosts, the same way? You may want to
look at /etc/mtab, if the system reports it's mounted, though it's
not, it might be in there. Look at /proc/mounts as well.

On Thu, Oct 3, 2013 at 9:53 AM, Indra Pramana  wrote:
> Dear all,
>
> We face a major problem after upgrading to 4.2.0. Mounting from KVM hosts
> to secondary storage seems to fail, every time a new VM instance is
> created, it will use up the / (root) partition of the KVM hosts instead.
>
> Here is the df result:
>
> 
> root@hv-kvm-02:/home/indra# df
> Filesystem  1K-blocksUsed
> Available Use% Mounted on
> /dev/sda1 4195924
> 4195924 0 100% /
> udev132053356   4
> 132053352   1% /dev
> tmpfs52825052 440
> 52824612   1% /run
> none 5120
> 0  5120   0% /run/lock
> none132062620   0
> 132062620   0% /run/shm
> cgroup  132062620   0
> 132062620   0% /sys/fs/cgroup
> /dev/sda610650544 2500424
> 7609092  25% /home
> 10.237.11.31:/mnt/vol1/sec-storage2/template/tmpl/2/288   4195924
> 4195924 0 100% /mnt/5230667e-9c58-3ff6-983c-5fc2a72df669
> 
>
> The strange thing is that df shows that it seems to be mounted, but it's
> actually not mounted. If you noticed, the total capacity of the mount point
> is exactly the same as the capacity of the / (root) partition. By right it
> should show 7 TB instead of just 4 GB.
>
> This caused VM creation to be error due to out of disk space. This also
> affect the KVM operations since the / (root) partition becomes full, and we
> can only release the space after we reboot the KVM host.
>
> Anyone experience this problem before? We are at loss on how to resolve the
> problem.
>
> Looking forward to your reply, thank you.
>
> Cheers.


RE: CloudStack VPC <- VPN -> VPC

2013-08-22 Thread Marcus Sorensen
It is possible, sort of. You have to bring both up at the same time,
otherwise they will time out and fail. There is no mode to make one side or
the other just listen for connections.
On Aug 23, 2013 12:37 AM, "Kimihiko Kitase" 
wrote:

> Thanks! Is it in 4.2?
>
> -Original Message-
> From: Sheng Yang [mailto:sh...@yasker.org]
> Sent: Friday, August 23, 2013 2:52 PM
> To: users@cloudstack.apache.org
> Cc: d...@cloudstack.apache.org
> Subject: Re: CloudStack VPC <- VPN -> VPC
>
> Not now. But I think it's in the road map.
>
> --Sheng
>
>
> On Thu, Aug 22, 2013 at 10:42 PM, Kimihiko Kitase <
> kimihiko.kit...@citrix.co.jp> wrote:
>
> > Hello
> >
> > Is it possible to make site-to-site VPN connection between VPC and VPC
> > on CloudStack?
> >
> > Thanks
> > Kimi
> >
> > -Original Message-
> > From: Ahmad Emneina [mailto:aemne...@gmail.com]
> > Sent: Friday, August 23, 2013 2:38 PM
> > To: users@cloudstack.apache.org
> > Cc: users@cloudstack.apache.org
> > Subject: Re: CloudStack VPC <- VPN -> VPC
> >
> > Don't think so, you might want to confirm with dev@cloudstack and or
> > create an enhancement ticket (if one doesn't exist) in Jira.
> >
> > Ahmad
> >
> > On Aug 22, 2013, at 10:32 PM, Kimihiko Kitase <
> > kimihiko.kit...@citrix.co.jp> wrote:
> >
> > > Hello
> > >
> > > Can we make site to site VPN connection between VPC and VPC on
> > CloudStack?
> > >
> > > Thanks
> > > Kimi
> > >
> > >
> >
>


Re: Single Server | Advanced Mode | KVM | Cent OS 6.4

2013-08-15 Thread Marcus Sorensen
You may be able to leverage the devcloud-kvm configuration as a
reference. You can either use marvin to deploy an edited version of
tools/devcloud-kvm/devcloud-kvm-advanced.cfg (just swapping out your
ip address ranges), or take a look at the example advanced KVM network
configs that I sent out awhile back:
http://marcus.mlsorensen.com/cloudstack-extras/cs-4.1-kvm-networking-two-nic.rtf
and 
http://marcus.mlsorensen.com/cloudstack-extras/cs-4.1-kvm-networking-one-nic.rtf
. I only mention these options because I've used them quite a bit, and
the public access has consistently worked fine.

On Thu, Aug 15, 2013 at 4:32 PM, Maurice Lawler  wrote:
> Hello,
>
> I'm working with KVM | CloudStack 4.1.1 | CentOS 6.4, I am running into a 
> issue where it goes through all the motions of setting up just fine. However, 
> I notice when it attempts to download the CentOS template it fails,  with 
> error message: No Route To Host. I am utilizing two subnets one /27 and 
> another /29; this works without issue in basic mode.
>
> My thought is this, I am obviously missing an important step in Advanced Mode 
> setup, is there a need (or a step) that states to create virtual network 
> interfaces on the host server? If there, I am not seeing that step; as when I 
> sign into the System VM's (which provision and come online without issue) I 
> can ping the gateway of the /27 without issue; however, it does not permit my 
> downloading (No Route to Host) along with that, I cannot resolve DNS of any 
> kind of the Console Proxy VM /  Secondary Storage VM.
>
> If anyone can guide me into the right direction that would be greatly 
> appreciated !
>
>  - Maurice


Re: FW: CS 4.1.0 - this will help a number of people who struggle with Advanced Networking

2013-08-01 Thread Marcus Sorensen
Here's a simple (not recommended) one-nic setup:

http://marcus.mlsorensen.com/cloudstack-extras/cs-4.1-kvm-networking-one-nic.rtf

And a simple two-nic setup:

http://marcus.mlsorensen.com/cloudstack-extras/cs-4.1-kvm-networking-two-nic.rtf

Hasty docs put together on the road...


On Thu, Aug 1, 2013 at 11:28 AM, Marcus Sorensen  wrote:
> I'm short on time, but here's the KVM advanced networking config we
> use for testing. If someone wants to write a doc based around it that
> would be nice.
>
> Start out KVM host with two networks, eth0, eth1. eth0 is intended for
> public traffic, eth0 will be guest vlans and management vlan. then
> create a bridge interface for each:
>
> [root@devcloud-kvm ~]# brctl show
> bridge name bridge id STP enabled interfaces
> cloud0 8000. no
> br0 8000.5254004eff4f no eth0
> br1 8000.52540052b15e no eth1
>
> br0  Link encap:Ethernet  HWaddr 52:54:00:4E:FF:4F
>   inet addr:172.17.10.10  Bcast:172.17.10.255  Mask:255.255.255.0
>   inet6 addr: fe80::5054:ff:fe4e:ff4f/64 Scope:Link
>   UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>   RX packets:127 errors:0 dropped:0 overruns:0 frame:0
>   TX packets:30 errors:0 dropped:0 overruns:0 carrier:0
>   collisions:0 txqueuelen:0
>   RX bytes:5846 (5.7 KiB)  TX bytes:4345 (4.2 KiB)
>
> br1  Link encap:Ethernet  HWaddr 52:54:00:52:B1:5E
>   inet addr:192.168.100.10  Bcast:192.168.100.255  Mask:255.255.255.0
>   inet6 addr: fe80::5054:ff:fe52:b15e/64 Scope:Link
>   UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>   RX packets:343 errors:0 dropped:0 overruns:0 frame:0
>   TX packets:153 errors:0 dropped:0 overruns:0 carrier:0
>   collisions:0 txqueuelen:0
>   RX bytes:24227 (23.6 KiB)  TX bytes:29108 (28.4 KiB)
>
> eth0  Link encap:Ethernet  HWaddr 52:54:00:4E:FF:4F
>   inet6 addr: fe80::5054:ff:fe4e:ff4f/64 Scope:Link
>   UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>   RX packets:157 errors:0 dropped:0 overruns:0 frame:0
>   TX packets:38 errors:0 dropped:0 overruns:0 carrier:0
>   collisions:0 txqueuelen:1000
>   RX bytes:12276 (11.9 KiB)  TX bytes:4897 (4.7 KiB)
>
> eth1  Link encap:Ethernet  HWaddr 52:54:00:52:B1:5E
>   inet6 addr: fe80::5054:ff:fe52:b15e/64 Scope:Link
>   UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>   RX packets:377 errors:0 dropped:0 overruns:0 frame:0
>   TX packets:163 errors:0 dropped:0 overruns:0 carrier:0
>   collisions:0 txqueuelen:1000
>   RX bytes:34044 (33.2 KiB)  TX bytes:29748 (29.0 KiB)
>
> loLink encap:Local Loopback
>   inet addr:127.0.0.1  Mask:255.0.0.0
>   inet6 addr: ::1/128 Scope:Host
>   UP LOOPBACK RUNNING  MTU:16436  Metric:1
>   RX packets:863 errors:0 dropped:0 overruns:0 frame:0
>   TX packets:863 errors:0 dropped:0 overruns:0 carrier:0
>   collisions:0 txqueuelen:0
>   RX bytes:120247 (117.4 KiB)  TX bytes:120247 (117.4 KiB)
>
> Ok, now kvm host is ready. Just define the kvm traffic label for
> Management traffic to be 'br0', for guest to be 'br0', and for public
> to be 'br1'. Cloudstack will create any necessary bridges or vlans.
> You can leave the vlan option empty if you don't want it to create a
> vlan (say for management). I can perhaps go into more detail later.
>
> On Wed, Jul 31, 2013 at 12:33 PM, Marcus Sorensen  wrote:
>> Yes, that's correct. I think we need to update the documentation. The
>> user simply needs to create a bridge where 'public' traffic will work,
>> and then set that bridge name as the traffic label for public traffic.
>> Then it will create the vlan device and the bridge necessary for
>> public based on the physical ethernet device of that bridge.
>>
>> Note, in this example, it is only looking for cloudVirBr for
>> compatibility, if there are existing cloudVirBr bridges then the agent
>> will continue to create cloudVirBr bridges, otherwise, it will create
>> breth bridges, which allow the same vlan number on different physical
>> interfaces.
>>
>> We can easily create some concrete examples for this... such as the
>> one represented in devcloud-kvm by
>> tools/devcloud-kvm/devcloud-kvm-advanced.cfg
>>
>> On Wed, Jul 31, 2013 at 12:06 PM, Edison Su  wrote:
>>> The KVM installation guide at 
>>> http://cloudstack.apache.org/docs/en-US/Apache_CloudStack/4.1.0/html/Installation_Guide/hypervisor-kvm-install-flow.html
>>>  , is unnecessary complicated and inaccurate.
>&g

Re: FW: CS 4.1.0 - this will help a number of people who struggle with Advanced Networking

2013-08-01 Thread Marcus Sorensen
I'm short on time, but here's the KVM advanced networking config we
use for testing. If someone wants to write a doc based around it that
would be nice.

Start out KVM host with two networks, eth0, eth1. eth0 is intended for
public traffic, eth0 will be guest vlans and management vlan. then
create a bridge interface for each:

[root@devcloud-kvm ~]# brctl show
bridge name bridge id STP enabled interfaces
cloud0 8000. no
br0 8000.5254004eff4f no eth0
br1 8000.52540052b15e no eth1

br0  Link encap:Ethernet  HWaddr 52:54:00:4E:FF:4F
  inet addr:172.17.10.10  Bcast:172.17.10.255  Mask:255.255.255.0
  inet6 addr: fe80::5054:ff:fe4e:ff4f/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:127 errors:0 dropped:0 overruns:0 frame:0
  TX packets:30 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:0
  RX bytes:5846 (5.7 KiB)  TX bytes:4345 (4.2 KiB)

br1  Link encap:Ethernet  HWaddr 52:54:00:52:B1:5E
  inet addr:192.168.100.10  Bcast:192.168.100.255  Mask:255.255.255.0
  inet6 addr: fe80::5054:ff:fe52:b15e/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:343 errors:0 dropped:0 overruns:0 frame:0
  TX packets:153 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:0
  RX bytes:24227 (23.6 KiB)  TX bytes:29108 (28.4 KiB)

eth0  Link encap:Ethernet  HWaddr 52:54:00:4E:FF:4F
  inet6 addr: fe80::5054:ff:fe4e:ff4f/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:157 errors:0 dropped:0 overruns:0 frame:0
  TX packets:38 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:12276 (11.9 KiB)  TX bytes:4897 (4.7 KiB)

eth1  Link encap:Ethernet  HWaddr 52:54:00:52:B1:5E
  inet6 addr: fe80::5054:ff:fe52:b15e/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:377 errors:0 dropped:0 overruns:0 frame:0
  TX packets:163 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:34044 (33.2 KiB)  TX bytes:29748 (29.0 KiB)

loLink encap:Local Loopback
  inet addr:127.0.0.1  Mask:255.0.0.0
  inet6 addr: ::1/128 Scope:Host
  UP LOOPBACK RUNNING  MTU:16436  Metric:1
  RX packets:863 errors:0 dropped:0 overruns:0 frame:0
  TX packets:863 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:0
  RX bytes:120247 (117.4 KiB)  TX bytes:120247 (117.4 KiB)

Ok, now kvm host is ready. Just define the kvm traffic label for
Management traffic to be 'br0', for guest to be 'br0', and for public
to be 'br1'. Cloudstack will create any necessary bridges or vlans.
You can leave the vlan option empty if you don't want it to create a
vlan (say for management). I can perhaps go into more detail later.

On Wed, Jul 31, 2013 at 12:33 PM, Marcus Sorensen  wrote:
> Yes, that's correct. I think we need to update the documentation. The
> user simply needs to create a bridge where 'public' traffic will work,
> and then set that bridge name as the traffic label for public traffic.
> Then it will create the vlan device and the bridge necessary for
> public based on the physical ethernet device of that bridge.
>
> Note, in this example, it is only looking for cloudVirBr for
> compatibility, if there are existing cloudVirBr bridges then the agent
> will continue to create cloudVirBr bridges, otherwise, it will create
> breth bridges, which allow the same vlan number on different physical
> interfaces.
>
> We can easily create some concrete examples for this... such as the
> one represented in devcloud-kvm by
> tools/devcloud-kvm/devcloud-kvm-advanced.cfg
>
> On Wed, Jul 31, 2013 at 12:06 PM, Edison Su  wrote:
>> The KVM installation guide at 
>> http://cloudstack.apache.org/docs/en-US/Apache_CloudStack/4.1.0/html/Installation_Guide/hypervisor-kvm-install-flow.html
>>  , is unnecessary complicated and inaccurate.
>> For example, we don't need to configure vlan on kvm host by users 
>> themselves, cloudstack-agent will create vlans automatically.
>> All users need to do is to create bridges(if the default bridge created by 
>> cloudstack-agent is not enough), then add these bridge names from cloudstack 
>> mgt server UI during the zone creation.
>>
>> -Original Message-
>> From: Noel Kendall [mailto:noeldkend...@hotmail.com]
>> Sent: Wednesday, July 31, 2013 9:49 AM
>> To: users@cloudstack.apache.org
>> Subject: CS 4.1.0 - this will help a number of people who struggle with 
>> Advanced Networking
>>
>> The documentation for installation in a KVM e

Re: FW: CS 4.1.0 - this will help a number of people who struggle with Advanced Networking

2013-07-31 Thread Marcus Sorensen
Yes, that's correct. I think we need to update the documentation. The
user simply needs to create a bridge where 'public' traffic will work,
and then set that bridge name as the traffic label for public traffic.
Then it will create the vlan device and the bridge necessary for
public based on the physical ethernet device of that bridge.

Note, in this example, it is only looking for cloudVirBr for
compatibility, if there are existing cloudVirBr bridges then the agent
will continue to create cloudVirBr bridges, otherwise, it will create
breth bridges, which allow the same vlan number on different physical
interfaces.

We can easily create some concrete examples for this... such as the
one represented in devcloud-kvm by
tools/devcloud-kvm/devcloud-kvm-advanced.cfg

On Wed, Jul 31, 2013 at 12:06 PM, Edison Su  wrote:
> The KVM installation guide at 
> http://cloudstack.apache.org/docs/en-US/Apache_CloudStack/4.1.0/html/Installation_Guide/hypervisor-kvm-install-flow.html
>  , is unnecessary complicated and inaccurate.
> For example, we don't need to configure vlan on kvm host by users themselves, 
> cloudstack-agent will create vlans automatically.
> All users need to do is to create bridges(if the default bridge created by 
> cloudstack-agent is not enough), then add these bridge names from cloudstack 
> mgt server UI during the zone creation.
>
> -Original Message-
> From: Noel Kendall [mailto:noeldkend...@hotmail.com]
> Sent: Wednesday, July 31, 2013 9:49 AM
> To: users@cloudstack.apache.org
> Subject: CS 4.1.0 - this will help a number of people who struggle with 
> Advanced Networking
>
> The documentation for installation in a KVM environment is utterly misleading.
> The documentation reads as though one can set up the bridge for the public 
> network with any name one chooses, the default being cloudbr0.
> You cannot use just any old name. That simply will not work.
> Let's suppose I have a public network that I isolate on VLAN 5, which is 
> interfaced on ethernet adapter eth4. I will need to define an adapter eth4.5 
> with VLAN set to yes.
> So far, so good.
> Next, for the bridge...
> By enabling debugging output in the log, I was able to see that the code 
> looks for a bridge with the name cloudVirBr5 for my public network.
> I had tried several different approaches, none would work if I did not name 
> my bridge cloudVirBr5, and set my traffic label on the network 
> configurationto the same.
> I have seen numerous posts in the mailing lists, blog entries, you name it, 
> representing frustrations of throngs of users trying to validate a CS setup.
> The documentation is utterly wrong and misleading.
> Summary:
> does not work:traffic label: cloudbr0 with eth4.5 pointing to cloudbr0 - code 
> still tries to create a breth4.5 and enlist eth4.5 to it but cannot because 
> it is already enlisted to cloudbr0.
> Good luck everyone with advanced networking with VLAN isolation on CentOS KVM 
> hosts.
>


Re: Unable to migrate VM to another host

2013-07-23 Thread Marcus Sorensen
There is no cloud user on the host/agent side. Agent runs as root. Or is
this an experiment to try to run the agent as another user?
On Jul 23, 2013 4:44 AM, "Prasanna Santhanam"  wrote:

> On Tue, Jul 23, 2013 at 06:19:06PM +0800, Indra Pramana wrote:
> > Dear Prasanna and all,
> >
> > I have tried to remove the host from CloudStack, uninstall and reinstall
> > cloudstack-agent and added the host back into CloudStack. The "cloud"
> > user-id is still not yet created. I then tried to add the "cloud" user
> > manually, using exactly the same credentials as the "cloud" user on the
> > other host.
> >
> > >From the host, I tried to do a virsh connect qemu+ssh to the other host
> > using the "cloud" user (instead of root), and getting this error:
> >
> > cloud@hv-kvm-01:~$ virsh --connect qemu+ssh://cloud@10.237.3.22/systemlist
> > cloud@10.237.3.22's password:
> > error: failed to connect to the hypervisor
> > error: no valid connection
> > error: End of file while reading data: : Input/output error
> >
> > If you notice, the error is the same exact error message I see on the
> > management-server.log when I tried to do a live migration of VM. So I
> tried
> > to follow this lead and implement this instruction to allow the "cloud"
> > user to have access to the hypervisor on the other host:
> >
> > http://wiki.libvirt.org/page/SSHSetup
> >
> > usermod -G libvirtd -a cloud
> >
> > Enabled these on /etc/libvirt/libvirtd.conf:
> >
> > unix_sock_group = "libvirtd"
> > unix_sock_rw_perms = "0770"
> >
> > I also copied the SSH keys of the cloud user to the other host, so that
> it
> > will not prompt for password.
> >
> > And I am now able to do a virsh connect using the "cloud" user:
> >
> > cloud@hv-kvm-01:~$ virsh --connect qemu+ssh://cloud@10.237.3.22/systemlist
> >  IdName   State
> > 
> >  2 i-2-275-VM running
> >  3 i-2-276-VM running
> >  4 i-2-293-VM running
> >
> > But I am still not able to perform live migration of VM.
> >
> > May I know how CloudStack connects to the hypervisors when it performs
> live
> > migration, after finding a suitable target host? Does it request the
> source
> > host to perform a qemu+ssh connect to the target host?
> >
>
> I'm not sure about the specifics here. Perhaps someone else will chime
> in - But from looking at the resource for KVM in cloudstack it appears
> the libvirt XML contains a qemu+tcp:// connection on migrate.
>
> Can you tell us if the KVM hosts share the same management subnet?
> Are they in the same cluster as CloudStack denotes a cluster?
>
> Is there a way to trap the XMLs sent to the KVM resource in the
> libvirt logs? I'd try to enable that if so.
>
> --
> Prasanna.,
>
> 
> Powered by BigRock.com
>
>


Re: cloudstack 4.1 QinQ vlan behaviour

2013-07-10 Thread Marcus Sorensen
I created that document, as a suggestion. I never got feedback. The
way it worked previously was sort of  a happy accident, which was
'fixed' when the code changed to accept overlapping vlan numbers on
multiple physical devices (hence the bridge name change).

However... I believe there is still a way to do what you want with the
stock code. What is your guest KVM traffic label set to?  Cloudstack
is looking for the 'parent' physical device of the bridge, so if it
sees that it's on a vlan, it goes up one more to find the real device.
It only does this once. So if instead of:

cloudbrguest8000.90e2ba317614   yes vlan211

You create:

cloudbrguest-10   8000.90e2ba317614   yes vlan211.10

And tell it to use cloudbrguest-10 as the traffic label, it will go up
one from vlan10 and settle on vlan211 as the physical device. The nice
thing about the new behavior is that I believe it will work on ANY
type, not just 'vlan' ones (so you could bond, for instance).

On Wed, Jul 10, 2013 at 2:34 AM, Valery Ciareszka
 wrote:
> Hi all.
>
> I was able to change vlan creation behaviour by source code modification
> (plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/resource/LibvirtComputingResource.java),
> had to comment several lines of code:
>
>  private String getPif(String bridge) {
> String pif = matchPifFileInDirectory(bridge);
> //File vlanfile = new File("/proc/net/vlan/" + pif);
>
> //if (vlanfile.isFile()) {
> //pif = Script.runSimpleBashScript("grep ^Device\\:
> /proc/net/vlan/"
> //  + pif + " | awk {'print
> $2'}");
> //}
>
> return pif;
> }
>
> Could someone please comment this new behaviour of vlan creation ? Why does
> it try to create vlan on real physical device, but not on vlan (vlan in
> vlan) ? There is nothing about this in documentation.
> I have found Q-in-Q for isolated networks functional spec -
> https://cwiki.apache.org/CLOUDSTACK/q-in-q-for-isolated-networks-functional-spec.html
> "The admin simply needs to create any 'vlan#' devices, and CloudStack uses
> them as physical devices."
>
> That worked for me in CS 4.0.2. But as you can see, current version of
> cloudstack DOES NOT use 'vlan#' devices as physical devices!!!
> Is that a bug ?
>
>
>
> On Tue, Jul 9, 2013 at 12:39 PM, Valery Ciareszka > wrote:
>
>> So, nobody uses q in q and cloudstack 4.1 ?
>>
>>
>> On Mon, Jul 8, 2013 at 3:13 PM, Valery Ciareszka <
>> valery.teres...@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I use the following environment: CS 4.1, KVM, Centos 6.4
>>> (management+node1+node2), OpenIndiana NFS server as primary and secondary
>>> storage
>>> I have advanced networking in zone. I split management/public/guest
>>> traffic into different vlans, and use kvm network labels (bridge names):
>>> # cat /etc/cloud/agent/agent.properties |grep device
>>> guest.network.device=cloudbrguest
>>> private.network.device=cloudbrmanage
>>> public.network.device=cloudbrpublic
>>>
>>> I have following network configuration:
>>> eth0+eth1=bond0
>>> eth2+eth3=bond1
>>>
>>> I use  vlan with id=211 on bond1 interface for guest traffic:
>>> cloudbrguest8000.90e2ba317614   yes vlan211
>>> cloudbrmanage   8000.90e2ba317614   yes bond1.210
>>> cloudbrpublic   8000.90e2ba317614   yes bond1.221
>>> cloudbrstor 8000.0025908814a4   yes bond0
>>>
>>>
>>> The problem appeared after I have upgraded CS from 4.0.2 to 4.1.
>>>
>>> How it works in 4.0.2:
>>> -bridge interface cloudVirBr#VLANID is created on hypervisor, #VLANID -
>>> value from 1024 to 4096(is specified when creating zone), i.e.
>>> cloudVirBr1224
>>> -vlan interface vlan211.#VLANID is created on hypervisor and is plugged
>>> into cloudVirBr#VLANID
>>> I should had permitted 211 vlanid on switchports and all guest traffic
>>> (vlans 1024-4096) was encapsulated.
>>>
>>> How it works in 4.1:
>>> -bridge interface br#ETHNAME-#VLANID is created on hypervisor, where
>>> #VLANID - value from 1024 to 4096(is specified when creating zone) and
>>> #ETHNAME - name of device on top of which vlan will be created
>>> i.e. brbond1-1224
>>> -vlan interface bond1.#VLANID is created on hypervisor and is plugged
>>> into br#ETHNAME-#VLANID
>>> However, vlan interface is created on top of bond1 interface, while I
>>> would like it to be created on top of vlan211 (bond1.211)
>>> Now I should permit 1024-4096 vlanid on switchports, that is not
>>> convenient.
>>>
>>> How do I configure CS 4.1 so that it could work with guest vlans the same
>>> way as it had worked in CS 4.0 ?
>>>
>>> --
>>> Regards,
>>> Valery
>>>
>>> http://protocol.by/slayer
>>>
>>
>>
>>
>> --
>> Regards,
>> Valery
>>
>> http://protocol.by/slayer
>>
>
>
>
> --
> Regards,
> Valery
>
> http://protocol.by/slayer


Re: unable to add a new VM:

2013-05-03 Thread Marcus Sorensen
Was the system VM rebooted to get new software?
On May 3, 2013 4:57 AM, "Shashi Dahal"  wrote:

> Hi All,
> Sending to both lists, as not sure if this is a dev or infra issue.
>
> I am not able to add a new VM after upgrade from 2.2.14 to 4.1.0  [CentOS
> 6.4 - Advance Networking with Security Groups]
>
> Log snippet:
> 2013-05-03 12:41:55,212 DEBUG [agent.transport.Request]
> (Job-Executor-24:job-33) Seq 2-1920335888: Received: { Ans: , MgmtId:
> 90520747364525, via: 2, Ver: v1, Flags: 110, { Answer } }
> 2013-05-03 12:41:55,212 INFO [cloud.vm.VirtualMachineManagerImpl]
> (Job-Executor-24:job-33) Unable to contact resource.
> com.cloud.exception.ResourceUnavailableException: Resource [DataCenter:1]
> is unreachable: Unable to apply dhcp entry on router
> at
> com.cloud.network.router.VirtualNetworkApplianceManagerImpl.applyRules(VirtualNetworkApplianceManagerImpl.java:3431)
> at
> com.cloud.network.router.VirtualNetworkApplianceManagerImpl.applyDhcpEntry(VirtualNetworkApplianceManagerImpl.java:2664)
> at
> com.cloud.network.element.VirtualRouterElement.addDhcpEntry(VirtualRouterElement.java:831)
> at
> com.cloud.network.NetworkManagerImpl.prepareElement(NetworkManagerImpl.java:1547)
> at
> com.cloud.network.NetworkManagerImpl.prepareNic(NetworkManagerImpl.java:1658)
> at
> com.cloud.network.NetworkManagerImpl.prepare(NetworkManagerImpl.java:1599)
> at
> com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:746)
> at
> com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:471)
> at
> org.apache.cloudstack.engine.cloud.entity.api.VMEntityManagerImpl.deployVirtualMachine(VMEntityManagerImpl.java:212)
> at
> org.apache.cloudstack.engine.cloud.entity.api.VirtualMachineEntityImpl.deploy(VirtualMachineEntityImpl.java:209)
> at
> com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3865)
> at
> com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3458)
> at
> com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3444)
> at
> com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
> at
> org.apache.cloudstack.api.command.user.vm.DeployVMCmd.execute(DeployVMCmd.java:379)
> at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:162)
> at
> com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:437)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:679)
> 2013-05-03 12:41:55,228 DEBUG [cloud.vm.VirtualMachineManagerImpl]
> (Job-Executor-24:job-33) Cleaning up resources for the vm VM[User|MY-VM] in
> Starting state
>
> Full logs found in: https://issues.apache.org/jira/browse/CLOUDSTACK-2322
>
> Cheers,
> Shashi
>
>


Re: IP tables blocking KVM/Console

2013-04-19 Thread Marcus Sorensen
yea, so you can play with that and see if it makes any difference. Iptables
may be blocking access to your consoleproxy's service, or something else.


On Fri, Apr 19, 2013 at 5:37 PM, Maurice Lawler wrote:

> Output:
>
> [root@gizmo scripts]# cat /proc/sys/net/bridge/bridge*
> 1
> 1
> 1
> 0
> 0
> [root@gizmo scripts]#
>
>
>
>
> On Apr 19, 2013, at 07:21 PM, Marcus Sorensen  wrote:
>
> what do you see in:
>
> cat /proc/sys/net/bridge/bridge*
>
> ? I think I've seen issues with these being set to 1, but I think it might
> need to be set to 1 if you're using security groups.
>
>
> On Fri, Apr 19, 2013 at 5:20 PM, Marcus Sorensen  >wrote:
>
> > What do you see in :
> >
> >
> >
> > On Fri, Apr 19, 2013 at 2:17 PM, Maurice Lawler  >wrote:
> >
> >> I've tried it with them disabled (iptables get written) and enabled (the
> >> same issue)
> >>
> >> The cron job seemed to do the trick, until someone just mentioned to
> try:
> >>
> >> iptables -I INPUT -p tcp -m tcp --dport 5900:6100 -j ACCEPT
> >>
> >> That's not working, so I am going back to my cronjob!
> >>
> >> - Maurice
> >>
> >>
> >> On Apr 19, 2013, at 02:08 PM, Edison Su  wrote:
> >>
> >>
> >>
> >> > -Original Message-
> >> > From: Jason Pavao [mailto:jason.pa...@oracle.com]
> >> > Sent: Thursday, April 18, 2013 8:50 AM
> >> > To: d...@cloudstack.apache.org
> >> > Cc: Maurice Lawler; users@cloudstack.apache.org
> >> > Subject: Re: IP tables blocking KVM/Console
> >> >
> >> > Maurice,
> >> > I was having the same issues, I tried a number of iptables rule
> >> changes, but it
> >> > seems that whenever a new instance was deployed it would overwrite my
> >> > changes and break things again. My temporary fix is to run a cron job
> >> that
> >> > runs every minute that issues a service iptables stop.
> >>
> >> Do you disable security group when creating the zone? If security group
> >> is disabled, then there should be no iptables rules created on kvm host
> >> when a new instance created.
> >>
> >> >
> >> > It's not elegant but it works since I don't have a need for security
> >> groups and
> >> > am supporting a jenkins continuous testing environment with no need
> for
> >> > network ingress/egress rules.
> >> >
> >> > Does anyone else know why this is happening?
> >> >
> >> > I am running cs 4.0.1 on oel6.3x64
> >> >
> >> > Any help would be appreciated.
> >> > Thanks.
> >> > -jason
> >> >
> >> > On 4/17/2013 7:47 PM, Maurice Lawler wrote:
> >> > > I have stopped iptables at least 15 times, because it keeps blocking
> >> > > my console access to my instances. How can I either A) disable
> >> > > Iptables all together / b add a rule to allow it's access.
> >> > >
> >> > > Right now, it has this:
> >> > >
> >> > > [root@lunder ~]# iptables -L
> >> > > Chain INPUT (policy ACCEPT)
> >> > > target prot opt source destination
> >> > > ACCEPT udp -- anywhere anywhere udp
> >> > > dpt:bootps
> >> > > ACCEPT tcp -- anywhere anywhere tcp
> >> > > dpt:bootps
> >> > > ACCEPT tcp -- anywhere anywhere tcp
> >> > > dpts:49152:49216
> >> > > ACCEPT tcp -- anywhere anywhere tcp
> >> > > dpts:vnc-server:synchronet-db
> >> > > ACCEPT tcp -- anywhere anywhere tcp
> >> > > dpt:16509
> >> > > ACCEPT tcp -- anywhere anywhere tcp
> >> > > dpt:websm
> >> > > ACCEPT tcp -- anywhere anywhere tcp dpt:8250
> >> > > ACCEPT tcp -- anywhere anywhere tcp
> >> > > dpt:empowerid
> >> > > ACCEPT tcp -- anywhere anywhere tcp
> >> > > dpt:webcache
> >> > > ACCEPT all -- anywhere anywhere state
> >> > > RELATED,ESTABLISHED
> >> > > ACCEPT icmp -- anywhere anywhere
> >> > > ACCEPT all -- anywhere anywhere
> >> > > ACCEPT tcp -- anywhere anywhere state NEW tcp dpt:ssh
> >> > > REJECT all -- anywhere anywhere reject-with
> >> > > icmp-host-prohibited
> >> > >
> >> > > Chain FORWARD (policy ACCEPT)
> >> > > target prot opt source destination
> >> > >
> >> > > Chain OUTPUT (policy ACCEPT)
> >> > > target prot opt source destination
> >> > > [root@lunder ~]#
> >> > >
> >> > > But there was plenty of other rules previously to my stopping it.
> >> > >
> >> > >
> >> >
> >> > --
> >> > Thanks.
> >> > -Jason
> >>
> >>
> >
>
>


Re: IP tables blocking KVM/Console

2013-04-19 Thread Marcus Sorensen
what do you see in:

 cat /proc/sys/net/bridge/bridge*

?  I think I've seen issues with these being set to 1, but I think it might
need to be set to 1 if you're using security groups.


On Fri, Apr 19, 2013 at 5:20 PM, Marcus Sorensen wrote:

> What do you see in :
>
>
>
> On Fri, Apr 19, 2013 at 2:17 PM, Maurice Lawler wrote:
>
>> I've tried it with them disabled (iptables get written) and enabled (the
>> same issue)
>>
>> The cron job seemed to do the trick, until someone just mentioned to try:
>>
>>   iptables -I INPUT -p tcp -m tcp --dport 5900:6100 -j ACCEPT
>>
>> That's not working, so I am going back to my cronjob!
>>
>> - Maurice
>>
>>
>> On Apr 19, 2013, at 02:08 PM, Edison Su  wrote:
>>
>>
>>
>> > -Original Message-
>> > From: Jason Pavao [mailto:jason.pa...@oracle.com]
>> > Sent: Thursday, April 18, 2013 8:50 AM
>> > To: d...@cloudstack.apache.org
>> > Cc: Maurice Lawler; users@cloudstack.apache.org
>> > Subject: Re: IP tables blocking KVM/Console
>> >
>> > Maurice,
>> > I was having the same issues, I tried a number of iptables rule
>> changes, but it
>> > seems that whenever a new instance was deployed it would overwrite my
>> > changes and break things again. My temporary fix is to run a cron job
>> that
>> > runs every minute that issues a service iptables stop.
>>
>> Do you disable security group when creating the zone? If security group
>> is disabled, then there should be no iptables rules created on kvm host
>> when a new instance created.
>>
>> >
>> > It's not elegant but it works since I don't have a need for security
>> groups and
>> > am supporting a jenkins continuous testing environment with no need for
>> > network ingress/egress rules.
>> >
>> > Does anyone else know why this is happening?
>> >
>> > I am running cs 4.0.1 on oel6.3x64
>> >
>> > Any help would be appreciated.
>> > Thanks.
>> > -jason
>> >
>> > On 4/17/2013 7:47 PM, Maurice Lawler wrote:
>> > > I have stopped iptables at least 15 times, because it keeps blocking
>> > > my console access to my instances. How can I either A) disable
>> > > Iptables all together / b add a rule to allow it's access.
>> > >
>> > > Right now, it has this:
>> > >
>> > > [root@lunder ~]# iptables -L
>> > > Chain INPUT (policy ACCEPT)
>> > > target prot opt source destination
>> > > ACCEPT udp -- anywhere anywhere udp
>> > > dpt:bootps
>> > > ACCEPT tcp -- anywhere anywhere tcp
>> > > dpt:bootps
>> > > ACCEPT tcp -- anywhere anywhere tcp
>> > > dpts:49152:49216
>> > > ACCEPT tcp -- anywhere anywhere tcp
>> > > dpts:vnc-server:synchronet-db
>> > > ACCEPT tcp -- anywhere anywhere tcp
>> > > dpt:16509
>> > > ACCEPT tcp -- anywhere anywhere tcp
>> > > dpt:websm
>> > > ACCEPT tcp -- anywhere anywhere tcp dpt:8250
>> > > ACCEPT tcp -- anywhere anywhere tcp
>> > > dpt:empowerid
>> > > ACCEPT tcp -- anywhere anywhere tcp
>> > > dpt:webcache
>> > > ACCEPT all -- anywhere anywhere state
>> > > RELATED,ESTABLISHED
>> > > ACCEPT icmp -- anywhere anywhere
>> > > ACCEPT all -- anywhere anywhere
>> > > ACCEPT tcp -- anywhere anywhere state NEW tcp dpt:ssh
>> > > REJECT all -- anywhere anywhere reject-with
>> > > icmp-host-prohibited
>> > >
>> > > Chain FORWARD (policy ACCEPT)
>> > > target prot opt source destination
>> > >
>> > > Chain OUTPUT (policy ACCEPT)
>> > > target prot opt source destination
>> > > [root@lunder ~]#
>> > >
>> > > But there was plenty of other rules previously to my stopping it.
>> > >
>> > >
>> >
>> > --
>> > Thanks.
>> > -Jason
>>
>>
>


Re: IP tables blocking KVM/Console

2013-04-19 Thread Marcus Sorensen
What do you see in :



On Fri, Apr 19, 2013 at 2:17 PM, Maurice Lawler wrote:

> I've tried it with them disabled (iptables get written) and enabled (the
> same issue)
>
> The cron job seemed to do the trick, until someone just mentioned to try:
>
>   iptables -I INPUT -p tcp -m tcp --dport 5900:6100 -j ACCEPT
>
> That's not working, so I am going back to my cronjob!
>
> - Maurice
>
>
> On Apr 19, 2013, at 02:08 PM, Edison Su  wrote:
>
>
>
> > -Original Message-
> > From: Jason Pavao [mailto:jason.pa...@oracle.com]
> > Sent: Thursday, April 18, 2013 8:50 AM
> > To: d...@cloudstack.apache.org
> > Cc: Maurice Lawler; users@cloudstack.apache.org
> > Subject: Re: IP tables blocking KVM/Console
> >
> > Maurice,
> > I was having the same issues, I tried a number of iptables rule changes,
> but it
> > seems that whenever a new instance was deployed it would overwrite my
> > changes and break things again. My temporary fix is to run a cron job
> that
> > runs every minute that issues a service iptables stop.
>
> Do you disable security group when creating the zone? If security group is
> disabled, then there should be no iptables rules created on kvm host when a
> new instance created.
>
> >
> > It's not elegant but it works since I don't have a need for security
> groups and
> > am supporting a jenkins continuous testing environment with no need for
> > network ingress/egress rules.
> >
> > Does anyone else know why this is happening?
> >
> > I am running cs 4.0.1 on oel6.3x64
> >
> > Any help would be appreciated.
> > Thanks.
> > -jason
> >
> > On 4/17/2013 7:47 PM, Maurice Lawler wrote:
> > > I have stopped iptables at least 15 times, because it keeps blocking
> > > my console access to my instances. How can I either A) disable
> > > Iptables all together / b add a rule to allow it's access.
> > >
> > > Right now, it has this:
> > >
> > > [root@lunder ~]# iptables -L
> > > Chain INPUT (policy ACCEPT)
> > > target prot opt source destination
> > > ACCEPT udp -- anywhere anywhere udp
> > > dpt:bootps
> > > ACCEPT tcp -- anywhere anywhere tcp
> > > dpt:bootps
> > > ACCEPT tcp -- anywhere anywhere tcp
> > > dpts:49152:49216
> > > ACCEPT tcp -- anywhere anywhere tcp
> > > dpts:vnc-server:synchronet-db
> > > ACCEPT tcp -- anywhere anywhere tcp
> > > dpt:16509
> > > ACCEPT tcp -- anywhere anywhere tcp
> > > dpt:websm
> > > ACCEPT tcp -- anywhere anywhere tcp dpt:8250
> > > ACCEPT tcp -- anywhere anywhere tcp
> > > dpt:empowerid
> > > ACCEPT tcp -- anywhere anywhere tcp
> > > dpt:webcache
> > > ACCEPT all -- anywhere anywhere state
> > > RELATED,ESTABLISHED
> > > ACCEPT icmp -- anywhere anywhere
> > > ACCEPT all -- anywhere anywhere
> > > ACCEPT tcp -- anywhere anywhere state NEW tcp dpt:ssh
> > > REJECT all -- anywhere anywhere reject-with
> > > icmp-host-prohibited
> > >
> > > Chain FORWARD (policy ACCEPT)
> > > target prot opt source destination
> > >
> > > Chain OUTPUT (policy ACCEPT)
> > > target prot opt source destination
> > > [root@lunder ~]#
> > >
> > > But there was plenty of other rules previously to my stopping it.
> > >
> > >
> >
> > --
> > Thanks.
> > -Jason
>
>


Re: IP tables blocking KVM/Console

2013-04-19 Thread Marcus Sorensen
That's reflected by this line:

ACCEPT tcp  --  anywhere anywheretcp
dpts:vnc-server:synchronet-db

Although we don't know what interfaces it applies to because we don't have
an 'iptables -L -v'

If stopping iptables fixes Maurice's problem it would be interesting to
know, as the rules seem to let VNC through. It should be easy to tcpdump
and see what traffic is actually being blocked because his rules suggest
that VNC is wide open on the KVM host.


On Fri, Apr 19, 2013 at 12:15 PM, Edison Su  wrote:

> This rule will reject all the ingress activities: "REJECT all  --
>  anywhere anywherereject-with icmp-host-prohibited"
> You can try:
> iptables -I INPUT -p tcp -m tcp --dport 5900:6100 -j ACCEPT
> to allow console access.
>
> From: Maurice Lawler [mailto:maurice.law...@me.com]
> Sent: Wednesday, April 17, 2013 7:48 PM
> To: Cloud Dev
> Cc: users@cloudstack.apache.org; users@cloudstack.apache.org
> Subject: IP tables blocking KVM/Console
>
> I have stopped iptables at least 15 times, because it keeps blocking my
> console access to my instances. How can I either A) disable Iptables all
> together / b add a rule to allow it's access.
>
> Right now, it has this:
>
> [root@lunder ~]# iptables -L
> Chain INPUT (policy ACCEPT)
> target prot opt source   destination
> ACCEPT udp  --  anywhere anywhereudp dpt:bootps
> ACCEPT tcp  --  anywhere anywheretcp dpt:bootps
> ACCEPT tcp  --  anywhere anywheretcp
> dpts:49152:49216
> ACCEPT tcp  --  anywhere anywheretcp
> dpts:vnc-server:synchronet-db
> ACCEPT tcp  --  anywhere anywheretcp dpt:16509
> ACCEPT tcp  --  anywhere anywheretcp dpt:websm
> ACCEPT tcp  --  anywhere anywheretcp dpt:8250
> ACCEPT tcp  --  anywhere anywheretcp
> dpt:empowerid
> ACCEPT tcp  --  anywhere anywheretcp
> dpt:webcache
> ACCEPT all  --  anywhere anywherestate
> RELATED,ESTABLISHED
> ACCEPT icmp --  anywhere anywhere
> ACCEPT all  --  anywhere anywhere
> ACCEPT tcp  --  anywhere anywherestate NEW tcp
> dpt:ssh
> REJECT all  --  anywhere anywherereject-with
> icmp-host-prohibited
>
> Chain FORWARD (policy ACCEPT)
> target prot opt source   destination
>
> Chain OUTPUT (policy ACCEPT)
> target prot opt source   destination
> [root@lunder ~]#
>
> But there was plenty of other rules previously to my stopping it.
>
>
>


Re: ebtables

2013-04-19 Thread Marcus Sorensen
you can go back and disable security groups in the zone if you don't care
about the ebtables rules, or you can start up ebtables and then restart any
associated VMs through cloudstack. The rules are dynamic, so they're not
going to be saved anywhere on the host to be reinstated, they have to be
reapplied by cloudstack via a restart of the vms.


On Fri, Apr 19, 2013 at 11:12 AM, Maurice Lawler wrote:

> Anyone know how to correct my mistake?
>
> - Maurice
>
>
> On Apr 19, 2013, at 2:01 AM, Maurice Lawler  wrote:
>
> > Perhaps this was not the best thing, now my ports are open; how can I
> revert back to eatables.
> >
> > Along with that, when reverted, how can I drop rules for a particular VM
> to allow communication via second IP address.
> >
> >
> > On Apr 18, 2013, at 10:34 PM, Maurice Lawler 
> wrote:
> >
> >> Disregard, for now, I have disabled/removed ebtables as shown here:
> >>
> >>
> http://mail-archives.apache.org/mod_mbox/incubator-cloudstack-users/201302.mbox/%3cb1df26ecc0458748ac97cece2da98d41012fa47b6...@sjcpmailbox01.citrite.net%3E
> >>
> >>
> >> On Apr 18, 2013, at 11:28 PM, Maurice Lawler 
> wrote:
> >>
> >>> Hello --
> >>>
> >>> Previously one told me how to do this, but I cannot find my notes on
> this, so I hope you can help me out.
> >>>
> >>> I am attempting to allow a secondary IP address on an instance by-pass
> the routing rules set forth in ebtables. I recall doing something like
> >>>
> >>> ebtables nat i-2-25-VM something ... I cannot for the life of me
> remember.
> >>>
> >>> How to list and/or drop the rules per VM.
> >>>
> >>> Can you guys assist?
> >
>
>


Re: VM with multiple nics connected to vpc n/w and isolated n/w - what is the usecase?

2013-04-17 Thread Marcus Sorensen
Its nice to have for migration, combined with the add/remove nic feature.
That was actually the #1 reason I heard about people wanting the add/remove
nic at the CS conference, to migrate between networks. Aside from that, I
can't think of a reason not to allow it. I can draw up a few arbitrary
scenarios where someone may want to connect a VM to two networks.
On Apr 17, 2013 5:37 AM, "Venkata SwamyBabu Budumuru" <
venkataswamybabu.budum...@citrix.com> wrote:

> Currently I see that we allow VM to have multiple nics one in VPC n/w
> another in isolate network. Is there any use case for this?
>
> Thanks,
> SWAMY
>


Re: Emergency: Cloud NOT starting

2013-04-13 Thread Marcus Sorensen
A "brctl show" would also be good to have.
On Apr 13, 2013 11:52 AM, "Marcus Sorensen"  wrote:

> If you do a "virsh list" on the agent there's a good chance you would see
> a VM running, however the system will only wait so long for it to boot up
> before shutting it down, so it will come and go. You can do "virsh
> vncdisplay (vmname)" and it will tell you what port to vnc to on the host
> in order to connect to the VM and see what state it is in.
>
> I see in the agent log that at one point it failed to start due to no
> private bridge. Is cloudbr0 your private as defined in agent.properties?
>
> You can also open /etc/cloud/agent/log4j-cloud.xml and change every INFO
> to DEBUG, restart the agent, and get more info.
> On Apr 13, 2013 11:45 AM, "Maurice Lawler"  wrote:
>
>> Thank you.
>>
>> The FSCK was already completed during boot up, it was forced. However,
>> how can I access the VM's when they are in starting state to see if they
>> need a FSCK?
>>
>> Agent log is showing this presently.
>>
>>
>> 2013-04-13 12:35:09,989 INFO  [cloud.agent.Agent]
>> (AgentShutdownThread:null) Stopping the agent: Reason = sig.kill
>> 2013-04-13 12:37:32,244 INFO  [utils.component.ComponentLocator]
>> (main:null) Unable to find components.xml
>> 2013-04-13 12:37:32,285 INFO  [utils.component.ComponentLocator]
>> (main:null) Skipping configuration using components.xml
>> 2013-04-13 12:37:32,285 INFO  [cloud.agent.AgentShell] (main:null)
>> Implementation Version is 4.0.1.20130201075054
>> 2013-04-13 12:37:32,286 INFO  [cloud.agent.AgentShell] (main:null)
>> agent.properties found at /etc/cloud/agent/agent.properties
>> 2013-04-13 12:37:32,287 INFO  [cloud.agent.AgentShell] (main:null)
>> Defaulting to using properties file for storage
>> 2013-04-13 12:37:32,289 INFO  [cloud.agent.AgentShell] (main:null)
>> Defaulting to the constant time backoff algorithm
>> 2013-04-13 12:37:32,413 INFO  [cloud.agent.Agent] (main:null) id is 1
>> 2013-04-13 12:37:32,418 ERROR [cloud.resource.ServerResourceBase]
>> (main:null) Nics are not configured!
>> 2013-04-13 12:37:32,420 ERROR [cloud.agent.AgentShell] (main:null) Unable
>> to start agent: Private NIC is not configured
>> 2013-04-13 12:42:30,653 INFO  [utils.component.ComponentLocator]
>> (main:null) Unable to find components.xml
>> 2013-04-13 12:42:30,654 INFO  [utils.component.ComponentLocator]
>> (main:null) Skipping configuration using components.xml
>> 2013-04-13 12:42:30,654 INFO  [cloud.agent.AgentShell] (main:null)
>> Implementation Version is 4.0.1.20130201075054
>> 2013-04-13 12:42:30,655 INFO  [cloud.agent.AgentShell] (main:null)
>> agent.properties found at /etc/cloud/agent/agent.properties
>> 2013-04-13 12:42:30,656 INFO  [cloud.agent.AgentShell] (main:null)
>> Defaulting to using properties file for storage
>> 2013-04-13 12:42:30,658 INFO  [cloud.agent.AgentShell] (main:null)
>> Defaulting to the constant time backoff algorithm
>> 2013-04-13 12:42:30,721 INFO  [cloud.agent.Agent] (main:null) id is 1
>> 2013-04-13 12:42:30,820 INFO
>>  [resource.virtualnetwork.VirtualRoutingResource] (main:null)
>> VirtualRoutingResource _scriptDir to use: scripts/network/domr/kvm
>> 2013-04-13 12:42:32,094 INFO  [kvm.resource.LibvirtComputingResource]
>> (main:null) No libvirt.vif.driver specififed. Defaults to BridgeVifDriver.
>> 2013-04-13 12:42:32,147 INFO  [cloud.agent.Agent] (main:null) Agent [id =
>> 1 : type = LibvirtComputingResource : zone = 1 : pod = 1 : workers = 5 :
>> host = 96.31.67.232 : port = 8250
>> 2013-04-13 12:42:32,154 INFO  [utils.nio.NioClient] (Agent-Selector:null)
>> Connecting to myipaddress:8250
>> 2013-04-13 12:42:32,444 INFO  [utils.nio.NioClient] (Agent-Selector:null)
>> SSL: Handshake done
>> 2013-04-13 12:42:32,599 INFO  [cloud.serializer.GsonHelper]
>> (Agent-Handler-1:null) Default Builder inited.
>> 2013-04-13 12:42:32,803 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
>> Proccess agent startup answer, agent id = 1
>> 2013-04-13 12:42:32,803 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
>> Set agent id 1
>> 2013-04-13 12:42:32,808 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
>> Startup Response Received: agent id = 1
>>
>>
>> The management log says this:
>>
>> 2013-04-13 12:43:28,952 DEBUG [cloud.network.NetworkManagerImpl]
>> (secstorage-1:null) Lock is released for network id 201 as a part of
>> network implement
>> 2013-04-13 12:43:28,969 DEBUG [db.Transaction.Transaction]
>> (secstorage-1:null) Rolling back the transaction:

Re: Emergency: Cloud NOT starting

2013-04-13 Thread Marcus Sorensen
:43:58,233 DEBUG [cloud.server.ManagementServerImpl]
> (EventChecker-1:null) Found 0 events to be purged
> 2013-04-13 12:43:58,235 DEBUG [cloud.server.ManagementServerImpl]
> (EventChecker-1:null) Deleting events older than: Fri Apr 12 12:43:58 CDT
> 2013
> 2013-04-13 12:43:58,238 DEBUG [cloud.server.ManagementServerImpl]
> (EventChecker-1:null) Found 0 events to be purged
> 2013-04-13 12:43:59,186 DEBUG
> [network.router.VirtualNetworkApplianceManagerImpl]
> (RouterStatusMonitor-1:null) Found 0 routers.
> [root@lunder agent]#
>
>
>
>
> On Apr 13, 2013, at 12:30 PM, Marcus Sorensen  wrote:
>
> > Well you've got something trying to start, because you have vnet
> > interfaces. You need to look at your agent logs to see why the system VMS
> > refuse to start. If the power went out it could be corruption, the system
> > VMS may be waiting for you to fsck. It sounds like maybe the system was
> put
> > into production without testing to make sure the host settings were
> > persistent and would survive a reboot?
> >
> > So 1) look at your agent logs. And 2) use vnc to look at whatever system
> > VMS are running and see what state they are in. They will probably
> > continually try to start and then shut down.
> > On Apr 13, 2013 11:24 AM, "Maurice Lawler" 
> wrote:
> >
> >> Greetings,
> >>
> >> I'm have a terrible way to go, nothing I have done will start my cloud.
> >> None of my system VM's will start, which in turn do not permit the
> regular
> >> OS VM's to start. I suffered from first a power outage, then I manually
> >> rebooted my server. Now, nothing is coming back online.
> >>
> >> I was previously told, having cloud0 first is the cause of this. Even
> when
> >> doing ifconfig cloud0 down, nothing seems to come back online.
> >>
> >> I have gone as far as stopping iptables / eatables along with
> >> stopping/starting the network and the management console.
> >>
> >>
> >> Checking the system VM's the continue to remain in a 'starting' status.
> >>
> >> [root@lunder ~]# service iptables status
> >> iptables: Firewall is not running.
> >> [root@lunder ~]# service ebtables status
> >> # Generated by ebtables-save v1.0 on Sat Apr 13 12:21:04 CDT 2013
> >> *nat
> >> :PREROUTING ACCEPT
> >> :OUTPUT ACCEPT
> >> :POSTROUTING ACCEPT
> >>
> >> [root@lunder ~]#
> >>
> >>
> >> [root@lunder daoenix]# ifconfig
> >> cloud0Link encap:Ethernet  HWaddr FE:00:A9:FE:00:67
> >>  inet addr:169.254.0.1  Bcast:169.254.255.255  Mask:255.255.0.0
> >>  inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
> >>  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >>  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> >>  TX packets:658 errors:0 dropped:0 overruns:0 carrier:0
> >>  collisions:0 txqueuelen:0
> >>  RX bytes:0 (0.0 b)  TX bytes:28068 (27.4 KiB)
> >>
> >> cloudbr0  Link encap:Ethernet  HWaddr C8:0A:A9:9E:2D:7C
> >>  inet addr:myipaddress  Bcast:9myipaddress Mask:255.255.255.224
> >>  inet6 addr: fe80::fc2c:bcff:fe00:5/64 Scope:Link
> >>  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >>  RX packets:192832 errors:0 dropped:0 overruns:0 frame:0
> >>  TX packets:11251 errors:0 dropped:0 overruns:0 carrier:0
> >>  collisions:0 txqueuelen:0
> >>  RX bytes:11481135 (10.9 MiB)  TX bytes:25153331 (23.9 MiB)
> >>
> >> eth0  Link encap:Ethernet  HWaddr C8:0A:A9:9E:2D:7C
> >>  inet6 addr: fe80::ca0a:a9ff:fe9e:2d7c/64 Scope:Link
> >>  UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
> >>  RX packets:199794 errors:0 dropped:0 overruns:0 frame:0
> >>  TX packets:24157 errors:0 dropped:0 overruns:0 carrier:0
> >>  collisions:0 txqueuelen:1000
> >>  RX bytes:14647159 (13.9 MiB)  TX bytes:25994485 (24.7 MiB)
> >>  Memory:df6e-df70
> >>
> >> loLink encap:Local Loopback
> >>  inet addr:127.0.0.1  Mask:255.0.0.0
> >>  inet6 addr: ::1/128 Scope:Host
> >>  UP LOOPBACK RUNNING  MTU:16436  Metric:1
> >>  RX packets:7850808 errors:0 dropped:0 overruns:0 frame:0
> >>  TX packets:7850808 errors:0 dropped:0 overruns:0 carrier:0
> >>  collisions:0 txqueuelen:0
> &

Re: Emergency: Cloud NOT starting

2013-04-13 Thread Marcus Sorensen
Well you've got something trying to start, because you have vnet
interfaces. You need to look at your agent logs to see why the system VMS
refuse to start. If the power went out it could be corruption, the system
VMS may be waiting for you to fsck. It sounds like maybe the system was put
into production without testing to make sure the host settings were
persistent and would survive a reboot?

So 1) look at your agent logs. And 2) use vnc to look at whatever system
VMS are running and see what state they are in. They will probably
continually try to start and then shut down.
On Apr 13, 2013 11:24 AM, "Maurice Lawler"  wrote:

> Greetings,
>
> I'm have a terrible way to go, nothing I have done will start my cloud.
> None of my system VM's will start, which in turn do not permit the regular
> OS VM's to start. I suffered from first a power outage, then I manually
> rebooted my server. Now, nothing is coming back online.
>
> I was previously told, having cloud0 first is the cause of this. Even when
> doing ifconfig cloud0 down, nothing seems to come back online.
>
> I have gone as far as stopping iptables / eatables along with
> stopping/starting the network and the management console.
>
>
> Checking the system VM's the continue to remain in a 'starting' status.
>
> [root@lunder ~]# service iptables status
> iptables: Firewall is not running.
> [root@lunder ~]# service ebtables status
> # Generated by ebtables-save v1.0 on Sat Apr 13 12:21:04 CDT 2013
> *nat
> :PREROUTING ACCEPT
> :OUTPUT ACCEPT
> :POSTROUTING ACCEPT
>
> [root@lunder ~]#
>
>
> [root@lunder daoenix]# ifconfig
> cloud0Link encap:Ethernet  HWaddr FE:00:A9:FE:00:67
>   inet addr:169.254.0.1  Bcast:169.254.255.255  Mask:255.255.0.0
>   inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
>   UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>   RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>   TX packets:658 errors:0 dropped:0 overruns:0 carrier:0
>   collisions:0 txqueuelen:0
>   RX bytes:0 (0.0 b)  TX bytes:28068 (27.4 KiB)
>
> cloudbr0  Link encap:Ethernet  HWaddr C8:0A:A9:9E:2D:7C
>   inet addr:myipaddress  Bcast:9myipaddress Mask:255.255.255.224
>   inet6 addr: fe80::fc2c:bcff:fe00:5/64 Scope:Link
>   UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>   RX packets:192832 errors:0 dropped:0 overruns:0 frame:0
>   TX packets:11251 errors:0 dropped:0 overruns:0 carrier:0
>   collisions:0 txqueuelen:0
>   RX bytes:11481135 (10.9 MiB)  TX bytes:25153331 (23.9 MiB)
>
> eth0  Link encap:Ethernet  HWaddr C8:0A:A9:9E:2D:7C
>   inet6 addr: fe80::ca0a:a9ff:fe9e:2d7c/64 Scope:Link
>   UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
>   RX packets:199794 errors:0 dropped:0 overruns:0 frame:0
>   TX packets:24157 errors:0 dropped:0 overruns:0 carrier:0
>   collisions:0 txqueuelen:1000
>   RX bytes:14647159 (13.9 MiB)  TX bytes:25994485 (24.7 MiB)
>   Memory:df6e-df70
>
> loLink encap:Local Loopback
>   inet addr:127.0.0.1  Mask:255.0.0.0
>   inet6 addr: ::1/128 Scope:Host
>   UP LOOPBACK RUNNING  MTU:16436  Metric:1
>   RX packets:7850808 errors:0 dropped:0 overruns:0 frame:0
>   TX packets:7850808 errors:0 dropped:0 overruns:0 carrier:0
>   collisions:0 txqueuelen:0
>   RX bytes:1611132695 (1.5 GiB)  TX bytes:1611132695 (1.5 GiB)
>
> virbr0Link encap:Ethernet  HWaddr 52:54:00:D9:D9:9A
>   inet addr:192.168.122.1  Bcast:192.168.122.255
>  Mask:255.255.255.0
>   UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>   RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>   TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>   collisions:0 txqueuelen:0
>   RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
>
> vnet0 Link encap:Ethernet  HWaddr FE:00:A9:FE:00:67
>   inet6 addr: fe80::fc00:a9ff:fefe:67/64 Scope:Link
>   UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>   RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>   TX packets:116 errors:0 dropped:0 overruns:0 carrier:0
>   collisions:0 txqueuelen:500
>   RX bytes:0 (0.0 b)  TX bytes:5232 (5.1 KiB)
>
> vnet1 Link encap:Ethernet  HWaddr FE:84:4C:00:00:01
>   inet6 addr: fe80::fc84:4cff:fe00:1/64 Scope:Link
>   UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>   RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>   TX packets:256 errors:0 dropped:0 overruns:1 carrier:0
>   collisions:0 txqueuelen:500
>   RX bytes:0 (0.0 b)  TX bytes:17849 (17.4 KiB)
>
> vnet2 Link encap:Ethernet  HWaddr FE:2C:BC:00:00:05
>   inet6 addr: fe80::fc2c:bcff:fe00:5/64 Scope:Link
>   UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>   RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>   

Re: Cloudstack KVM Failing

2013-04-02 Thread Marcus Sorensen
All I can suggest is to troubleshoot step-by-step.

Can you connect directly to the VMs via VNC client?

Can you log into consoleproxy system VM?

Can consoleproxy system VM ping the KVM hosts?

Can consoleproxy system VM telnet to VM VNC server ports?

Is consoleproxy running the java consoleproxy process?

Can you ping consoleproxy's public address from where your UI is running?

Does consoleproxy's routes allow it to get back to the system where your UI
is running?

etc...


On Mon, Apr 1, 2013 at 11:15 AM, Maurice Lawler wrote:

> I added 5900 + to ingress for that particular security profile, I would
> think that is how it would be done correct? I again restarted the proxy VM,
> yet and still I am unable to get the KVM to connect. The instances are
> fine, because I can access them via SSH without issue.
>
> What would you suggest?
>
> On Mar 31, 2013, at 12:36 AM, Marcus Sorensen  wrote:
>
> > This looks like your console proxy VM isn't working. It could be firewall
> > on the KVM host not allowing access to 5900+ anymore, or the consoleproxy
> > VM may need to be restarted (system VM starting with "v"). Has it worked
> in
> > the past?  You could try connecting to the instances with a vnc client to
> > see if they are accessible, skipping console proxy.
> > On Mar 30, 2013 10:40 PM, "Maurice Lawler" 
> wrote:
> >
> >> For some reason, I am unsure of the issue; attempting to review the KVM
> of
> >> any instance times out:
> >>
> >> cloudstack realhostip.com took too long to respond
> >>
> >> Has anyone encountered this error and how can it be corrected?
> >>
> >> Thanks,
> >> Maurice
>
>


Re: Cloudstack KVM Failing

2013-03-30 Thread Marcus Sorensen
This looks like your console proxy VM isn't working. It could be firewall
on the KVM host not allowing access to 5900+ anymore, or the consoleproxy
VM may need to be restarted (system VM starting with "v"). Has it worked in
the past?  You could try connecting to the instances with a vnc client to
see if they are accessible, skipping console proxy.
On Mar 30, 2013 10:40 PM, "Maurice Lawler"  wrote:

> For some reason, I am unsure of the issue; attempting to review the KVM of
> any instance times out:
>
> cloudstack realhostip.com took too long to respond
>
> Has anyone encountered this error and how can it be corrected?
>
> Thanks,
> Maurice