This email has been starred in my inbox for a week and I have only just had
a chance to get all the way through it.

I have spent quite a bit of time in ACS networking.  I build the Palo Alto
Networks integration and acted as a technical advisor for Syed's work on
the NetScaler SSL Termination feature.

So just addressing QoS in ACS in general.  So instead of just limiting
bandwidth, we would be guaranteeing a specific rate as well.  This problem
is relatively complicated because usually ACS does not have orchestration
capabilities to the edge network it is implemented in.  Usually, the ACS
'public' traffic is handed off to spine switches and we lose control of the
networking before the traffic even gets out of the datacenter.  If that
spine is overloaded, there is no way that you can guarantee a QoS.  Are you
guys looking for just adding QoS to the edge that we can control?

I would argue that for an enterprise it would probably be valuable to be
able to support QoS to the edge of the orchestration reach.  A lot of
enterprise workloads are not actually as concerned about the speed of the
public link as much as the speed of the interconnections between different
guest networks.  That then brings up the fact that it is quite difficult to
be able to even support inter-guest networking communication without going
through the public link.

I also think that the fact that the VR is considered ephemeral has its
limitations and I have to admit that I have struggled with that one.  For
the Palo Alto firewall, it supports a lot more features than ACS
networking.  When I tell a customer using it that they can't make changes
on the PA for a specific network and expect the change to persist and not
cause problems, I usually get blank stares.  "But if the network is
created, why wouldn't it be persistent for the life of the network?".  Just
giving an option for persistent VRs would be a big step.

I have my own list of things that I would love to see addressed:
- Ordering firewall rules.
- Automatic VPC IP range provisioning (including the possibility to have
non-overlapping IP ranges).

As to not rant, I will leave it there.  I am interested in this topic for
sure.  Being a developer who has worked in this code base, I can assure you
that some of the features/functionality we are discussing here is quite
difficult technically given the current code base.  For some of these
things to be implemented we would need to do a major refactor of the
networking code base (which may be needed at this point anyway).

To Paul's point, I do agree that user groups to understand what features
are most important to the community would be valuable.  This conversation
needs to include developers too though because even though in theory
'anything is possible', in reality we have to be very smart about how we
approach major changes to the core components like this.  Especially since
this functionality has been extensively plugged into.  Maintaining the
validity of existing plugins when doing a rewrite is very important (or at
least getting those plugins updated to work with the new approach).  The
major problem there is most of us don't even have access to the majority of
those networking devices to test and validate their functionality.

Something that concerns me about the user group approach is that it is
really nice that people can agree that there is a demand for some
functionality, but without the means to actually implement the
functionality, where does it go from there?  Unless someone just steps up
and funds the development outright, I expect a lot of people will get
behind the conversation and the idea, but getting dedicated resources on
the projects will be a harder problem.  Yes, if we can get good
communication and some sort of consensus together, that is the required
first step.  I think at some point we will need to address the elephant in
the room of how does that actually translate into implementation.

Sorry for the wall-of-text...

> Hi Adrian,
> Obviously I have to pick you up on not including the <rant> tag, so I
> can't tell where it started :)
> Otherwise I'm pretty much in complete agreement.
> The community is probably too developer focused and for the project to
> stay relevant we probably need to redress that balance. What we really need
> are user-community driven features and far more user-input into the feature
> development process, and I agree that means making it more friendly to
> non-developers.  I'm not anti-developer, some of my best friends are
> developers :) but due to the job description, they don't tend to spend a
> lot of time consuming the product that they build especially not in the
> multitude of ways and users could be using them (SBP excepted).
> I won't go into a lengthy discussion about all your individual points,
> I'll just say that I largely agree, and point people to the site to site
> VPN feature as an example. It's implemented in openswan which I have no
> problem with, except the documentation on around how to
> connect it to various vendor's endpoints is sparse to non-existent.  Also
> the only place to see errors relating to the VPN are in a log file on the
> VPC virtual router which the user has no access to. VPNs are tricky enough
> to set up at the best of times, doing it without any feedback is a
> nightmare.  This would never pass any decent user acceptance testing.
> To throw an idea in the ring, *maybe* for a feature to be accepted, its
> design document and functional specification have to have been 'signed off'
> by x independent 'users'.  Controversial I know, but I think it makes a
> good starting point for us to think about how the people on the user
> mailing list can get involved. A Section in the CloudStack wiki maintained
> by users showing the features that they may put input into might help.  Or
> a 'features & improvements ' mailing list might serve as neutral territory
> for users and developers to have conversations...
> .... Just some thoughts..
> Hi Marcus, Somesh, Paul
> Thanks for responding and yes, I think in some aspects I may be asking too
> much but this isn't just me asking for extra/additional/new features per se.
> I think it's easier if I split up the issues that I perceive to be
> problems into four distinct areas. There's overlap between them but it
> might help me explain better:
> 1. New features
> 2. Features that are already there but that can’t be used 3. Features that
> are turned on that can't be turned off 4. Features that work but which
> could do with a more granular implementation
> In more detail:
> 1. New features:
> Yes this is a contentious one (QoS, Dynamic Routing, OpenVPN etc). I think
> that this is the greedy ask and it's not feasible or even fair to ask for
> stuff for free. Believe it or not, I'm not asking "for CloudStack to
> provide and maintain a fully fledged and featured router distribution in
> its provided virtual router". If customer demand pays for something new and
> cool to be developed and it's contributed back then great but this sort of
> thing can't be an expectation by default. For those that do want more, the
> solution here though might be, IMHO the ability to either:
> a) Provide a relatively simple way for users (not devs) to customise the
> VR themselves, adding/swapping/deleting packages, orchestration agents
> (puppet ansible salt etc) and giving people access to the VR for
> configuration without the risk of deletion on recreation; making the VR a
> 'first-class citizen' vm instead of a disposable commodity.
> b) Allow the use of third party router/firewall VMs to be used instead of
> the default VR. Obviously, this would be at the sacrifice of functions that
> a 3rd party might not provide such as userdata and most other functions
> would have to be managed manually on the VR without the orchestration
> functionality that ACS provides with the official system VMs. There appears
> to already be a way to do this but I can’t find much information on this so
> I might be very wrong. The router.template.<hypervisor> setting lets me
> create my own VR template but things went very wrong when I tried using it
> so I gave up. This setting also applies to every single vr in the entire
> zone which is not going to fit for many, especially if the replacement is a
> commercially licensed product. This is what I gather Somesh alluded to in
> his reply. It would be great if we could offer a choice of built-in VR, a
> Vyatta VR or a Cisco VR depending on the customer requirements without
> having to code a separate network provider for each, get it committed and
> wait for a new release.
> 2. Features that are already there but that can’t be used These are things
> like being able to configure the IPsec VPNs, iptables firewall/NAT rules,
> or routing tables on VRs in more detail. At the moment, if it can’t be
> orchestrated by ACS, it can’t be done at all. If I were to suggest that
> configuration of all VMs including user VMs had to be done by ACS most
> would likely state that as being utterly ridiculous but yet we accept that
> limitation with the VR VMs. Also, some of the hardcoded rules that validate
> configuration changes to VRs are simply broken. For example, yes it is
> possible and perfectly normal to have two routes with destinations that
> overlap if the subnet mask is different - it's called route summarisation.
> Another example is that I might want to set my default route on a VPC VR to
> the private gateway but because it doesn’t fit with the specific use-case
> that whoever designed that aspect had at the time, it's not allowed. One
> last one which would need more but not much, is being able to use DHCP
> relay functionality already built into DNSMasq - That one would solve a few
> problems for hybrid or private cloud deployments where IPAM is managed with
> Active Directory integrated tools.
> 3. Features that are turned on that can't be turned off This one is a
> personal annoyance and while there's likely other examples, one that gets
> me is the fact that you can’t turn off source NAT on a VPC VR.
> Perhaps a sensible *default* might be to use source NAT but there are
> perfectly valid reasons why some people might want to turn it off. Some
> people might want public IPs inside one or more tiers, some people might be
> doing source NAT further up in the network and the 'public' network is
> merely a transit network (default route on the VR). These sort of things
> are where I actually want ACS to do less and not to try to second guess
> what I want - just let me have a little more control over the basics
> without making things 'easier' for me (and enforcing it).
> 4. Features that work but which could do with a more granular
> implementation A good example of this would be the fact that you have to
> specify a CIDR for a VPC that cannot be changed or added to once
> configured. Nobody in their right mind would ever place a constraint like
> this on an on-premise network.
> Yes if you're doing something very simple or you create a VPC for each
> application that you build it's less of an issue but if you're using ACS as
> part of your more traditional corporate infrastructure, for DR purposes,
> for hybrid cloud purposes etc, it's a very notable WTF to any network
> engineer when they first see it. This example may be in place to make
> things simpler elsewhere (such as routes, firewall rules & VPN connections)
> but in my mind is simply brushing additional complexity in the future under
> the metaphorical carpet.
> Trying to flip the dev/networking approach here, it would be like creating
> a really cool automated way of installing and configuring MySQL on every VM
> but stopping anyone from uninstalling it or using anything other than
> InnoDB. The network guy might say "Why do you need anything other than
> MySQL/InnoDB and if you don’t want it, don’t use it". The dev/ops/devops
> guy would say they'd rather have a plain VM with no DB at all and install
> themselves if needed than have the really cool automation. Let users decide
> whether they want ACS to do cool stuff for them or whether they just want
> the basics and they'll do the rest (if more is even needed).
> In summary, while my initial rant may have come across as wanting more for
> nothing, priority 1 for me is actually the option to have *less* but the
> ability for me to tweak stuff myself instead of having ACS enforce its view
> on how things should be done. Leave the templated network provisioning
> procedures alone for where they fit, perhaps leave them as defaults, but
> don’t enforce them or assume that everyone wants them. I don't think it is
> safe to assume that "Cloud consumers are end users, web developers,
> application developers". IMHO, making that assumption, is the cause of the
> same effect.
> If we reject that assumption, we may find ACS to be a bit more welcoming
> to others. I don’t believe that more is necessarily needed, in fact less.
> Adrian
> </rant>
> The points raised are certainly valid from an enterprise networking
> standpoint, and don't fall on deaf ears, but we should keep things in
> perspective. To provide the aforementioned features would be relatively
> uncharted territory in the cloud orchestration world (at least not
> considering vendor provided networking solutions that only handle the
> network part of the equation), so while it would be good to aspire to
> providing those things, it should be no surprise that the platform works
> that way and lacks such features.
> For further perspective, keep in mind that cloud orchestration in general
> has been a pitch to software developers and management for "easy
> infrastructure". Cloud consumers are end users, web developers, application
> developers, so again it should be no surprise that the product provides
> features that cater to that, rather than providing the bells and whistles
> that a network admin would want to see in their infrastructure. CloudStack
> was never built to be pitched to network teams as a cure for managing their
> infra deployments, the only cloud product providers doing that are network
> vendors who have cloud networking products. This is of course why a VPC
> needs IPs defined, as applications care more about how to serve up a web
> page than network engineering and managing distinct layer 2 and 3, so the
> whole network stack is sandwiched into a simple orchestration mechanism
> that gets the application what it needs.
> In designing and deploying cloud, the most common complaint I see from
> people who are infrastructure maintainers is "why can't I just build the
> infrastructure the way I want and then have it orchestrated?".
> Unfortunately, we can't just automate and integrate with anyone's pet
> design. CloudStack supports many novel and custom network designs simply by
> allowing the option of letting you manage the network hardware and being
> hands-off (shared/public networks), while also being pluggable to allow
> vendors to take over whatever features and they wish. I've seen some pretty
> advanced overlay networking provided through third party plugins to
> CloudStack that take over all network functionality and provide more.
> What's really being asked for here is for CloudStack to provide and
> maintain a fully fledged and featured router distribution in its provided
> virtual router. It's an admirable project to have if we can get support for
> it. My guess is there's a bit of a disconnect in interest though, because
> many (but not all) enterprises who want CloudStack for infrastructure
> automation are skeptical about a VM as software router and prefer to bring
> in aforementioned enterprise vendors who have their own plugins. People who
> provide cloud hosting and other services tend to use the routers, but their
> interest in enterprise level routing and redundancy varies greatly, and
> their customers are designing their apps to be resilient to infrastructure
> loss (e.g. most AWS customers). That's of course not entirely the whole
> truth, as is evidenced by the work we are seeing on redundant routers, but
> I do believe that's why we haven't seen these things from the beginning.
> They just haven't been all that important to the target customers, even
> though infrastructure engineers are used to providing them.
> So now comes my philosophy. In the end, I think the great thing about open
> source communities is that if there's the right level of interest, it will
> happen.  I'm the kind of person who feels a pang of stress at the idea that
> something I work on can't be all things to all people, but after building a
> hosting business over the last few years I've begun to realize that it's
> really only practical to try to be good for a subset of the market and
> focus on that. You'll never please everyone, there are limits to what you
> can accomplish, and sometimes it's OK to just concede that your product is
> not going to work for everyone. If you don't, you'll spread yourself too
> thin and fail everyone. In order to make something great you have to have a
> limit on your scope. That's not to say you don't listen to your customers,
> but you sometimes have to make hard choices on who to listen to and who to
> upset.
> None of this should be taken as a discouragement to the topics at hand,
> but again as someone to takes it personally when I don't deliver I wanted
> to provide some follow up to address the "rant" and try to provide
> perspective on why the things are the way they are.
> > Adrian,
> >
> > Rant or not, I believe you have raised a valid point and reflect
> > certain group of peoples requirement.
> >
> > Based on your requirement, I believe you are looking for something
> > like Vyatta.
> >
> > Regards,
> > Somesh
> >
> >
> > Tempted to suggest some sort of special interest group where
> > networking people can have some input into the dev process despite not
> > necessarily being able to produce any code themselves. As an example,
> > Schuberg Philis have recently done some great work on the redundant
> > VPC VR but to a network person, this sort of functionality is almost
> > taken for granted (please don't take this as a lack of appreciation).
> > Similarly, the lack of end-to-end QoS for applications running on ACS
> > seems to me at least to be a fairly significant oversight. ACS is
> > known as having very flexible networking compared with some of the
> > alternatives but there does still appear to be an enterprise focus on
> > most elements that a 'typical'
> > developer (dare I say it, web developer) faces but more of a home
> > network approach to the networking side (aside from some pretty
> > impressive niche features).
> >
> > We shouldn't need to rely on proprietary 3rd party products to provide
> > a similar level of versatility for networking in ACS in my opinion. It
> > seems bizarre to me that we have load balancing, distributed routing &
> > ACLs with the OVS controller, PVLANs for isolation,  etc, but yet
> > still don't have what I would consider basic functions such as better
> > control over NAT, firewalling, routing (no dynamic routing protocols
> > at all), IPsec, having to specify IP related attributes to what should
> > simply be L2 constructs (why does a VPC need to be given a CIDR?!?)
> > etc. AWS had a similar issue that lead to the VPC being introduced -
> > enterprises consistently rejected the weird and illogical way that
> > they did networking back in the day that was overly focussed on
> > web/cloudy workloads.
> >
> > This sounds like a rant and to an extent it is but I'd like to turn it
> > into a positive. I feel fairly helpless when the typical response to
> > feedback like this is that I should just contribute code. There are a
> > number of people that embrace the concept that the community should be
> > a collective of not just developers, but at the same time it's pretty
> > difficult to feel part of a community that's run almost uniquely by
> > developers; it's even a bit intimidating at times. I've seen too many
> > commercial companies that abandon innovation in favour of satisfying
> > the 'large account' RFC/RFPs and in my opinion the same may apply to a
> > project driven largely by the needs of those that can contribute code.
> >
> > To flip the concept on its head, it would be like a network guy
> > creating an amazing cloud orchestration platform but where you can
> > only run centos
> > 6 with a LAMP stack - yes this might work for a lot of people (and it
> > would likely only be adopted by those people) but for those that just
> > want to do something a bit different, it would be a fairly frustrating
> > experience.
> >
> > Am I simply being a spoilt kid here or is there room for input that
> > might be constructive? Is there anyone here on the list with a
> > networking focus that can corroborate these concerns?
> >
> > Adrian
> >
> >
> > I don't think we can. QoS in CS is mostly throttling traffic on the
> > virtual interface.
> >
> > Regards,
> > Somesh
> >
> >
> >
> > Hi All,
> >
> > Does anyone know if it's possible to do network QoS in Cloudstack?  I
> > don't mean bandwidth limiting, but rather, prioritising different
> > traffic types for voice, etc.
> >
> > Thanks
> > Len
