Re: Notifications between Aeolus components

Jan Provaznik Thu, 24 Jan 2013 04:47:11 -0800

On 01/23/2013 11:51 PM, Steve Loranz wrote:

On Jan 23, 2013, at 4:21 PM, Matt Wagner <[email protected]>
wrote:

On Wed, Jan 23, 2013 at 01:42:31PM -0500, Mo Morsi wrote:

On 01/23/2013 01:16 PM, Bryan Kearney wrote:

Well, just thinking out loud.. can any component go into the
cloud which may not be in the same VPN space?


-- bk


I'd imagine this would come down to the security policy of the
organization deploying to the cloud, namely how lax the firewall
can be to permit connections from the cloud as well as the ip
address assignment on the cloud instances launched.


For a while now, I've been watching this discussion thinking, "It
feels like we decided a while back that AMQP would be cool, and
are now trying to work backwards to come up with reasons to
justify it." Perhaps I'm just not forward-looking enough, or
perhaps I'm overlooking an important detail, but that's sort of how
it feels to me.

I think networking is an edge case, and a minor detail. We should
absolutely try to support running components on different boxes
(though I don't believe we have ever properly tested or supported
it). But if


Having Aeolus components on different boxes is not edge case for me. I
think that having Imagefactory on separate box will be quite often
production setup requirement: if a production box is not bare metal and

doesn't support nested virtualization then separate box is the onlyoption. Also Imagefactroy HW requirements (more storage, less CPU) aredifferent from other Aeolus services (more CPU, less storage).

Personally I use remote Imagefactory or Deltacloud services quite often.

you break Aeolus up and run it on disparate network segments,
you're going to have to handle somehow patching things together. I
don't think we should choose how we handle inter-component
messaging based on what happens if you break components up on
different networks and refuse to set up a VPN or appropriate
port-forwarding rules.


Agree. I wouldn't care about cross VPN support.

I don't mean to single out networking in general, though; it's
just the latest in the discussion. I just worry that the discussion
is largely theoretical and academic, focused on how different
means of inter-component communications differ. That might be a
good conversation to have if we didn't already have all our
components using one. What I'm missing from this discussion is an
exploration of what issues we're actually experiencing today with
our HTTP callbacks system, and whether the overhead of switching to
AMQP is a worthwhile trade-off. Is changing Factory and Conductor
to use AMQP worthwhile to prevent the issue where if you shut down
one of the two components in the middle of an exchange of messages,
some messages might be lost? Could that better be solved by
implementing some queuing or polling? Or, is it fair to say that if
you send a job to Factory from Conductor and then shut down
Conductor, it's just expected that the updated status might be
missed?

There are various not-theoretical failures which can cause that acallback is not delivered, off-hand examples:

- network/firewall error
- vpn is down
- callback receiver is down
- rails proxy (in our case apache) is down

My impression is that overall opinion is that a failure can occur sosporadically that there is no reason to take care of them. Based on myexperience I believe this is wrong. I hit all above errors myself whentesting imagefactory (except the last apache proxy error, but onlybecause I was accessing rails directly).

Another opinion mentioned here and on IRC was that failures could becovered by additional polling/status checking. This is little bitsuboptimal:1) it's not sufficient to check objects states after a service is backafter a failure (callback failure can occur even if both ends arerunning). This polling would have to be done repeatedly, as a receiveryou don't know if a callback delivery failed or if it wasn't sent yet

2) you can use just polling in such case and get rid of callbacks at all

3) you have two ways through which the object state is changed (throughcallbacks and through polling)4) you have to take care of potential race conditions between pollingand callbacks

5) polling is not efficient

And I want to emphasize that it's not a problem only ofImagefactory<->Conductor communication but all components I mentioned inthe first mail.

6) means bunch of additional code/logic on receiver side

Can we agree on the assumption that current callback solution is notsufficient and that more robustness is required?


IMO we have 2 options then:

1) make callback system more robust in *each* component. As I saidbefore: if general opinion is that this is preferred solution, I'm finewith it. Though to me this sounds like reinventing of wheel to some extent.2) use message bus instead of callbacks - this brings you all requiredfeatures out of the box, also I would expect this means less coding anddelegating the problem to third party system which is designed exactlyfor this purpose, but the feedback so far was quite negative about thisoption.

It's not my intention to vehemently oppose AMQP, and I certainly
don't mean to suggest that it shouldn't be discussed. I just don't
find the current conversation terribly productive at making the
case for why we should switch.

-- Matt


imagefactory started off using QMF. It's what we were told was
decided on when Aeolus was designed. But conductor was having a
difficult time using it because it meant having a separate thread or
process to bridge conductor to the broker. So, in the summer of
2011, there was a number of discussions where developers on both the
imagefactory and conductor teams came out in favor of imagefactory
offering a REST interface. We did, and near the end of January of
2012, we officially removed the QMF interface from the source tree
when it started to seem like QPID/QMF was failing to gain traction
in the wider community.

I'm not saying the conversation shouldn't happen either. What I am
saying is that it should pick up where it was left 18 months ago
with the question of if the challenges of conductor actually
connecting to a broker are easier to deal with now than they were in
2011 when they were great enough to decide to switch course.


Could someone please send me a link to the discussion (if the discussion

was on the mailing list) - I can't find it, so I can't pick upconversation from that point.

-steve

Jan

Re: Notifications between Aeolus components

Reply via email to