For the last day and a half I've been looking at this bug: https://bugs.launchpad.net/juju-core/+bug/1401130
There's a lot of detail attached to the ticket but the short story is that the Joyent cloud often allocates different internal networks to instances, meaning that they can't communicate. From what I can tell from relevant LP tickets, this has been a problem for a long time (perhaps always). It's very hit and miss - sometimes you get allocated 10 machines in a row that all end up with the same internal network, but more often than not it only takes 2 or 3 machine additions before running into one that can't talk to the others. I have found a forum post where someone from Joyent suggests adding a static route for 10.0.0.0/8 to force all internal traffic down the internal network interface. I've tried this out and it does indeed work. We *could* have cloud-init install such a static route as new instances are configured but that's a pretty gross hack that hardcodes an assumption in Juju about Joyent's network setup which will no doubt bite us down the track. Another possible workaround could be to have machines on Joyent communicate via their public addresses, ignoring the internal network. I'm not sure how hard this is. Andrew has played around with the Joyent API and curiously the ListNetworks API returns different networks to those that actually get assigned to the instances. I hacked up the Joyent provisioner to use these networks but that didn't seem to help. I have opened a support ticket with Joyent to get clarification (no response yet). Given that this is looking like a problem/feature at Joyent's end that needs clarification from them, may I suggest that this issue is no longer allowed to block CI? If there's other ideas about what's going on here, please speak up. - Menno
-- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev