Re: [Coder-Com] RFD: Secondary links

Kev Thu, 11 Apr 2002 14:12:32 -0700

> >The idea is to allow servers to establish a secondary network link.  No
> >messages (except PINGs) would go across the secondary link unless the
> >primary link was broken.  This is still a long way from a mesh-linked
> >network, but might be an interesting enhancement.
>
> Might be, but it also opens up the possibility of many very subtle bugs


Of course it does.  This begs the question, though: is it worth it?  I
think so.  The reason is that this would allow us to remove the need for
most netsplits--only catastrophic failures would result in a split.  This
would greatly reduce CPU spikes, and would also reduce the frequency of
buffer allocation errors and other annoyances that crop up because of
netsplits and junctions.

> >We would introduce a new command for unregistered connections to server
> >ports--SECONDARY.  We would also introduce a new server<->server command
> >with the token SC.  SC would be prefixed by the introducing server and
> >would contain the same fields as a SERVER command--lag could lead to some
> >servers receiving the SECONDARY command before the SERVER command.  We
> >need to discuss how to represent destruction of a secondary link--is a
> >SQUIT command enough, or must we introduce another command?  We must also
> >discuss how the fall-back is to occur--if more than one SECONDARY link
> >has been established, how do we select which one to fall back to?

> Well: do servers other than the 2 having a secondary link need to know the
> secondary link exists? If no, you don't need to broadcast the creation and
> destruction of secondary links accross the network.

Consider: they could be on opposite sides of the net.  If we're interested
in reliability, intervening servers do need to know of the link, so they
know where to redirect the message.  If you don't do it this way, you
lose either reliability of message delivery or the benefit of the non-squit
operation.

> Let leafs have only one secondary link, simple and obvious.

I don't really see the need for this assertion.  There really isn't an
impact that I can see on resource allocation, for instance.

> For hubs, it gets more complicated:
> Store a "preferred hub-to-hub routing map" in the hub's .conf, which hubs
> can use to decide/arbitrate which secondary link to use when the need
> arises. This has the benefit of requiring no communication between hubs to
> make the decision. The drawbacks are that the hubs will not be able to make
> decisions based on useful data such as network lag, ping times etc. It also
> requires the routing-preferences stored in the .conf of each hub to be
> consistent. Keeping that map updated in the .conf is also an issue, but a
> minor one since it would only have to updated when a new hub links or or an
> existing one delinks.

Actually, we can have stuff in the .conf to make decisions based on the ping
times, using the same expression processing that does crules--we'd just need
to add a few more functions like ping times.

> You could also establish something along the lines of a routing protocol
> between hubs, which allows them to communicate their status (lag, pings,
> etc); and would allow them to choose the best secondary link to use on the
> fly without the need for any human input (either from an oper or a .conf
> file). Obviously this would have to exist outside of IRC-based
> server-server links (as in, via UDP).

Given the priority queue system implemented in u2.10.11, it should be
possible to manage this in-band, even.  If we try to use UDP, we will
lose packets even when the network conditions are fine, but that may
be acceptable for a routing protocol.  However, my original intent did
not (yet) include on-the-fly rerouting--this might be a prelude to such
a system, though.

> >My idea for this is that each link--both the primary and secondaries--
> >have an associated timestamp.  If the primary is lost, the secondary
> >with the oldest timestamp is promoted to the primary.
>
> But, would you also still allow humans (opers) to overrule and let them set
> up routing manyally (i.e., /squit and /connect)? For all practical purposes
> routing is, for the most part, a policy-based decision and not a technical on
> e.

I suppose we could add a command that would tickle the secondary link system
to permit on-the-fly rerouting of this sort.  My primary goal is
splitlessness, so that should be kept in mind.

> >I can think of three more problems that must be solved.  The first is
> >what to do with en-route messages: messages that cross with the SQUIT.
> >The parser drops messages that are going the wrong direction.  We should
> >then remove this restriction, but that leads to potentially infinite
> >message loops under certain types of desynchronization.
>
> Give messages a TTL?

A TTL wouldn't solve the multiple message receipt problem.  Imagine a server
with two links, one of which is lagged.  A user on a remote server JOINs a
channel and then leaves the channel.  The JOIN reaches one of the links,
followed by a PART.  Now the JOIN comes over the lagged link, and the server
is briefly desynched and thinks the user has rejoined the channel!

One of the possible solutions that occurred to me today is that we can add
a per-server sequence number to database messages of this form.  Then we
keep a record of the last sequence number we saw from that server.  If we
assert reliability, and are only concerned about ordering, that should be
sufficient.  We can allow the sequence number to wrap around, and it can
be encoded with the same base64 encoding we use for numeric nicks, to make
it more compact.  In fact, if it's the same length as a user's numeric nick,
that should be sufficient for our purposes.

> >  The second
> >problem is what to do with messages waiting in the sendq for the primary
> >link--they should be resent to the secondary, but might that require
> >re-parsing them?
>
> Have servers send an ACK every 2000 (or whatever number) messages, keep all
> messages sent since the last ACK in a buffer, and resend them via the new
> link when it comes online? This will keep the number of duplicated down and
> within predictable limits.

Unfortunately, due to the way the protocol works, we can't have *any*
duplicates.  We certainly can't try to simply repropagate the global
messages back to the new link due to the desync possibility.  We can
probably manage it with some additional flags in the msgq system--when
we clear the queue, we look for the flag on the buffer, and if it's there,
we then add it to the msgq for the next hop to the server's new location.
That still doesn't perfectly deal with messages directed to a channel
(say), but it's a step in the right direction...

> >The third problem is what to do about certain global
> >messages required for maintaining network synchronization--some could get
> >lost during the transition from the primary to the secondary, but we
> >can't simply have the server deal with multiple copies of the same change
> >to the database.
>
> How many messages are we talking about? If the number is managebly low, a
> simple acknowlegement-based system can be used.

I'd say 50% of a server<->server link is database changes, but that's not
even an educated guess.  Most of the IRC protocol is in fact database
change notifications: MODE, NICK, SERVER, SECONDARY, JOIN, PART, GLINE,
JUPE, AWAY, etc.  Only messages such as PRIVMSG, NOTICE, numerics, and
various requests (WHOIS, STATS, VERSION, etc.) don't fall into this
category.  An ACK-based system wouldn't work.  Now if we could pull the
database out-of-band somehow, like using an external database, that
would greatly reduce bandwidth requirements, decrease the possibility of
desyncs, and solve this little problem :)

> >So.  Discuss :)
>
> Could you detail some of the "interesting enhancements" you mentioned?
> Considering the amount of work this requires (basically it's irc3), they
> better be good =P

Basically, splitless IRC and an incremental movement towards a mesh-linked
net.  Personally, I feel the problems are soluble and quite easily, it's
just a matter of figuring out how to go about it.  Of course, I may be
wrong (which is why I invited discussion), but I think it's an idea worth
pursuing.  Certainly this would not come before .12 or a subsequent
release.
-- 
Kevin L. Mitchell <[EMAIL PROTECTED]>

Re: [Coder-Com] RFD: Secondary links

Reply via email to