On Fri, Mar 25, 2016 at 1:00 PM, Eric Dumazet <eric.duma...@gmail.com> wrote:
> On Fri, 2016-03-25 at 12:31 -0400, Craig Gallek wrote:
>
>> I believe the issue here is that closing the listen sockets will drop
>> any connections that are in the listen queue but have not been
>> accepted yet.  In the case of reuseport, you could in theory drain
>> those queues into the non-closed sockets, but that probably has some
>> interesting consequences...
>
> It is more complicated than this.
>
> Ideally, no TCP connection should be dropped during a server change.
>
> The idea is to let old program running as long as :
> 1) It has established TCP sessions
> 2) Some SYN_RECV pseudo requests are still around
>
> Once 3WHS completes for these SYN_RECV, children are queued into
> listener accept queues.
>
> But the idea is to direct all new SYN packets to the 'new' process and
> its listeners. (New SYN_RECV should be created on behalf on the new
> listeners only)
>
>
> In some environments, the listeners are simply transfered via FD
> passing, from the 'old process' to the new one.

Right. Comparatively, one of the nice features of the BPF variant is
that the sockets in the old process can passively enter listen_off
state solely with changes initiated by the new process (change the bpf
filter for the group).

By the way, if I read correctly, the listen_off feature was already
possible without kernel changes prior to fast reuseport by changing
SO_BINDTODEVICE on the old process's sockets to effectively segment
them into a separate reuseport group. With fast reuseport,
sk_bound_dev_if state equivalence is checked on joining a group, but
the socket is not removed from the array when that syscall is made, so
this does not work.

Reply via email to