Am 22.11.21 um 00:04 schrieb Tom Lane:
Do we know that that actually happens in an arm's-length connection
(ie two separate machines)?  I wonder if the data loss is strictly
an artifact of a localhost connection.  There'd be a lot more pressure
on them to make cross-machine TCP work per spec, one would think.
But in any case, if we can avoid sending RST in this situation,
it seems mostly moot for our usage.

Sorry it took some days to get a setup to check this!

The result is as expected:

1. Windows client to Linux server works without dropping the error message
2. Linux client to Windows server works without dropping the error message
3. Windows client to remote Windows server drops the error message,
   depending on the timing of the event loop

In 1. the Linux server doesn't end the connection with a RST packet, so that the Windows client enqueues the error message properly and doesn't drop it.

In 2. the Linux client doesn't care about the RST packet of the Windows server and properly enqueues and raises the error message.

In 3. the combination of the bad RST behavior of client and server leads to data loss. It depends on the network timing. A delay of 0.5 ms in the event loop was enough in a localhost setup and as wall as in some LAN setup. On the contrary over some slower WLAN connection a delay of less than 15 ms did not loose data, but higher delays still did.

The idea of running a second process, pass the socket handle to it, observe the parent process and close the socket when it exited, could work, but I guess it's overly complicated and creates more issues than it solves. Probably the same if the master process handles the socket closing.

So I still think it's best to close the socket as proposed in the patch.

--

Regards,
Lars Kanis




Reply via email to