On 2013-11-20 07:04, Vladislav Bogdanov wrote:
19.11.2013 13:48, Lars Ellenberg wrote:
On Wed, Nov 13, 2013 at 09:02:47AM +0300, Vladislav Bogdanov wrote:
13.11.2013 04:46, Jefferson Ogata wrote:
...
3. portblock preventing TCP Send-Q from draining, causing tgtd
connections to hang. I modified portblock to reverse the sense of the
iptables rules it was adding: instead of blocking traffic from the
initiator on the INPUT chain, it now blocks traffic from the target on
the OUTPUT chain with a tcp-reset response. With this setup, as soon as
portblock goes active, the next packet tgtd attempts to send to a given
initiator will get a TCP RST response, causing tgtd to hang up the
connection immediately. This configuration allows the connections to
terminate promptly under load.

I'm not totally satisfied with this workaround. It means
acknowledgements of operations tgtd has actually completed never make it
back to the initiator. I suspect this could cause problems in some
scenarios. I don't think it causes a problem the way i'm using it, with
each LUN as backing store for a distinct VM--when the LUN is back up on
the other node, the outstanding operations are re-sent by the initiator.
Maybe with a clustered filesystem this would cause problems; it
certainly would cause problems if the target device were, for example, a
tape drive.

Maybe only block "new" incoming connection attempts?

That may cause issues on an initiator side in some circumstances (IIRC):
* connection is established
* pacemaker fires target move
* target is destroyed, connection breaks (TCP RST is sent to initiator)
* initiator connects again
* target is not available on iSCSI level (but portals answer either on
old or on new node) or portals are not available
* initiator *returns error* to an upper layer <- this one is important
* target is configured on other node then

I was hit by this, but that was several years ago, so I may miss some
details.

Indeed, using iptables with REJECT and tcp-reset, this seems to piss off the initiators, creating immediate i/o errors. But one can use DROP on incoming SYN packets and let established connections drain. I've been trying to get this to work but am finding that it takes so long for some connections to drain that something times out. I haven't given up on this approach, tho. Testing this stuff can be tricky because if i make one mistake, stonith kicks in and i end up having to wait 5-10 minutes for the machine to reboot and resync its DRBD devices.

My experience with IET and LIO shows it is better (safer) to block all
iSCSI traffic to target's portals, both directions.
* connection is established
* pacemaker fires target move
* both directions are blocked (DROP) on both target nodes
* target is destroyed, connection stays "established" on initiator side,
just TCP packets timeout
* target is configured on other node (VIPs are moved too)
* firewall rules are removed
* initiator (re)sends request
* target sends RST (?) back - it doesn't have that connection
* initiator reconnects and continues to use target

As already noted, this approach doesn't work with TGT because it refuses to teardown its config until it has drained all its connections, and they can't drain if ACK packets can't come in. The only reliable solution i've found so far is to send RSTs to tgtd (but leave the initiators in the dark).

I'm also using VIPs. They don't have to be bound to a specific target in a tgt configuration; you just have each initiator connect to a given target only using its unique VIP.
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to