• Vitaliy Makkoveev [2023-10-04 23:38]:
On 5 Oct 2023, at 00:31, Alexander Bluhm <alexander.bl...@gmx.net> wrote:
On Wed, Oct 04, 2023 at 11:03:27PM +0300, Vitaliy Makkoveev wrote:
On Wed, Oct 04, 2023 at 09:13:59PM +0200, Alexander Bluhm wrote:
On Wed, Oct 04, 2023 at 08:42:48PM +0200, Kirill Miazine wrote:
If it happns again, could you send an 'ps axlww | grep ifconifg'
output? Then we see the wait channel where it hangs in the kernel.
$ ps axlww
UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TT TIME COMMAND
Here it happened again:
0 75339 23922 0 10 0 360 296 wg_ifq D+U p0 0:00.00
ifconfig wg1 destroy
wg_peer_destroy()
...
NET_LOCK();
while (!ifq_empty(&sc->sc_if.if_snd)) {
NET_UNLOCK();
tsleep_nsec(sc, PWAIT, "wg_ifq", 1000);
NET_LOCK();
}
NET_UNLOCK();
This net lock dance looks fishy. And the sleep has a timeout of 1
milli second. But that is may be per packet. So if you have a
long queue or the queue refills somehow, it will take forever.
I think the difference in the usage is constant traffic that keeps
the send queue full. The timeout hides the problem when there are
only a few packets.
This should ensure wg_qstart() will not dereference dying `peer'. Looks
crappy and potentially could block forever, but should work. However
netlock it unnecessary here. netlocked wg_output() could fill `if_snd'
while netlock released before tsleep(), so it serializes nothing but
stops packets processing.
Kirill, does this diff help?
I doubt that it changes much. When netlock is not taken, the queue
can still be filled with packets.
Removing this ugly netlock makes sense anyway. But without any
synchronisation just reading a variable feels wrong. Can we add a
read once like for mq_len in sys/mbuf.h? And the ifq_set_maxlen()
also looks very unsafe. For mbuf queues I added a mutex, interface
queues should do the same.
ok?
I guess this is uniprocessor machine, so synchronisation is not
related.
new diff doesn't prevent hang in test scenario either.
wg destroy would hang on both UP and MP machines -- the fresh Vultr test
machine is MP.
diff is ok mvs.
bluhm
Index: net/ifq.c
===================================================================
RCS file: /data/mirror/openbsd/cvs/src/sys/net/ifq.c,v
retrieving revision 1.50
diff -u -p -r1.50 ifq.c
--- net/ifq.c 30 Jul 2023 05:39:52 -0000 1.50
+++ net/ifq.c 4 Oct 2023 21:04:20 -0000
@@ -529,6 +529,14 @@ ifq_hdatalen(struct ifqueue *ifq)
return (len);
}
+void
+ifq_set_maxlen(struct ifqueue *ifq, unsigned int maxlen)
+{
+ mtx_enter(&ifq->ifq_mtx);
+ ifq->ifq_maxlen = maxlen;
+ mtx_leave(&ifq->ifq_mtx);
+}
+
unsigned int
ifq_purge(struct ifqueue *ifq)
{
Index: net/ifq.h
===================================================================
RCS file: /data/mirror/openbsd/cvs/src/sys/net/ifq.h,v
retrieving revision 1.38
diff -u -p -r1.38 ifq.h
--- net/ifq.h 30 Jul 2023 05:39:52 -0000 1.38
+++ net/ifq.h 4 Oct 2023 21:09:04 -0000
@@ -435,6 +435,7 @@ void ifq_deq_commit(struct ifqueue *,
void ifq_deq_rollback(struct ifqueue *, struct mbuf *);
struct mbuf *ifq_dequeue(struct ifqueue *);
int ifq_hdatalen(struct ifqueue *);
+void ifq_set_maxlen(struct ifqueue *, unsigned int);
void ifq_mfreem(struct ifqueue *, struct mbuf *);
void ifq_mfreeml(struct ifqueue *, struct mbuf_list *);
unsigned int ifq_purge(struct ifqueue *);
@@ -448,9 +449,8 @@ int ifq_deq_sleep(struct ifqueue *, st
const char *, volatile unsigned int *,
volatile unsigned int *);
-#define ifq_len(_ifq) ((_ifq)->ifq_len)
-#define ifq_empty(_ifq) (ifq_len(_ifq) == 0)
-#define ifq_set_maxlen(_ifq, _l) ((_ifq)->ifq_maxlen = (_l))
+#define ifq_len(_ifq) READ_ONCE((_ifq)->ifq_len)
+#define ifq_empty(_ifq) (ifq_len(_ifq) == 0)
static inline int
ifq_is_priq(struct ifqueue *ifq)
@@ -490,8 +490,8 @@ int ifiq_input(struct ifiqueue *, stru
int ifiq_enqueue(struct ifiqueue *, struct mbuf *);
void ifiq_add_data(struct ifiqueue *, struct if_data *);
-#define ifiq_len(_ifiq) ml_len(&(_ifiq)->ifiq_ml)
-#define ifiq_empty(_ifiq) ml_empty(&(_ifiq)->ifiq_ml)
+#define ifiq_len(_ifiq) READ_ONCE(ml_len(&(_ifiq)->ifiq_ml))
+#define ifiq_empty(_ifiq) (ifiq_len(_ifiq) == 0)
#endif /* _KERNEL */