On Thu, May 06, 2021 at 06:10:39PM +0200, Alexandr Nedvedicky wrote:
> Hello,
> 
> </snip>
> > > to be honest I have no idea what could be causing problems on those two 
> > > fairly
> > > distinct machines. The strange thing is that pf_test() currently does not 
> > > run in
> > > parallel. I don't quite understand why reverting my earlier change helps 
> > > here.
> > 
> > it could be two differents ways to trigger a bug somewhere else that
> > your commit expose.
> > 
> > the panic doesn't trigger in the same way on both machines:
> > - Olivier's machine seems to trigger it quickly (after some minutes)
> > - mine relatively slowly (~ once a day)
> 
>     Olivier's machine acts as AP, so it forwards packets between interfaces.
> 
>     If I remember correctly your machine is laptop/workstation, which
>     does not forward traffic. 

it doesn't forward, but it acts as a bridge: 2 physical networks cards
grouped in a bridge(4) with only few traffic (a network printer is on
other side, the bridge(4) is here because I had a sparse network card
and no physical-switch to put here).

>     the function, which we change back and forth here is
>     pf_state_key_link_reverse(), which is being called from pf_find_state() 
> here:
> 
> 1085 
> 1086         if (sk == NULL) {
> 1087                 if ((sk = RB_FIND(pf_state_tree, &pf_statetbl,
> 1088                     (struct pf_state_key *)key)) == NULL)
> 1089                         return (PF_DROP);
> 1090                 if (pd->dir == PF_OUT && pkt_sk &&
> 1091                     pf_compare_state_keys(pkt_sk, sk, pd->kif, pd->dir) 
> == 0)
> 1092                         pf_state_key_link_reverse(sk, pkt_sk);
> 1093                 else if (pd->dir == PF_OUT && pd->m->m_pkthdr.pf.inp &&
> 1094                     !pd->m->m_pkthdr.pf.inp->inp_pf_sk && !sk->inp)
> 1095                         pf_state_key_link_inpcb(sk, 
> pd->m->m_pkthdr.pf.inp);
> 1096         }
> 1097 
> 
>     the story in human words goes as follows:
> 
>       sk == NULL -> no matching state key was attached to packet, Thus we
>       have to search state key in state tree using RB_FIND()
> 
>       if we could find state key for packet in table, then we will try
>       to set up a 'shortcut', which can save us RB_FIND() later.
> 
>       1090 - 1092
>       the shortcut can be set up for outbound packet only (pd->dir PF_OUT),
>       which is also being forwarded (pkt_sk != NULL, indicates we are seeing
>       the packet for the second time pkt_sk holds state key for inbound
>       direction).  pf_compare_state_keys() is sanity check, it leaves a
>       debug message on system console on failure.
> 
>       So if it is outbound forwarded packet, we've seen earlier, we
>       set up a reverse link to save one RB_FIND() operation on next
>       forwarded packet, which matches the same state.
> 
>       1093 - 1095
>       creates similar shortcut for local bound packets. We put reference 
>       to state key into PCB linked to socket. This will save us RB_FIND()
>       operation for next local outbound, which matches the same state.
> 
> 
> given the bug seems to be triggered/uncovered by pf_state_key_link_reverse()
> is there any chance your laptop/workstation occasionally forwards packets?
> like doing NAT for vmd/qemu virtual machine?

I don't know bridge(4) internals, but it could make sense that it is
using such functions.

> if it is not the case then the question is how does it come we run
> pf_state_key_link_reverse()? which same as why pkt_sk is not NULL at line 
> 1090.
> 
> > 
> > I could try to run with your commit and see if I could trigger it more
> > easily or found some elements influencing it. I could try with GENERIC
> > for example to see if I still trigger the same assert() or if it is
> > more like Olivier.
> 
>     I need to think of how to further debug the thing.
> 
> > 
> > my LAN was several hosts with the same kernel and only this machine
> > trigger the panic, so it shouldn't be strictly linked to the
> > environment.
> > 

Thanks
-- 
Sebastien Marie

Reply via email to