On Mon, Feb 13, 2023 at 08:39:39AM +0100, Alexandr Nedvedicky wrote:
> this bug has been found and reported by Hrvoje@ [1].
> I took my chance and asked Hrvoje to test a small diff [2].
> I would like to ask for OK to commit this fix which makes
> Hrvoje's test box happy. Diff below is same to one found
> at bugs@. The thing is that pfsync_bulk_update() function
> must check first if there is anything to update, otherwise
> we may day due to NULL pointer dereference.
Your check makes sense, OK bluhm@
But who is protecting write access to sc->sc_bulk_next ?
I think it is exclusive netlock. This works as pfsync_input() is
a protocol input function which is still not running in parallel.
rw_enter_read(&pf_state_list.pfs_rwl) does not protect sc->sc_bulk_next
it is a read lock. mtx_enter(&pf_state_list.pfs_mtx) is not taken
for all writers to sc->sc_bulk_next.
Do you have plans to relax this code from exclusive to shared
netlock?
bluhm
> --------8<---------------8<---------------8<------------------8<--------
> diff --git a/sys/net/if_pfsync.c b/sys/net/if_pfsync.c
> index e2c86971336..1fa58f6fab9 100644
> --- a/sys/net/if_pfsync.c
> +++ b/sys/net/if_pfsync.c
> @@ -2464,6 +2464,11 @@ pfsync_bulk_update(void *arg)
> st = sc->sc_bulk_next;
> sc->sc_bulk_next = NULL;
>
> + if (st == NULL) {
> + rw_exit_read(&pf_state_list.pfs_rwl);
> + goto out;
> + }
> +
> for (;;) {
> if (st->sync_state == PFSYNC_S_NONE &&
> st->timeout < PFTM_MAX &&