On 9.4.2022. 1:51, Alexandr Nedvedicky wrote: > Hello, > > thank you for taking a look at it. > > </snip> >> I think there is a use after free in you diff. After you return >> from pfsync_delete_tdb() you must not access the TDB again. >> >> Comments inline. >> > </snip> >>> TAILQ_INIT(&sn->sn_upd_req_list); >>> - TAILQ_CONCAT(&sn->sn_upd_req_list, &sc->sc_upd_req_list, ur_entry); >>> + while (!TAILQ_EMPTY(&sc->sc_upd_req_list)) { >>> + ur = TAILQ_FIRST(&sc->sc_upd_req_list); >> Other loops have this idiom >> while ((ur = TAILQ_FIRST(&sc->sc_upd_req_list) != NULL) { >> > new diff version uses 'TAILQ_FIRST()' > > </snip> >>> @@ -1827,18 +1853,20 @@ pfsync_sendout(void) >>> subh = (struct pfsync_subheader *)(m->m_data + offset); >>> offset += sizeof(*subh); >>> >>> - mtx_enter(&sc->sc_tdb_mtx); >>> count = 0; >>> while ((t = TAILQ_FIRST(&sn.sn_tdb_q)) != NULL) { >>> - TAILQ_REMOVE(&sn.sn_tdb_q, t, tdb_sync_entry); >>> + TAILQ_REMOVE(&sn.sn_tdb_q, t, tdb_sync_snap); >> I think the TDB in tdb_sync_snap list may be freed. Maybe >> you should grab a refcount if you store them into a list. >> > I see. pfsync must grab the reference count in pfsync_update_tdb(), > where tdb entry is inserted into queue. new diff fixes that. > >>> pfsync_out_tdb(t, m->m_data + offset); >>> offset += sizeof(struct pfsync_tdb); >>> +#ifdef PFSYNC_DEBUG >>> + KASSERT(t->tdb_snapped == 1); >>> +#endif >>> + t->tdb_snapped = 0; >> This may also be use after free. > new diffs drops the reference as soon as pfsync(4) dispatches > the tdb into pfsync packet. > > </snip> >>> @@ -2525,7 +2565,17 @@ pfsync_delete_tdb(struct tdb *t) >>> >>> mtx_enter(&sc->sc_tdb_mtx); >>> >>> + /* >>> + * if tdb entry is just being processed (found in snapshot), >>> + * then it can not be deleted. we just came too late >>> + */ >>> + if (t->tdb_snapped) { >>> + mtx_leave(&sc->sc_tdb_mtx); >> You must not access the TDB after this point. I thnik you cannot >> guarantee that. The memory will freed after return. > new diff fixes that > > </snip> >>> diff --git a/sys/netinet/ip_ipsp.h b/sys/netinet/ip_ipsp.h >>> index c697994047b..ebdb6ada1ae 100644 >>> --- a/sys/netinet/ip_ipsp.h >>> +++ b/sys/netinet/ip_ipsp.h >>> @@ -405,6 +405,7 @@ struct tdb { /* tunnel >>> descriptor block */ >>> u_int8_t tdb_wnd; /* Replay window */ >>> u_int8_t tdb_satype; /* SA type (RFC2367, PF_KEY) */ >>> u_int8_t tdb_updates; /* pfsync update counter */ >>> + u_int8_t tdb_snapped; /* dispatched by pfsync(4) */ >> u_int8_t is not atomic. I you want to change tdb_snapped, you need >> a mutex that also protects the othere fields in the same 32 bit >> word everywhere. I think a new flag TDBF_PFSYNC_SNAPPED in tdb_flags >> would be easier. The tdb_flags are protected by tdb_mtx. >> > I like your idea. > > updated diff is below. > > > thanks and > regards > sashan
Hi, I'm running this diff with bluhm@ pfsync mpfloor diff for 4 days on production firewalls where panics were found and for now everything seems normal ..