Re: IPsec tdb ref counting

2021-11-25 Thread Vitaliy Makkoveev
On Thu, Nov 25, 2021 at 09:52:54AM +0100, Alexander Bluhm wrote: > On Sat, Nov 13, 2021 at 06:04:07PM +0100, Alexander Bluhm wrote: > > When testing, please check for tdb leaks. > > The diff below was running on my performance setup for more than > 10 hours. iked SA lifetime was about 10

Re: IPsec tdb ref counting

2021-11-25 Thread Hrvoje Popovski
On 25.11.2021. 9:52, Alexander Bluhm wrote: > On Sat, Nov 13, 2021 at 06:04:07PM +0100, Alexander Bluhm wrote: >> When testing, please check for tdb leaks. > The diff below was running on my performance setup for more than > 10 hours. iked SA lifetime was about 10 seconds. ipsecctl -F; > vmstat

Re: IPsec tdb ref counting

2021-11-25 Thread Alexander Bluhm
On Sat, Nov 13, 2021 at 06:04:07PM +0100, Alexander Bluhm wrote: > When testing, please check for tdb leaks. The diff below was running on my performance setup for more than 10 hours. iked SA lifetime was about 10 seconds. ipsecctl -F; vmstat -m showed no leak. Running regress passed. Hrvoje

Re: IPsec tdb ref counting

2021-11-24 Thread Vitaliy Makkoveev
> On 24 Nov 2021, at 22:14, Tobias Heider wrote: > > On Wed, Nov 24, 2021 at 03:52:26PM +0100, Alexander Bluhm wrote: >> On Wed, Nov 24, 2021 at 05:12:36PM +0300, Vitaliy Makkoveev wrote: >>> Understood. But his means we encoded double unref when we calling >>> tdb_unref() just after

Re: IPsec tdb ref counting

2021-11-24 Thread Vitaliy Makkoveev
It was not clean for me because we has no extra reference for parallel tdb_delete(). Now I understand how it should work and why ‘TDBF_DELETED’ fixed this. Thanks for explanation. > On 24 Nov 2021, at 17:52, Alexander Bluhm wrote: > > On Wed, Nov 24, 2021 at 05:12:36PM +0300, Vitaliy Makkoveev

Re: IPsec tdb ref counting

2021-11-24 Thread Tobias Heider
On Wed, Nov 24, 2021 at 03:52:26PM +0100, Alexander Bluhm wrote: > On Wed, Nov 24, 2021 at 05:12:36PM +0300, Vitaliy Makkoveev wrote: > > Understood. But his means we encoded double unref when we calling > > tdb_unref() just after tdb_delete(tdb). To me it looks better to avoid > > this and rework

Re: IPsec tdb ref counting

2021-11-24 Thread Alexander Bluhm
On Wed, Nov 24, 2021 at 05:12:36PM +0300, Vitaliy Makkoveev wrote: > Understood. But his means we encoded double unref when we calling > tdb_unref() just after tdb_delete(tdb). To me it looks better to avoid > this and rework handlers like below: I have tried this before. It creates a tdb leak.

Re: IPsec tdb ref counting

2021-11-24 Thread Vitaliy Makkoveev
On Wed, Nov 24, 2021 at 02:12:15PM +0100, Alexander Bluhm wrote: > On Wed, Nov 24, 2021 at 02:48:06AM +0300, Vitaliy Makkoveev wrote: > > >> ddb{2}> show panic > > >> *cpu2: pool_do_get: tdb free list modified: page 0x8801; > > >> item addr 0 > > >> x8801c998; offset

Re: IPsec tdb ref counting

2021-11-24 Thread Alexander Bluhm
On Wed, Nov 24, 2021 at 02:48:06AM +0300, Vitaliy Makkoveev wrote: > >> ddb{2}> show panic > >> *cpu2: pool_do_get: tdb free list modified: page 0x8801; item > >> addr 0 > >> x8801c998; offset 0x28=0xdeadbeee > >> > >> ddb{2}> show tdb /f 0x8801c998 > >> tdb at

Re: IPsec tdb ref counting

2021-11-23 Thread Vitaliy Makkoveev
> On 23 Nov 2021, at 18:16, Tobias Heider wrote: > > On Tue, Nov 23, 2021 at 02:18:26PM +0100, Alexander Bluhm wrote: >> On Tue, Nov 23, 2021 at 06:54:59AM +0100, Hrvoje Popovski wrote: >>> after 24 hours hitting sasyncd setup one box panic >> >> Thanks for testing. >> >> I have reduced my

Re: IPsec tdb ref counting

2021-11-23 Thread Hrvoje Popovski
On 23.11.2021. 14:18, Alexander Bluhm wrote: > On Tue, Nov 23, 2021 at 06:54:59AM +0100, Hrvoje Popovski wrote: >> after 24 hours hitting sasyncd setup one box panic > > Thanks for testing. > > I have reduced my iked lifetime to about 10 seconds and got the > same panic on my new 8 core test

Re: IPsec tdb ref counting

2021-11-23 Thread Tobias Heider
On Tue, Nov 23, 2021 at 02:18:26PM +0100, Alexander Bluhm wrote: > On Tue, Nov 23, 2021 at 06:54:59AM +0100, Hrvoje Popovski wrote: > > after 24 hours hitting sasyncd setup one box panic > > Thanks for testing. > > I have reduced my iked lifetime to about 10 seconds and got the > same panic on

Re: IPsec tdb ref counting

2021-11-23 Thread Alexander Bluhm
On Tue, Nov 23, 2021 at 06:54:59AM +0100, Hrvoje Popovski wrote: > after 24 hours hitting sasyncd setup one box panic Thanks for testing. I have reduced my iked lifetime to about 10 seconds and got the same panic on my new 8 core test machine. ddb{2}> trace db_enter() at db_enter+0x10

Re: IPsec tdb ref counting

2021-11-23 Thread Vitaliy Makkoveev
On Tue, Nov 23, 2021 at 06:54:59AM +0100, Hrvoje Popovski wrote: > On 21.11.2021. 23:36, Alexander Bluhm wrote: > > Updated tdb refcounting diff after merging with mvs@'s commit. > > Hi, > > after 24 hours hitting sasyncd setup one box panic > > r620-2# panic: pool_do_get: tdb free list

Re: IPsec tdb ref counting

2021-11-22 Thread Hrvoje Popovski
On 21.11.2021. 23:36, Alexander Bluhm wrote: > Updated tdb refcounting diff after merging with mvs@'s commit. Hi, after 24 hours hitting sasyncd setup one box panic r620-2# panic: pool_do_get: tdb free list modified: page 0x812e; item addr 0 x812e2a88; offset 0x28=0xdead410f

Re: IPsec tdb ref counting

2021-11-21 Thread Alexander Bluhm
Updated tdb refcounting diff after merging with mvs@'s commit. Index: net/if_bridge.c === RCS file: /data/mirror/openbsd/cvs/src/sys/net/if_bridge.c,v retrieving revision 1.358 diff -u -p -r1.358 if_bridge.c --- net/if_bridge.c

Re: IPsec tdb ref counting

2021-11-20 Thread Alexander Bluhm
On Sun, Nov 21, 2021 at 03:17:08AM +0300, Vitaliy Makkoveev wrote: > Can I propose to commit the diff [1] to expose the TDB hardlimit excess > statistics and then test this diff? I want to see stable systems with > non null "TDBs with hard limit excess" couter in the statistics like > below. Yes

Re: IPsec tdb ref counting

2021-11-20 Thread Vitaliy Makkoveev
On Sat, Nov 20, 2021 at 05:31:33PM +0100, Alexander Bluhm wrote: > New tdb refcounting diff. > > I delete and unref the timeouts earlier and fixed some leaks. At > least on my machine it does not crash and tcp InUse is 0 after > ipsecctl -F . > > Please test. > > mvs: Are some of your

Re: IPsec tdb ref counting

2021-11-20 Thread Alexander Bluhm
New tdb refcounting diff. I delete and unref the timeouts earlier and fixed some leaks. At least on my machine it does not crash and tcp InUse is 0 after ipsecctl -F . Please test. mvs: Are some of your concerns resolved by deleting the tdb_reaper() ? bluhm Index: net/if_bridge.c

Re: IPsec tdb ref counting

2021-11-19 Thread Vitaliy Makkoveev
I want to note all the people who testing this diff. Please be sure your test exceeds the lifetime of the `tdb’ and some rekeying cycles have been made. > On 19 Nov 2021, at 20:09, Alexander Bluhm wrote: > > Hi, > > I got a new hardware in my testlab and could panic this machine > easily with

Re: IPsec tdb ref counting

2021-11-19 Thread Alexander Bluhm
Hi, I got a new hardware in my testlab and could panic this machine easily with the tdb refcount diff. Adding tdb refcounts for timeout fixes this for me. This allows to get rid of the timeout reaper. I refcount all successful timeout_add() and decrease in timeout handler. There are no

Re: IPsec tdb ref counting

2021-11-17 Thread Hrvoje Popovski
On 16.11.2021. 0:08, Hrvoje Popovski wrote: > On 15.11.2021. 16:44, Hrvoje Popovski wrote: >> On 15.11.2021. 15:04, Vitaliy Makkoveev wrote: >>> On Mon, Nov 15, 2021 at 02:51:16PM +0100, Hrvoje Popovski wrote: >>> >>> And you don'n see "> tdb_free() killing ourself" in dmesg >>> output? >>

Re: IPsec tdb ref counting

2021-11-15 Thread Hrvoje Popovski
On 15.11.2021. 16:44, Hrvoje Popovski wrote: > On 15.11.2021. 15:04, Vitaliy Makkoveev wrote: >> On Mon, Nov 15, 2021 at 02:51:16PM +0100, Hrvoje Popovski wrote: >> >> And you don'n see "> tdb_free() killing ourself" in dmesg >> output? > > > I couldn't find that message anywhere >

Re: IPsec tdb ref counting

2021-11-15 Thread Hrvoje Popovski
On 15.11.2021. 15:04, Vitaliy Makkoveev wrote: > On Mon, Nov 15, 2021 at 02:51:16PM +0100, Hrvoje Popovski wrote: > > And you don'n see "> tdb_free() killing ourself" in dmesg > output? I couldn't find that message anywhere

Re: IPsec tdb ref counting

2021-11-15 Thread Vitaliy Makkoveev
On Sun, Nov 14, 2021 at 10:50:34PM +0100, Alexander Bluhm wrote: > On Sat, Nov 13, 2021 at 06:04:07PM +0100, Alexander Bluhm wrote: > > It passes regress but there are setups that are not covered. Bridge > > and pfsync with IPsec and TCP signature need special care. > > > > When testing, please

Re: IPsec tdb ref counting

2021-11-15 Thread Vitaliy Makkoveev
On Mon, Nov 15, 2021 at 02:51:16PM +0100, Hrvoje Popovski wrote: And you don'n see "> tdb_free() killing ourself" in dmesg output? > On 15.11.2021. 13:11, Vitaliy Makkoveev wrote: > > Hi, > > > > Could you try this diff? It should still panic, but I suspect to see > > ">

Re: IPsec tdb ref counting

2021-11-15 Thread Hrvoje Popovski
On 15.11.2021. 13:11, Vitaliy Makkoveev wrote: > Hi, > > Could you try this diff? It should still panic, but I suspect to see > "> tdb_free() killing ourself" string. panic with your diff r620-1# panic: kernel diagnostic assertion "refcnt != ~0" failed: file "/sys/kern/kern_synch.c",

Re: IPsec tdb ref counting

2021-11-15 Thread Vitaliy Makkoveev
On Mon, Nov 15, 2021 at 12:35:13PM +0100, Hrvoje Popovski wrote: > On 14.11.2021. 22:50, Alexander Bluhm wrote: > > New diff with fix from mvs@. Please continue testing with this one. > > Hi, > > i've applied this diff on sasyncd setup with two ipsec sessions and i'm > getting this panic. Box

Re: IPsec tdb ref counting

2021-11-15 Thread Hrvoje Popovski
On 14.11.2021. 22:50, Alexander Bluhm wrote: > New diff with fix from mvs@. Please continue testing with this one. Hi, i've applied this diff on sasyncd setup with two ipsec sessions and i'm getting this panic. Box didn't panic instantly but after some time. I will leave ddb console active...

Re: IPsec tdb ref counting

2021-11-14 Thread Alexander Bluhm
On Sat, Nov 13, 2021 at 06:04:07PM +0100, Alexander Bluhm wrote: > It passes regress but there are setups that are not covered. Bridge > and pfsync with IPsec and TCP signature need special care. > > When testing, please check for tdb leaks. The vmstat -m tdb in use > culumn must be 0. It it

Re: IPsec tdb ref counting

2021-11-14 Thread Stuart Henderson
On 2021/11/14 03:12, Vitaliy Makkoveev wrote: > Hi, > > Do you have panics with this diff? Running now, I don't hit the splassert traces with this one. I will let you know if there is a panic. > Index: sys/net/if_bridge.c > === >

Re: IPsec tdb ref counting

2021-11-13 Thread Vitaliy Makkoveev
Hi, Do you have panics with this diff? Index: sys/net/if_bridge.c === RCS file: /cvs/src/sys/net/if_bridge.c,v retrieving revision 1.358 diff -u -p -r1.358 if_bridge.c --- sys/net/if_bridge.c 11 Nov 2021 18:08:17 - 1.358

Re: IPsec tdb ref counting

2021-11-13 Thread Vitaliy Makkoveev
On Sat, Nov 13, 2021 at 09:49:31PM +, Stuart Henderson wrote: > On 2021/11/13 18:04, Alexander Bluhm wrote: > > Hi, > > > > To make IPsec MP safe we need refcounting for the tdb. The diff > > below is part of something bigger we have at genua. Although it > > does not cover timeouts and the

Re: IPsec tdb ref counting

2021-11-13 Thread Stuart Henderson
On 2021/11/13 23:05, Stuart Henderson wrote: > On 2021/11/13 22:41, Stuart Henderson wrote: > > On 2021/11/13 21:49, Stuart Henderson wrote: > > > On 2021/11/13 18:04, Alexander Bluhm wrote: > > > > Hi, > > > > > > > > To make IPsec MP safe we need refcounting for the tdb. The diff > > > > below

Re: IPsec tdb ref counting

2021-11-13 Thread Stuart Henderson
On 2021/11/13 22:41, Stuart Henderson wrote: > On 2021/11/13 21:49, Stuart Henderson wrote: > > On 2021/11/13 18:04, Alexander Bluhm wrote: > > > Hi, > > > > > > To make IPsec MP safe we need refcounting for the tdb. The diff > > > below is part of something bigger we have at genua. Although it

Re: IPsec tdb ref counting

2021-11-13 Thread Stuart Henderson
On 2021/11/13 21:49, Stuart Henderson wrote: > On 2021/11/13 18:04, Alexander Bluhm wrote: > > Hi, > > > > To make IPsec MP safe we need refcounting for the tdb. The diff > > below is part of something bigger we have at genua. Although it > > does not cover timeouts and the tdb reaper yet, I

Re: IPsec tdb ref counting

2021-11-13 Thread Stuart Henderson
On 2021/11/13 18:04, Alexander Bluhm wrote: > Hi, > > To make IPsec MP safe we need refcounting for the tdb. The diff > below is part of something bigger we have at genua. Although it > does not cover timeouts and the tdb reaper yet, I want to get this > in as a frist step. > > It passes

IPsec tdb ref counting

2021-11-13 Thread Alexander Bluhm
Hi, To make IPsec MP safe we need refcounting for the tdb. The diff below is part of something bigger we have at genua. Although it does not cover timeouts and the tdb reaper yet, I want to get this in as a frist step. It passes regress but there are setups that are not covered. Bridge and