Re: npf lock issue?

2017-05-29 Thread Anthony Mallet
On Friday 21 Apr 2017, at 23:41, Anthony Mallet wrote:
> Yes. npf does not seem to be involed at all. I kept the same message
> subject for consistency but I guess a new thread should be started.

Closing this thread, npf is not involved at all.
For the records, the issue is discussed in PR/52252, with an
explation/possible fix.


Re: npf lock issue?

2017-04-21 Thread Anthony Mallet
On Friday 21 Apr 2017, at 22:15, Mindaugas Rasiukevicius wrote:
| > #11 0x804b3075 in panic (
| > fmt=fmt@entry=0x806b6790 "uvm_km_check_empty: va %p
| > has pa 0x%llx") at /usr/src/sys/kern/subr_prf.c:258
| > #12 0x8044ed05 in uvm_km_check_empty (
| > map=map@entry=0x8081c780 ,
| > start=, end=18446744071572586496) at
| > /usr/src/sys/uvm/uvm_km.c:563
| > #13 0x8045268f in uvm_map (
| > map=map@entry=0x8081c780 ,
| > startp=startp@entry=0xfe80cc383918, size=size@entry=65536,
| > uobj=, uoffset=uoffset@entry=-1,
| > align=, flags=,
| > flags@entry=5927) at /usr/src/sys/uvm/uvm_map.c:1096
| > #14 0x8044ee4f in uvm_km_alloc (
| > map=0x8081c780 ,
| > size=size@entry=65536, align=align@entry=4096,
| > flags=flags@entry=49) at /usr/src/sys/uvm/uvm_km.c:621
| > #15 0x80240a4d in alloc_chunk (size=65536)
| > at
| > /usr/src/sys/external/bsd/sljit/dist/sljit_src/sljitExecAllocator.c:110
| > #16 sljit_malloc_exec (size=)
| > at
| > /usr/src/sys/external/bsd/sljit/dist/sljit_src/sljitExecAllocator.c:221
| > 221 header = (struct block_header*)alloc_chunk(chunk_size);
| >
| > Does this ring a bell to anyone?
|
| This looks like a bug in sljit rather than NPF per se.  The panic
| message suggests some kind of KVA leak.  I suspect it might be a
| result of e.g. a free_chunk() call with an incorrect size in the
| sljitExecAllocator.c code.

Yes. npf does not seem to be involed at all. I kept the same message
subject for consistency but I guess a new thread should be started.

Actually, sljit does not seem involed either. After my post, I noticed
that anything that tries to uvm_km_alloc() from module_map has the
same failure. On the failing machine, options MODULAR is used but I
have no modules. So npf happens to be the first piece of code
allocating memory from module_map (sljit actually). But, for instance,
a modload(1) right after boot also fails in exactly the same way.

The page that uvm_km_check_empty() finds already mapped is the very
first one of module_map. I checked the latest commits in uvm since
January without noticing anything suspect (and I don't have
UVM_HOTPLUG defined either). But my knowledge of uvm is
nearly zero ... so I did not really progress on this.


Re: npf lock issue?

2017-04-21 Thread Mindaugas Rasiukevicius
Anthony Mallet  | Trying to upgrade from 7.99.44 to today's -current, I have a panic
> | right away when starting npf. The boot with npf disabled is fine (see
> | note below), then when manually running `npfctl reload` the machine
> | reboots right aways with absolutely no diagnostic. This is an issue
> | that I experiencing consistently since something like last January or
> | so.
> 
> I got a useful backtrace, it's actually failing in sljit:
> 
> #11 0x804b3075 in panic (
> fmt=fmt@entry=0x806b6790 "uvm_km_check_empty: va %p has pa 
> 0x%llx")
> at /usr/src/sys/kern/subr_prf.c:258
> #12 0x8044ed05 in uvm_km_check_empty (
> map=map@entry=0x8081c780 , 
> start=, end=18446744071572586496)
> at /usr/src/sys/uvm/uvm_km.c:563
> #13 0x8045268f in uvm_map (
> map=map@entry=0x8081c780 , 
> startp=startp@entry=0xfe80cc383918, size=size@entry=65536, 
> uobj=, uoffset=uoffset@entry=-1, align=, 
> flags=, flags@entry=5927) at 
> /usr/src/sys/uvm/uvm_map.c:1096
> #14 0x8044ee4f in uvm_km_alloc (
> map=0x8081c780 , size=size@entry=65536, 
> align=align@entry=4096, flags=flags@entry=49)
> at /usr/src/sys/uvm/uvm_km.c:621
> #15 0x80240a4d in alloc_chunk (size=65536)
> at /usr/src/sys/external/bsd/sljit/dist/sljit_src/sljitExecAllocator.c:110
> #16 sljit_malloc_exec (size=)
> at /usr/src/sys/external/bsd/sljit/dist/sljit_src/sljitExecAllocator.c:221
> 221 header = (struct block_header*)alloc_chunk(chunk_size);
> 
> Does this ring a bell to anyone?

This looks like a bug in sljit rather than NPF per se.  The panic message
suggests some kind of KVA leak.  I suspect it might be a result of e.g. a
free_chunk() call with an incorrect size in the sljitExecAllocator.c code.

Alex -- do you want to have a look into this?

-- 
Mindaugas