Re: Problem with autounload of nfsserver module
m...@eterna.com.au (matthew green) writes: i've long thought we should not auto unload by default. users can cause modules to be loaded but they cannot unload them. How do you solve that without auto unload?
Compat module auto-bounce
ttioctl() in sys/tty.c ends with doing (void)module_autoload(compat, MODULE_CLASS_ANY); for ioctls that it doesn't know about. This causes compat module to auto-bounce in and out a lot. Note that this happens even for up-to-date userland that doesn't need compat code. E.g. ttyname(3) uses TIOCPTSNAME which ttioctl() doesn't handle. Running vi on console seems to cause compat.mod to be autoloaded twice. This seems rather wasteful. -uwe
Re: low memory problem - pool calls vmem calls uvm calls pool etc.
On 12/09/2013 10:05 PM, David Sainty wrote: On 04/12/13 01:07, Lars Heidieker wrote: The recursion happened due to freeing to the wrong arena. Take a look at the pool_allocator_meta and how vmem recurses. It's one vmem arena (kmem_va_arena) using quantum caches and one (actually two kmem_meta_arena(s) stacked on top of kmem_meta_arena) for meta allocations eg pool_allocator_meta those arenas are stacked up on one arena without quatum caching (kmem_arena). So normal allocations utilized this cache, but no pool_caches are involved for meta allocations. So it might recurse once while switching to allocate via the kmem_meta_arena and from there on no pool_caches are involved. Allocations via vmem_alloc must be freed via vmem_free and vmem_xalloc via vmem_xfree mixing them, at least with quantum caches involved, will lead to inconsistencies. It occurs to me that disabling the quantum cache on the meta arenas may be specifically to avoid this problem. But this problem could also, I think, be resolved by ensuring that quantum caching is inhibited in vmem_free's in, and only in, ENOMEM paths. Then quantum caching could be used for meta arenas too. Is it worth going to that effort? Would it be much of a win? Using quantum caching must be symmetrically used, I'm not sure if the vmem_free paths or ok with quantum caching for the meta arenas... But vmem_alloc paths aren't really, you have to keep a reserve to allocate new boundary tags and this reserver is hard to calculate if allocations recurses more than once due to cache group allocations for the pool_cache, this will (if possible at all) require serializing the allocation path right through the pool_cache for meta allocations. Allocations from the meta arenas don't happen frequently so I don't think there will be any difference.
Re: BPF memstore and bpf_validate_ext()
Alexander Nasonov al...@yandex.ru wrote: All your recent changes to adapt bpfjit for npf show that you're hitting sljit limitation all the time. Your expectations of sljit are probably too high. We are discussing the external memory store which is useful regardless whether BPF program is JIT-compiled or not. It is a win in both cases. Also, it was you who proposed sljit. It can optimise *most* practical cases (80-20 rule) and I am happy with that. I do not understand why are you concerned about those rare/unusual cases. Do you have some particular application in mind? Something else than in our tree? ... When I later was implementing copfunc calls in bpfjit, I din't know how you gonna use external memory and I felt that it wasn't necessary (and I still do), so I changed your structs. We can pass the memstore pointer as a separate argument (it would be three arguments, fine for sljit), but what's the point.. ... By not making memory external I avoided doing a lot of changes in bpfjit to adapt to a new execution model. Now you want to have *both* external memory and copfuncs for a faster execution of your code without considering other uses of bpfjit. To be honest, I'm not very intersted in implementing external memory in bpfjit. Quite the opposite, after all these discussions I want to improve my index check optimization and make filters like 'host xxx' run a bit faster. Needless to say that this optimization is the most effective if bpf memory isn't visible from outside. Why are you ignoring the fact that your optimisations can still be added and be effective? I already suggested - we can add a flag to indicate that the caller does not care about the result in the memory store. Moreover, the usual byte-code produced by tcpdump/pcap does not even use the memory store so you optimisations would most of the time be applicable anyway! This discussion is going to nowhere. At this point, someone from outside should review our opinions and help us to make a decision. Alex -- Mindaugas
Re: BPF memstore and bpf_validate_ext()
Mindaugas Rasiukevicius wrote: Also, it was you who proposed sljit. Proposed for what? I implemented bpfjit using sljit if that's what you mean. I offered you a help with implementing jit compiler for npfcode. It was your idea to add COP/COPX and I agreed to implement a support for it in bpfjit. I never agreed on implementing external memory. It can optimise *most* practical cases (80-20 rule) and I am happy with that. I do not understand why are you concerned about those rare/unusual cases. Do you have some particular application in mind? Something else than in our tree? I don't have any application in mind but I don't understand why are you pushing two extentions to bpf solely to get performance benefit for your cases and you don't care that bpf looses performance even if there are no cop instructions in a program at all. We can pass the memstore pointer as a separate argument (it would be three arguments, fine for sljit), but what's the point.. My point is that you mix argument pack with something else. They should be separeted. Why are you ignoring the fact that your optimisations can still be added and be effective? I already suggested - we can add a flag to indicate that the caller does not care about the result in the memory store. I already offered to support SLJIT_FAST_CALL copfuncs in bpfjit. They're much faster than regular copfuncs. But that's mean you will need to emit sljit code and you will have a limited number of sljit registers and all other limitations of sljit. You still should be able to copy data from auxiliary argument to memstore and you can do it quite fast. You didn't respond to me about it. If you ingored it because you don't want to deal with sljit than you're pulling a blanket. It's easy to suggest to have a flag but it's actually a lot or work. You need to write several lines of C code to generate a single instruction. I don't want to maintain two different modes of code generation. If you want this flag, go ahead, write the code, write the tests and everyone will be happy. Moreover, the usual byte-code produced by tcpdump/pcap does not even use the memory store so you optimisations would most of the time be applicable anyway! Maybe in this case. I don't know all use cases. There are some IDS/IPS that use bpf but I never looked at them. In any case, this functionality will have to be tested. Alex