Mindaugas Rasiukevicius wrote: > Now that the BPF memory store can be external and can be provided by the > caller of the BPF program, we can use to it pass some values or reuse it > as a cache.
I don't think that external memory is needed. You added an auxiliary agrument, what is it for if it's not for passing some values? External memory disables several optimizations in bpfjit for most filter programs even if they don't use BPF_COP. 1. sljit has a limited number of registers especially memory addressable registers. I will have to allocate a register to a memstore pointer. Because all registers are already assigned, I will have to start moving data. The best I can think of is assigning BPFJIT_TMP2 to a new pointer but I will need to switch to EREG register in 32bit BPF_LD. This register is emulated on some arches, for instance on i386. 2. I have (or plan to have) some simple optimizations which aren't possible by rewriting a bpf program. For instance, if there are multiple loads in a linear block, I will only generate one index check. Here is an example: LD+W(0) ST(0) LD+W(4) ST(1) LD+W(8) ST(2) ... It loads packet words with increasing offsets and stores them in memwords for later processing. I generate only only one index check for the first LD instruction (return 0 is packet is shorter than 12 bytes). I can exit early without storing any word because I know that memory will not be available after that bpf program returns. If you make it external, I will have to generate three index checks to make sure that stores are visible to a caller for all possible packet lengths. > However, this has to be supported by bpf_validate() - we have > to tell it which words are going to be initialised by the caller, since it > currently checks and prevents loads of the uninitialised memstore words. > > Hence, here is the proposed API function for this: > > int bpf_set_initwords(bpf_ctx_t *bc, uint8_t *wordmask, size_t len); Your description of the new API is too terse to be absolutely certain but it doesn't look like my proposal I sent to you privately. I think you can add a mask to each copfunc which indicates which of the words it loads and which words are stored by the copfunc. For instance, your npf_cop_l3 will look like { .copfunc = &npf_cop_l3, .loads = BPF_COP_LOAD_NONE, .stores = BPF_COP_STORE(BPF_MW_IPVER) | BPF_COP_STORE(BPF_MW_L4OFF) | BPF_COP_STORE(BPF_MW_L4PROTO) }, /* ... other copfuncs ... */ Validation of BPF_COP instruction will look very natural: /* inside BPF_COP in bpf_validate_ext */ if (bc->copfuncs) { if (bc->copfuncs[i].inwords & invalid) goto out; invalid &= ~bc->copfuncs[i].outwords; } It's a bit more trickier for BPF_COPX because you need to pre-calculate loads/stores masks but it's doable. At the moment, bpf_filter_ext() resets the "invalid" mask. This translates to .stores = BPF_COP_STORE_ALL. Alex