On 16-11-29 09:37 PM, Alexei Starovoitov wrote: > On Tue, Nov 29, 2016 at 06:52:36PM -0800, John Fastabend wrote: >> On 16-11-29 04:15 PM, Alexei Starovoitov wrote: >>> On Tue, Nov 29, 2016 at 02:21:22PM +0100, Thomas Graf wrote: >>>> Registers new BPF program types which correspond to the LWT hooks: >>>> - BPF_PROG_TYPE_LWT_IN => dst_input() >>>> - BPF_PROG_TYPE_LWT_OUT => dst_output() >>>> - BPF_PROG_TYPE_LWT_XMIT => lwtunnel_xmit() >>>> >>>> The separate program types are required to differentiate between the >>>> capabilities each LWT hook allows: >>>> >>>> * Programs attached to dst_input() or dst_output() are restricted and >>>> may only read the data of an skb. This prevent modification and >>>> possible invalidation of already validated packet headers on receive >>>> and the construction of illegal headers while the IP headers are >>>> still being assembled. >>>> >>>> * Programs attached to lwtunnel_xmit() are allowed to modify packet >>>> content as well as prepending an L2 header via a newly introduced >>>> helper bpf_skb_push(). This is safe as lwtunnel_xmit() is invoked >>>> after the IP header has been assembled completely. >>>> >>>> All BPF programs receive an skb with L3 headers attached and may return >>>> one of the following error codes: >>>> >>>> BPF_OK - Continue routing as per nexthop >>>> BPF_DROP - Drop skb and return EPERM >>>> BPF_REDIRECT - Redirect skb to device as per redirect() helper. >>>> (Only valid in lwtunnel_xmit() context) >>>> >>>> The return codes are binary compatible with their TC_ACT_ >>>> relatives to ease compatibility. >>>> >>>> Signed-off-by: Thomas Graf <tg...@suug.ch> >>> ... >>>> +#define LWT_BPF_MAX_HEADROOM 128 >>> >>> why 128? >>> btw I'm thinking for XDP to use 256, so metadata can be stored in there. >>> >> >> hopefully not too off-topic but for XDP I would like to see this get > > definitely off-topic. lwt->headroom is existing concept. Too late > to do anything about it. > >> passed down with the program. It would be more generic and drivers could >> configure the headroom on demand and more importantly verify that a >> program pushing data is not going to fail at runtime. > > For xdp I think it will be problematic, since we'd have to check for > that value at prog array access to make sure tailcalls are not broken. > Mix and match won't be possible. > So what does 'configure the headroom on demand' buys us? > Isn't it much easier to tell all drivers "always reserve this much" ? > We burn the page anyway. > If it's configurable per driver, then we'd need an api > to retrieve it. Yet the program author doesn't care what the value is. > If program needs to do udp encap, it will try do it. No matter what. > If xdp_adjust_head() helper fails, the program will likely decide > to drop the packet. In some cases it may decide to punt to stack > for further processing, but for high performance dataplane code > it's highly unlikely. > If it's configurable to something that is not cache line boundary > hw dma performance may suffer and so on. > So I see only cons in such 'configurable headroom' and propose > to have fixed 256 bytes headroom for XDP > which is enough for any sensible encap and metadata. >
OK I'm convinced let it be fixed at some conservative value.