struct my_proto { struct _hdr { uint32_t len; } hdr; char data[32]; } __attribute__((packed));
// use htons to use LE header size, since load_half does a first convertion // from network byte order const char *bpf_prog_string = " \ ssize_t bpf_prog1(struct __sk_buff *skb) \ { \ return bpf_htons(load_half(skb, 0)) + 4; \ }"; On Fri, Aug 3, 2018 at 11:28 AM, Dominique Martinet <asmad...@codewreck.org> wrote: > I've been playing with KCM on a 4.18.0-rc7 kernel and I'm running in a > problem where the iovec filled by recvmsg() is mangled up: it is filled > by the length of one packet, but contains (truncated) data from another > packet, rendering KCM unuseable. > > (I haven't tried old kernels to see for how long this is broken/try to > bisect; I might if there's no progress but this might be simpler than I > think) > > > I've attached a reproducer, a simple program that forks, creates a tcp > server/client, attach the server socket to a kcm socket, and in an > infinite loop sends varying-length messages from the client to the > server. > The loop stops when the server gets a message which length is not the > length indicated in the packet header, rather fast (I can make it run > for a while if I slow down emission, or if I run a verbose tcpdump for > example) > >From the reproducer: struct my_proto { struct _hdr { uint32_t len; } hdr; char data[32]; } __attribute__((packed)); // use htons to use LE header size, since load_half does a first convertion // from network byte order const char *bpf_prog_string = " \ ssize_t bpf_prog1(struct __sk_buff *skb) \ { \ return bpf_htons(load_half(skb, 0)) + 4; \ }"; The length in hdr is uint32_t above, but this looks like it's being read as a short. Tom > In the quiet version on a VM on my laptop, I get this output: > [root@f2 ~]# gcc -g -l bcc -o kcm kcm.c > [root@f2 ~]# ./kcm > client is starting > server is starting > server is receiving data > Got 14, expected 27 on 1th message: 22222222222222; flags: 80 > > The client sends message deterministacally, first one is 14 bytes filled > with 1, second one is 27 bytes filled with 2, third one is 9 bytes > filled with 3 etc (final digit is actually a \0 instead) > > As we can see, the server received 14 '2', and the header size matches > the second message header, so something went wrong™. > Flags 0x80 is MSG_EOR meaning recvmsg copied the full message. > > > > This happens even if I reduce the VMs CPU to 1, so I was thinking some > irq messes with the sock between skb_peek and the actual copy of the > data (as this deos work if I send slowly!), but even disabling > irq/preempt doesn't seem to help so I'm not sure what to try next. > > Any idea? > > > Thanks, > -- > Dominique Martinet