> -----Original Message----- > From: Parthasarathy Bhuvaragan > Sent: Monday, 05 December, 2016 15:11 > To: Jon Maloy <ma...@donjonn.com>; Jon Maloy <jon.ma...@ericsson.com>; > tipc-discussion@lists.sourceforge.net; Ying Xue <ying....@windriver.com> > Cc: thompa....@gmail.com > Subject: Re: [PATCH net-next v2 0/3] tipc: improve interaction socket-link > > Hi Jon, > > Sorry for the delay, could not work due to sick child. > > The crash occurs due to the last commit: > "tipc: reduce risk of user starvation during link congestion" > > I examined the crash today, the crash due to array out of bounds for > skb->cb[48]. > The max size allowed for the callback area is 48bytes, whereas the new struct > tipc_skb_cb is 64 bytes.
Weird. I did of course test this, and on my system sizeof(tipc_skb_cb) yields 40, and everything works flawlessly. > This overrides the skb->destructor callback lying below the 'skb->cb'. > The sizeof struct sk_buff_head itself is 48bytes. > > crash> p *(struct sk_buff*)0xffff88003f007600 > : > dev = 0xffff88003f985000, > cb = "\000\00\000", > _skb_refdst = 0, > destructor = 0x1000000000000, << insane function pointer >> > > I think the simpler way to place these packets 'pkts' into the backlogq and > allow > temporary over-committing and keep the wakeup mechanism as it is. You are right. The end result will be the same. I'll change it and recommit. ///jon > > This way, we transmit the packet in tipc_link_advance_backlog() instead of > doing > it in > link_prepare_wakeup(). Its misleading that link_prepare_wakeup() transmits > packets. > > /Partha > > > On 11/30/2016 07:48 PM, Jon Maloy wrote: > > Weird. Looks like a corrupted incoming buffer directly at startup, > > before any of my new code is active. Is this repeatable? > > > > ///jon > > > > > > On 11/30/2016 08:52 AM, Parthasarathy Bhuvaragan wrote: > >> Hi Jon, > >> > >> With your patches, I get the following crash when loading the tipc > >> module. Leaving home now, so couldnt investigate further. > >> > >> [ 58.201114] tipc: Started in single node mode > >> [ 58.212991] Started in network mode > >> [ 58.213796] Own node address <1.1.1>, network identity 4711 > >> [ 58.238416] 8021q: adding VLAN 0 to HW filter on device data0 > >> [ 58.252217] 8021q: adding VLAN 0 to HW filter on device data1 > >> [ 58.270822] Enabled bearer <eth:data0>, discovery domain <1.1.0>, > >> priority 10 > >> [ 58.571114] general protection fault: 0000 [#1] SMP > >> [ 58.572031] Modules linked in: tipc ip6_udp_tunnel udp_tunnel > >> 9pnet_virtio 9p 9pnet virtio_net virtio_pci virtio_ring virtio > >> [ 58.572031] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-rc6+ #15 > >> [ 58.572031] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > >> [ 58.572031] task: ffffffff81c0d540 task.stack: ffffffff81c00000 > >> [ 58.572031] RIP: 0010:[<ffffffff8162f10d>] [<ffffffff8162f10d>] > >> skb_release_head_state+0x4d/0xa0 > >> [ 58.572031] RSP: 0018:ffff880037c03ba0 EFLAGS: 00010246 > >> [ 58.572031] RAX: 0001000000000000 RBX: ffff880033fffa00 RCX: > >> 00000000000000ff > >> [ 58.572031] RDX: 0000000000000000 RSI: ffff880037c03bca RDI: > >> ffff880033fffa00 > >> [ 58.572031] RBP: ffff880037c03ba8 R08: ffffffffa005f2c0 R09: > >> 0000000000000000 > >> [ 58.572031] R10: ffff880035b0f0a0 R11: ffffea0000000000 R12: > >> ffff880033fffa00 > >> [ 58.572031] R13: ffffffffa0048fd4 R14: ffffffff81cfbec0 R15: > >> ffff880033718000 > >> [ 58.572031] FS: 0000000000000000(0000) GS:ffff880037c00000(0000) > >> knlGS:0000000000000000 > >> [ 58.572031] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >> [ 58.572031] CR2: 0000000000851bf0 CR3: 0000000035b00000 CR4: > >> 00000000000006f0 > >> [ 58.572031] Stack: > >> [ 58.572031] ffff880033fffa00 ffff880037c03bc0 ffffffff8162f2b2 > >> ffff880033fffa00 > >> [ 58.572031] ffff880037c03be8 ffffffff8162f327 ffff880033fffa00 > >> 0000000000000000 > >> [ 58.572031] ffff880035b32540 ffff880037c03c68 ffffffffa0048fd4 > >> 0000000000000082 > >> [ 58.572031] Call Trace: > >> [ 58.572031] <IRQ> [ 58.572031] [<ffffffff8162f2b2>] > >> skb_release_all+0x12/0x30 > >> [ 58.572031] [<ffffffff8162f327>] kfree_skb+0x37/0xa0 > >> [ 58.572031] [<ffffffffa0048fd4>] tipc_disc_rcv+0x84/0x1d0 [tipc] > >> [ 58.572031] [<ffffffffa0053ddc>] tipc_rcv+0x3ac/0x3c0 [tipc] > >> [ 58.572031] [<ffffffff81093457>] ? find_busiest_group+0x117/0x940 > >> [ 58.572031] [<ffffffffa0043088>] tipc_l2_rcv_msg+0x48/0x60 [tipc] > >> [ 58.572031] [<ffffffff81641245>] __netif_receive_skb_core+0x2e5/0xa60 > >> [ 58.572031] [<ffffffff816360ba>] ? __build_skb+0x2a/0xe0 > >> [ 58.572031] [<ffffffff816360ba>] ? __build_skb+0x2a/0xe0 > >> [ 58.572031] [<ffffffff81643a8b>] __netif_receive_skb+0x1b/0x70 > >> [ 58.572031] [<ffffffff81643b0d>] netif_receive_skb_internal+0x2d/0x90 > >> [ 58.572031] [<ffffffff81644494>] napi_gro_receive+0x94/0x130 > >> [ 58.572031] [<ffffffffa0019405>] virtnet_receive+0x1a5/0x8a0 > >> [virtio_net] > >> [ 58.572031] [<ffffffffa0019b1d>] virtnet_poll+0x1d/0x80 [virtio_net] > >> [ 58.572031] [<ffffffff81644c2e>] net_rx_action+0x20e/0x390 > >> [ 58.572031] [<ffffffff8178358b>] __do_softirq+0x9b/0x2a2 > >> [ 58.572031] [<ffffffff81062d00>] irq_exit+0x60/0x70 > >> [ 58.572031] [<ffffffff81783324>] do_IRQ+0x54/0xd0 > >> [ 58.572031] [<ffffffff817817ff>] common_interrupt+0x7f/0x7f > >> [ 58.572031] <EOI> [ 58.572031] [<ffffffff817805c0>] ? > >> default_idle+0x20/0xe0 > >> [ 58.572031] [<ffffffff8114d439>] ? next_zone+0x29/0x30 > >> [ 58.572031] [<ffffffff8102769f>] arch_cpu_idle+0xf/0x20 > >> [ 58.572031] [<ffffffff81780a0c>] default_idle_call+0x2c/0x30 > >> [ 58.572031] [<ffffffff8109a4d7>] cpu_startup_entry+0x177/0x1e0 > >> [ 58.572031] [<ffffffff8177a7f7>] rest_init+0x77/0x80 > >> [ 58.572031] [<ffffffff81d5deb5>] start_kernel+0x40e/0x41b > >> [ 58.572031] [<ffffffff81d5d42f>] x86_64_start_reservations+0x2a/0x2c > >> [ 58.572031] [<ffffffff81d5d51b>] x86_64_start_kernel+0xea/0xed > >> [ 58.572031] Code: 00 00 48 8b 7b 68 48 85 ff 74 05 f0 ff 0f 74 36 > >> 48 8b 43 60 48 85 c0 74 14 65 8b 15 96 d3 9d 7e 81 e2 00 00 0f 00 75 > >> 30 48 89 df <ff> d0 48 8b 7b 70 48 85 ff 74 05 f0 ff 0f 74 03 5b 5d c3 > >> e8 bb > >> [ 58.572031] RIP [<ffffffff8162f10d>] skb_release_head_state+0x4d/0xa0 > >> [ 58.572031] RSP <ffff880037c03ba0> > >> [ 58.662814] ---[ end trace fa57695d3ce8757f ]--- > >> [ 58.663875] Kernel panic - not syncing: Fatal exception in interrupt > >> [ 58.664872] Kernel Offset: disabled > >> [ 58.664872] ---[ end Kernel panic - not syncing: Fatal exception in > >> interrupt > >> > >> regards > >> Partha > >> > >> On 11/29/2016 06:07 PM, Jon Maloy wrote: > >>> Ying, Partha, > >>> It would be very nice I could get "acked" or "reviewed" on this so I > >>> can send it to David before net-next closes. > >>> > >>> ///jon > >>> > >>> > >>>> -----Original Message----- > >>>> From: Jon Maloy [mailto:jon.ma...@ericsson.com] > >>>> Sent: Tuesday, 29 November, 2016 12:04 > >>>> To: tipc-discussion@lists.sourceforge.net; Parthasarathy Bhuvaragan > >>>> <parthasarathy.bhuvara...@ericsson.com>; Ying Xue > >>>> <ying....@windriver.com>; Jon Maloy <jon.ma...@ericsson.com> > >>>> Cc: ma...@donjonn.com; thompa....@gmail.com > >>>> Subject: [PATCH net-next v2 0/3] tipc: improve interaction socket-link > >>>> > >>>> We fix a very real starvation problem that may occur when the link > >>>> level runs into send buffer congestion. At the same time we make the > >>>> interaction between the socket and link layer simpler and more > >>>> consistent. > >>>> > >>>> v2: - Simplified link congestion check to only check against own > >>>> importance limit. This reduces the risk of higher levels > >>>> starving out lower levels. > >>>> > >>>> Jon Maloy (3): > >>>> tipc: unify tipc_wait_for_sndpkt() and tipc_wait_for_sndmsg() > >>>> functions > >>>> tipc: modify struct tipc_plist to be more versatile > >>>> tipc: reduce risk of user starvation during link congestion > >>>> > >>>> net/tipc/bcast.c | 2 +- > >>>> net/tipc/link.c | 81 ++++----- > >>>> net/tipc/msg.h | 8 +- > >>>> net/tipc/name_table.c | 100 +++++++---- > >>>> net/tipc/name_table.h | 21 +-- > >>>> net/tipc/node.c | 2 +- > >>>> net/tipc/socket.c | 450 > >>>> ++++++++++++++++++++++---------------------------- > >>>> 7 files changed, 327 insertions(+), 337 deletions(-) > >>>> > >>>> -- > >>>> 2.7.4 > >>> > > ------------------------------------------------------------------------------ Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today.http://sdm.link/xeonphi _______________________________________________ tipc-discussion mailing list tipc-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tipc-discussion