Thanks for the comments >> To fix it, this patch adds a dev field to struct ipoib_neigh which is used >> instead of the struct neighbour dev one. > > It seems that in this design, if multiple ipoib interfaces are present, we > might > get an skb such that skb->dev will be different from the new dev field in > struct > ipoib_neigh. > > It seems that the result will be that the packet will be sent on a wrong > interface. > Right? > I don't see how. The field dev in ipoib_neigh doesn't take part in interface selection. As I see it, skb travels this path: 1. Passed to bond_dev->hard_start_xmit 2. bond_dev->hard_start_xmit chooses the current active interface, changes skb->dev and enqueues it back for xmittig.
>> In addition, if an IPoIB device is removed before bonding is unloaded it may >> cause bond0 neighbours (neighbours that point to bond0) to exist after the >> IPoIB >> device no longer exist. This is why a neighbour cleanup is required during >> device >> cleanup. This cleanup scans the arp cache and the ndisc cache to find there >> neighbours of bond0 which refer also to the relevant ibX. Also, when >> ib_ipoib module is >> unloaded, the neighbour destructor must be set to NULL because the neighbour >> function is in >> ib_ipoib. >> For this neigh table cleanup, it is required to export the symbol nd_tbl >> just like the symbol arp_tbl is. > > I wonder about this: is it really true that any allocated neighbour is always > in > either arp_tbl or nd_tbl? For example, could some code have called neigh_hold > and retained a neighbour that is not in either one of these tables? > I got the assumption about neighbours living in one of these 2 tables from observation and code reading. I preferred that that on keeping track of all ipoib_neighs and putting them in a list. However, I could do that instead of neigh_table scanning. Do you think it's better? For the example... I didn't understand it. Could you please explain? >> During my tests I found that when running >> >> 1. modprobe -r ib_mthca (to delete IPoIB interfaces) >> 2. ping somewhere on the subnet of bond0 >> >> I get this stack dump (which ends with kernel death) >> [<ffffffff8037ff32>] skb_under_panic+0x5c/0x60 >> [<ffffffff882e00c2>] :ib_ipoib:ipoib_hard_header+0xa6/0xc0 >> [<ffffffff803c3c98>] arp_create+0x120/0x226 >> [<ffffffff803c3dc3>] arp_send+0x25/0x3b >> [<ffffffff803c466a>] arp_solicit+0x186/0x195 >> [<ffffffff8038c0ac>] neigh_timer_handler+0x2b5/0x309 >> [<ffffffff8038bdf7>] neigh_timer_handler+0x0/0x309 >> [<ffffffff80239599>] run_timer_softirq+0x130/0x19e >> [<ffffffff80235fcc>] __do_softirq+0x55/0xc3 >> [<ffffffff8020acac>] call_softirq+0x1c/0x28 >> [<ffffffff8020c02b>] do_softirq+0x2c/0x7d >> [<ffffffff8021864a>] smp_apic_timer_interrupt+0x57/0x6a >> [<ffffffff80208e19>] mwait_idle+0x0/0x45 >> [<ffffffff8020a756>] apic_timer_interrupt+0x66/0x70 >> <EOI> [<ffffffff80208e5b>] mwait_idle+0x42/0x45 >> [<ffffffff80208db1>] cpu_idle+0x8b/0xae >> [<ffffffff80217d60>] start_secondary+0x47f/0x48f >> >> The only way I found to avoid this (for now) is to check skb headroom in >> ipoib_hard_header. I guess that this safety check doesn't harm regular IPoIB >> operation and it seems to solve my problem. However, I would be happy to >> hear what >> others think of this last issue. > > As I said, this seems to indicate a problem in the bonding code. > But what will happen after you error out in ipoib_hard_header? > Is the packet dropped? What might break as a result? > I will check the hard_header_len issue in the bonding code more carefully. From first look it seems that bonding does borrow the hard_header_len. Also, my checks show that it is safe to return with error from hard_header(). For example, in neigh_connected_output: err = dev->hard_header(skb, dev, ntohs(skb->protocol), neigh->ha, NULL, skb->len); read_unlock_bh(&neigh->lock); if (err >= 0) err = neigh->ops->queue_xmit(skb); else { err = -EINVAL; kfree_skb(skb); >> I would really appreciate comments. >> >> thanks >> >> -MoniS > _______________________________________________ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general