Ok, I have tried for a couple of hours but in vain. Problems:
- I don't really understand how to probe for paths not using qdiscs
using skb as a source (skb->list check if list is empty?), which I
would really like to do in ip_finish_output2(). Something along the
lines of:
if (list_empty(&skb->list)) {
nf_reset(skb);
}
Thats not exactly what I meant. I meant to manually follow the codepath
down the stack from ip_finish_output2() and put nf_resets everywhere
but in paths leading to the qdisc (dev_queue_xmit). This means in the
neighbour code, packet sockets, ...
Pardon my language, but this is just damn fugly :). Isn't there a point
where we can say for sure that an skb is not needed anymore by the
netfilter engine? Also from reading reading the code to SO_ORIGINAL_DST
I do not understand why this has to be in the netfilter code as it's a
socket option. If this was not separated from the rest of the stack we
couldn't care less that the skb has no reference entry to conntrack
anymore and boldly get the original IP information for transparent
proxying in NAT. OTOH we don't have fast NAT anymore so the rest of the
stack does not need such a functionality.
- The net/* seems to be randomly sprinkled with nf_reset()'s without
a clear documentation as to why (ipv4/ip_input.c:ip_call_ra_chain()
for 2.6 kernel, maybe for skb_clone()'d packets), which makes it
very hard to understand.
- Where exactly do we need to nf_conntrack_put()/nf_reset() skb's?
The reasons to call nf_reset() are:
- A packet is queued and the reference no longer needed
(ip_local_deliver_finish, ip_call_ra_chain)
- A packet is encapsulated and passed to the output functions
(ipip.c, ip_gre.c)
- A packet is decapsulated and passed to netif_rx (ipip.c, ip_gre.c)
What about packets which are not queued and are still in the processing
chain of netfilter while we call rmmod ip_conntrack? Or is the combo
br_write_lock_bh(BR_NETPROTO_LOCK);
br_write_unlock_bh(BR_NETPROTO_LOCK);
guaranteeing that such a condition does not occur?
I wonder if the remaining references should just be zeroed once we are
in the i_see_dead_people part and the call stack reveils rmmod
ip_conntrack? :)
If you can suggest a good way to find them .. :)
Bluntly speaking, my naive take on this is the following:
ip_conntrack_cleanup get called only when we basically rmmod a module or
if a init fails at some point. So why not introduce an atomic counter in
ip_conntrack_cleanup which gets incremented for each rerun of
i_see_dead_people _and_ if the list query using get_next_corpse() in
ip_ct_iterate_cleanup() returns NULL. After x HZ () of endless
schedule() we decrement ip_conntrack_untracked.ct_general.use or
ip_conntrack_count if the atomic counter > 1.
Don't laugh, please ;).
Cheers,
Roberto Nibali, ratz
--
-------------------------------------------------------------
addr://Kasinostrasse 30, CH-5001 Aarau tel://++41 62 823 9355
http://www.terreactive.com fax://++41 62 823 9356
-------------------------------------------------------------
terreActive AG Wir sichern Ihren Erfolg
-------------------------------------------------------------
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html