Re: [Q: 2.4 vs. 2.6] nf_conntrack_get() semantics in copy_skb_header()

Roberto Nibali Tue, 29 Nov 2005 02:10:42 -0800

Ok, I have tried for a couple of hours but in vain. Problems:


- I don't really understand how to probe for paths not using qdiscs
  using skb as a source (skb->list check if list is empty?), which I
  would really like to do in ip_finish_output2(). Something along the
  lines of:

  if (list_empty(&skb->list)) {
     nf_reset(skb);
  }

Thats not exactly what I meant. I meant to manually follow the codepath

down the stack from ip_finish_output2() and put nf_resets everywhere
but in paths leading to the qdisc (dev_queue_xmit). This means in the
neighbour code, packet sockets, ...

Pardon my language, but this is just damn fugly :). Isn't there a pointwhere we can say for sure that an skb is not needed anymore by thenetfilter engine? Also from reading reading the code to SO_ORIGINAL_DSTI do not understand why this has to be in the netfilter code as it's asocket option. If this was not separated from the rest of the stack wecouldn't care less that the skb has no reference entry to conntrackanymore and boldly get the original IP information for transparentproxying in NAT. OTOH we don't have fast NAT anymore so the rest of thestack does not need such a functionality.

- The net/* seems to be randomly sprinkled with nf_reset()'s without
  a clear documentation as to why (ipv4/ip_input.c:ip_call_ra_chain()
  for 2.6 kernel, maybe for skb_clone()'d packets), which makes it
  very hard to understand.
- Where exactly do we need to nf_conntrack_put()/nf_reset() skb's?


The reasons to call nf_reset() are:

- A packet is queued and the reference no longer needed
  (ip_local_deliver_finish, ip_call_ra_chain)
- A packet is encapsulated and passed to the output functions
  (ipip.c, ip_gre.c)
- A packet is decapsulated and passed to netif_rx (ipip.c, ip_gre.c)

What about packets which are not queued and are still in the processingchain of netfilter while we call rmmod ip_conntrack? Or is the combo


        br_write_lock_bh(BR_NETPROTO_LOCK);
        br_write_unlock_bh(BR_NETPROTO_LOCK);

guaranteeing that such a condition does not occur?

I wonder if the remaining references should just be zeroed once we are
in the i_see_dead_people part and the call stack reveils rmmod
ip_conntrack? :)


If you can suggest a good way to find them .. :)


Bluntly speaking, my naive take on this is the following:

ip_conntrack_cleanup get called only when we basically rmmod a module orif a init fails at some point. So why not introduce an atomic counter inip_conntrack_cleanup which gets incremented for each rerun ofi_see_dead_people _and_ if the list query using get_next_corpse() inip_ct_iterate_cleanup() returns NULL. After x HZ () of endlessschedule() we decrement ip_conntrack_untracked.ct_general.use orip_conntrack_count if the atomic counter > 1.


Don't laugh, please ;).

Cheers,
Roberto Nibali, ratz
--
-------------------------------------------------------------
addr://Kasinostrasse 30, CH-5001 Aarau tel://++41 62 823 9355
http://www.terreactive.com             fax://++41 62 823 9356
-------------------------------------------------------------
terreActive AG                       Wir sichern Ihren Erfolg
-------------------------------------------------------------
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Q: 2.4 vs. 2.6] nf_conntrack_get() semantics in copy_skb_header()

Reply via email to