Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-12-18 Thread YOSHIFUJI Hideaki / 吉藤英明
In article [EMAIL PROTECTED] (at Sun, 18 Dec 2005 15:27:01 +0100), Patrick 
McHardy [EMAIL PROTECTED] says:

 YOSHIFUJI Hideaki wrote:
  In article [EMAIL PROTECTED] (at Tue, 22 Nov 2005 02:14:26 +0100), 
  Patrick McHardy [EMAIL PROTECTED] says:
  
  
 The easiest way would be to store nhoff somewhere in the skb and
 use it to continue at the next header. But I still hope there is
 a way without keeping data in the skb.
  
  
  We've coded up this.
 
 How about this patch instead? It eliminates the nhoffp argument
 to IPv6 protocol handlers by storing it in the IP6CB, which allows
 to call ip6_input_finish a second time and have it skip already
 parsed headers and also gets rid of the manual hopopts skipping.

The idea to store IP6CB itself seems sane to me.

BTW, we're now using full of skb-cb
(and we are even exceeding it w/ mobile-ipv6 extensions)...

--yoshfuji
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-12-18 Thread YOSHIFUJI Hideaki / 吉藤英明
In article [EMAIL PROTECTED] (at Sun, 18 Dec 2005 23:59:35 +0100), Patrick 
McHardy [EMAIL PROTECTED] says:

 YOSHIFUJI Hideaki wrote:
:
   BTW, we're now using full of skb-cb
  (and we are even exceeding it w/ mobile-ipv6 extensions)...
 
 Not in mainline so far, so maybe we can fit your extensions
 and my patches without the mobile extensions, that apparently
 exceed the CB anyway, in there for now. Can I look at those
 patches somewhere? BTW, other fields in the IP6CB seem to
 store offsets in u16 fields, is this OK for nhoff too? I
 thought with jumbo options I need to use a u32 field.

Well, don't mind too much.
I just wanted to note that we're about to exceed skb-cb.
I will probably enlarge skb-cb if needed.

And, yes, good point.
I think nhoff should be u32 because it is critical.
In theory, u32 is definitely needed for others as well...

--yoshfuji
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-12-04 Thread Herbert Xu
On Sun, Dec 04, 2005 at 11:06:02PM +0100, Patrick McHardy wrote:

 I'm worried about this bit.  This looks like it'll go back to the top
 of the IP stack with the existing call chain.  So could grow as the
 number of transforms increase.
 
 Its not so bad. It adds ip_xfrm_transport_hook and
 ip_local_deliver_finish to the call stack, but since two subsequent
 transport mode SAs are always processed at once it can't take this
 path again without calling netif_rx in between.

If there is a DNAT in the way, this will jump to the very start of
the stack.  So if we have a hostile IPsec peer, and the DNAT rules
are such that this can occur, then we could be in trouble (especially
because policy/selector verification does not occur until all IPsec
has been done so we can't check inner address validitiy at this point).
 
 Besides the double counting, packets also appear on the packet sockets
 after transport mode decapsulation with the original approach. For
 IPv6 there's also the double-parsing of extension header issue.

Having the packets appear twice on AF_PACKET is probably desirable :)

I'll need to think about the double-parsing though.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-12-04 Thread Patrick McHardy

Herbert Xu wrote:

On Sun, Dec 04, 2005 at 11:06:02PM +0100, Patrick McHardy wrote:

If there is a DNAT in the way, this will jump to the very start of
the stack.  So if we have a hostile IPsec peer, and the DNAT rules
are such that this can occur, then we could be in trouble (especially
because policy/selector verification does not occur until all IPsec
has been done so we can't check inner address validitiy at this point).


We could return NET_XMIT_BYPASS from ip_xfrm_transport_hook(), although
it looks a bit ugly to use NET_XMIT* on the input path.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-30 Thread Herbert Xu
On Sun, Nov 20, 2005 at 04:31:36PM +, Patrick McHardy wrote:

 @@ -145,7 +149,17 @@ int xfrm4_rcv_encap(struct sk_buff *skb,
   netif_rx(skb);
   return 0;
   } else {
 +#ifdef CONFIG_NETFILTER
 + __skb_push(skb, skb-data - skb-nh.raw);
 + skb-nh.iph-tot_len = htons(skb-len);
 + ip_send_check(skb-nh.iph);
 +
 + NF_HOOK(PF_INET, NF_IP_PRE_ROUTING, skb, skb-dev, NULL,
 + ip_xfrm_transport_hook);
 + return 0;
 +#else
   return -skb-nh.iph-protocol;
 +#endif

I'm worried about this bit.  This looks like it'll go back to the top
of the IP stack with the existing call chain.  So could grow as the
number of transforms increase.

Perhaps we need to play a dst_input/netif_rx trick here.

Actually, was there a problem with your original netif_rx approach
apart from the issue with double counting?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-23 Thread YOSHIFUJI Hideaki / 吉藤英明
Hello.

In article [EMAIL PROTECTED] (at Tue, 22 Nov 2005 02:14:26 +0100), Patrick 
McHardy [EMAIL PROTECTED] says:

 The easiest way would be to store nhoff somewhere in the skb and
 use it to continue at the next header. But I still hope there is
 a way without keeping data in the skb.

We've coded up this.

Though we have still another issue (call chain issue) to resolve,
we're getting closer to the goal.
i.e. we should continue the loop for common case.

Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
Signed-off-by: Yasuyuki Kozawai [EMAIL PROTECTED]

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 8c5d600..1101851 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -272,6 +272,9 @@ struct sk_buff {
void(*destructor)(struct sk_buff *skb);
 #ifdef CONFIG_NETFILTER
__u32   nfmark;
+#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
+   unsigned intnf_nhoff;
+#endif
struct nf_conntrack *nfct;
 #if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)
struct sk_buff  *nfct_reasm;
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 44b979a..0531d0a 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -463,6 +463,8 @@ extern int  ip6_dst_lookup(struct sock 
 extern int ip6_output(struct sk_buff *skb);
 extern int ip6_forward(struct sk_buff *skb);
 extern int ip6_input(struct sk_buff *skb);
+extern int ip6_input_finish2(struct sk_buff *skb,
+ unsigned int nhoff);
 extern int ip6_mc_input(struct sk_buff *skb);
 
 /*
diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
index e84b3cd..cd0606a 100644
--- a/net/ipv6/ip6_input.c
+++ b/net/ipv6/ip6_input.c
@@ -134,31 +134,13 @@ out:
  * Deliver the packet to the host
  */
 
-
-static inline int ip6_input_finish(struct sk_buff *skb)
+int ip6_input_finish2(struct sk_buff *skb, unsigned int nhoff)
 {
struct inet6_protocol *ipprot;
struct sock *raw_sk;
-   unsigned int nhoff;
-   int nexthdr;
+   unsigned int nexthdr;
u8 hash;
 
-   skb-h.raw = skb-nh.raw + sizeof(struct ipv6hdr);
-
-   /*
-*  Parse extension headers
-*/
-
-   nexthdr = skb-nh.ipv6h-nexthdr;
-   nhoff = offsetof(struct ipv6hdr, nexthdr);
-
-   /* Skip hop-by-hop options, they are already parsed. */
-   if (nexthdr == NEXTHDR_HOP) {
-   nhoff = sizeof(struct ipv6hdr);
-   nexthdr = skb-h.raw[0];
-   skb-h.raw += (skb-h.raw[1]+1)3;
-   }
-
rcu_read_lock();
 resubmit:
if (!pskb_pull(skb, skb-h.raw - skb-data))
@@ -221,6 +203,26 @@ discard:
return 0;
 }
 
+static inline int ip6_input_finish(struct sk_buff *skb)
+{
+   unsigned int nhoff;
+
+   skb-h.raw = skb-nh.raw + sizeof(struct ipv6hdr);
+
+   /*
+*  Parse extension headers
+*/
+
+   nhoff = offsetof(struct ipv6hdr, nexthdr);
+
+   /* Skip hop-by-hop options, they are already parsed. */
+   if (skb-nh.ipv6h-nexthdr == NEXTHDR_HOP) {
+   nhoff = sizeof(struct ipv6hdr);
+   skb-h.raw += (skb-h.raw[1]+1)3;
+   }
+
+   return ip6_input_finish2(skb, nhoff);
+}
 
 int ip6_input(struct sk_buff *skb)
 {
diff --git a/net/ipv6/ipv6_syms.c b/net/ipv6/ipv6_syms.c
index 1648278..6051783 100644
--- a/net/ipv6/ipv6_syms.c
+++ b/net/ipv6/ipv6_syms.c
@@ -15,6 +15,7 @@ EXPORT_SYMBOL(ndisc_mc_map);
 EXPORT_SYMBOL(register_inet6addr_notifier);
 EXPORT_SYMBOL(unregister_inet6addr_notifier);
 EXPORT_SYMBOL(ip6_route_output);
+EXPORT_SYMBOL(ip6_input_finish2);
 EXPORT_SYMBOL(addrconf_lock);
 EXPORT_SYMBOL(ipv6_setsockopt);
 EXPORT_SYMBOL(ipv6_getsockopt);
diff --git a/net/ipv6/xfrm6_input.c b/net/ipv6/xfrm6_input.c
index 9987416..2e3b28d 100644
--- a/net/ipv6/xfrm6_input.c
+++ b/net/ipv6/xfrm6_input.c
@@ -9,6 +9,7 @@
  * IPv6 support
  */
 
+#include linux/config.h
 #include linux/module.h
 #include linux/string.h
 #include linux/netfilter.h
@@ -17,6 +18,7 @@
 #include net/inet_ecn.h
 #include net/ip.h
 #include net/ipv6.h
+#include net/ip6_route.h
 #include net/xfrm.h
 
 static inline void ipip6_ecn_decapsulate(struct sk_buff *skb)
@@ -28,6 +30,25 @@ static inline void ipip6_ecn_decapsulate
IP6_ECN_set_ce(inner_iph);
 }
 
+#ifdef CONFIG_NETFILTER
+static inline int xfrm6_rcv_spi_finish2(struct sk_buff *skb)
+{
+   __skb_pull(skb, skb-h.raw - skb-nh.raw);
+   return ip6_input_finish2(skb, skb-nf_nhoff);
+}
+
+static inline int xfrm6_rcv_spi_finish(struct sk_buff *skb)
+{
+   if (skb-dst == NULL) {
+   ip6_route_input(skb);
+   return dst_input(skb);
+   }
+
+   return NF_HOOK(PF_INET6, NF_IP6_LOCAL_IN, skb, skb-dev, NULL,
+  

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-21 Thread Kazunori Miyazawa
Hello Patrick,

I have a comment about the patch on the IPv6 input process.
The kernel applied your patch will always call ip6_rcv_finish
when enabling netfilter and receiving a esp packet so that
it will always look up the routing table and parse
extention headers twice a packet.
It will probably cost.

Your ip_xfrm_transport_hook is a good idea, I think.

We could call ip6_rcv_finish if the netfilter changed the addresses
or otherwise we can continue the loop to avoid the cost in a similar
way because we can know the change with checking skb-dst.

If you fix the point, your change will resolve our issues which were
mentioned in the previous mail from Kozakai-san.

Patrick McHardy wrote:
 [IPV4/6]: Netfilter IPsec input hooks
 
 When the innermost transform uses transport mode the decapsulated packet
 is not visible to netfilter. Pass the packet through the PRE_ROUTING and
 LOCAL_IN hooks again before handing it to upper layer protocols to make
 netfilter-visibility symetrical to the output path.
 
 Signed-off-by: Patrick McHardy [EMAIL PROTECTED]
 
 ---
 commit 08cf39d5d7d8b942431a6529daa3ab69ecfb34b5
 tree 6f2a1a85f915b1b6f89ad50cf3d8855f21a561b6
 parent b847425c693f43a63137d18e36e5c8cf0187c175
 author Patrick McHardy [EMAIL PROTECTED] Sat, 19 Nov 2005 21:50:22 +0100
 committer Patrick McHardy [EMAIL PROTECTED] Sat, 19 Nov 2005 21:50:22 +0100
 
  include/linux/netfilter_ipv4.h |2 +-
  include/net/ipv6.h |2 ++
  net/ipv4/netfilter.c   |   20 
  net/ipv4/xfrm4_input.c |   14 ++
  net/ipv6/ip6_input.c   |2 +-
  net/ipv6/xfrm6_input.c |   13 +
  6 files changed, 51 insertions(+), 2 deletions(-)
 
 diff --git a/include/linux/netfilter_ipv4.h b/include/linux/netfilter_ipv4.h
 index fdc4a95..e9103fe 100644
 --- a/include/linux/netfilter_ipv4.h
 +++ b/include/linux/netfilter_ipv4.h
 @@ -79,7 +79,7 @@ enum nf_ip_hook_priorities {
  
  #ifdef __KERNEL__
  extern int ip_route_me_harder(struct sk_buff **pskb);
 -
 +extern int ip_xfrm_transport_hook(struct sk_buff *skb);
  #endif /*__KERNEL__*/
  
  #endif /*__LINUX_IP_NETFILTER_H*/
 diff --git a/include/net/ipv6.h b/include/net/ipv6.h
 index 6addb4d..4fbfe43 100644
 --- a/include/net/ipv6.h
 +++ b/include/net/ipv6.h
 @@ -414,6 +414,8 @@ extern intipv6_rcv(struct sk_buff 
 *sk
struct packet_type *pt,
struct net_device *orig_dev);
  
 +extern int   ip6_rcv_finish(struct sk_buff *skb);
 +
  /*
   *   upper-layer output functions
   */
 diff --git a/net/ipv4/netfilter.c b/net/ipv4/netfilter.c
 index b93e7cd..3c39296 100644
 --- a/net/ipv4/netfilter.c
 +++ b/net/ipv4/netfilter.c
 @@ -105,6 +105,26 @@ int ip_dst_output(struct sk_buff *skb)
   return dst_output(skb);
  }
  EXPORT_SYMBOL(ip_dst_output);
 +
 +/*
 + * okfn for transport mode xfrm_input.c hook. Basically a copy of
 + * ip_rcv_finish without statistics and option parsing.
 + */
 +int ip_xfrm_transport_hook(struct sk_buff *skb)
 +{
 + struct iphdr *iph = skb-nh.iph;
 +
 + if (likely(skb-dst == NULL)) {
 + int err = ip_route_input(skb, iph-daddr, iph-saddr, iph-tos,
 +  skb-dev);
 + if (unlikely(err))
 + goto drop;
 + }
 + return dst_input(skb);
 +drop:
 + kfree_skb(skb);
 + return NET_RX_DROP;
 +}
  #endif /* CONFIG_XFRM */
  
  /*
 diff --git a/net/ipv4/xfrm4_input.c b/net/ipv4/xfrm4_input.c
 index 2d3849c..d90cd93 100644
 --- a/net/ipv4/xfrm4_input.c
 +++ b/net/ipv4/xfrm4_input.c
 @@ -11,6 +11,8 @@
  
  #include linux/module.h
  #include linux/string.h
 +#include linux/netfilter.h
 +#include linux/netfilter_ipv4.h
  #include net/inet_ecn.h
  #include net/ip.h
  #include net/xfrm.h
 @@ -137,6 +139,8 @@ int xfrm4_rcv_encap(struct sk_buff *skb,
   memcpy(skb-sp-x+skb-sp-len, xfrm_vec, xfrm_nr*sizeof(struct 
 sec_decap_state));
   skb-sp-len += xfrm_nr;
  
 + nf_reset(skb);
 +
   if (decaps) {
   if (!(skb-dev-flagsIFF_LOOPBACK)) {
   dst_release(skb-dst);
 @@ -145,7 +149,17 @@ int xfrm4_rcv_encap(struct sk_buff *skb,
   netif_rx(skb);
   return 0;
   } else {
 +#ifdef CONFIG_NETFILTER
 + __skb_push(skb, skb-data - skb-nh.raw);
 + skb-nh.iph-tot_len = htons(skb-len);
 + ip_send_check(skb-nh.iph);
 +
 + NF_HOOK(PF_INET, NF_IP_PRE_ROUTING, skb, skb-dev, NULL,
 + ip_xfrm_transport_hook);
 + return 0;
 +#else
   return -skb-nh.iph-protocol;
 +#endif
   }
  
  drop_unlock:
 diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
 index 33d3b0e..e84b3cd 100644
 --- a/net/ipv6/ip6_input.c
 +++ b/net/ipv6/ip6_input.c
 @@ -48,7 +48,7 @@
  
  
  
 -static inline int ip6_rcv_finish( struct sk_buff *skb) 
 +inline int 

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-21 Thread YOSHIFUJI Hideaki / 吉藤英明
Hello.

In article [EMAIL PROTECTED] (at Mon, 21 Nov 2005 17:31:41 +0900), Kazunori 
Miyazawa [EMAIL PROTECTED] says:

 Your ip_xfrm_transport_hook is a good idea, I think.
 
 We could call ip6_rcv_finish if the netfilter changed the addresses
 or otherwise we can continue the loop to avoid the cost in a similar
 way because we can know the change with checking skb-dst.

Well, I agree.

In article [EMAIL PROTECTED] (at Sun, 20 Nov 2005 17:31:36 +0100), Patrick 
McHardy [EMAIL PROTECTED] says:

 diff --git a/net/ipv4/netfilter.c b/net/ipv4/netfilter.c
 index b93e7cd..3c39296 100644
 --- a/net/ipv4/netfilter.c
 +++ b/net/ipv4/netfilter.c
 @@ -105,6 +105,26 @@ int ip_dst_output(struct sk_buff *skb)
   return dst_output(skb);
  }
  EXPORT_SYMBOL(ip_dst_output);
 +
 +/*
 + * okfn for transport mode xfrm_input.c hook. Basically a copy of
 + * ip_rcv_finish without statistics and option parsing.
 + */
 +int ip_xfrm_transport_hook(struct sk_buff *skb)
 +{
 + struct iphdr *iph = skb-nh.iph;
 +
 + if (likely(skb-dst == NULL)) {
 + int err = ip_route_input(skb, iph-daddr, iph-saddr, iph-tos,
 +  skb-dev);
 + if (unlikely(err))
 + goto drop;
 + }
 + return dst_input(skb);
 +drop:
 + kfree_skb(skb);
 + return NET_RX_DROP;
 +}
  #endif /* CONFIG_XFRM */
  
:
 @@ -129,7 +133,16 @@ int xfrm6_rcv_spi(struct sk_buff **pskb,
   netif_rx(skb);
   return -1;
   } else {
 +#ifdef CONFIG_NETFILTER
 + skb-nh.ipv6h-payload_len = htons(skb-len);
 + __skb_push(skb, skb-data - skb-nh.raw);
 +
 + NF_HOOK(PF_INET6, NF_IP6_PRE_ROUTING, skb, skb-dev, NULL,
 + ip6_rcv_finish);
 + return -1;
 +#else
   return 1;
 +#endif
   }
  

Probably, we can do similarly for ipv6; e.g.:

int ip6_xfrm_transport_hook(struct sk_buff *skb)
{
#if 0 /* We NEVER support NAT. :-) */
 if (likely(skb-dst == NULL)) {
int err = ip6_route_input()
if (unlikely(err))
 goto drop;
 }
#endif
 __skb_pull(skb, skb-h.raw - skb-nh.raw);
 return NET_RX_SUCCESS;
drop:
 kfree_skb(skb);
 return NET_RX_DROP;
}

:

  } else {
#ifdef CONFIG_NETFILTER
 skb-nh.ipv6h-payload_len = htons(skb-len);
 skb-h.raw = skb-data;
 __skb_push(skb, skb-data - skb-nh.raw);

 if (NF_HOOK(PF_INET6, NF_IP6_PRE_ROUTING, skb, skb-dev, NULL,
 ip6_xfrm_transport_hook) == NET_RX_DROP)
 return -1;
#endif
 return 1;
  }

Then, we can continue parsing extension headers, I think.

-- 
YOSHIFUJI Hideaki @ USAGI Project  [EMAIL PROTECTED]
GPG-FP  : 9022 65EB 1ECF 3AD1 0BDF  80D8 4807 F894 E062 0EEA
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-21 Thread Yasuyuki KOZAKAI

Hi,

From: Patrick McHardy [EMAIL PROTECTED]
Date: Mon, 21 Nov 2005 07:52:36 +0100

 I don't see why it is confusing. Plain text packets are visible before
 encapsulation (and they have to be because we don't necessarily know
 if packets will be encapsulated at the time the hooks are called in
 case the policy lookup after NAT returns a policy), plain text packets
 are visible after decapsulation. With different hooks we can't have
 symetrical behaviour because of the case I mentioned above, and that
 would be confusing IMO.

Well, what I worried about was just ease to use, not internal processing.
I suspected that users can correctly configure IPsec and packet filtering.
Just doing iptables -P INPUT -j DROP; iptables -A INPUT -p esp -j ACCEPT
will drop all input packets and this is different behavior with current
kernel, for example. So I just imagined many people would say Why ?.

But as you said in other mail, probably this is my needles fear, isn't this ?
As you know, I'm basically worrier. :)

  And this can be said about IPv6 input path. If packets have not been mangled
  (this is ordinary case because ip6tables doesn't have neither NAT nor
  target module to mangle addresses directly), they don't have to re-route
  and don't have to re-visit ip6_input_finish().
  
  In the other way, if their addresses have been mangled, it's necessary to
  re-route. I agree re-visiting ip6_input_finish() in this case.
 
 Same goes for ip6_input_finish as for ip_local_deliver_finish(),
 the packet would continue its path there anyway. Do you actually
 mean ip6_rcv_finish()?

No. I mean ip6_input_finish(). calling ip6_input_finish() twice causes
problem at processing of IPv6 extension headers. This is different point
between IPv4 and IPv6.

Please note that I don't mean IPv4 processing in your patch has problem.
I think it will work. What I wanted to do was just avoiding double processing
of extension headers and synchronizing IPv4/IPv6 behavior as possible.

  Then, why don't we make xfrm{4,6}_rcv_spi() return 0 (1 if IPv6)
  so that ip_local_deliver_finish()/ip6_input_finish() can continue to process
  headers if packets have not been mangled ? Is this difficult or impossible 
  to
  implement ?
 
 I'm not sure I understand. Do you propose to check if the packet was
 mangled after the PRE_ROUTING hook (if it was not stolen or queued)
 and if not return directly to ip6_input_finish()?

Yes.

Where would the
 LOCAL_IN hook be called?

Ah, indeed. But we can add it just before return directly to
ip6_input_finish().

  This can solve the issue about twice processing of statistics and IPv6
  extension headers. Because ip_local_deliver_finish()/ip6_input_finish() can
  continue to process extension headers after ESP/AH in ordinary case.
 
 AFAICT statistics are not affected by my patches, except for the
 iptables counters. The double parsing of extension headers is indeed
 a problem I forgot about, it looks like we need to carry nhoff around
 if it can't be derived from the packet.

Of cause that is one solution.

-- Yasuyuki Kozakai
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-21 Thread Patrick McHardy
Kazunori Miyazawa wrote:
 Hello Patrick,
 
 I have a comment about the patch on the IPv6 input process.
 The kernel applied your patch will always call ip6_rcv_finish
 when enabling netfilter and receiving a esp packet so that
 it will always look up the routing table and parse
 extention headers twice a packet.
 It will probably cost.
 
 Your ip_xfrm_transport_hook is a good idea, I think.

Yes, not passing the packets through the entire stack seems like
the right thing to do.

 We could call ip6_rcv_finish if the netfilter changed the addresses
 or otherwise we can continue the loop to avoid the cost in a similar
 way because we can know the change with checking skb-dst.
 
 If you fix the point, your change will resolve our issues which were
 mentioned in the previous mail from Kozakai-san.

The problem is that netfilter hooks take ownership of the skb, so the
caller can't touch it afterwards. It would be possible, but it would
become very ugly. Can I assume that you would also be satisfied if
the double-parsing of extension headers is fixed in some other way?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-21 Thread Patrick McHardy

Yasuyuki KOZAKAI wrote:

I don't see why it is confusing. Plain text packets are visible before
encapsulation (and they have to be because we don't necessarily know
if packets will be encapsulated at the time the hooks are called in
case the policy lookup after NAT returns a policy), plain text packets
are visible after decapsulation. With different hooks we can't have
symetrical behaviour because of the case I mentioned above, and that
would be confusing IMO.


Well, what I worried about was just ease to use, not internal processing.
I suspected that users can correctly configure IPsec and packet filtering.
Just doing iptables -P INPUT -j DROP; iptables -A INPUT -p esp -j ACCEPT
will drop all input packets and this is different behavior with current
kernel, for example. So I just imagined many people would say Why ?.

But as you said in other mail, probably this is my needles fear, isn't this ?
As you know, I'm basically worrier. :)


:) I think its OK since in tunnel mode the decapsulated packets already
need to be allowed by the ruleset, so I think its not too confusing to
expect the same in transport mode.


Then, why don't we make xfrm{4,6}_rcv_spi() return 0 (1 if IPv6)
so that ip_local_deliver_finish()/ip6_input_finish() can continue to process
headers if packets have not been mangled ? Is this difficult or impossible to
implement ?


I'm not sure I understand. Do you propose to check if the packet was
mangled after the PRE_ROUTING hook (if it was not stolen or queued)
and if not return directly to ip6_input_finish()?


Yes.


Where would the LOCAL_IN hook be called?


Ah, indeed. But we can add it just before return directly to
ip6_input_finish().


Please see my mail to Kazunori. The hooks take ownership of the
skb, changing this would become pretty ugly because of queued
or stolen packets. And it would still leave one path where packets
are not directly returned, so I think the double-parsing should
be handled otherwise.


AFAICT statistics are not affected by my patches, except for the
iptables counters. The double parsing of extension headers is indeed
a problem I forgot about, it looks like we need to carry nhoff around
if it can't be derived from the packet.


Of cause that is one solution.


I'm going to try that if I can't come up with something better.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-21 Thread Patrick McHardy

David S. Miller wrote:

I've read over Patrick's two most recent postings of these patches
and I think they are generally sane and I cannot find any holes in
them.  Herbert brought up the legitimate concern about defragmentation,
but I think that's a detail and does not take away from the structural
soundness of Patrick's approach.


I think we implicitly agreed on moving the POST_ROUTING hook before
fragmentation and change the user-visible behaviour of the mangle
POSTROUTING chain. At least neither Harald not Rusty objected to
the patch :)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-21 Thread Kazunori Miyazawa
Patrick McHardy wrote:
 Kazunori Miyazawa wrote:
 
Hello Patrick,

I have a comment about the patch on the IPv6 input process.
The kernel applied your patch will always call ip6_rcv_finish
when enabling netfilter and receiving a esp packet so that
it will always look up the routing table and parse
extention headers twice a packet.
It will probably cost.

Your ip_xfrm_transport_hook is a good idea, I think.
 
 
 Yes, not passing the packets through the entire stack seems like
 the right thing to do.
 
 
We could call ip6_rcv_finish if the netfilter changed the addresses
or otherwise we can continue the loop to avoid the cost in a similar
way because we can know the change with checking skb-dst.

If you fix the point, your change will resolve our issues which were
mentioned in the previous mail from Kozakai-san.
 
 
 The problem is that netfilter hooks take ownership of the skb, so the
 caller can't touch it afterwards. It would be possible, but it would
 become very ugly. Can I assume that you would also be satisfied if
 the double-parsing of extension headers is fixed in some other way?

My concern is cost to look up routing table and parse extention headers twice.
I think the latter will be critical issue if some extention header makes
some state in the stack.

IMHO, we will call the loop part of ip6_input_finish to inject the skb
to upper layer directly in ip6_xfrm_transport_hook.
But it may make duplicate codes and raise other issues...

I'm also consider the issue.

Thank you Patrick,

--
Kazunori Miyazawa
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-21 Thread Patrick McHardy
Kazunori Miyazawa wrote:
 Patrick McHardy wrote:
 
The problem is that netfilter hooks take ownership of the skb, so the
caller can't touch it afterwards. It would be possible, but it would
become very ugly. Can I assume that you would also be satisfied if
the double-parsing of extension headers is fixed in some other way?
 
 
 My concern is cost to look up routing table and parse extention headers twice.
 I think the latter will be critical issue if some extention header makes
 some state in the stack.

The routing lookup is not done unless someone resets skb-dst (i.e.
nfqueue). I'm looking into fixing the extension header problem,
so I hope that will resolve all issues.

 IMHO, we will call the loop part of ip6_input_finish to inject the skb
 to upper layer directly in ip6_xfrm_transport_hook.
 But it may make duplicate codes and raise other issues...

The easiest way would be to store nhoff somewhere in the skb and
use it to continue at the next header. But I still hope there is
a way without keeping data in the skb.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-20 Thread Yasuyuki KOZAKAI

Hi, Patrick,

From: Patrick McHardy [EMAIL PROTECTED]
Date: Sun, 20 Nov 2005 17:31:36 +0100

 [IPV4/6]: Netfilter IPsec input hooks
 
 When the innermost transform uses transport mode the decapsulated packet
 is not visible to netfilter. Pass the packet through the PRE_ROUTING and
 LOCAL_IN hooks again before handing it to upper layer protocols to make
 netfilter-visibility symetrical to the output path.
 
 Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

At first, now I could agree to use same name for hooks before/after xfrm
processing, if it's important to keep compatibility than to avoid difficulty
to use. Even now I think it's confusing to pass packets before/after xfrm to
same hook, and believe it's ideal to use different name for them.
But people can configure correctly according to you, then my concern might be
needless fear.

Next is about re-visiting stack, I'm beginning to understand your patch.
It looks natural to re-route after DNAT on input path of IPv4. That's really
needed if packets have been mangled.

But is there any reason that we have to allow packets to re-visit
ip_local_deliver_finish() in the case that they have not been mangled
at PRE_ROUTING ? I think no. This situation is like ip_nat_out().

And this can be said about IPv6 input path. If packets have not been mangled
(this is ordinary case because ip6tables doesn't have neither NAT nor
target module to mangle addresses directly), they don't have to re-route
and don't have to re-visit ip6_input_finish().

In the other way, if their addresses have been mangled, it's necessary to
re-route. I agree re-visiting ip6_input_finish() in this case.

Then, why don't we make xfrm{4,6}_rcv_spi() return 0 (1 if IPv6)
so that ip_local_deliver_finish()/ip6_input_finish() can continue to process
headers if packets have not been mangled ? Is this difficult or impossible to
implement ?

This can solve the issue about twice processing of statistics and IPv6
extension headers. Because ip_local_deliver_finish()/ip6_input_finish() can
continue to process extension headers after ESP/AH in ordinary case.

In special case, if some codes mangle IPv6 addresses, that's the codes
to take care of the possibility of re-visiting ip6_input_finish().

What do you think ?

# Please note that these are just my opinions and other USAGI guys might have
# other opinions.

Regards,

-- Yasuyuki Kozakai
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-20 Thread Patrick McHardy

Yasuyuki KOZAKAI wrote:

At first, now I could agree to use same name for hooks before/after xfrm
processing, if it's important to keep compatibility than to avoid difficulty
to use. Even now I think it's confusing to pass packets before/after xfrm to
same hook, and believe it's ideal to use different name for them.
But people can configure correctly according to you, then my concern might be
needless fear.


I don't see why it is confusing. Plain text packets are visible before
encapsulation (and they have to be because we don't necessarily know
if packets will be encapsulated at the time the hooks are called in
case the policy lookup after NAT returns a policy), plain text packets
are visible after decapsulation. With different hooks we can't have
symetrical behaviour because of the case I mentioned above, and that
would be confusing IMO.


Next is about re-visiting stack, I'm beginning to understand your patch.
It looks natural to re-route after DNAT on input path of IPv4. That's really
needed if packets have been mangled.

But is there any reason that we have to allow packets to re-visit
ip_local_deliver_finish() in the case that they have not been mangled
at PRE_ROUTING ? I think no. This situation is like ip_nat_out().


My patches don't change when and if packets will reach
ip_local_deliver_finish(), they just add a possibility for rerouting.
Currently the transforms are called from ip_local_deliver_finish() and
in transport mode the decapsulated packet continues its path in
ip_local_deliver_finish(), with my patches they will go through two
netfilter hooks and continue the exact same codepath, given that
they are not NATed to a foreign destination IP on their way.


And this can be said about IPv6 input path. If packets have not been mangled
(this is ordinary case because ip6tables doesn't have neither NAT nor
target module to mangle addresses directly), they don't have to re-route
and don't have to re-visit ip6_input_finish().

In the other way, if their addresses have been mangled, it's necessary to
re-route. I agree re-visiting ip6_input_finish() in this case.


Same goes for ip6_input_finish as for ip_local_deliver_finish(),
the packet would continue its path there anyway. Do you actually
mean ip6_rcv_finish()?


Then, why don't we make xfrm{4,6}_rcv_spi() return 0 (1 if IPv6)
so that ip_local_deliver_finish()/ip6_input_finish() can continue to process
headers if packets have not been mangled ? Is this difficult or impossible to
implement ?


I'm not sure I understand. Do you propose to check if the packet was
mangled after the PRE_ROUTING hook (if it was not stolen or queued)
and if not return directly to ip6_input_finish()? Where would the
LOCAL_IN hook be called?


This can solve the issue about twice processing of statistics and IPv6
extension headers. Because ip_local_deliver_finish()/ip6_input_finish() can
continue to process extension headers after ESP/AH in ordinary case.


AFAICT statistics are not affected by my patches, except for the
iptables counters. The double parsing of extension headers is indeed
a problem I forgot about, it looks like we need to carry nhoff around
if it can't be derived from the packet.

Regards
Patrick

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-20 Thread David S. Miller
From: Patrick McHardy [EMAIL PROTECTED]
Date: Mon, 21 Nov 2005 07:52:36 +0100

 I don't see why it is confusing. Plain text packets are visible before
 encapsulation (and they have to be because we don't necessarily know
 if packets will be encapsulated at the time the hooks are called in
 case the policy lookup after NAT returns a policy), plain text packets
 are visible after decapsulation. With different hooks we can't have
 symetrical behaviour because of the case I mentioned above, and that
 would be confusing IMO.

I think this is a very important point.

I can see no serious argument against this behavior, especially for
output.  On input, there is an argument of paranoia about seeing
plaintext packets, but administrator could do this anyways with
tcpdump or custom kernel module if this system is the decapsulation
point.

I've read over Patrick's two most recent postings of these patches
and I think they are generally sane and I cannot find any holes in
them.  Herbert brought up the legitimate concern about defragmentation,
but I think that's a detail and does not take away from the structural
soundness of Patrick's approach.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-20 Thread Herbert Xu
David S. Miller [EMAIL PROTECTED] wrote:
 
 I've read over Patrick's two most recent postings of these patches
 and I think they are generally sane and I cannot find any holes in
 them.  Herbert brought up the legitimate concern about defragmentation,
 but I think that's a detail and does not take away from the structural
 soundness of Patrick's approach.

Yes I agree completely.  The new IPsec stack has been around for three
years and it's about time that we have proper netfilter support for it.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html