Re: Session ID 0 with PPPoE

2007-03-07 Thread Michal Ostrowski
This change can be made; the unbinding behavior can be removed and SID 0
can be made valid. I hope I was clear in my previous e-mail that I
didn't object to this.

PPPoE connections are unstable.  Ethernet frames get dropped.  Things
die randomly. And yes, you typically want to have a cron job or script
re-spawning pppd on failure.  Making this change won't increase the
reliability of these connections in any meaningful way, it won't break
pppd either.  

The only question I have is why is this important to you?  I'm simply
curious as to what you are trying to accomplish; is this related to some
other work you are doing or is it correctness as a virtue?

-- 
Michal Ostrowski <[EMAIL PROTECTED]>



On Wed, 2007-03-07 at 15:32 +0100, Florian Zumbiehl wrote:
> Hi,
> 
> > In the current  code SID 0 indicates that the socket is to be un-bound.
> 
> That are the semantics used by the kernel code, yes - but even pppd uses
> different semantics (which can't quite work, of course ...).
> 
> > Supporting unbinding of the socket was intended to permit the PPPoE
> > session to be reconnected without closing/reopening the socket; which
> > would mean that you'd have to re-bind the PPPoE/PPP channel bindings.
> > Thus it is conceivable to swap or renegotiate PPPoE connection
> > underneath a PPP connection, hypothetically if anyone ever considered
> > doing so.  Is that worth it?  I don't know.  One could eliminate that
> > disconnect behavior and I don't think anyone would care.
> 
> Well, if _you_ don't even know ... =:-)
> 
> Anyway, if it is not to be eliminated, it should be represented some way
> differently, I guess. Which probably would break backwards compatibility
> at the userspace API completely, of course, which is a bad thing[tm], so
> possibly simply changing the semantics to what is already assumed by most
> userspace applications might be the way to go.

> > I'll conceed that a SID of 0 could appear from outer space.   I've never
> > seen that happening.
> 
> Now, the question is, of course: How many samples is this based on
> (both, number of connection attempts and number of different peer
> implementations)?
> 
> > The only way I see this being an issue is if a
> > PPPoE server insists on giving you SID 0 and only SID 0 repeatedly.  And
> > I've never seen *that* happening.
> 
> So, what you are saying is that pppd/kernel-PPP/-PPPoE is so unstable that
> you shouldn't ever be using it without some cron job/wrapper that takes
> care of restarts anyway, so some additional bug that causes pppd to exit
> unexpectedly doesn't matter?

> That is to say: If I didn't overlook anything, pppd (the rp-pppoe plugin,
> that is) warns you if the PADS' session id field is 0 that the AC was
> somehow violating the RFC, but then goes on, using 0 as the session id
> anyway - thus calling connect() with a session id field of 0, obviously
> assuming that this will bind to session id 0 - which it doesn't.
> 
> Now, when it invokes the PPPIOCGCHAN ioctl on that seemingly successfully
> connect()ed socket, it gets an ENOTCONN - and exits with a fatal error.
> And exiting with a fatal error isn't quite what I'd expect pppd to do
> when the peer is behaving correctly according to the RFC, since that will
> not just cause a retry that might be likely to give a different session
> id and thus would cause the connection to finally be established with
> some delay, but rather will break network connectivity until manual
> intervention.
> 
> Of course, you could argue that this is somehow a bug in pppd, as it
> doesn't use the kernel API correctly. But I assume that fixing the
> kernel to support sid 0 rather than fixing pppd to "support" sid 0
> (which would mean implementing an automatic retry when assigned
> sid 0 by the AC) is somehow the better solution.
> 
> > If you'd really like to pursue this, I'll be happy to review and ack
> > patches in this regard.  However, I don't see what there is to be
> > actually gained by pursuing this.   I'm open to being convinced; what is
> > the motivation behind this?  If there is a real problem here I'll be
> > glad to get involved in fixing it myself.
> 
> Now, what is a "real problem"? I haven't noticed this being a problem for
> me yet, no. But I have been using userspace RP-PPPOE until recently and
> I am using a cron job that takes care of restarts in case pppd exits for
> some reason, so it's rather unlikely that I'd notice if it did happen.
> 
> But isn't a problem that does occur seldom, is difficult to reproduce,
> and has more than just temporar

Re: [PATCH 4/4] PPPoE: race between interface going down and release()

2007-03-11 Thread Michal Ostrowski
I need some more time to think about this one; there are some problem
I'm seeing here but I can't look at it right now.  I'll send out my
version of a patch for this tomorrow and we can discuss more.

Regarding the previous three patches, they seem fine after a first pass.
However, I'd like to ask you to please use a "Signed-off-by" statement.
That why I can ack it and push it upstream without hesitation.



-- 
Michal Ostrowski <[EMAIL PROTECTED]>



On Sun, 2007-03-11 at 05:41 +0100, Florian Zumbiehl wrote:
> Hi,
> 
> below you find the last patch for now. It (hopefully) fixes a race
> between a socket being release()d and the interface it's using going
> down. As pppoe_release() didn't lock the socket, and pppoe_flush_dev()
> did the locking in the wrong place, pppoe_flush_dev() could set
> po->pppoe_dev to NULL, which would then cause pppoe_release() to not
> dev_put() the interface, but to still mark the socket as DEAD,
> which in turn would cause pppoe_flush_dev() to not dev_put() the
> interface, effectively leaking one reference to the device, thus making it
> impossible to remove (ignoring the possibility of overflowing the reference
> counter by repeated use of this race ;-).
> 
> The thing I'm not quite sure about is whether the "outer"
> 
> |if (po->pppoe_dev == dev) {
> 
> is actually reliable this way on SMP systems, as far as cache consistency
> is concerned. I left it that way for now, as any alternative locking
> strategies that would lock the socket before doing this comparison seemed
> to be pretty complicated to implement because of the need to drop the
> hash table lock before trying to acquire the socket lock, so I'd rather
> be sure that this actually is a problem before I try to solve it ;-)
> 
> Florian
> 
> ---
> diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c
> index 1aeac2c..f5abfff 100644
> --- a/drivers/net/pppoe.c
> +++ b/drivers/net/pppoe.c
> @@ -253,7 +253,6 @@ static void pppoe_flush_dev(struct net_device *dev)
>   struct sock *sk = sk_pppox(po);
>  
>   sock_hold(sk);
> - po->pppoe_dev = NULL;
>  
>   /* We hold a reference to SK, now drop the
>* hash table lock so that we may attempt
> @@ -263,12 +262,15 @@ static void pppoe_flush_dev(struct net_device *dev)
>  
>   lock_sock(sk);
>  
> - if (sk->sk_state &
> - (PPPOX_CONNECTED | PPPOX_BOUND)) {
> - pppox_unbind_sock(sk);
> + if(po->pppoe_dev==dev){
>   dev_put(dev);
> - sk->sk_state = PPPOX_ZOMBIE;
> - sk->sk_state_change(sk);
> + po->pppoe_dev = NULL;
> + if (sk->sk_state &
> + (PPPOX_CONNECTED | PPPOX_BOUND)) {
> + pppox_unbind_sock(sk);
> + sk->sk_state = PPPOX_ZOMBIE;
> + sk->sk_state_change(sk);
> + }
>   }
>  
>   release_sock(sk);
> @@ -504,8 +506,11 @@ static int pppoe_release(struct socket *sock)
>   if (!sk)
>   return 0;
>  
> - if (sock_flag(sk, SOCK_DEAD))
> + lock_sock(sk);
> + if (sock_flag(sk, SOCK_DEAD)){
> + release_sock(sk);
>   return -EBADF;
> + }
>  
>   pppox_unbind_sock(sk);
>  
> @@ -526,6 +531,7 @@ static int pppoe_release(struct socket *sock)
>   sock->sk = NULL;
>  
>   skb_queue_purge(&sk->sk_receive_queue);
> + release_sock(sk);
>   sock_put(sk);
>  
>   return 0;
> 

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] PPPoE: race between interface going down and release()

2007-03-11 Thread Michal Ostrowski
Attached below is my take on how to address this problem.  
This addresses any concerns you may have had about checking po->pppoe_dev==NULL,
because accesses to this field are now synchronized with pppoe_hash_lock.

Once we can settle on a fix for this, I'll deal with the SID==0 issue 
(trying to do that now would just cause patch conflicts).

-- 
Michal Ostrowski <[EMAIL PROTECTED]>



[PATCH] PPPOE Fix device tear-down notification.

pppoe_flush_dev() kicks all sockets bound to a device that is going down.
In doing so, locks must be taken in the right order consistently (sock lock,
followed by the pppoe_hash_lock).  However, the scan process is based on
us holding the sock lock.  So, when something is found in the scan we must
release the lock we're holding and grab the sock lock.

This patch fixes race conditions between this code and pppoe_release(),
both of which perform similar functions but would naturally prefer to grab
locks in opposing orders.  Both code paths are now going after these locks
in a consistent manner.

pppoe_hash_lock protects the contents of the "pppox_sock" objects that reside
inside the hash.  Thus, NULL'ing out the pppoe_dev field should be done
under the protection of this lock.

Signed-off-by: Michal Ostrowski <[EMAIL PROTECTED]>
---
 drivers/net/pppoe.c |   93 +++---
 1 files changed, 50 insertions(+), 43 deletions(-)

diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c
index e097688..0961bf9 100644
--- a/drivers/net/pppoe.c
+++ b/drivers/net/pppoe.c
@@ -241,56 +241,53 @@ static inline struct pppox_sock *delete_item(unsigned 
long sid, char *addr, int
 static void pppoe_flush_dev(struct net_device *dev)
 {
int hash;
-
BUG_ON(dev == NULL);
 
-   read_lock_bh(&pppoe_hash_lock);
+   write_lock_bh(&pppoe_hash_lock);
for (hash = 0; hash < PPPOE_HASH_SIZE; hash++) {
struct pppox_sock *po = item_hash_table[hash];
 
while (po != NULL) {
-   if (po->pppoe_dev == dev) {
-   struct sock *sk = sk_pppox(po);
-
-   sock_hold(sk);
-
-   /* We hold a reference to SK, now drop the
-* hash table lock so that we may attempt
-* to lock the socket (which can sleep).
-*/
-   read_unlock_bh(&pppoe_hash_lock);
-
-   lock_sock(sk);
-
-   if(po->pppoe_dev==dev){
-   dev_put(dev);
-   po->pppoe_dev = NULL;
-   if (sk->sk_state &
-   (PPPOX_CONNECTED | PPPOX_BOUND)) {
-   pppox_unbind_sock(sk);
-   sk->sk_state = PPPOX_ZOMBIE;
-   sk->sk_state_change(sk);
-   }
-   }
-
-   release_sock(sk);
-
-   sock_put(sk);
-
-   read_lock_bh(&pppoe_hash_lock);
-
-   /* Now restart from the beginning of this
-* hash chain.  We always NULL out pppoe_dev
-* so we are guaranteed to make forward
-* progress.
-*/
-   po = item_hash_table[hash];
+   struct sock *sk = sk_pppox(po);
+   if (po->pppoe_dev != dev) {
+   po = po->next;
continue;
}
-   po = po->next;
+   po->pppoe_dev = NULL;
+   dev_put(dev);
+   
+
+   /* We always grab the socket lock, followed by the
+* pppoe_hash_lock, in that order.  Since we should
+* hold the sock lock while doing any unbinding, 
+* we need to release the lock we're holding.  
+* Hold a reference to the sock so it doesn't disappear
+* as we're jumping between locks.
+*/
+
+   sock_hold(sk);
+
+   write_unlock_bh(&pppoe_hash_lock);
+   lock_sock(sk);
+
+   if (sk->sk_state & (PPPOX_CONNECTED | PPPOX_BOUND)) {
+   pppox_unbind_sock(sk);
+   sk->sk_state = PPPOX_ZOMBIE;
+ 

[PATCH 1/4] PPPoE: miscellaneous smaller cleanups

2007-03-13 Thread Michal Ostrowski
below is a patch that just removes dead code/initializers without any
effect (first access is an assignment) that I stumbled accross while
reading the source.

Signed-off-by: Florian Zumbiehl <[EMAIL PROTECTED]>
Acked-by: Michal Ostrowski <[EMAIL PROTECTED]>
---
 drivers/net/pppoe.c |   21 -
 1 files changed, 8 insertions(+), 13 deletions(-)

diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c
index ebfa296..ec4e67d 100644
--- a/drivers/net/pppoe.c
+++ b/drivers/net/pppoe.c
@@ -207,7 +207,7 @@ static inline struct pppox_sock *get_item(unsigned long sid,
 
 static inline struct pppox_sock *get_item_by_addr(struct sockaddr_pppox *sp)
 {
-   struct net_device *dev = NULL;
+   struct net_device *dev;
int ifindex;
 
dev = dev_get_by_name(sp->sa_addr.pppoe.dev);
@@ -222,9 +222,6 @@ static inline int set_item(struct pppox_sock *po)
 {
int i;
 
-   if (!po)
-   return -EINVAL;
-
write_lock_bh(&pppoe_hash_lock);
i = __set_item(po);
write_unlock_bh(&pppoe_hash_lock);
@@ -344,7 +341,7 @@ static struct notifier_block pppoe_notifier = {
 static int pppoe_rcv_core(struct sock *sk, struct sk_buff *skb)
 {
struct pppox_sock *po = pppox_sk(sk);
-   struct pppox_sock *relay_po = NULL;
+   struct pppox_sock *relay_po;
 
if (sk->sk_state & PPPOX_BOUND) {
struct pppoe_hdr *ph = (struct pppoe_hdr *) skb->nh.raw;
@@ -514,7 +511,6 @@ static int pppoe_release(struct socket *sock)
 {
struct sock *sk = sock->sk;
struct pppox_sock *po;
-   int error = 0;
 
if (!sk)
return 0;
@@ -543,7 +539,7 @@ static int pppoe_release(struct socket *sock)
skb_queue_purge(&sk->sk_receive_queue);
sock_put(sk);
 
-   return error;
+   return 0;
 }
 
 
@@ -762,10 +758,10 @@ static int pppoe_ioctl(struct socket *sock, unsigned int 
cmd,
 static int pppoe_sendmsg(struct kiocb *iocb, struct socket *sock,
  struct msghdr *m, size_t total_len)
 {
-   struct sk_buff *skb = NULL;
+   struct sk_buff *skb;
struct sock *sk = sock->sk;
struct pppox_sock *po = pppox_sk(sk);
-   int error = 0;
+   int error;
struct pppoe_hdr hdr;
struct pppoe_hdr *ph;
struct net_device *dev;
@@ -929,10 +925,10 @@ static int pppoe_recvmsg(struct kiocb *iocb, struct 
socket *sock,
  struct msghdr *m, size_t total_len, int flags)
 {
struct sock *sk = sock->sk;
-   struct sk_buff *skb = NULL;
+   struct sk_buff *skb;
int error = 0;
int len;
-   struct pppoe_hdr *ph = NULL;
+   struct pppoe_hdr *ph;
 
if (sk->sk_state & PPPOX_BOUND) {
error = -EIO;
@@ -949,7 +945,6 @@ static int pppoe_recvmsg(struct kiocb *iocb, struct socket 
*sock,
m->msg_namelen = 0;
 
if (skb) {
-   error = 0;
ph = (struct pppoe_hdr *) skb->nh.raw;
len = ntohs(ph->length);
 
@@ -991,7 +986,7 @@ out:
 
 static __inline__ struct pppox_sock *pppoe_get_idx(loff_t pos)
 {
-   struct pppox_sock *po = NULL;
+   struct pppox_sock *po;
int i = 0;
 
for (; i < PPPOE_HASH_SIZE; i++) {
-- 
1.5.0.g78e90

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] PPPOE: memory leak when socket is release()d before PPPIOCGCHAN has been called on it

2007-03-13 Thread Michal Ostrowski
below you find a patch that fixes a memory leak when a PPPoE socket is
release()d after it has been connect()ed, but before the PPPIOCGCHAN ioctl
ever has been called on it.

This is somewhat of a security problem, too, since PPPoE sockets can be
created by any user, so any user can easily allocate all the machine's
RAM to non-swappable address space and thus DoS the system.

Is there any specific reason for PPPoE sockets being available to any
unprivileged process, BTW? After all, you need a packet socket for the
discovery stage anyway, so it's unlikely that any unprivileged process
will ever need to create a PPPoE socket, no? Allocating all session IDs
for a known AC is a kind of DoS, too, after all - with Juniper ERXes,
this is really easy, actually, since they don't ever assign session ids
above 8000 ...

Signed-off-by: Florian Zumbiehl <[EMAIL PROTECTED]>
Acked-by: Michal Ostrowski <[EMAIL PROTECTED]>
---
 drivers/net/pppox.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/pppox.c b/drivers/net/pppox.c
index 9315046..3f8115d 100644
--- a/drivers/net/pppox.c
+++ b/drivers/net/pppox.c
@@ -58,7 +58,7 @@ void pppox_unbind_sock(struct sock *sk)
 {
/* Clear connection to ppp device, if attached. */
 
-   if (sk->sk_state & (PPPOX_BOUND | PPPOX_ZOMBIE)) {
+   if (sk->sk_state & (PPPOX_BOUND | PPPOX_CONNECTED | PPPOX_ZOMBIE)) {
ppp_unregister_channel(&pppox_sk(sk)->chan);
sk->sk_state = PPPOX_DEAD;
}
-- 
1.5.0.g78e90

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] PPPOE: race between interface going down and connect()

2007-03-13 Thread Michal Ostrowski
below you find a patch that (hopefully) fixes a race between an interface
going down and a connect() to a peer on that interface. Before,
connect() would determine that an interface is up, then the interface
could go down and all entries referring to that interface in the
item_hash_table would be marked as ZOMBIEs and their references to
the device would be freed, and after that, connect() would put a new
entry into the hash table referring to the device that meanwhile is
down already - which also would cause unregister_netdevice() to wait
until the socket has been release()d.

This patch does not suffice if we are not allowed to accept connect()s
referring to a device that we already acked a NETDEV_GOING_DOWN for
(that is: all references are only guaranteed to be freed after
NETDEV_DOWN has been acknowledged, not necessarily after the
NETDEV_GOING_DOWN already). And if we are allowed to, we could avoid
looking through the hash table upon NETDEV_GOING_DOWN completely and
only do that once we get the NETDEV_DOWN ...

mostrows:
pppoe_flush_dev is called on NETDEV_GOING_DOWN and NETDEV_DOWN to deal with
this "late connect" issue.  Ideally one would hope to notify users at the
"NETDEV_GOING_DOWN" phase (just to pretend to be nice).  However, it is the
NETDEV_DOWN scan that takes all the responsibility for ensuring nobody is
hanging around at that time.

Signed-off-by: Florian Zumbiehl <[EMAIL PROTECTED]>
Acked-by: Michal Ostrowski <[EMAIL PROTECTED]>
---
 drivers/net/pppoe.c |   19 ++-
 1 files changed, 6 insertions(+), 13 deletions(-)

diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c
index ec4e67d..4e878c9 100644
--- a/drivers/net/pppoe.c
+++ b/drivers/net/pppoe.c
@@ -218,17 +218,6 @@ static inline struct pppox_sock *get_item_by_addr(struct 
sockaddr_pppox *sp)
return get_item(sp->sa_addr.pppoe.sid, sp->sa_addr.pppoe.remote, 
ifindex);
 }
 
-static inline int set_item(struct pppox_sock *po)
-{
-   int i;
-
-   write_lock_bh(&pppoe_hash_lock);
-   i = __set_item(po);
-   write_unlock_bh(&pppoe_hash_lock);
-
-   return i;
-}
-
 static inline struct pppox_sock *delete_item(unsigned long sid, char *addr, 
int ifindex)
 {
struct pppox_sock *ret;
@@ -595,14 +584,18 @@ static int pppoe_connect(struct socket *sock, struct 
sockaddr *uservaddr,
po->pppoe_dev = dev;
po->pppoe_ifindex = dev->ifindex;
 
-   if (!(dev->flags & IFF_UP))
+   write_lock_bh(&pppoe_hash_lock);
+   if (!(dev->flags & IFF_UP)){
+   write_unlock_bh(&pppoe_hash_lock);
goto err_put;
+   }
 
memcpy(&po->pppoe_pa,
   &sp->sa_addr.pppoe,
   sizeof(struct pppoe_addr));
 
-   error = set_item(po);
+   error = __set_item(po);
+   write_unlock_bh(&pppoe_hash_lock);
if (error < 0)
goto err_put;
 
-- 
1.5.0.g78e90

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] PPPOE: Fix device tear-down notification.

2007-03-13 Thread Michal Ostrowski
pppoe_flush_dev() kicks all sockets bound to a device that is going down.
In doing so, locks must be taken in the right order consistently (sock lock,
followed by the pppoe_hash_lock).  However, the scan process is based on
us holding the sock lock.  So, when something is found in the scan we must
release the lock we're holding and grab the sock lock.

This patch fixes race conditions between this code and pppoe_release(),
both of which perform similar functions but would naturally prefer to grab
locks in opposing orders.  Both code paths are now going after these locks
in a consistent manner.

pppoe_hash_lock protects the contents of the "pppox_sock" objects that reside
inside the hash.  Thus, NULL'ing out the pppoe_dev field should be done
under the protection of this lock.

Signed-off-by: Michal Ostrowski <[EMAIL PROTECTED]>
---
 drivers/net/pppoe.c |   93 +--
 1 files changed, 53 insertions(+), 40 deletions(-)

diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c
index 4e878c9..0961bf9 100644
--- a/drivers/net/pppoe.c
+++ b/drivers/net/pppoe.c
@@ -241,54 +241,53 @@ static inline struct pppox_sock *delete_item(unsigned 
long sid, char *addr, int
 static void pppoe_flush_dev(struct net_device *dev)
 {
int hash;
-
BUG_ON(dev == NULL);
 
-   read_lock_bh(&pppoe_hash_lock);
+   write_lock_bh(&pppoe_hash_lock);
for (hash = 0; hash < PPPOE_HASH_SIZE; hash++) {
struct pppox_sock *po = item_hash_table[hash];
 
while (po != NULL) {
-   if (po->pppoe_dev == dev) {
-   struct sock *sk = sk_pppox(po);
-
-   sock_hold(sk);
-   po->pppoe_dev = NULL;
-
-   /* We hold a reference to SK, now drop the
-* hash table lock so that we may attempt
-* to lock the socket (which can sleep).
-*/
-   read_unlock_bh(&pppoe_hash_lock);
-
-   lock_sock(sk);
-
-   if (sk->sk_state &
-   (PPPOX_CONNECTED | PPPOX_BOUND)) {
-   pppox_unbind_sock(sk);
-   dev_put(dev);
-   sk->sk_state = PPPOX_ZOMBIE;
-   sk->sk_state_change(sk);
-   }
-
-   release_sock(sk);
+   struct sock *sk = sk_pppox(po);
+   if (po->pppoe_dev != dev) {
+   po = po->next;
+   continue;
+   }
+   po->pppoe_dev = NULL;
+   dev_put(dev);
+   
+
+   /* We always grab the socket lock, followed by the
+* pppoe_hash_lock, in that order.  Since we should
+* hold the sock lock while doing any unbinding, 
+* we need to release the lock we're holding.  
+* Hold a reference to the sock so it doesn't disappear
+* as we're jumping between locks.
+*/
 
-   sock_put(sk);
+   sock_hold(sk);
 
-   read_lock_bh(&pppoe_hash_lock);
+   write_unlock_bh(&pppoe_hash_lock);
+   lock_sock(sk);
 
-   /* Now restart from the beginning of this
-* hash chain.  We always NULL out pppoe_dev
-* so we are guaranteed to make forward
-* progress.
-*/
-   po = item_hash_table[hash];
-   continue;
+   if (sk->sk_state & (PPPOX_CONNECTED | PPPOX_BOUND)) {
+   pppox_unbind_sock(sk);
+   sk->sk_state = PPPOX_ZOMBIE;
+   sk->sk_state_change(sk);
}
-   po = po->next;
+
+   release_sock(sk);
+   sock_put(sk);
+   
+   /* Restart scan at the beginning of this hash chain.
+* While the lock was dropped the chain contents may
+* have changed.
+*/
+   write_lock_bh(&pppoe_hash_lock);
+   po = item_hash_table[hash];
}
}
-   read_unlock_bh(&pp

Re: + ppp_generic-fix-lockdep-warning.patch added to -mm tree

2007-04-17 Thread Michal Ostrowski
The "xmit" function of a PPP channel is a synchronous operation.  If the 
transmission fails, we must notify the caller and let them re-submit the 
skb later.  The return status of dev_queue_xmit is needed to determine 
the return code passed back to the caller and thus the call is made 
synchronously and not in a tasklet.


Looking at the stack traces earlier in this thread, it seems to me that 
even if the PPPoE call was made in a tasklet, this same warning could be 
generated.


--
Michal Ostrowski
[EMAIL PROTECTED]



Jarek Poplawski wrote:

On Wed, Apr 11, 2007 at 12:52:28PM +0400, Yuriy N. Shkandybin wrote:
...
On Wed, 11 Apr 2007 09:57:33 +0400 "Yuriy N. Shkandybin" <[EMAIL PROTECTED]> 
wrote:



I've tested  2.6.21-rc6-mm1
Linux vpn1 2.6.21-rc6-mm1 #4 SMP Wed Apr 11 03:34:26 MSD 2007 x86_64
Intel(R) Pentium(R) D CPU 2.80GHz GenuineIntel GNU/Linux

warn appeares upon first pppoe connection to rp-pppoe server in kernel 
mode



Thanks.  So you're saying that
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc6/2.6.21-rc6-mm1/broken-out/ppp_generic-fix-lockdep-warning.patch
did not fix anything?
As i understand this patch already in -mm tree, so I've booted into last mm 
kernel and received this locked warning.

Or i've mistaked and should apply this patch manually?


Hi!

Yuriy - thanks for testing my patch ...(pause) Not!

It seems this patch is not visible in this version - probably
is overpatched by something else. But your new log shows there
is another connection between these locks (ppp_xmit_process
and ppp_push instead of ppp_channel_push in "-> #0"), so the
patch is not sufficient (and could be dumped).

I don't know your vlans configuration, but it seems the real
lockup isn't very probable here - it's rather lockdep question.
I think vlan's too broad lockdep class is the main "guilty"
here, but probably pppoe also could be enhanced: it's making
the things unnecessarily complicated by calling dev_queue_xmit
under ppp_generic's xmit locks. I wonder if there is any reason
against using a tasklet here.

I'll try to find more time to untie this yet - or maybe some
maintainer will find this interesting, too...

Regards,
Jarek P.

PS: sorry for late responding (vacations).



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][BUG][SECURITY] Re: Weird problem with PPPoE on tap interface

2007-03-04 Thread Michal Ostrowski
Sorry for the late reply I've been on the road the past few days.

I ACK the patch.

I'll need to think about it some more, but we could probably go a step
further and eliminate the MAC address from the hash as well.


-- 
Michal Ostrowski <[EMAIL PROTECTED]>

On Sat, 2007-03-03 at 21:08 -0800, David Miller wrote:
> From: Florian Zumbiehl <[EMAIL PROTECTED]>
> Date: Sun, 4 Mar 2007 02:55:16 +0100
> 
> > Below you find a slightly changed version of the patch
> 
> I already applied your first patch, so if you have any
> fixes to submit please provide them as relative patches
> to your original change.
> 
> Thank you.
> 


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Session ID 0 with PPPoE

2007-03-04 Thread Michal Ostrowski
>From the RFC:

5.4 The PPPoE Active Discovery Session-confirmation (PADS) packet

   When the Access Concentrator receives a PADR packet, it prepares to
   begin a PPP session.  It generates a unique SESSION_ID for the PPPoE
   session and replies to the Host with a PADS packet.  The
   DESTINATION_ADDR field is the unicast Ethernet address of the Host
   that sent the PADR.  The CODE field is set to 0x65 and the SESSION_ID
   MUST be set to the unique value generated for this PPPoE session.

   The PADS packet contains exactly one TAG of TAG_TYPE Service-Name,
   indicating the service under which Access Concentrator has accepted
   the PPPoE session, and any number of other TAG types.

   If the Access Concentrator does not like the Service-Name in the
   PADR, then it MUST reply with a PADS containing a TAG of TAG_TYPE
   Service-Name-Error (and any number of other TAG types).  In this case
   the SESSION_ID MUST be set to 0x.



As you can see from the last paragraph, a session id of 0 implies a
rejection of the PADR.  Thus, you can't possibly get a PADS packet that
completes and initiates a valid session if the session id is 0.

Note that the RFC does not prohibit all other aspects of the PADS to be
structured as if it were a valid success response; the only condition
and requirement of a failure mode here is the session id.

Also 0x is reserved for future use. Thus it cannot be used as a
sentinel value to indicate an invalid session id.

Changing this code would require that the user-space component be
synchronized with this change; as the socket interface implies that 0 is
an invalid/unbound session id. 

Lots of badness will occur if 0 is allowed as a session id, and nothing
will be gained because it can't possibly be a valid session id.

-- 
Michal Ostrowski <[EMAIL PROTECTED]>


On Sat, 2007-03-03 at 21:07 -0800, David Miller wrote:
> From: Florian Zumbiehl <[EMAIL PROTECTED]>
> Date: Sun, 4 Mar 2007 03:30:00 +0100
> 
> > I noticed that the PPPoE code doesn't allow session id 0x to be used
> > for an actual session but rather considers 0 a special value denoting
> > that the socket is unbound. Now, when reading RFC 2516, I couldn't really
> > find anything that would forbid 0x as a session id. Only 0x "is
> > reserved for future use and MUST NOT be used", while 0x is specified
> > as the only allowed value for the session id field on certain types of
> > packets, but neither can I find any statement that forbids 0x as
> > an ordinary session identifier, nor can I find any reasons that would
> > prevent PPPoE from functioning properly with a session id of 0x.
> > 
> > Does anyone of you see any reason why a server would not be allowed to
> > select 0x as the session id for a PPPoE session?
> 
> I can't, feel free to provide a patch to remove this limitation
> if it's important to you.
> 


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [EMAIL PROTECTED]: Mail delivery failed: returning message to sender]

2007-03-04 Thread Michal Ostrowski

In the current  code SID 0 indicates that the socket is to be un-bound.
Supporting unbinding of the socket was intended to permit the PPPoE
session to be reconnected without closing/reopening the socket; which
would mean that you'd have to re-bind the PPPoE/PPP channel bindings.
Thus it is conceivable to swap or renegotiate PPPoE connection
underneath a PPP connection, hypothetically if anyone ever considered
doing so.  Is that worth it?  I don't know.  One could eliminate that
disconnect behavior and I don't think anyone would care.

I'll conceed that a SID of 0 could appear from outer space.   I've never
seen that happening. The only way I see this being an issue is if a
PPPoE server insists on giving you SID 0 and only SID 0 repeatedly.  And
I've never seen *that* happening.

If you'd really like to pursue this, I'll be happy to review and ack
patches in this regard.  However, I don't see what there is to be
actually gained by pursuing this.   I'm open to being convinced; what is
the motivation behind this?  If there is a real problem here I'll be
glad to get involved in fixing it myself.


-- 
Michal Ostrowski <[EMAIL PROTECTED]>


On Sun, 2007-03-04 at 15:56 +0100, Florian Zumbiehl wrote:
> - Forwarded message from Mail Delivery System <[EMAIL PROTECTED]> -
> 
> Date: Sun, 04 Mar 2007 15:52:51 +0100
> From: Mail Delivery System <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Subject: Mail delivery failed: returning message to sender
> Delivery-date: Sun, 04 Mar 2007 15:54:05 +0100
> Auto-Submitted: auto-generated
> 
> This message was created automatically by mail delivery software.
> 
> A message that you sent could not be delivered to one or more of its
> recipients. This is a permanent error. The following address(es) failed:
> 
>   [EMAIL PROTECTED]
> SMTP error from remote mailer after MAIL FROM:<[EMAIL PROTECTED]> 
> SIZE=5299:
> host mx1.earthlink.net [209.86.93.226]: 550 550 Dynamic/zombied/spam IPs 
> blocked. Write [EMAIL PROTECTED]
> 
> -- This is a copy of the message, including all the headers. --
> 
> Return-path: <[EMAIL PROTECTED]>
> Received: from florz.florz.dyndns.org ([192.168.0.121])
>   by rain.florz.dyndns.org with esmtp (Exim 4.50)
>   id 1HNs4n-0006c7-E0; Sun, 04 Mar 2007 15:52:41 +0100
> Received: from florz by florz.florz.dyndns.org with local (Exim 3.35 #1 
> (Debian))
>   id 1HNs4k-xd-00; Sun, 04 Mar 2007 15:52:38 +0100
> Date: Sun, 4 Mar 2007 15:52:38 +0100
> From: Florian Zumbiehl <[EMAIL PROTECTED]>
> To: Michal Ostrowski <[EMAIL PROTECTED]>
> Cc: David Miller <[EMAIL PROTECTED]>, netdev@vger.kernel.org,
>   [EMAIL PROTECTED]
> Subject: Re: Session ID 0 with PPPoE
> Message-ID: <[EMAIL PROTECTED]>
> References: <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]>
> Mime-Version: 1.0
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
> In-Reply-To: <[EMAIL PROTECTED]>
> User-Agent: Mutt/1.5.9i
> 
> Hi,
> 
> > >>From the RFC:
> > 
> > 5.4 The PPPoE Active Discovery Session-confirmation (PADS) packet
> > 
> >When the Access Concentrator receives a PADR packet, it prepares to
> >begin a PPP session.  It generates a unique SESSION_ID for the PPPoE
> >session and replies to the Host with a PADS packet.  The
> >DESTINATION_ADDR field is the unicast Ethernet address of the Host
> >that sent the PADR.  The CODE field is set to 0x65 and the SESSION_ID
> >MUST be set to the unique value generated for this PPPoE session.
> > 
> >The PADS packet contains exactly one TAG of TAG_TYPE Service-Name,
> >indicating the service under which Access Concentrator has accepted
> >the PPPoE session, and any number of other TAG types.
> > 
> >If the Access Concentrator does not like the Service-Name in the
> >PADR, then it MUST reply with a PADS containing a TAG of TAG_TYPE
> >Service-Name-Error (and any number of other TAG types).  In this case
> >the SESSION_ID MUST be set to 0x.
> > 
> > 
> > 
> > As you can see from the last paragraph, a session id of 0 implies a
> > rejection of the PADR.  Thus, you can't possibly get a PADS packet that
> > completes and initiates a valid session if the session id is 0.
> > 
> > Note that the RFC does not prohibit all other aspects of the PADS to be
> > structured as if it were a valid success response; the only condition
> > and requirement of a failure mode here is the session id.
> 
> | [...] then it MUST reply with a PADS containing a TAG of TAG_TYPE
> | Service-Name

Re: PPPoE dropping packets?

2005-07-15 Thread Michal Ostrowski
The connection between these two hosts is via your ISP right?

I'd like to see a third trace, synchronized to the other two, performed
on 213.139.163.144: tcpdump -n -i eth0

This will show me what PPPoE packets are being sent (on the other side
of the ppp0 device).

-- 
Michal Ostrowski



On Fri, 15 Jul 2005 13:48:29 +0300
Antti Salmela <[EMAIL PROTECTED]> wrote:

> I'm running 2.6.13-rc3 now but have been suffering from weird stalls with my
> PPPoE connection ever since my ISP made it a requirement, about two months
> ago. I found out by listening outgoing ethernet interface with tcpdump that
> when a stall occurs, it's not sending anything. Listening ppp interface
> shows packets being sent.
> 
> Here are two tcpdump logs taken when a stall occured and while a ping to ppp 
> peer
> was running. My ip is 213.139.163.144, peer is 10.10.9.1. 
> 
> tcpdump -i ppp0 -n:
> 
> 11:13:55.710515 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 1, 
> length 64
> 11:13:56.709827 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 2, 
> length 64
> 11:13:57.709842 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 3, 
> length 64
> 11:13:58.709848 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 4, 
> length 64
> 11:13:58.874515 IP 213.139.169.82.6668 > 213.139.163.144.58747: . 
> 1689088981:1689089019(38) ack 1926745953 win 8499  4460534>
> 11:13:58.874610 IP 213.139.163.144.58747 > 213.139.169.82.6668: . ack 38 win 
> 12424 
> 11:13:59.709864 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 5, 
> length 64
> 11:14:00.709869 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 6, 
> length 64
> 11:14:01.709884 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 7, 
> length 64
> 11:14:02.588471 IP 213.139.163.144.48587 > 207.46.0.35.1863: P 
> 3889301623:3889301628(5) ack 519932196 win 2264  20203750>
> 11:14:02.709889 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 8, 
> length 64
> 11:14:03.024512 IP 213.139.163.144.57511 > 207.46.0.63.1863: P 
> 3897420679:3897420684(5) ack 1231695246 win 2264  20038047>
> 11:14:03.709905 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 9, 
> length 64
> 11:14:04.505177 IP 216.239.59.99.80 > 213.139.163.144.44690: F 
> 2225588483:2225588483(0) ack 4125655007 win 8190
> 11:14:04.544579 IP 213.139.163.144.44690 > 216.239.59.99.80: . ack 1 win 14300
> 11:14:04.709928 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 10, 
> length 64
> 11:14:05.512052 IP 216.239.59.99.80 > 213.139.163.144.44690: F 0:0(0) ack 1 
> win 8190
> 11:14:05.512256 IP 213.139.163.144.44690 > 216.239.59.99.80: . ack 1 win 14300
> 11:14:05.709934 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 11, 
> length 64
> 11:14:06.372979 IP 213.139.163.144.44690 > 216.239.59.99.80: F 1:1(0) ack 1 
> win 14300
> 11:14:06.628684 IP 213.139.163.144.44690 > 216.239.59.99.80: F 1:1(0) ack 1 
> win 14300
> 11:14:06.709947 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 12, 
> length 64
> 11:14:07.140722 IP 213.139.163.144.44690 > 216.239.59.99.80: F 1:1(0) ack 1 
> win 14300
> 11:14:07.522432 IP 216.239.59.99.80 > 213.139.163.144.44690: F 0:0(0) ack 1 
> win 8190
> 11:14:07.522663 IP 213.139.163.144.44690 > 216.239.59.99.80: . ack 1 win 14300
> 11:14:07.710004 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 13, 
> length 64
> 11:14:08.164751 IP 213.139.163.144.44690 > 216.239.59.99.80: F 1:1(0) ack 1 
> win 14300
> 11:14:08.709979 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 14, 
> length 64
> 11:14:08.879946 IP 213.139.163.144.58747 > 213.139.169.82.6668: P 1:19(18) 
> ack 38 win 12424 
> 11:14:09.710183 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 15, 
> length 64
> 11:14:10.212868 IP 213.139.163.144.44690 > 216.239.59.99.80: F 1:1(0) ack 1 
> win 14300
> 11:14:10.710024 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 16, 
> length 64
> 11:14:11.532770 IP 216.239.59.99.80 > 213.139.163.144.44690: F 0:0(0) ack 1 
> win 8190
> 11:14:11.532938 IP 213.139.163.144.44690 > 216.239.59.99.80: . ack 1 win 14300
> 11:14:11.710029 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 17, 
> length 64
> 11:14:12.710065 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 18, 
> length 64
> 11:14:12.710896 IP 10.10.9.1 > 213.139.163.144: ICMP echo reply seq 18, 
> length 64
> 11:14:13.720235 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 19, 
> length 64
> 11:14:13.721049 IP 10.10.9.1 > 213.139.163.144: ICMP echo reply seq 19, 
> length 64
> 11:14:14.309102 IP 213.139.163.144.44690 > 216.239.59.99.80: F 1:1(0) ack 1 
> win 14300
> 11:14:

Re: [PATCH] Advertise PPPoE MTU / avoid memory leak.

2006-09-24 Thread Michal Ostrowski
On Sat, 2006-09-23 at 14:56 -0700, David Miller wrote:
> From: [EMAIL PROTECTED]
> Date: Sat, 23 Sep 2006 12:30:23 -0500
> 
> > __pppoe_xmit must free any skb it allocates if there is an error
> > submitting the skb downstream.
> 
> This isn't right, dev_queue_xmit() can return -ENETDOWN and still
> free the SKB, so your change will cause the SKB to be freed up
> twice in that case, from dev_queue_xmit():
> 
>   rc = -ENETDOWN;
>   rcu_read_unlock_bh();
> 
> out_kfree_skb:
>   kfree_skb(skb);
>   return rc;
> 
> dev_queue_xmit() is basically expected to consume the packet,
> error or not.
> 
> What case of calling dev_queue_xmit() did you discover that did not
> kfree the SKB on error?  We should fix that.  On a quick scan on the
> entire dev_queue_xmit() implmentation, I cannot find such a case.
> 

I think the call path via dev->hard_start_xmit, if it fails, may result
in an skb not being freed.  This appears to be the case with the e100.c
driver.  The qdisc_restart path to dev->hard_start_xmit also appears
susceptible to this.  It appears that not all devices agree as to who
should clean-up an skb on error.

-- 
Michal Ostrowski <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Advertise PPPoE MTU / avoid memory leak.

2006-09-25 Thread Michal Ostrowski
On Sun, 2006-09-24 at 18:41 -0700, David Miller wrote:
> From: Michal Ostrowski <[EMAIL PROTECTED]>
> Date: Sun, 24 Sep 2006 07:29:25 -0500
> 
> > I think the call path via dev->hard_start_xmit, if it fails, may result
> > in an skb not being freed.  This appears to be the case with the e100.c
> > driver.  The qdisc_restart path to dev->hard_start_xmit also appears
> > susceptible to this.  It appears that not all devices agree as to who
> > should clean-up an skb on error.
> 
> There is a well defined policy about who frees the SKB or has
> ownership of it based upon dev->hard_start_xmit() return values.
> 
> Any driver deviating from this set of rules should simply be
> audited and fixed, as needed.
> 
> But, no matter, your change is buggy and we can't apply your
> patch (even if it does fix a leak in some legitimate case)
> because it introduces an obvious double-free bug.
> 

Yup.  I'll resubmit a fixed one.


-- 
Michal Ostrowski <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Advertise PPPoE MTU / avoid memory leak.

2006-09-27 Thread Michal Ostrowski
This is not currently supported.

Looking at the RFC, this is an issue that must be solved in the pppoe
plugin that performs the PPPoE negotiation/discovery.  The kernel code
that runs the regular session traffic should be able to adjust (with
minimal or no changes) as it does not depend on a hardcoded MTU.



-- 
Michal Ostrowski <[EMAIL PROTECTED]>

On Tue, 2006-09-26 at 09:32 +0300, Pekka Savola wrote:
> Speaking of PPPoE and MTU, does Linux support recently-published RFC 
> 4638:
> 
>Accommodating a Maximum Transit Unit/Maximum Receive Unit (MTU/MRU)
> Greater Than 1492 in the
>   Point-to-Point Protocol over Ethernet (PPPoE)
> 
> On Sat, 23 Sep 2006, [EMAIL PROTECTED] wrote:
> > PPPoE must advertise the underlying device's MTU via the ppp channel
> > descriptor structure, as multilink functionality depends on it.
> >
> > __pppoe_xmit must free any skb it allocates if there is an error
> > submitting the skb downstream.
> >
> > Signed-off-by: Michal Ostrowski <[EMAIL PROTECTED]>
> > ---
> > drivers/net/pppoe.c |5 -
> > 1 files changed, 4 insertions(+), 1 deletions(-)
> >
> > diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c
> > index 475dc93..b4dc516 100644
> > --- a/drivers/net/pppoe.c
> > +++ b/drivers/net/pppoe.c
> > @@ -600,6 +600,7 @@ static int pppoe_connect(struct socket *
> > po->chan.hdrlen = (sizeof(struct pppoe_hdr) +
> >dev->hard_header_len);
> >
> > +   po->chan.mtu = dev->mtu - sizeof(struct pppoe_hdr);
> > po->chan.private = sk;
> > po->chan.ops = &pppoe_chan_ops;
> >
> > @@ -831,7 +832,7 @@ static int __pppoe_xmit(struct sock *sk,
> > struct pppoe_hdr *ph;
> > int headroom = skb_headroom(skb);
> > int data_len = skb->len;
> > -   struct sk_buff *skb2;
> > +   struct sk_buff *skb2 = NULL;
> >
> > if (sock_flag(sk, SOCK_DEAD) || !(sk->sk_state & PPPOX_CONNECTED))
> > goto abort;
> > @@ -887,6 +888,8 @@ static int __pppoe_xmit(struct sock *sk,
> > return 1;
> >
> > abort:
> > +   if (skb2)
> > +   kfree_skb(skb2);
> > return 0;
> > }
> >
> >
> 

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html