Re: pppoe repeated disconnects
Hoi once more, Closure on this. 2013/2/24 Stuart Henderson s...@spacehopper.org: On 2013-02-24, Pim van Pelt p...@ipng.nl wrote: .. , or the DSL has untrained It does seem like the line could well be dropping out then. .. and that was it. I have since replaced the DSL modem with a Fritz!Box 7390 and the line has been up for 6 days without pretty much a single packet dropped. As a bonus, it now trains at 50/10 where it used to be 30/6 or something significantly weaker. That is sufficient for me to prove that (a) the OpenBSD pppoe(4) and pppoe(8) setups were fine, and (b) DSL modem vendors make a difference :) Thanks Stuart, for helping me think this through, and others on this list for lurking. groet, Pim -- Pim van Pelt p...@ipng.nl PBVP1-RIPE - http://www.ipng.nl/
Re: pppoe repeated disconnects
Hoi, Thanks for helping me think about this, Stu. 2013/2/22 Stuart Henderson s...@spacehopper.org: At 00:27:30.706636 I see ingress to .86, which is replied at 00:27:30.708050 egress. Then, ingress halts. I send three echos at 00:27:52.564297, 00:28:02.563980 and 00:28:12.563665, and then give up at 00:28:22.563420 and send a Terminate-Request. I then send three Initiation requests, which are answered finally at 00:29:27.771647 by the ISP. The connection re-establishes and the cycle repeats. Your log only shows the last incoming packet and a few unanswered outgoing packets, so it's not enough to tell, does the ISP reply to any of your earlier LCP echo requests or does it ignore all of them? You are right, I only pasted the tail end of the data, but I do have the whole file still: http://www.ipng.nl/pppoe-tcpdump/pppoe4.tcpdump (kernel) http://www.ipng.nl/pppoe-tcpdump/pppoe8.tcpdump (userland) In the call I was referring to above (pppoe4.tcpdump, the kernel version), a whole session lasts from 00:25:32.588726 to 00:28:22.583343. I took another look and it seems that my side does not send LCP echos, but it does reply to them. Typically, the ISP (mac ending in 8d:20) sends LCP echos every 10s, and my side (mac ending in 79:79) responds to them, for example: 00:26:33.656302 00:90:1a:a4:8d:20 00:60:e0:53:79:79 8864 60: PPPoE-Session code Session, version 1, type 1, id 0x124a, length 14 LCP: Echo-Request, Magic-Number=373307203 00:26:33.656321 00:60:e0:53:79:79 00:90:1a:a4:8d:20 8864 34: PPPoE-Session code Session, version 1, type 1, id 0x124a, length 14 LCP: Echo-Reply, Magic-Number=1979569634 You mentioned that I would/should be sending echos as well, but my side definitely does not do that every N seconds. It does, however, when the connection did not see packets for a while, for example at 00:24:42.570312, 00:24:52.569996, 00:25:02.569679 (none responded by ISP), and it then sends terminate-request at 00:25:12.569442. I counted the Echo-Requests, I saw 20 from them, and replied 20 times. I sent 6 to them, they replied 0 times. In the userland pppoe(8) I also saw 20 of them, replied 20 times. I sent 0 to them, they replied 0 times. But I think it's relevant to note that I only sent LCP Echo-Requests just before the line was disconnected (The one I mentioned before is 00:24:42.570312 but there's another example at 00:27:52.564297) - they could very well not be replied because the ISP stopped forwarding traffic to me, or the DSL has untrained (i mean, while the LCPs are are being sent, I am also not seeing any IP traffic coming in, whlie I have a ping running in the background, I should be seeing 2 packets per second, for 40secs I see nothing and then see the line disconnect). Normally if there is a period of 30 seconds with either no LCP keepalives or no user data, pppoe(4) will drop the session and log LCP keepalive timeout, which it looks like should show up at the bottom of dmesg output. Confirmed. I can see those 3 LCP packets with 10s interval, and then my side sending Terminate-Request and immediately thereafter Terminate. So, if they do normally respond to LCP, it looks like the link is dropping out somehow.. I don't think we can conclude if they normally respond to LCP or not, because at it seems I am not normally sending LCP Echos .. and I don't see how to set this in ifconfig(8) or sppp(4) or pppoe(4). Do you have any idea why my side is not sending LCP echos, or how I can configure it to? Or, if they don't normally respond to LCP, you may either need some other way to generate data to keep the session alive, disable the timeout (by hacking the driver), or ideally see if they can start responding (easier if it's an ISP compliant with xkcd 806 ;) There's ingress and egress data ~always. In my examples, it's a running ping from internet to me. The ping just times out after a while, for up to a minute or so while my side is sending (sometimes multiple!) Initiation frames, for example 00:28:22.583356, 00:28:27.583185, 00:29:27.581285. Then the ISP responds with an Offer and the cycle restarts. The only other thing I can think of is that actually my DSL modem is dropping the connection. I should easily be able to find a pocket of downtime and inspect the DSL line status because the LCP keepalive timeouts occur more than 100 times per day (there can be 8 hours between them, or as little as 3 minutes) - http://pastebin.com/1VikG019 groet, Pim -- Pim van Pelt p...@ipng.nl PBVP1-RIPE - http://www.ipng.nl/
Re: pppoe repeated disconnects
On 2013-02-24, Pim van Pelt p...@ipng.nl wrote: Your log only shows the last incoming packet and a few unanswered outgoing packets, so it's not enough to tell, does the ISP reply to any of your earlier LCP echo requests or does it ignore all of them? You are right, I only pasted the tail end of the data, but I do have the whole file still: http://www.ipng.nl/pppoe-tcpdump/pppoe4.tcpdump (kernel) http://www.ipng.nl/pppoe-tcpdump/pppoe8.tcpdump (userland) In the call I was referring to above (pppoe4.tcpdump, the kernel version), a whole session lasts from 00:25:32.588726 to 00:28:22.583343. I took another look and it seems that my side does not send LCP echos, but it does reply to them. Ah... this might indeed be normal then, the line I was looking at has echoes every second sent by my ISP (http://aa.net.uk/kb-broadband-cqm.html) so I wouldn't have noticed whether my side was sending them (I was looking at tcpdump on the pppoe interface where it's not clear who sends them). I suspect I was wrong then and we only send them if the line is quiet. You mentioned that I would/should be sending echos as well, but my side definitely does not do that every N seconds. It does, however, when the connection did not see packets for a while, for example at 00:24:42.570312, 00:24:52.569996, 00:25:02.569679 (none responded by ISP), and it then sends terminate-request at 00:25:12.569442. I counted the Echo-Requests, I saw 20 from them, and replied 20 times. I sent 6 to them, they replied 0 times. In the userland pppoe(8) I also saw 20 of them, replied 20 times. I sent 0 to them, they replied 0 times. But I think it's relevant to note that I only sent LCP Echo-Requests just before the line was disconnected (The one I mentioned before is 00:24:42.570312 but there's another example at 00:27:52.564297) - they could very well not be replied because the ISP stopped forwarding traffic to me, or the DSL has untrained (i mean, while the LCPs are are being sent, I am also not seeing any IP traffic coming in, whlie I have a ping running in the background, I should be seeing 2 packets per second, for 40secs I see nothing and then see the line disconnect). It does seem like the line could well be dropping out then. Are you able to setup syslog on the DSL modem and get the logs sent to the box running pppoe? Having the two logs together would make it easier to correlate any problems logged by the modem with those logged by pppoe. (it will need 'syslog_flags=-u' in rc.conf.local to accept from the network, and /etc/rc.d/syslogd restart). Maybe you could also leave a ping running to the modem to spot if there's a problem with the ethernet link or something?
pppoe repeated disconnects
Hoi, I have recently switched from a Zyxel modem terminated pppoe connection, to an OpenBSD based termination of the pppoe connection. I think I have a reasonable configuration, and both kernel as well as userland pppoe configs connect fine, but after a few minutes the ingress traffic halts and the connection drops. I am running 5.2, with the following /etc/ppp/ppp.conf ([1] for userland) and /etc/hostname.pppoe0 ([2] for kernel). Both exhibit the same issue. When the connection establishes, I have the following device: $ ifconfig pppoe0 pppoe0: flags=8851UP,POINTOPOINT,RUNNING,SIMPLEX,MULTICAST mtu 1492 priority: 0 dev: vlan5 state: session sid: 0x1268 PADI retries: 8 PADR retries: 0 time: 00:04:02 sppp: phase network authproto chap authname u280...@dsl.green.ch groups: pppoe status: active inet6 fe80::260:e0ff:fe53:7978%pppoe0 - prefixlen 64 scopeid 0x1e inet 81.6.62.85 -- 80.254.161.242 netmask 0x The ISP routes me 81.6.62.84/30 and when the connection is up, this works (I can ping all 4 IP addresses). There's a netscreen at 81.6.62.86 behind this machine and it is responding to ping's from the internet just fine. A few minutes later, traffic stops flowing and about one minute later, pppoe0 disconnects and then reconnects a few minutes later, to rinse and repeat. So I tcpdump'ed the kernel pppoe, and noticed the following snippet (see [3] for details). I am MAC 00:60:e0:53:79:79 , the ISP is MAC 00:90:1a:a4:8d:20. At 00:27:30.706636 I see ingress to .86, which is replied at 00:27:30.708050 egress. Then, ingress halts. I send three echos at 00:27:52.564297, 00:28:02.563980 and 00:28:12.563665, and then give up at 00:28:22.563420 and send a Terminate-Request. I then send three Initiation requests, which are answered finally at 00:29:27.771647 by the ISP. The connection re-establishes and the cycle repeats. When I try the same thing using userland ppp (to benefit from debugging), things look pretty good to me, but also after a few minutes the physical link goes silent and I disconnect. The logs[4] show that the connection establishes (and on the ppp command line, 'show ipcp' and 'show lcp' look fine to me), but the problem persists. I've read up on ppp/pppd/pppoe for both userspace and kernel, but I'm stumped and at this point I believe the problem is with the ISP (I say this because traffic stops flowing while I'm sending LCP echos, and I always reply the ISPs echos, but after a few failed echos pppoe on my machine gives up). Can somebody please confirm this before I open a problem ticket with my ISP, or help me find a way to gather more information to diagnose this issue? Kind regards, Pim/ Zurich [1] cat /etc/ppp/ppp.conf default: set log Phase Chat LCP IPCP CCP tun command set speed sync disable acfcomp protocomp shortseq vjcomp deny acfcomp green: set device !/usr/sbin/pppoe -i vlan5 set mtu max 1492 set mru max 1492 set authname x...@dsl.green.ch set authkey Y set ifaddr 0.0.0.0 0.0.0.0 disable echo lqr deny lqr [2] cat /etc/hostname.pppoe0 inet 0.0.0.0 255.255.255.255 NONE pppoedev vlan5 authproto chap authname 'x...@dsl.green.ch' authkey 'Y' up dest 0.0.0.1 [3] http://pastebin.com/ph6xrNaW [4] http://pastebin.com/grifHJ41 -- Pim van Pelt p...@ipng.nl PBVP1-RIPE - http://www.ipng.nl/
Re: pppoe repeated disconnects
On 2013-02-21, Pim van Pelt p...@ipng.nl wrote: Hoi, I have recently switched from a Zyxel modem terminated pppoe connection, to an OpenBSD based termination of the pppoe connection. I think I have a reasonable configuration, and both kernel as well as userland pppoe configs connect fine, but after a few minutes the ingress traffic halts and the connection drops. I am running 5.2, with the following /etc/ppp/ppp.conf ([1] for userland) and /etc/hostname.pppoe0 ([2] for kernel). Both exhibit the same issue. When the connection establishes, I have the following device: $ ifconfig pppoe0 pppoe0: flags=8851UP,POINTOPOINT,RUNNING,SIMPLEX,MULTICAST mtu 1492 priority: 0 dev: vlan5 state: session sid: 0x1268 PADI retries: 8 PADR retries: 0 time: 00:04:02 sppp: phase network authproto chap authname u280...@dsl.green.ch groups: pppoe status: active inet6 fe80::260:e0ff:fe53:7978%pppoe0 - prefixlen 64 scopeid 0x1e inet 81.6.62.85 -- 80.254.161.242 netmask 0x The ISP routes me 81.6.62.84/30 and when the connection is up, this works (I can ping all 4 IP addresses). There's a netscreen at 81.6.62.86 behind this machine and it is responding to ping's from the internet just fine. A few minutes later, traffic stops flowing and about one minute later, pppoe0 disconnects and then reconnects a few minutes later, to rinse and repeat. So I tcpdump'ed the kernel pppoe, and noticed the following snippet (see [3] for details). I am MAC 00:60:e0:53:79:79 , the ISP is MAC 00:90:1a:a4:8d:20. At 00:27:30.706636 I see ingress to .86, which is replied at 00:27:30.708050 egress. Then, ingress halts. I send three echos at 00:27:52.564297, 00:28:02.563980 and 00:28:12.563665, and then give up at 00:28:22.563420 and send a Terminate-Request. I then send three Initiation requests, which are answered finally at 00:29:27.771647 by the ISP. The connection re-establishes and the cycle repeats. Your log only shows the last incoming packet and a few unanswered outgoing packets, so it's not enough to tell, does the ISP reply to any of your earlier LCP echo requests or does it ignore all of them? Normally if there is a period of 30 seconds with either no LCP keepalives or no user data, pppoe(4) will drop the session and log LCP keepalive timeout, which it looks like should show up at the bottom of dmesg output. So, if they do normally respond to LCP, it looks like the link is dropping out somehow.. Or, if they don't normally respond to LCP, you may either need some other way to generate data to keep the session alive, disable the timeout (by hacking the driver), or ideally see if they can start responding (easier if it's an ISP compliant with xkcd 806 ;) I've read up on ppp/pppd/pppoe for both userspace and kernel, but I'm stumped and at this point I believe the problem is with the ISP (I say this because traffic stops flowing while I'm sending LCP echos, and I always reply the ISPs echos, but after a few failed echos pppoe on my machine gives up). Can somebody please confirm this before I open a problem ticket with my ISP, or help me find a way to gather more information to diagnose this issue? [1] cat /etc/ppp/ppp.conf default: set log Phase Chat LCP IPCP CCP tun command set speed sync disable acfcomp protocomp shortseq vjcomp deny acfcomp green: set device !/usr/sbin/pppoe -i vlan5 set mtu max 1492 set mru max 1492 set authname x...@dsl.green.ch set authkey Y set ifaddr 0.0.0.0 0.0.0.0 disable echo lqr deny lqr I haven't really used iij ppp since around 1998 so not too sure about syntax here (though it seems ok).. but [2] cat /etc/hostname.pppoe0 inet 0.0.0.0 255.255.255.255 NONE pppoedev vlan5 authproto chap authname 'x...@dsl.green.ch' authkey 'Y' up dest 0.0.0.1 this definitely looks right to me, and I have a bunch of pppoe(4) running over vlans, so there's no problem in that respect. [3] http://pastebin.com/ph6xrNaW [4] http://pastebin.com/grifHJ41