RE: Openvpn tap uses 99% cpu time
Hi Bruce I hope you can still remember this post from way back when. You can read the archives here: http://unix.derkeiler.com/Mailing-Lists/FreeBSD/stable/2007-03/msg00445.html Up until now we have been tracking RELENG_6_2 without any issues but I was forced to look at this again to get a newer NIC working where I needed RELENG_6. So quick recap: A while ago I hit a problem where the openvpn tap service would kill my machine with 99% CPU usage. I could reproduce this on a number of machines. I have now produced a way to prevent and reproduce the problem, so hopefully if someone else can reproduce this we can find a proper fix. I have been loading my tap device using /boot/loader.conf if_tap_load=YES So I removed this from the loader. Rebooted and tried to start openvpn (/usr/local/sbin/openvpn --cd /usr/local/etc/openvpn --config /usr/local/etc/openvpn/tapsvr.conf) It would not start (obviously as there was no tap device). I then loaded the tap device using kldload (kldload if_tap) and tried again. This time is started and surprise, surprise no 99% CPU problem. Not wanting to have to load the device before starting openvpn I then compiled the tap device into my kernel which did the trick. So it seems that there is a problem with loading the tap device from loader.conf on RELENG_6. This works correctly on RELENG_6_2. Do you want to see if you can reproduce this first or should I open a bug? Regards Emile -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Emile Coetzee Posted At: 20 March 2007 14:16 PM Posted To: freebsd-stable Conversation: Openvpn tap uses 99% cpu time Subject: RE: Openvpn tap uses 99% cpu time Emile Coetzee wrote: Okay I finally have a ktrace of the offending process. You can view it here: http://www.clarotech.co.za/dump/openvpn2.txt Thanks for this. If this is the correct trace, of the correct process, then it looks like OpenVPN is hanging immediately on opening the tap device. One thing that does jump out at me is the use of the persist-tun keyword. Can you try removing the use of this keyword? It is something I've never had to use with OpenVPN. I did try it without the persist settings and it seemed to work but then I put it back again and that worked too (i.e. no 100% CPU usage). However after restarting the box and repeating the tests, both produced the 100% CPU usage issue. What I suspect happened is that somehow the tap device was already present and thus openvpn did not need to create one. I then tried to use the cloned_interfaces=tap0 to see if that would create a tap device for me at boot time. But the server hung similarly to when openvpn uses 100% CPU time more or less where I would expect it to initialize the NICs. Unfortunately I did not think to check if the tap device was present before testing it without the persist setting. So it's a bit of mystery. I have lost the box I was testing on (off to a client) and will only be able to setup a new one to do testing on next week. So I will feed back with any new findings. Cheers Emile ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Openvpn tap uses 99% cpu time
Emile Coetzee wrote: Okay I finally have a ktrace of the offending process. You can view it here: http://www.clarotech.co.za/dump/openvpn2.txt Thanks for this. If this is the correct trace, of the correct process, then it looks like OpenVPN is hanging immediately on opening the tap device. One thing that does jump out at me is the use of the persist-tun keyword. Can you try removing the use of this keyword? It is something I've never had to use with OpenVPN. I did try it without the persist settings and it seemed to work but then I put it back again and that worked too (i.e. no 100% CPU usage). However after restarting the box and repeating the tests, both produced the 100% CPU usage issue. What I suspect happened is that somehow the tap device was already present and thus openvpn did not need to create one. I then tried to use the cloned_interfaces=tap0 to see if that would create a tap device for me at boot time. But the server hung similarly to when openvpn uses 100% CPU time more or less where I would expect it to initialize the NICs. Unfortunately I did not think to check if the tap device was present before testing it without the persist setting. So it's a bit of mystery. I have lost the box I was testing on (off to a client) and will only be able to setup a new one to do testing on next week. So I will feed back with any new findings. Cheers Emile ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Openvpn tap uses 99% cpu time
Emile Coetzee wrote: Okay I finally have a ktrace of the offending process. You can view it here: http://www.clarotech.co.za/dump/openvpn2.txt Thanks for this. If this is the correct trace, of the correct process, then it looks like OpenVPN is hanging immediately on opening the tap device. In a quick look at the code, nothing jumps out at me which is out of the ordinary about the locking during clone and open. It would have been useful to have the ps output to correlate with in case the process was spinning trying to acquire a spin lock, which would show up in the WCHAN field. One thing that does jump out at me is the use of the persist-tun keyword. Can you try removing the use of this keyword? It is something I've never had to use with OpenVPN. To the best of my knowledge there have been no other reports of this problem on FreeBSD. I'm sorry there is just very little hard information to go on here at the moment. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Openvpn tap uses 99% cpu time
I have done some more investigation. Rolling back to RELENG_6_2 solves the problem. I have now had this problem on 3 boxes in 2 days and have been able to reproduce it on a 4th in our lab. Starting openvpn with full debug it stops just before the point where the TAP devices gets initiated. And then the output from top shows openvpn on 99% cpu time (well almost). RELENG_6 [snip] Thu Mar 15 08:27:39 2007 us=333199 OpenVPN 2.0.6 i386-portbld-freebsd6.2 [SSL] [LZO] built on Mar 14 2007 Thu Mar 15 08:27:39 2007 us=344375 Diffie-Hellman initialized with 1024 bit key Thu Mar 15 08:27:39 2007 us=345355 MTU DYNAMIC mtu=0, flags=1, 0 - 138 Thu Mar 15 08:27:39 2007 us=345376 TLS-Auth MTU parms [ L:1574 D:138 EF:38 EB:0 ET:0 EL:0 ] Thu Mar 15 08:27:39 2007 us=345390 MTU DYNAMIC mtu=1450, flags=2, 1574 - 1450 RELENG_6_2 [snip] Thu Mar 15 07:18:23 2007 us=437582 OpenVPN 2.0.6 i386-portbld-freebsd6.2 [SSL] [LZO] built on Mar 14 2007 Thu Mar 15 07:18:23 2007 us=448672 Diffie-Hellman initialized with 1024 bit key Thu Mar 15 07:18:23 2007 us=449653 MTU DYNAMIC mtu=0, flags=1, 0 - 138 Thu Mar 15 07:18:23 2007 us=449676 TLS-Auth MTU parms [ L:1574 D:138 EF:38 EB:0 ET:0 EL:0 ] Thu Mar 15 07:18:23 2007 us=449692 MTU DYNAMIC mtu=1450, flags=2, 1574 - 1450 Thu Mar 15 07:18:23 2007 us=449759 TUN/TAP device /dev/tap0 opened Thu Mar 15 07:18:23 2007 us=449832 Data Channel MTU parms [ L:1574 D:1450 EF:42 EB:135 ET:32 EL:0 AF:3/1 ] Thu Mar 15 07:18:23 2007 us=450475 GID set to nobody Thu Mar 15 07:18:23 2007 us=450504 UID set to nobody Thu Mar 15 07:18:23 2007 us=450532 Socket Buffers: R=[42080-65536] S=[9216-65536] Thu Mar 15 07:18:23 2007 us=450552 UDPv4 link local (bound): [undef]:1195 Thu Mar 15 07:18:23 2007 us=450567 UDPv4 link remote: [undef] Thu Mar 15 07:18:23 2007 us=450586 MULTI: multi_init called, r=256 v=256 Thu Mar 15 07:18:23 2007 us=450623 IFCONFIG POOL: base=192.168.251.200 size=11 Thu Mar 15 07:18:23 2007 us=450649 IFCONFIG POOL LIST Thu Mar 15 07:18:23 2007 us=450670 PO_INIT maxevents=4 flags=0x0002 Thu Mar 15 07:18:23 2007 us=450688 Initialization Sequence Completed Thu Mar 15 07:18:23 2007 us=450702 SCHEDULE: schedule_find_least NULL Thu Mar 15 07:18:23 2007 us=450718 PO_CTL rwflags=0x0001 ev=5 arg=0x080891c0 Thu Mar 15 07:18:23 2007 us=450733 PO_CTL rwflags=0x0001 ev=6 arg=0x080891c4 Thu Mar 15 07:18:23 2007 us=450751 I/O WAIT TR|Tw|SR|Sw [10/0] last pid: 1066; load averages: 0.49, 0.12, 0.04up 0+00:48:00 08:28:16 31 processes: 2 running, 29 sleeping CPU states: 0.0% user, 0.0% nice, 98.5% system, 1.5% interrupt, 0.0% idle Mem: 11M Active, 7956K Inact, 17M Wired, 11M Buf, 448M Free Swap: 1024M Total, 1024M Free PID USERNAME THR PRI NICE SIZERES STATETIME WCPU COMMAND 1066 root1 1220 2984K 2436K RUN 0:36 96.72% openvpn 786 root1 960 3056K 1900K select 0:00 0.00% ntpd [snip] Vlad Galu: As this is a tap server there is no remote end point to connect to. It is waiting for a windows client to connect. As you can see from the openvpn logs the tap device is not even getting initiated. Luigi Rizzo: I'm no programmer, so suggestions on how to fix it are welcome ;) Regards Emile ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Openvpn tap uses 99% cpu time
Emile Coetzee wrote: I have done some more investigation. Rolling back to RELENG_6_2 solves the problem. I have now had this problem on 3 boxes in 2 days and have been able to reproduce it on a 4th in our lab. Starting openvpn with full debug it stops just before the point where the TAP devices gets initiated. And then the output from top shows openvpn on 99% cpu time (well almost). The recent changes to tap(4) were tested with openvpn and ppp before committing, and this is an isolated report at the moment. Other users on the list have suggested that the problem you are reporting does not affect FreeBSD alone. Please consider doing the following: 1. Use ktrace(8) to extract a kdump of openvpn as it runs into this problem, so that you may determine exactly which syscalls are invoked when you see this problem, and post it on this list. Consider using the ktrace pid mode to attach to openvpn as it blocks, as opposed to running openvpn under ktrace as a child process. 2. Implementing the usleep() fix as discussed by luigi, as the information in point 1 will give clues as to where the code is spinning. Regards, BMS ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Openvpn tap uses 99% cpu time
Hi Bruce I have run a ktrace as you suggested I put the following in a script /usr/local/sbin/openvpn --daemon --config /usr/local/etc/openvpn/tapserver.conf and called it tapserver.sh and then used ktrace -d -f openvpn.dump ./tapserver.sh This captured all the information for the parent PID 1082 I then watched top in a second session and identified the PID (1083) of the openvpn process that was using up the CPU time. I then ran ktrace -p 1083 -f openvpn2.dump This produced no data at all. The dump is fairly large so I have dropped it here: http://pastebin.com/899318 If required/allowed I can post it to the list. Regards Emile ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Openvpn tap uses 99% cpu time
Emile Coetzee wrote: I then ran ktrace -p 1083 -f openvpn2.dump This produced no data at all. Shouldn't. :-) You need to run ktrace then kdump to dump the trace. Correlate the cpu spin with the wallclock time in the kdump output, and paste an excerpt of what you get -- it sounds like it is spinning on a syscall. I can't seem to get the stuff from pastebin, it just times out, so ok to post an excerpt here. Thanks, BMS ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Openvpn tap uses 99% cpu time
I then ran ktrace -p 1083 -f openvpn2.dump This produced no data at all. Shouldn't. :-) You need to run ktrace then kdump to dump the trace. I figured that part out on my own thanks ;) The second file is 0kb in size running a kdump on it produces no output. Correlate the cpu spin with the wallclock time in the kdump output, and paste an excerpt of what you get -- it sounds like it is spinning on a syscall. I have no idea what exactly you want me to do here. Point me at some reading material if you like or show me some examples. I can't seem to get the stuff from pastebin, it just times out, so ok to post an excerpt here. No worries below is the last part of the first ktrace. I have dropped a full copy on our webserver: http://www.clarotech.co.za/dump/openvpn.txt 1082 sh RET read 4096/0x1000 1082 sh CALL mmap(0,0xea000,0x5,0x20002,0x3,0,0,0) 1082 sh RET mmap 672038912/0x280e8000 1082 sh CALL mprotect(0x281b5000,0x1000,0x7) 1082 sh RET mprotect 0 1082 sh CALL mprotect(0x281b5000,0x1000,0x5) 1082 sh RET mprotect 0 1082 sh CALL mmap(0x281b6000,0x6000,0x3,0x12,0x3,0,0xcd000,0) 1082 sh RET mmap 672882688/0x281b6000 1082 sh CALL mmap(0x281bc000,0x16000,0x3,0x1012,0x,0,0,0) 1082 sh RET mmap 672907264/0x281bc000 1082 sh CALL close(0x3) 1082 sh RET close 0 1082 sh CALL access(0x28092000,0) 1082 sh NAMI /lib/libncurses.so.6 1082 sh RET access 0 1082 sh CALL sysarch(0xa,0xbfbfec60) 1082 sh RET sysarch 0 1082 sh CALL mmap(0,0x440,0x3,0x1000,0x,0,0,0) 1082 sh RET mmap 672997376/0x281d2000 1082 sh CALL munmap(0x281d2000,0x440) 1082 sh RET munmap 0 1082 sh CALL mmap(0,0x3f8,0x3,0x1000,0x,0,0,0) 1082 sh RET mmap 672997376/0x281d2000 1082 sh CALL munmap(0x281d2000,0x3f8) 1082 sh RET munmap 0 1082 sh CALL mmap(0,0x1208,0x3,0x1000,0x,0,0,0) 1082 sh RET mmap 672997376/0x281d2000 1082 sh CALL munmap(0x281d2000,0x1208) 1082 sh RET munmap 0 1082 sh CALL mmap(0,0x58d8,0x3,0x1000,0x,0,0,0) 1082 sh RET mmap 672997376/0x281d2000 1082 sh CALL munmap(0x281d2000,0x58d8) 1082 sh RET munmap 0 1082 sh CALL sigprocmask(0x1,0x28087980,0xbfbfec30) 1082 sh RET sigprocmask 0 1082 sh CALL sigprocmask(0x3,0x28087990,0) 1082 sh RET sigprocmask 0 1082 sh CALL getpid 1082 sh RET getpid 1082/0x43a 1082 sh CALL geteuid 1082 sh RET geteuid 0 1082 sh CALL getppid 1082 sh RET getppid 1006/0x3ee 1082 sh CALL readlink(0x281ade4b,0xbfbfe810,0x3f) 1082 sh NAMI /etc/malloc.conf 1082 sh RET readlink -1 errno 2 No such file or directory 1082 sh CALL issetugid 1082 sh RET issetugid 0 1082 sh CALL mmap(0,0x1000,0x3,0x1002,0x,0,0,0) 1082 sh RET mmap 672997376/0x281d2000 1082 sh CALL break(0x8066000) 1082 sh RET break 0 1082 sh CALL break(0x8067000) 1082 sh RET break 0 1082 sh CALL break(0x8068000) 1082 sh RET break 0 1082 sh CALL stat(0x8067004,0xbfbfebb0) 1082 sh NAMI /var/mail/support 1082 sh RET stat 0 1082 sh CALL getuid 1082 sh RET getuid 0 1082 sh CALL geteuid 1082 sh RET geteuid 0 1082 sh CALL getgid 1082 sh RET getgid 0 1082 sh CALL getegid 1082 sh RET getegid 0 1082 sh CALL open(0xbfbfee07,0,0x28066b71) 1082 sh NAMI ./tapserver.sh 1082 sh RET open 3 1082 sh CALL fcntl(0x3,0,0xa) 1082 sh RET fcntl 10/0xa 1082 sh CALL close(0x3) 1082 sh RET close 0 1082 sh CALL fcntl(0xa,0x2,0x1) 1082 sh RET fcntl 0 1082 sh CALL sigaction(0x2,0,0xbfbfebb0) 1082 sh RET sigaction 0 1082 sh CALL sigaction(0x2,0xbfbfebb0,0xbfbfeb90) 1082 sh RET sigaction 0 1082 sh CALL sigaction(0x2,0,0xbfbfebb0) 1082 sh RET sigaction 0 1082 sh CALL sigaction(0x2,0xbfbfebb0,0) 1082 sh RET sigaction 0 1082 sh CALL sigaction(0x3,0,0xbfbfebb0) 1082 sh RET sigaction 0 1082 sh CALL sigaction(0x3,0xbfbfebb0,0xbfbfeb90) 1082 sh RET sigaction 0 1082 sh CALL sigaction(0x3,0,0xbfbfebb0) 1082 sh RET sigaction 0 1082 sh CALL sigaction(0x3,0xbfbfebb0,0) 1082 sh RET sigaction 0 1082 sh CALL sigaction(0xf,0,0xbfbfebb0) 1082 sh RET sigaction 0 1082 sh CALL sigaction(0xf,0xbfbfebb0,0xbfbfeb90) 1082 sh RET sigaction 0 1082 sh CALL sigaction(0x1c,0,0xbfbfebb0) 1082 sh RET sigaction 0 1082 sh CALL
Re: Openvpn tap uses 99% cpu time
Emile Coetzee wrote: No worries below is the last part of the first ktrace. I have dropped a full copy on our webserver: http://www.clarotech.co.za/dump/openvpn.txt Hi, Still no diagnostic info I'm afraid. What is needed is for the problem to be reproduced, with the openvpn process which is spinning under ktrace. You might need to hack your script to do this i.e. run openvpn directly from ktrace. Let it run for a bit, let it spin, then kill it. Check that you have kdump output from the openvpn process itself. Use the -T switch of kdump to get absolute timestamps, watch the wallclock time on your system when the spin happens, and post an excerpt of the kdump output at that time, which should tell us where openvpn is spinning. Thanks, BMS ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Openvpn tap uses 99% cpu time
Emile Coetzee wrote: No worries below is the last part of the first ktrace. I have dropped a full copy on our webserver: http://www.clarotech.co.za/dump/openvpn.txt Hi, Still no diagnostic info I'm afraid. What is needed is for the problem to be reproduced, with the openvpn process which is spinning under ktrace. You might need to hack your script to do this i.e. run openvpn directly from ktrace. Let it run for a bit, let it spin, then kill it. Check that you have kdump output from the openvpn process itself. Use the -T switch of kdump to get absolute timestamps, watch the wallclock time on your system when the spin happens, and post an excerpt of the kdump output at that time, which should tell us where openvpn is spinning. The problem is that I can't kill openvpn once the spin starts. I can do very little else. Even trying to login on the console does not go any further than letting me enter a username and password and then it just sits. Even Ctrl-C on running openvpn interactively on the command line does not work. I will try some of your suggestions and post back. Regards Emile ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Openvpn tap uses 99% cpu time
Still no diagnostic info I'm afraid. What is needed is for the problem to be reproduced, with the openvpn process which is spinning under ktrace. You might need to hack your script to do this i.e. run openvpn directly from ktrace. Let it run for a bit, let it spin, then kill it. Check that you have kdump output from the openvpn process itself. Use the -T switch of kdump to get absolute timestamps, watch the wallclock time on your system when the spin happens, and post an excerpt of the kdump output at that time, which should tell us where openvpn is spinning. Okay I finally have a ktrace of the offending process. You can view it here: http://www.clarotech.co.za/dump/openvpn2.txt Regards Emile ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Openvpn tap uses 99% cpu time
Since the latest updates to sys/net/if_tap.c (I suspect) in 6.2-STABLE my openvpn tap server is using up all available CPU time (99%) effectively killing the box. I replicated this on a second machine which was about 3 weeks behind with a cvsup to the latest from RELENG_6 and after rebuilding the kernel it does the same thing. I have portupgraded openvpn to the latest release (openvpn-2.0.6_7), but this had not made any difference. Is anyone else experiencing similar problems? Regards Emile ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Openvpn tap uses 99% cpu time
On 3/14/07, Emile Coetzee [EMAIL PROTECTED] wrote: Since the latest updates to sys/net/if_tap.c (I suspect) in 6.2-STABLE my openvpn tap server is using up all available CPU time (99%) effectively killing the box. I replicated this on a second machine which was about 3 weeks behind with a cvsup to the latest from RELENG_6 and after rebuilding the kernel it does the same thing. I have portupgraded openvpn to the latest release (openvpn-2.0.6_7), but this had not made any difference. Is anyone else experiencing similar problems? It usually happens when OpenVPN is told to retry connecting to the remote endpoint and the remote endpoint isn't accessible due to network conditions. It happens the same even on Windows. Regards Emile ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] -- If it's there, and you can see it, it's real. If it's not there, and you can see it, it's virtual. If it's there, and you can't see it, it's transparent. If it's not there, and you can't see it, you erased it. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Openvpn tap uses 99% cpu time
On Wed, Mar 14, 2007 at 03:11:58PM +0200, Emile Coetzee wrote: Since the latest updates to sys/net/if_tap.c (I suspect) in 6.2-STABLE my openvpn tap server is using up all available CPU time (99%) effectively killing the box. I replicated this on a second machine which was about 3 weeks behind with a cvsup to the latest from RELENG_6 and after rebuilding the kernel it does the same thing. I have portupgraded openvpn to the latest release (openvpn-2.0.6_7), but this had not made any difference. Is anyone else experiencing similar problems? I haven't seen this on either end of our OpenVPN setup. Ours: endpoint A: FreeBSD eos.sc1.parodius.com 6.2-STABLE FreeBSD 6.2-STABLE #0: Thu Mar 8 10:41:09 PST 2007 openvpn-2.0.6_7 Secure IP/Ethernet tunnel daemon bridge0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500 ether 2e:2e:ec:9c:76:02 id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 proto stp maxaddr 100 timeout 1200 root id 00:00:00:00:00:00 priority 0 ifcost 0 port 0 member: tap0 flags=143LEARNING,DISCOVER,AUTOEDGE,AUTOPTP member: bge1 flags=143LEARNING,DISCOVER,AUTOEDGE,AUTOPTP tap0: flags=8943UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST mtu 1500 ether 00:bd:22:14:00:00 Opened by PID 36044 endpoint B: FreeBSD medusa.parodius.com 6.1-STABLE FreeBSD 6.1-STABLE #0: Sat Aug 5 09:36:02 PDT 2006 openvpn-2.0.6_4 Secure IP/Ethernet tunnel daemon bridge0: flags=8043UP,BROADCAST,RUNNING,MULTICAST mtu 1500 ether e6:7a:00:c2:48:e5 priority 32768 hellotime 2 fwddelay 15 maxage 20 member: tap0 flags=3LEARNING,DISCOVER member: em1 flags=3LEARNING,DISCOVER tap0: flags=8943UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST mtu 1500 ether 00:bd:a4:27:00:00 Opened by PID 849 One thing I am noticing, however, is that on endpoint A the pid tap0 claims to be associated with is incorrect. My guess is that this is due to some fork() stuff that's going on. tap0 claims the wrong pid, but the pidfile has the correct pid. eos# ps -auxw | grep 36044 eos# eos# ps -auxw | grep openvpn root36048 0.0 0.1 3248 2644 ?? Ss7:03AM 0:00.02 /usr/local/sbin/openvpn --cd ... eos# cat /var/run/openvpn.pid 36048 While on endpoint B: medusa# ps -auxw | grep 849 root 849 0.0 0.2 3604 2268 ?? Ss3Feb07 2:18.63 /usr/local/sbin/openvpn --cd ... medusa# cat /var/run/openvpn.pid 849 -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networkinghttp://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Openvpn tap uses 99% cpu time
On Wed, Mar 14, 2007 at 03:48:38PM +0200, Vlad GALU wrote: On 3/14/07, Emile Coetzee [EMAIL PROTECTED] wrote: Since the latest updates to sys/net/if_tap.c (I suspect) in 6.2-STABLE my openvpn tap server is using up all available CPU time (99%) effectively killing the box. ... It usually happens when OpenVPN is told to retry connecting to the remote endpoint and the remote endpoint isn't accessible due to network conditions. It happens the same even on Windows. something that often does the job is putting a usleep(1) right after the offending syscall (e.g. a select() returning with an error, or the like). This deschedules the current process for at least one tick, which is normally enough to change the busy-wait loop into one that still is busy-wait but only takes a tiny fraction of the cpu. Of course you want to usleep() only if there is a real error condition. cheers luigi ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]