On Fri, Jun 26, 2015 at 04:59:32AM +0300, Sergey Ryazanov wrote: > Hello, > > during building l2tp tunnel with xl2tpd-1.3.1 I was faced with its too > low upload performance. When download, the speed is 20 mbit/s at > nearly 100% CPU utilization. CPU is Pentium D 930 3 GHz. When upload, > the speed is below 2 mbit/s at nearly zero CPU utilization. > > First, I examined xl2tpd code and did not find any potential issues. > Then I compiled it with -pg option and do a quick test with help of > iperf(1): 4 TCP flows, direction is toward the L2TP server, 2 min > test. Then I run gprof and got pretty strange output: > > Full output: http://pastebin.com/dm4wz6VK > granularity: each sample hit covers 4 byte(s) for 11.11% of 0.09 seconds > % cumulative self self total > time seconds seconds calls ms/call ms/call name > 77.8 0.07 0.07 24311 0.00 0.00 write [4] > 22.2 0.09 0.02 35897 0.00 0.00 _thread_sys_sendmsg [7] > 0.0 0.09 0.00 109136 0.00 0.00 memset [46] > 0.0 0.09 0.00 107997 0.00 0.00 ___errno [926] > 0.0 0.09 0.00 71740 0.00 0.00 read_packet [47] > 0.0 0.09 0.00 71726 0.00 0.00 _thread_sys_read [927] > 0.0 0.09 0.00 60154 0.00 0.00 > _thread_sys_gettimeofday [928] > 0.0 0.09 0.00 60114 0.00 0.00 process_signal [32] > 0.0 0.09 0.00 60113 0.00 0.00 build_fdset [29] > 0.0 0.09 0.00 60113 0.00 0.00 process_schedule [26] > 0.0 0.09 0.00 60113 0.00 0.00 select [48] > 0.0 0.09 0.00 35897 0.00 0.00 udp_xmit [6] > > Another one run also gives something strange: > > Full output: http://pastebin.com/vXvHPuyj > granularity: each sample hit covers 4 byte(s) for 14.29% of 0.07 seconds > % cumulative self self total > time seconds seconds calls ms/call ms/call name > 71.4 0.05 0.05 24348 0.00 0.00 write [4] > 14.3 0.06 0.01 60265 0.00 0.00 select [6] > 14.3 0.07 0.01 10 1.00 1.00 calloc [7] > 0.0 0.07 0.00 109324 0.00 0.00 memset [55] > 0.0 0.07 0.00 108285 0.00 0.00 ___errno [925] > 0.0 0.07 0.00 71997 0.00 0.00 read_packet [56] > 0.0 0.07 0.00 71980 0.00 0.00 _thread_sys_read [926] > 0.0 0.07 0.00 60282 0.00 0.00 > _thread_sys_gettimeofday [927] > 0.0 0.07 0.00 60266 0.00 0.00 process_signal [50] > 0.0 0.07 0.00 60265 0.00 0.00 build_fdset [49] > 0.0 0.07 0.00 60265 0.00 0.00 process_schedule [57] > 0.0 0.07 0.00 36020 0.00 0.00 _thread_sys_sendmsg > [928] > 0.0 0.07 0.00 36020 0.00 0.00 udp_xmit [58] > > For comparison I changed direction and did a download test: > > Full output: http://pastebin.com/FFF6xGcy > granularity: each sample hit covers 4 byte(s) for 0.01% of 114.30 seconds > % cumulative self self total > time seconds seconds calls ms/call ms/call name > 93.7 107.06 107.06 370804 0.29 0.29 write [5] > 1.8 109.08 2.02 370770 0.01 0.01 recvmsg [6] > 1.4 110.67 1.59 370799 0.00 0.00 > _thread_sys_gettimeofday [8] > 1.1 111.90 1.23 370770 0.00 0.30 handle_packet [4] > 1.0 113.01 1.11 370761 0.00 0.00 add_fcs [9] > 0.7 113.81 0.81 370781 0.00 0.00 select [10] > 0.2 114.03 0.21 1 210.00 114172.55 network_thread [3] > 0.1 114.14 0.12 mcount [11] > 0.0 114.19 0.05 370781 0.00 0.00 build_fdset [12] > 0.0 114.23 0.04 370771 0.00 0.00 get_call [13] > 0.0 114.27 0.03 1112701 0.00 0.00 memset [14] > 0.0 114.28 0.02 370781 0.00 0.00 process_schedule [7] > 0.0 114.30 0.01 29 0.34 0.34 _thread_sys_sendmsg [20] > 0.0 114.30 0.01 8 0.62 0.74 inet_ntoa [24] > 0.0 114.30 0.00 370785 0.00 0.00 swaps [110] > 0.0 114.30 0.00 370782 0.00 0.00 process_signal [41] > 0.0 114.30 0.00 370770 0.00 0.00 recycle_buf [111] > > During upload tests, everything looks like if xl2tpd doesn't perform > any work and stucks somewhere in I/O operation. > > May be there are some options, what could be tuned to speedup ppp(4) > I/O performance or did I missed something during my tests? I am in > doubts. Any clues? >
Can you get a ktrace output to figure out what write is doing? Could it be that it busy loops with EINTR or EAGAIN? It sure smells like something is going on there. -- :wq Claudio