Hi Thomas, Hui's analysis look right, I'll try and test it myself later this week. (Sorry, replied privately).
Cheers, Matt On 25 April 2016 11:15:58 pm AWST, Thomas De Schampheleire <patrickdeping...@gmail.com> wrote: >ZHANG Hui P <Hui.P.Zhang <at> alcatel-sbell.com.cn> writes: > >> >> >> >> Hi: >> I am a software engineer of Alcatel-Lucent. In our product >we use >dropbear v071 under the OS: Linux version 3.4.24. At most time it works >perfectly, but recently we got a problem: sometimes a child-process of >> dropbear occupied nearly 100% CPU (we use ARM1176, single-core). >After I >investigated it ,I found it is cause by a misuse of KEX_REKEY_TIMEOUT. >> KEX_REKEY_TIMEOUT is defined as 8hours. that means when a session >lasts >more than 8 hours, the server and client will re-exchange their KEY for >security reason. The timestamp of last-time >> KEY-EXCHANGED is recorded in variable "ses.kexstate.lastkextime". >> The child dropbear process decides the "timeout" parameter of >"select" >function by calling "select_timeout". we can see it checks the >timeout-events like KEX_REKEY_TIMEOUT, AUTH_TIMEOUT, >> keepalive_secs. If there is a timeout occurs, the "update_timeout" >function returns a negative value, then "select_timeout" modifies it to >ZREO >by this: >> /* clamp negative timeouts to zero - event has already triggered */ >> return MAX(timeout, 0); >> if "select_timeout" returns ZERO, the next "select" call (in >"session_loop") will return immediately. Then it will check timeout >events >by this: >> /* check for auth timeout, rekeying required etc */ >> checktimeouts(); >> in the function " checktimeouts ", when it find the timeout is >reached >or to many data has been sent, it will send a SSH_MSG_KEXINIT message >to >peer. Normally this message will trigger a new KEY-EXCHANGE. However, >> when there is a network problem that the peer can't receive the >message , >this bug occurs: the timestamp ses.kexstate.lastkextime is only updated >by >calling "switch_keys"-->" kexinitialise ", unfortunately this calling >sequence is driven by ssh-messages, >> either SSH_MSG_KEXDH_INIT or SSH_MSG_NEWKEYS. When there is no >ssh-message received , the child dropbear process enters dead-loop >"select" >with ZERO-timeout parameter caused by KEX_REKEY_TIMEOUT. >> > So there is a very simple way to reproduce this bug: first >define >the KEX_REKEY_TIMEOUT as small as possible( I set it to 8 seconds), >then >start a ssh-session , the child dropbear process is forked. then plug >> out the network wire, after 8 seconds the child dropbear thread will >occupy 100% CPU. Could you kindly check it? thanks. >> > >Any feedback regarding this reported issue? > >Thanks, >Thomas