Hi, The first thing I'd try would be to build with -O0 compilation flags to rule out compiler optimisations doing something strange.
Cheers, Matt > On Thu 19/3/2020, at 3:42 pm, Horshack <horsh...@live.com> wrote: > > Update - I cloned and built the dbclient source so I could enable the debug > tracing facility to get more information about the 'Bad hostkey signature'. > The intermittent failure is detected in recv_msg_kexdh_reply() -> > buf_rsa_verify() -> mp_cmd(). If I bypass the buf_rsa_verify() call then the > session proceeds normally without issue, which indicates everything else in > the key exchange is working 100% of the time. I'll dig deeper to see why the > signed host key sent by the server is wrong. > > From: Horshack > Sent: Wednesday, March 18, 2020 9:36 AM > To: dropbear@ucc.asn.au <dropbear@ucc.asn.au> > Subject: SSH key exchange fails 30-70% of the time on Netgear X4S R7800 > > Hi, > > I have a strange issue on my Netgear X4S R7800. Running either DD-WRT or > OpenWrt, approximately 30-70% of my SSH login attempts fail. For OpenSSH > clients the error reported is "error in libcrypto". For the PuTTY client the > error is more descriptive - "Signature from server's host key is invalid". > The failure occurs even when using the OpenSSH client built in to OpenWrt > itself (ie, SSH'ing into the router from the router via an existing remote > SSH session). > > The failure appears to be at the tail end of the key exchange, before > authentication. I've tried varying the cipher (aes128-ctr / aes256-ctr), the > MAC (hmac-sha1 / hmac-sha2-256), and the key exchange algo (curve25519-sha256 > / curve25519-sha...@libssh.org / diffie-hellman-group14-sha256 / > diffie-hellman-group14-sha1) but the intermittent failure still occurs. The > frequency of failure is about the same for all these configuration options > except for diffie-hellman-group14-sha256, which fails much more frequently - > it sometimes takes hundreds of attempts to succeed. Perhaps that will provide > a clue to the underlying cause. > > Once an SSH login succeeds the connection is stable. However if I initiate a > manual rekey operation via ~R then the key re-exchange fails. The router is > otherwise very stable with no noticeable issues. > > I'm an embedded firmware engineer but have never worked on DD-WRT/OpenWrt > firmware or dropbear. I have a conceptual understanding of the key exchange > algo but haven't looked at the actual code of any implementation including > Dropbear's. I'm seek ideas on how to troubleshoot this issue. Considering the > problem is intermittent I'm thinking it's some variant in the key > generation/exchange algorithm that's failing due to some issue with the > router, or a more remote possibility, an issue with the Dropbear > implementation. > > Here are pastebin links to the PuTTY full debug logs (w/raw data dumps) for > both the failure and success cases: > Failure Case: https://pastebin.com/MS2BtFmW <https://pastebin.com/MS2BtFmW> > Success Case: https://pastebin.com/c4j66Ga9 <https://pastebin.com/c4j66Ga9> > > The only message I see from dropbear for a failed connection attempt is: > > authpriv.info dropbear[15948]: Child connection from 192.168.1.249:54819 > authpriv.info dropbear[15948]: Exit before auth: Disconnect received > > > Thanks!