Hello Tim, Thanks for your answer. Indeed it's a very plausible explanation. And in my case I do have some clients very frequently establishing/aborting connections to all of the 5 nodes, which is increasing the odds of running in the race condition and underflow issues.
However, the underflow scenario only seem to be possible if the peers are sending relative values, rather than absolute ones. Apparently both cases (absolut and offset values) exist. I am looking at src/peers.c to understand how the peer protocol works and maybe create the patch you proposed (do not decrement counter if already 0). However it seems that a real fix would require some big changes on the protocol itself. One potencial implementation I could imagine, would be to, rather than broadcasting absolute values or offsets, each neighbor peer could report the amount of connection it has locally only, and it would be up to the local node to resolve the actual value by adding up the different values received from all neighbors. Not even sure if my understading is correct, but it's task currently out of my reach. Should I do a bug report somewhere? :) BR., Emerson Em qui, 3 de jan de 2019 às 15:49, Tim Düsterhus <[email protected]> escreveu: > Emerson, > > Am 03.01.19 um 16:19 schrieb Emerson Gomes: > > This works fine most of the time, but every now and then, when I check > the > > stick table contents, one or more IPs show up with an absurd number of > > cunn_cur - Often around 4 Billion entries - A number very close to > > the 32-bit unsigned int data type limit. > > > > That looks like an integer underflow and a limitation of the peer > protocol. If I understand it correctly the peer protocol always sends an > absolute value to it's peers, instead of a relative modification > operation such as "value++". While I'm not able to cause a connection to > be decremented twice I am able to cause some connections to never be > decremented because of a race condition: > > - Connect to peer A (A=1, B=0) > - Peer A sends 1 to B (A=1, B=1) > - Connect to peer B (A=1, B=2) > - Peer B sends 2 to A (A=2, B=2) > - Kill both connections at the same time (A=1, B=1) > - Peer A sends 1 to B (A=1, B=1) > - Peer B sends 1 to A (A=1, B=1) > > There are no connections remaining, but both peers believe that there > still is one connection. > > To cause the underflow you are seeing I imagine the following happens, > but I don't manage to get the timing right. > > - Connect to peer A (A=1, B=0) > - Peer A sends 1 to B (A=1, B=1) > - Kill connection to A (A=0, B=1) > - Connect to peer B (A=0, B=2) > - Peer A sends 0 to B (A=0, B=0) > - Peer B sends 0/2 to A (A=?, B=0) > - Kill connection to B (A=?, B=-1) > - Peer B sends -1 to A (A=-1, B=-1) > > An easy fix would probably be skipping the decrement if the value is > already 0. The counter will be off either way, though. > > Best regards > Tim Düsterhus >

