Re[2]: Issues with TCP Timestamps allocation
Thanks for following through and making the patch! Kudos! 17 July 2019, 21:23:33, by "Michael Tuexen" : > > On 17. Jul 2019, at 09:42, Vitalij Satanivskij wrote: > > > > > > > > Hello. > > > > Is there any changes about this problem > Please find a patch in https://reviews.freebsd.org/D20980 > > If possible, please test and report. > > Best regards > Michael > > > > > > I'm using FreeBSD 12 on my desktop and can confirm problem occur with some > > hosts. > > > > > > > > Michael Tuexen wrote: > > MT> > > MT> > > MT> > On 9. Jul 2019, at 14:58, Paul wrote: > > MT> > > > MT> > Hi Michael, > > MT> > > > MT> > 9 July 2019, 15:34:29, by "Michael Tuexen" : > > MT> > > > MT> >> > > MT> >> > > MT> >>> On 8. Jul 2019, at 17:22, Paul wrote: > > MT> >>> > > MT> >>> > > MT> >>> > > MT> >>> 8 July 2019, 17:12:21, by "Michael Tuexen" : > > MT> >>> > > MT> > On 8. Jul 2019, at 15:24, Paul wrote: > > MT> > > > MT> > Hi Michael, > > MT> > > > MT> > 8 July 2019, 15:53:15, by "Michael Tuexen" : > > MT> > > > MT> >>> On 8. Jul 2019, at 12:37, Paul wrote: > > MT> >>> > > MT> >>> Hi team, > > MT> >>> > > MT> >>> Recently we had an upgrade to 12 Stable. Immediately after, we > > have started > > MT> >>> seeing some strange connection establishment timeouts to some > > fixed number > > MT> >>> of external (world) hosts. The issue was persistent and easy to > > reproduce. > > MT> >>> Thanks to a patience and dedication of our system engineer we > > have tracked > > MT> >>> this issue down to a specific commit: > > MT> >>> > > MT> >>> https://svnweb.freebsd.org/base?view=revision&revision=338053 > > MT> >>> > > MT> >>> This patch was also back-ported into 11 Stable: > > MT> >>> > > MT> >>> https://svnweb.freebsd.org/base?view=revision&revision=348435 > > MT> >>> > > MT> >>> Among other things this patch changes the timestamp allocation > > strategy, > > MT> >>> by introducing a deterministic randomness via a hash function > > that takes > > MT> >>> into account a random key as well as source address, source > > port, dest > > MT> >>> address and dest port. As the result, timestamp offsets of > > different > > MT> >>> tuples (SA,SP,DA,DP) will be wildly different and will jump > > from small > > MT> >>> to large numbers and back, as long as something in the tuple > > changes. > > MT> >> Hi Paul, > > MT> >> > > MT> >> this is correct. > > MT> >> > > MT> >> Please note that the same happens with the old method, if two > > hosts with > > MT> >> different uptimes are bind a consumer grade NAT. > > MT> > > > MT> > If NAT does not replace timestamps then yes, it should be the > > case. > > MT> > > > MT> >>> > > MT> >>> After performing various tests of hosts that produce the above > > mentioned > > MT> >>> issue we came to conclusion that there are some interesting > > implementations > > MT> >>> that drop SYN packets with timestamps smaller than the largest > > timestamp > > MT> >>> value from streams of all recent or current connections from a > > specific > > MT> >>> address. This looks as some kind of SYN flood protection. > > MT> >> This also breaks multiple hosts with different uptimes behind a > > consumer > > MT> >> level NAT talking to such a server. > > MT> >>> > > MT> >>> To ensure that each external host is not going to see a wild > > jumps of > > MT> >>> timestamp values I propose a patch that removes ports from the > > equation > > MT> >>> all together, when calculating the timestamp offset: > > MT> >>> > > MT> >>> Index: sys/netinet/tcp_subr.c > > MT> >>> > > === > > MT> >>> --- sys/netinet/tcp_subr.c (revision 348435) > > MT> >>> +++ sys/netinet/tcp_subr.c (working copy) > > MT> >>> @@ -2224,7 +2224,22 @@ > > MT> >>> uint32_t > > MT> >>> tcp_new_ts_offset(struct in_conninfo *inc) > > MT> >>> { > > MT> >>> - return (tcp_keyed_hash(inc, V_ts_offset_secret)); > > MT> >>> +/* > > MT> >>> + * Some implementations show a strange behaviour when > > a wildly random > > MT> >>> + * timestamps allocated for different streams. It > > seems that only the > > MT> >>> + * SYN packets are affected. Observed implementations > > drop SYN packets > > MT> >>> + * with timestamps smaller than the largest timestamp > > value of all > > MT> >>> + * recent or current connections from specific a > > address. To mitigate > > MT> >>> + * this we are going to ensure that each host will > > always observe > > MT> >>> + * timestamps as increasing no matter the stream: by > > dropping ports > > MT> >>> + * from the equation. > > MT> >>> +
Re[2]: Issues with TCP Timestamps allocation
Hi Michael, 9 July 2019, 15:34:29, by "Michael Tuexen" : > > > > On 8. Jul 2019, at 17:22, Paul wrote: > > > > > > > > 8 July 2019, 17:12:21, by "Michael Tuexen" : > > > >>> On 8. Jul 2019, at 15:24, Paul wrote: > >>> > >>> Hi Michael, > >>> > >>> 8 July 2019, 15:53:15, by "Michael Tuexen" : > >>> > > On 8. Jul 2019, at 12:37, Paul wrote: > > > > Hi team, > > > > Recently we had an upgrade to 12 Stable. Immediately after, we have > > started > > seeing some strange connection establishment timeouts to some fixed > > number > > of external (world) hosts. The issue was persistent and easy to > > reproduce. > > Thanks to a patience and dedication of our system engineer we have > > tracked > > this issue down to a specific commit: > > > > https://svnweb.freebsd.org/base?view=revision&revision=338053 > > > > This patch was also back-ported into 11 Stable: > > > > https://svnweb.freebsd.org/base?view=revision&revision=348435 > > > > Among other things this patch changes the timestamp allocation strategy, > > by introducing a deterministic randomness via a hash function that takes > > into account a random key as well as source address, source port, dest > > address and dest port. As the result, timestamp offsets of different > > tuples (SA,SP,DA,DP) will be wildly different and will jump from small > > to large numbers and back, as long as something in the tuple changes. > Hi Paul, > > this is correct. > > Please note that the same happens with the old method, if two hosts with > different uptimes are bind a consumer grade NAT. > >>> > >>> If NAT does not replace timestamps then yes, it should be the case. > >>> > > > > After performing various tests of hosts that produce the above > > mentioned > > issue we came to conclusion that there are some interesting > > implementations > > that drop SYN packets with timestamps smaller than the largest > > timestamp > > value from streams of all recent or current connections from a specific > > address. This looks as some kind of SYN flood protection. > This also breaks multiple hosts with different uptimes behind a consumer > level NAT talking to such a server. > > > > To ensure that each external host is not going to see a wild jumps of > > timestamp values I propose a patch that removes ports from the equation > > all together, when calculating the timestamp offset: > > > > Index: sys/netinet/tcp_subr.c > > === > > --- sys/netinet/tcp_subr.c (revision 348435) > > +++ sys/netinet/tcp_subr.c (working copy) > > @@ -2224,7 +2224,22 @@ > > uint32_t > > tcp_new_ts_offset(struct in_conninfo *inc) > > { > > - return (tcp_keyed_hash(inc, V_ts_offset_secret)); > > +/* > > + * Some implementations show a strange behaviour when a wildly > > random > > + * timestamps allocated for different streams. It seems that > > only the > > + * SYN packets are affected. Observed implementations drop SYN > > packets > > + * with timestamps smaller than the largest timestamp value of > > all > > + * recent or current connections from specific a address. To > > mitigate > > + * this we are going to ensure that each host will always > > observe > > + * timestamps as increasing no matter the stream: by dropping > > ports > > + * from the equation. > > + */ > > +struct in_conninfo inc_copy = *inc; > > + > > +inc_copy.inc_fport = 0; > > +inc_copy.inc_lport = 0; > > + > > + return (tcp_keyed_hash(&inc_copy, V_ts_offset_secret)); > > } > > > > /* > > > > In any case, the solution of the uptime leak, implemented in rev338053 > > is > > not going to suffer, because a supposed attacker is currently able to > > use > > any fixed values of SP and DP, albeit not 0, anyway, to remove them out > > of the equation. > Can you describe how a peer can compute the uptime from two observed > timestamps? > I don't see how you can do that... > >>> > >>> Supposed attacker could run a script that continuously monitors > >>> timestamps, > >>> for example via a periodic TCP connection from a fixed local port (eg > >>> 12345) > >>> and a fixed local address to the fixed victim's address and port (eg 80). > >>> Whenever large discrepancy is observed, attacker can assume that reboot > >>> has > >>> happened (due to V_ts_offset_secret re-generation), hence the received > >>> timestamp is considered an approximate point of reboot from which the > >>> uptime > >>> can be calculated, until the
Re[2]: Issues with TCP Timestamps allocation
8 July 2019, 17:12:21, by "Michael Tuexen" : > > On 8. Jul 2019, at 15:24, Paul wrote: > > > > Hi Michael, > > > > 8 July 2019, 15:53:15, by "Michael Tuexen" : > > > >>> On 8. Jul 2019, at 12:37, Paul wrote: > >>> > >>> Hi team, > >>> > >>> Recently we had an upgrade to 12 Stable. Immediately after, we have > >>> started > >>> seeing some strange connection establishment timeouts to some fixed number > >>> of external (world) hosts. The issue was persistent and easy to reproduce. > >>> Thanks to a patience and dedication of our system engineer we have > >>> tracked > >>> this issue down to a specific commit: > >>> > >>> https://svnweb.freebsd.org/base?view=revision&revision=338053 > >>> > >>> This patch was also back-ported into 11 Stable: > >>> > >>> https://svnweb.freebsd.org/base?view=revision&revision=348435 > >>> > >>> Among other things this patch changes the timestamp allocation strategy, > >>> by introducing a deterministic randomness via a hash function that takes > >>> into account a random key as well as source address, source port, dest > >>> address and dest port. As the result, timestamp offsets of different > >>> tuples (SA,SP,DA,DP) will be wildly different and will jump from small > >>> to large numbers and back, as long as something in the tuple changes. > >> Hi Paul, > >> > >> this is correct. > >> > >> Please note that the same happens with the old method, if two hosts with > >> different uptimes are bind a consumer grade NAT. > > > > If NAT does not replace timestamps then yes, it should be the case. > > > >>> > >>> After performing various tests of hosts that produce the above mentioned > >>> issue we came to conclusion that there are some interesting > >>> implementations > >>> that drop SYN packets with timestamps smaller than the largest timestamp > >>> value from streams of all recent or current connections from a specific > >>> address. This looks as some kind of SYN flood protection. > >> This also breaks multiple hosts with different uptimes behind a consumer > >> level NAT talking to such a server. > >>> > >>> To ensure that each external host is not going to see a wild jumps of > >>> timestamp values I propose a patch that removes ports from the equation > >>> all together, when calculating the timestamp offset: > >>> > >>> Index: sys/netinet/tcp_subr.c > >>> === > >>> --- sys/netinet/tcp_subr.c(revision 348435) > >>> +++ sys/netinet/tcp_subr.c(working copy) > >>> @@ -2224,7 +2224,22 @@ > >>> uint32_t > >>> tcp_new_ts_offset(struct in_conninfo *inc) > >>> { > >>> - return (tcp_keyed_hash(inc, V_ts_offset_secret)); > >>> +/* > >>> + * Some implementations show a strange behaviour when a wildly > >>> random > >>> + * timestamps allocated for different streams. It seems that > >>> only the > >>> + * SYN packets are affected. Observed implementations drop SYN > >>> packets > >>> + * with timestamps smaller than the largest timestamp value of > >>> all > >>> + * recent or current connections from specific a address. To > >>> mitigate > >>> + * this we are going to ensure that each host will always > >>> observe > >>> + * timestamps as increasing no matter the stream: by dropping > >>> ports > >>> + * from the equation. > >>> + */ > >>> +struct in_conninfo inc_copy = *inc; > >>> + > >>> +inc_copy.inc_fport = 0; > >>> +inc_copy.inc_lport = 0; > >>> + > >>> + return (tcp_keyed_hash(&inc_copy, V_ts_offset_secret)); > >>> } > >>> > >>> /* > >>> > >>> In any case, the solution of the uptime leak, implemented in rev338053 is > >>> not going to suffer, because a supposed attacker is currently able to use > >>> any fixed values of SP and DP, albeit not 0, anyway, to remove them out > >>> of the equation. > >> Can you describe how a peer can compute the uptime from two observed > >> timestamps? > >> I don't see how you can do that... > > > > Supposed attacker could run a script that continuously monitors timestamps, > > for example via a periodic TCP connection from a fixed local port (eg > > 12345) > > and a fixed local address to the fixed victim's address and port (eg 80). > > Whenever large discrepancy is observed, attacker can assume that reboot has > > happened (due to V_ts_offset_secret re-generation), hence the received > > timestamp is considered an approximate point of reboot from which the uptime > > can be calculated, until the next reboot and so on. > Ahh, I see. The patch we are talking about is not intended to protect against > continuous monitoring, which is something you can always do. You could even > watch for service availability and detect reboots. A change of the local key > would also look similar to a reboot without a temporary loss of connectivity. > > Thanks for the clarification. > > > >>> > >>> There is the list of exampl
Re[2]: Issues with TCP Timestamps allocation
Hi Michael, 8 July 2019, 15:53:15, by "Michael Tuexen" : > > On 8. Jul 2019, at 12:37, Paul wrote: > > > > Hi team, > > > > Recently we had an upgrade to 12 Stable. Immediately after, we have started > > seeing some strange connection establishment timeouts to some fixed number > > of external (world) hosts. The issue was persistent and easy to reproduce. > > Thanks to a patience and dedication of our system engineer we have tracked > > this issue down to a specific commit: > > > > https://svnweb.freebsd.org/base?view=revision&revision=338053 > > > > This patch was also back-ported into 11 Stable: > > > > https://svnweb.freebsd.org/base?view=revision&revision=348435 > > > > Among other things this patch changes the timestamp allocation strategy, > > by introducing a deterministic randomness via a hash function that takes > > into account a random key as well as source address, source port, dest > > address and dest port. As the result, timestamp offsets of different > > tuples (SA,SP,DA,DP) will be wildly different and will jump from small > > to large numbers and back, as long as something in the tuple changes. > Hi Paul, > > this is correct. > > Please note that the same happens with the old method, if two hosts with > different uptimes are bind a consumer grade NAT. If NAT does not replace timestamps then yes, it should be the case. > > > > After performing various tests of hosts that produce the above mentioned > > issue we came to conclusion that there are some interesting implementations > > that drop SYN packets with timestamps smaller than the largest timestamp > > value from streams of all recent or current connections from a specific > > address. This looks as some kind of SYN flood protection. > This also breaks multiple hosts with different uptimes behind a consumer > level NAT talking to such a server. > > > > To ensure that each external host is not going to see a wild jumps of > > timestamp values I propose a patch that removes ports from the equation > > all together, when calculating the timestamp offset: > > > > Index: sys/netinet/tcp_subr.c > > === > > --- sys/netinet/tcp_subr.c (revision 348435) > > +++ sys/netinet/tcp_subr.c (working copy) > > @@ -2224,7 +2224,22 @@ > > uint32_t > > tcp_new_ts_offset(struct in_conninfo *inc) > > { > > - return (tcp_keyed_hash(inc, V_ts_offset_secret)); > > +/* > > + * Some implementations show a strange behaviour when a wildly > > random > > + * timestamps allocated for different streams. It seems that only > > the > > + * SYN packets are affected. Observed implementations drop SYN > > packets > > + * with timestamps smaller than the largest timestamp value of all > > + * recent or current connections from specific a address. To > > mitigate > > + * this we are going to ensure that each host will always observe > > + * timestamps as increasing no matter the stream: by dropping ports > > + * from the equation. > > + */ > > +struct in_conninfo inc_copy = *inc; > > + > > +inc_copy.inc_fport = 0; > > +inc_copy.inc_lport = 0; > > + > > + return (tcp_keyed_hash(&inc_copy, V_ts_offset_secret)); > > } > > > > /* > > > > In any case, the solution of the uptime leak, implemented in rev338053 is > > not going to suffer, because a supposed attacker is currently able to use > > any fixed values of SP and DP, albeit not 0, anyway, to remove them out > > of the equation. > Can you describe how a peer can compute the uptime from two observed > timestamps? > I don't see how you can do that... Supposed attacker could run a script that continuously monitors timestamps, for example via a periodic TCP connection from a fixed local port (eg 12345) and a fixed local address to the fixed victim's address and port (eg 80). Whenever large discrepancy is observed, attacker can assume that reboot has happened (due to V_ts_offset_secret re-generation), hence the received timestamp is considered an approximate point of reboot from which the uptime can be calculated, until the next reboot and so on. > > > > There is the list of example hosts that we were able to reproduce the > > issue with: > > > > curl -v http://88.99.60.171:80 > > curl -v http://163.172.71.252:80 > > curl -v http://5.9.242.150:80 > > curl -v https://185.134.205.105:443 > > curl -v https://136.243.1.231:443 > > curl -v https://144.76.196.4:443 > > curl -v http://94.127.191.194:80 > > > > To reproduce, call curl repeatedly with a same URL some number of times. > > You are going to see some of the requests stuck in > > `*Trying XXX.XXX.XXX.XXX...` > > > > For some reason, the easiest way to reproduce the issue is with nc: > > > > $ echo "foo" | nc -v 88.99.60.171 80 > > > > Only a few such calls are required until one of them is stuck on connect(): > > issuing SYN packets with an e