Re: Fuzz, Numbers
Thanks. > and without 'limited' on ~5kpps I have 8-10% CPU regardless minitoring > enabled/disabled. About 1% on 1000pps. Is that within reason or worth investigating? 1% times 5 should be 5% rather than 8-10% but there may not be enough significant digits in any of the numbers. > For those who want to process hundreds of thousands of requests per second > (like 'national standard' servers) you can use multithreading and multiply > power of server. The current code isn't setup for threads. I think with a bit of work, we could get multiple threads on the server side. On an Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz I can get 330K packets per second. 258K with AES CMAC. I don't have NTS numbers yet. -- These are my opinions. I hate spam. ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel
Re: Fuzz, Numbers
and without 'limited' on ~5kpps I have 8-10% CPU regardless minitoring enabled/disabled. About 1% on 1000pps. (Hardware is old MS-9258 server, CPU Quad CPU Q940, FreeBSD 12.1) As I see many limited queries really sourced from NAT, and we cannot determine whether they are correct or not. So for production server better not have 'limited' or have limited to tens queries per second. And maybe limit only ip:source port, not only 'per ip' because we have different source ports from NAT and identical port on "dumb" clients. But we cannot set such settings.. To protect against participation in DDoS, you can use traffic restriction with a firewall to 1-5Mbit/s. Every 1k queries takes <1Mbps of bandwidth. For those who want to process hundreds of thousands of requests per second (like 'national standard' servers) you can use multithreading and multiply power of server. As I know professional solutions like Meinberg Lantime can run multithreading, but no opensource daemons can do it. NTPPoll community have poses about good expirience with https://github.com/mlichvar/rsntp (look at https://community.ntppool.org/t/can-i-incrase-number-of-threads-to-use-in-ntpd-proccess/1159/20) . Maybe when there will be absolutely nothing to do you can write some proxy-balancer that solves this task as official utility :) Have a nice day! -- Mike Yurlov 09.01.2020 13:52, Mike Yurlov via devel пишет: Hi, Hal! I build ntpd from latest sources tonight. CPU load drops from 18-20% average to 5-6% on my ~3-4k pps. Looks perfect! If you get race with "init before config read", you can create build option for the init size of the mrulist. Here the stats from nigth to 13:00 (GMT+3): recieded 173 647 480 packets, 3.1kpps average (real from 2.5 to 6kpps i see on network interface), 1.8% bad, 21% ratelimited, 77% processed ntpq> sysstats uptime: 55394 sysstats reset: 55394 packets received: 173647480 current version: 76272783 older version: 57692039 control requests: 1516 bad length or format: 3287409 authentication failed: 3955 declined: 3199 restricted: 388 rate limited: 36398991 KoD responses: 0 processed for time: 133953537 ntpq> monstats enabled: 2 hash slots in use: 158963 addresses in use: 290909 peak addresses: 290909 maximum addresses: 290909 reclaim above count: 600 reclaim maxage: 250 reclaim minage: 240 kilobytes: 25000 maximum kilobytes: 25000 alloc: exists: 133311968 alloc: new: 290909 alloc: recycle old: 35498556 alloc: recycle full: 1162596 alloc: none: 150665 age of oldest slot: 240 Some request strange and I don't know is this NAT or not. This one looks like many clients over NAT 13:17:31.160400 IP 90.188.255.3.42962 > x.x.x.x.123: NTPv4, Client, length 48 13:17:31.312476 IP 90.188.255.3.51241 > x.x.x.x.123: NTPv4, Client, length 48 13:17:31.482878 IP 90.188.255.3.55666 > x.x.x.x.123: NTPv4, Client, length 48 13:17:31.570783 IP 90.188.255.3.38018 > x.x.x.x.123: NTPv4, Client, length 48 13:17:31.596582 IP 90.188.255.3.36581 > x.x.x.x.123: NTPv4, Client, length 48 13:17:31.776522 IP 90.188.255.3.42962 > x.x.x.x.123: NTPv4, Client, length 48 13:17:31.928548 IP 90.188.255.3.51241 > x.x.x.x.123: NTPv4, Client, length 48 But than it looks like woodpecker :) 13:19:24.257556 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:24.917559 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:25.533525 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:26.157515 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:26.769554 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:27.381551 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:28.001559 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:28.617574 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:29.237470 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:29.853630 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:30.469565 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:31.081622 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:31.705618 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:32.321652 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:32.945589 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:33.025639 IP 90.188.255.3.46163 > x.x.x.x.123: NTPv4, Client, length 48 13:19:33.573548 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:33.661612 IP 90.188.255.3.46163 > x.x.x.x.123: NTPv4, Client, length 48 13:19:34.193647 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:34.273687 IP
Re: Fuzz, Numbers
Hi, Hal! I build ntpd from latest sources tonight. CPU load drops from 18-20% average to 5-6% on my ~3-4k pps. Looks perfect! If you get race with "init before config read", you can create build option for the init size of the mrulist. Here the stats from nigth to 13:00 (GMT+3): recieded 173 647 480 packets, 3.1kpps average (real from 2.5 to 6kpps i see on network interface), 1.8% bad, 21% ratelimited, 77% processed ntpq> sysstats uptime: 55394 sysstats reset: 55394 packets received: 173647480 current version: 76272783 older version: 57692039 control requests: 1516 bad length or format: 3287409 authentication failed: 3955 declined: 3199 restricted: 388 rate limited: 36398991 KoD responses: 0 processed for time: 133953537 ntpq> monstats enabled: 2 hash slots in use: 158963 addresses in use: 290909 peak addresses: 290909 maximum addresses: 290909 reclaim above count: 600 reclaim maxage: 250 reclaim minage: 240 kilobytes: 25000 maximum kilobytes: 25000 alloc: exists: 133311968 alloc: new: 290909 alloc: recycle old: 35498556 alloc: recycle full: 1162596 alloc: none: 150665 age of oldest slot: 240 Some request strange and I don't know is this NAT or not. This one looks like many clients over NAT 13:17:31.160400 IP 90.188.255.3.42962 > x.x.x.x.123: NTPv4, Client, length 48 13:17:31.312476 IP 90.188.255.3.51241 > x.x.x.x.123: NTPv4, Client, length 48 13:17:31.482878 IP 90.188.255.3.55666 > x.x.x.x.123: NTPv4, Client, length 48 13:17:31.570783 IP 90.188.255.3.38018 > x.x.x.x.123: NTPv4, Client, length 48 13:17:31.596582 IP 90.188.255.3.36581 > x.x.x.x.123: NTPv4, Client, length 48 13:17:31.776522 IP 90.188.255.3.42962 > x.x.x.x.123: NTPv4, Client, length 48 13:17:31.928548 IP 90.188.255.3.51241 > x.x.x.x.123: NTPv4, Client, length 48 But than it looks like woodpecker :) 13:19:24.257556 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:24.917559 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:25.533525 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:26.157515 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:26.769554 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:27.381551 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:28.001559 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:28.617574 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:29.237470 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:29.853630 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:30.469565 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:31.081622 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:31.705618 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:32.321652 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:32.945589 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:33.025639 IP 90.188.255.3.46163 > x.x.x.x.123: NTPv4, Client, length 48 13:19:33.573548 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:33.661612 IP 90.188.255.3.46163 > x.x.x.x.123: NTPv4, Client, length 48 13:19:34.193647 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:34.273687 IP 90.188.255.3.46163 > x.x.x.x.123: NTPv4, Client, length 48 13:19:34.809651 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, length 48 13:19:34.897663 IP 90.188.255.3.46163 > x.x.x.x.123: NTPv4, Client, length 48 many clients look buggy or installed behind firewall. It request 3-5 times once per second, do 1-2 sec pause and repeat cycle. ntpd ratelimit it and reply once on every cycle, but it send request again and again. Many such clients make ~100k requests per day. I think to answer to such requests are a waste of hardware resources and network bandwidth worldwide. 13:27:02.246352 IP 77.222.101.171.123 > x.x.x.x.123: NTPv4, Client, length 48 13:27:02.246384 IP x.x.x.x.123 > 77.222.101.171.123: NTPv4, Server, length 48 13:27:02.278056 IP 77.222.101.171.123 > x.x.x.x.123: NTPv4, Client, length 48 13:27:03.245720 IP 77.222.101.171.123 > x.x.x.x.123: NTPv4, Client, length 48 13:27:04.246223 IP 77.222.101.171.123 > x.x.x.x.123: NTPv4, Client, length 48 13:27:06.840038 IP 77.222.101.171.123 > x.x.x.x.123: NTPv4, Client, length 48 13:27:06.840064 IP x.x.x.x.123 > 77.222.101.171.123: NTPv4, Server, length 48 13:27:06.869703 IP 77.222.101.171.123 > x.x.x.x.123: NTPv4, Client, length 48 13:27:07.840540 IP 77.222.101.171.123 > x.x.x.x.123: NTPv4, Client, length 48 13:27:08.841967 IP 77.222.101.171.123 > x.x.x.x.123: NTPv4, Client, length 48 13:27:11.440866 IP 77.222.101.171.123 > x.x.x.x.123: NTPv4,
Re: Fuzz, Numbers
> there are not only DDoS amplifier. I see many dumb queries with 0.3-2 second > interval. Looks like sources located behind NAT, does not NAT'ed correctly > and does not recieve my answers. Or just it have "broken" ntp client. Or > DDoS reflection attack. It still exists by simple queries with spoofed > source ip. One of my clients sometimes gets such flood at 5-10Gbit/s. I've seen a few piggy clients where whois indicates that it is likely to be a NAT box. One was a hotel, the other was an ISP block labeled DHCP clients. They have been piggy, but at least sane. I've seen a few others that seemed more like DDoS redirections but no hard evidence. > Looks like MRU reduce reply rate to this queries by 20-25%. I typically have > 4kpps input and 3-3.2kpps output on server. Is the CPU saturated? If not, there should be some counter that indicates why the packet didn't generate a response. (It wouldn't surprise me if there are missing cases, but if we find any, I'll fix that.) -- These are my opinions. I hate spam. ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel
Re: Fuzz, Numbers
there are not only DDoS amplifier. I see many dumb queries with 0.3-2 second interval. Looks like sources located behind NAT, does not NAT'ed correctly and does not recieve my answers. Or just it have "broken" ntp client. Or DDoS reflection attack. It still exists by simple queries with spoofed source ip. One of my clients sometimes gets such flood at 5-10Gbit/s. Looks like MRU reduce reply rate to this queries by 20-25%. I typically have 4kpps input and 3-3.2kpps output on server. Also MRU give me list of the worst clients and I can list them for futher action. This is useful for network and routers that have to process less "crap" pps. Not to ntp service directly. I will test current fixed sources and no-fuzz on the week. -- Mike ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel
Re: Fuzz, Numbers
]>> That turns off monitoring, aka the MRU list. > I believe that was a security feature to prevent amplification of ddos-type > attacks. (for ntp classic) Or doesn't this work this way for ntpsec? That was fixed in ntp classic long before ntpsec forked. The old code was for the client to send a request then the server would send back a lot of data. If you sent a forged request, that was a nice DDoS amplifier. The fix was to add a cookie. The server now needs a cookie along with the request. You can get the cookie from the server. It depends upon the IP Address. If you are sending forged requests, it's hard to get the cookie for the target system. You can also block -- These are my opinions. I hate spam. ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel
Re: Fuzz, Numbers
On 03-01-2020 06:06, Hal Murray wrote: > >>> Do you have a "disable monitor" in your ntp.conf? >> yes > > That turns off monitoring, aka the MRU list. I believe that was a security feature to prevent amplification of ddos-type attacks. (for ntp classic) Or doesn't this work this way for ntpsec? Udo ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel
Re: Fuzz, Numbers
>> Do you have a "disable monitor" in your ntp.conf? > yes That turns off monitoring, aka the MRU list. Comment out that line and restart ntpd and you should get some data. The default parameters are OK for home use. If your system is in the pool you probably want to give it more memory. Details are in docs/includes/misc-options.adoc -- These are my opinions. I hate spam. ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel
Re: Fuzz, Numbers
On 02-01-2020 20:26, Hal Murray wrote: > That's not the normal output. > > Do you have a "disable monitor" in your ntp.conf? yes > What does "ntpq -c monstats" show? # ntpq -c monstats enabled:0 addresses: 0 peak addresses: 0 maximum addresses: 11915 reclaim above count:600 reclaim maxage: 3600 reclaim minage: 64 kilobytes: 0 maximum kilobytes: 1024 alloc: exists: 0 alloc: new: 0 alloc: recycle old: 0 alloc: recycle full:0 alloc: none:0 age of oldest slot: 0 # grep monitor /etc/ntp.conf disable monitor # Udo ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel
Re: Fuzz, Numbers
> I see stuff like this: [no data] That's not the normal output. Do you have a "disable monitor" in your ntp.conf? What does "ntpq -c monstats" show? -- These are my opinions. I hate spam. ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel
Re: Fuzz, Numbers
hmur...@megapathdsl.net said: > The MRU hash table was limited to 16 bits. I have no idea why. It's > probably leftover from when even big machines didn't have much memory. I'm > about to go fix that. Should be simple, but I've said that before I just pushed a cleanup of that area. CPU usage should be close to flat as the table grows. In hindsight, things were screwed up worse than I expected. I'd noticed a limit of 16 bits for the hash table, but the setup routine was getting called before reading the config file so it allocated the hash table using the default parameters. That turned out to be 8K slots which ends up with hundreds of entries chained off each slot. --- Is anybody other than me and Mike interested in large MRU lists? -- These are my opinions. I hate spam. ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel
Re: Fuzz, Numbers
> The synthetic load with only one client is far away from real production > load of thousands of requests per second from around the world. As the owner > of the production server from the ntppool, I am very interested in > performance. I think the no-MRU (or early/small MRU) case should be a lot better now. Can you try git head? > Warming up, filling and overflow monlist plays a big role in CPU load. For > me, a monlist size that fills in about 5 minutes is optimal. It shouldn't play a big role. Or at least I can't see any reason it should. It should be a few cache faults. The MRU hash table was limited to 16 bits. I have no idea why. It's probably leftover from when even big machines didn't have much memory. I'm about to go fix that. Should be simple, but I've said that before In case you haven't found it yet, if you turn on usestats, ntpd will log memory and CPU usage. (I should add packets.) > I suggest using the following realistic test mode: big source address subnet > (i.e. /8 or 'all internet'); queries per sec 3-10-20k; duration 5-30 > minutes. Something like that would be nice. I don't know enough about filtering packets and/or what the filter does to CPU usage. Thanks for the pointer. -- These are my opinions. I hate spam. ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel
Re: Fuzz, Numbers
The synthetic load with only one client is far away from real production load of thousands of requests per second from around the world. As the owner of the production server from the ntppool, I am very interested in performance. I suggest using the following realistic test mode: big source address subnet (i.e. /8 or 'all internet'); queries per sec 3-10-20k; duration 5-30 minutes. Warming up, filling and overflow monlist plays a big role in CPU load. For me, a monlist size that fills in about 5 minutes is optimal. I have not tried it myself, but Google say ntpperf (https://github.com/mlichvar/ntpperf) can generate such a stream of requests "from subnet". -- Mike 29.12.2019 6:52, Hal Murray via devel пишет: I found the missing line of code that was breaking no-FUZZ. I found several other quirks while browsing the code. I'm starting to work on performance numbers. With no-FUZZ: A Pi 3 can support 18-19K packets/sec. 22-23K with monitoring turned off. An Intel(R) Core(TM) i5-3570 CPU @ 3.40GHz is good for 160K packets/second. 225K with monitoring turned off. The with-monitoring (default) numbers are slightly optimistic since there is only one client. Lots of clients would increase the working set. Things were slightly faster a few days ago. I don't know why. I've seen similar quirks before. On my to-do list is numbers with NTS. A major contribution to sloth was ntp_random(). Once upon a time, ntp had its own pseudo random number generator. Back in 2015, that was changed to use the one in libsodium. In Jan 2017, it was changed to use OpenSSL and the 31 bit limit was lost. Anybody object if I get rid of libntp/ntp_random.c? POSIX has random() We can use it where we don't need crypto quality randomness. Similarly, is there any reason not to use CLOCK_GETTIME() for timing? get_systime() has fuzzing which includes a call to randomness. The fuzzing code doesn't cleanly handle SO_TIMESTAMP (no NS). Suppose you want to fuzz by 1/2 microsecond. Your raw data has already been truncated to a whole microsecond. Fuzzing might be appropriate if it were limited to the current time so we don't get packets leaving before they arrive. FreeBSD has their own quirky ways to get more accurate time stamps. I don't know why they didn't implement SO_TIMESTAMPNS. ??? We should investigate. ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel
Fuzz, Numbers
I found the missing line of code that was breaking no-FUZZ. I found several other quirks while browsing the code. I'm starting to work on performance numbers. With no-FUZZ: A Pi 3 can support 18-19K packets/sec. 22-23K with monitoring turned off. An Intel(R) Core(TM) i5-3570 CPU @ 3.40GHz is good for 160K packets/second. 225K with monitoring turned off. The with-monitoring (default) numbers are slightly optimistic since there is only one client. Lots of clients would increase the working set. Things were slightly faster a few days ago. I don't know why. I've seen similar quirks before. On my to-do list is numbers with NTS. A major contribution to sloth was ntp_random(). Once upon a time, ntp had its own pseudo random number generator. Back in 2015, that was changed to use the one in libsodium. In Jan 2017, it was changed to use OpenSSL and the 31 bit limit was lost. Anybody object if I get rid of libntp/ntp_random.c? POSIX has random() We can use it where we don't need crypto quality randomness. Similarly, is there any reason not to use CLOCK_GETTIME() for timing? get_systime() has fuzzing which includes a call to randomness. The fuzzing code doesn't cleanly handle SO_TIMESTAMP (no NS). Suppose you want to fuzz by 1/2 microsecond. Your raw data has already been truncated to a whole microsecond. Fuzzing might be appropriate if it were limited to the current time so we don't get packets leaving before they arrive. FreeBSD has their own quirky ways to get more accurate time stamps. I don't know why they didn't implement SO_TIMESTAMPNS. ??? We should investigate. -- These are my opinions. I hate spam. ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel