Re: Fuzz, Numbers

2020-01-13 Thread Hal Murray via devel
Thanks.

> and without 'limited' on ~5kpps I have 8-10% CPU regardless minitoring
> enabled/disabled. About 1% on 1000pps. 

Is that within reason or worth investigating? 1% times 5 should be 5% rather 
than 8-10% but there may not be enough significant digits in any of the 
numbers.



> For those who want to process hundreds of thousands of requests per  second
> (like 'national standard' servers) you can use multithreading and  multiply
> power of server.

The current code isn't setup for threads.  I think with a bit of work, we 
could get multiple threads on the server side.

On an Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
I can get 330K packets per second.
258K with AES CMAC.
I don't have NTS numbers yet.




-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Fuzz, Numbers

2020-01-12 Thread Mike Yurlov via devel
and without 'limited' on ~5kpps I have 8-10% CPU regardless minitoring 
enabled/disabled. About 1% on 1000pps.

(Hardware is old MS-9258 server, CPU Quad CPU Q940, FreeBSD 12.1)

As I see many limited queries really sourced from NAT, and we cannot 
determine whether they are correct or not. So for production server 
better not have 'limited' or have limited to tens queries per second. 
And maybe limit only ip:source port, not only 'per ip' because we have 
different source ports from NAT and identical port on "dumb" clients. 
But we cannot set such settings.. To protect against participation in 
DDoS, you can use traffic restriction with a firewall to 1-5Mbit/s. 
Every 1k queries takes <1Mbps of bandwidth.


For those who want to process hundreds of thousands of requests per 
second (like 'national standard' servers) you can use multithreading and 
multiply power of server. As I know professional solutions like Meinberg 
Lantime can run multithreading, but no opensource daemons can do it. 
NTPPoll community have poses about good expirience with 
https://github.com/mlichvar/rsntp (look at 
https://community.ntppool.org/t/can-i-incrase-number-of-threads-to-use-in-ntpd-proccess/1159/20) 
.


Maybe when there will be absolutely nothing to do you can write some 
proxy-balancer that solves this task as official utility :)


Have a nice day!

--
Mike Yurlov


09.01.2020 13:52, Mike Yurlov via devel пишет:

Hi, Hal!


I build ntpd from latest sources tonight. CPU load drops from 18-20% 
average to 5-6% on my ~3-4k pps. Looks perfect!
If you get race with "init before config read", you can create build 
option for the init size of the mrulist.


Here the stats from nigth to 13:00 (GMT+3):
recieded 173 647 480 packets, 3.1kpps average (real from 2.5 to 6kpps 
i see on network interface),

1.8% bad, 21% ratelimited, 77% processed


ntpq> sysstats
uptime: 55394
sysstats reset: 55394
packets received:   173647480
current version:    76272783
older version:  57692039
control requests:   1516
bad length or format:   3287409
authentication failed:  3955
declined:   3199
restricted: 388
rate limited:   36398991
KoD responses:  0
processed for time: 133953537

ntpq> monstats

enabled:    2
hash slots in use:  158963
addresses in use:   290909
peak addresses: 290909
maximum addresses:  290909
reclaim above count:    600
reclaim maxage: 250
reclaim minage: 240
kilobytes:  25000
maximum kilobytes:  25000
alloc: exists:  133311968
alloc: new: 290909
alloc: recycle old: 35498556
alloc: recycle full:    1162596
alloc: none:    150665
age of oldest slot: 240


Some request strange and I don't know is this NAT or not.

This one looks like many clients over NAT
13:17:31.160400 IP 90.188.255.3.42962 > x.x.x.x.123: NTPv4, Client, 
length 48
13:17:31.312476 IP 90.188.255.3.51241 > x.x.x.x.123: NTPv4, Client, 
length 48
13:17:31.482878 IP 90.188.255.3.55666 > x.x.x.x.123: NTPv4, Client, 
length 48
13:17:31.570783 IP 90.188.255.3.38018 > x.x.x.x.123: NTPv4, Client, 
length 48
13:17:31.596582 IP 90.188.255.3.36581 > x.x.x.x.123: NTPv4, Client, 
length 48
13:17:31.776522 IP 90.188.255.3.42962 > x.x.x.x.123: NTPv4, Client, 
length 48
13:17:31.928548 IP 90.188.255.3.51241 > x.x.x.x.123: NTPv4, Client, 
length 48


But than it looks like woodpecker :)
13:19:24.257556 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:24.917559 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:25.533525 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:26.157515 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:26.769554 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:27.381551 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:28.001559 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:28.617574 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:29.237470 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:29.853630 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:30.469565 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:31.081622 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:31.705618 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:32.321652 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:32.945589 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:33.025639 IP 90.188.255.3.46163 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:33.573548 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:33.661612 IP 90.188.255.3.46163 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:34.193647 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:34.273687 IP 

Re: Fuzz, Numbers

2020-01-09 Thread Mike Yurlov via devel

Hi, Hal!


I build ntpd from latest sources tonight. CPU load drops from 18-20% 
average to 5-6% on my ~3-4k pps. Looks perfect!
If you get race with "init before config read", you can create build 
option for the init size of the mrulist.


Here the stats from nigth to 13:00 (GMT+3):
recieded 173 647 480 packets, 3.1kpps average (real from 2.5 to 6kpps i 
see on network interface),

1.8% bad, 21% ratelimited, 77% processed


ntpq> sysstats
uptime: 55394
sysstats reset: 55394
packets received:   173647480
current version:    76272783
older version:  57692039
control requests:   1516
bad length or format:   3287409
authentication failed:  3955
declined:   3199
restricted: 388
rate limited:   36398991
KoD responses:  0
processed for time: 133953537

ntpq> monstats

enabled:    2
hash slots in use:  158963
addresses in use:   290909
peak addresses: 290909
maximum addresses:  290909
reclaim above count:    600
reclaim maxage: 250
reclaim minage: 240
kilobytes:  25000
maximum kilobytes:  25000
alloc: exists:  133311968
alloc: new: 290909
alloc: recycle old: 35498556
alloc: recycle full:    1162596
alloc: none:    150665
age of oldest slot: 240


Some request strange and I don't know is this NAT or not.

This one looks like many clients over NAT
13:17:31.160400 IP 90.188.255.3.42962 > x.x.x.x.123: NTPv4, Client, 
length 48
13:17:31.312476 IP 90.188.255.3.51241 > x.x.x.x.123: NTPv4, Client, 
length 48
13:17:31.482878 IP 90.188.255.3.55666 > x.x.x.x.123: NTPv4, Client, 
length 48
13:17:31.570783 IP 90.188.255.3.38018 > x.x.x.x.123: NTPv4, Client, 
length 48
13:17:31.596582 IP 90.188.255.3.36581 > x.x.x.x.123: NTPv4, Client, 
length 48
13:17:31.776522 IP 90.188.255.3.42962 > x.x.x.x.123: NTPv4, Client, 
length 48
13:17:31.928548 IP 90.188.255.3.51241 > x.x.x.x.123: NTPv4, Client, 
length 48


But than it looks like woodpecker :)
13:19:24.257556 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:24.917559 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:25.533525 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:26.157515 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:26.769554 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:27.381551 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:28.001559 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:28.617574 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:29.237470 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:29.853630 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:30.469565 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:31.081622 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:31.705618 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:32.321652 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:32.945589 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:33.025639 IP 90.188.255.3.46163 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:33.573548 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:33.661612 IP 90.188.255.3.46163 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:34.193647 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:34.273687 IP 90.188.255.3.46163 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:34.809651 IP 90.188.255.3.39114 > x.x.x.x.123: NTPv4, Client, 
length 48
13:19:34.897663 IP 90.188.255.3.46163 > x.x.x.x.123: NTPv4, Client, 
length 48


many clients look buggy or installed behind firewall. It request 3-5 
times once per second, do 1-2 sec pause and repeat cycle. ntpd ratelimit 
it and reply once on every cycle, but it send request again and again. 
Many such clients make ~100k requests per day. I think to answer to such 
requests are a waste of hardware resources and network bandwidth worldwide.


13:27:02.246352 IP 77.222.101.171.123 > x.x.x.x.123: NTPv4, Client, 
length 48
13:27:02.246384 IP x.x.x.x.123 > 77.222.101.171.123: NTPv4, Server, 
length 48
13:27:02.278056 IP 77.222.101.171.123 > x.x.x.x.123: NTPv4, Client, 
length 48
13:27:03.245720 IP 77.222.101.171.123 > x.x.x.x.123: NTPv4, Client, 
length 48
13:27:04.246223 IP 77.222.101.171.123 > x.x.x.x.123: NTPv4, Client, 
length 48
13:27:06.840038 IP 77.222.101.171.123 > x.x.x.x.123: NTPv4, Client, 
length 48
13:27:06.840064 IP x.x.x.x.123 > 77.222.101.171.123: NTPv4, Server, 
length 48
13:27:06.869703 IP 77.222.101.171.123 > x.x.x.x.123: NTPv4, Client, 
length 48
13:27:07.840540 IP 77.222.101.171.123 > x.x.x.x.123: NTPv4, Client, 
length 48
13:27:08.841967 IP 77.222.101.171.123 > x.x.x.x.123: NTPv4, Client, 
length 48
13:27:11.440866 IP 77.222.101.171.123 > x.x.x.x.123: NTPv4, 

Re: Fuzz, Numbers

2020-01-06 Thread Hal Murray via devel


> there are not only DDoS amplifier. I see many dumb queries with 0.3-2  second
> interval. Looks like sources located behind NAT, does not NAT'ed  correctly
> and does not recieve my answers. Or just it have "broken" ntp  client. Or
> DDoS reflection attack. It still exists by simple queries  with spoofed
> source ip. One of my clients sometimes gets such flood at  5-10Gbit/s. 

I've seen a few piggy clients where whois indicates that it is likely to be a 
NAT box.  One was a hotel, the other was an ISP block labeled DHCP clients.  
They have been piggy, but at least sane.

I've seen a few others that seemed more like DDoS redirections but no hard 
evidence.


> Looks like MRU reduce reply rate to this queries by 20-25%. I typically  have
> 4kpps input and 3-3.2kpps output on server. 

Is the CPU saturated?  If not, there should be some counter that indicates why 
the packet didn't generate a response.  (It wouldn't surprise me if there are 
missing cases, but if we find any, I'll fix that.)


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Fuzz, Numbers

2020-01-06 Thread Mike Yurlov via devel
there are not only DDoS amplifier. I see many dumb queries with 0.3-2 
second interval. Looks like sources located behind NAT, does not NAT'ed 
correctly and does not recieve my answers. Or just it have "broken" ntp 
client. Or DDoS reflection attack. It still exists by simple queries 
with spoofed source ip. One of my clients sometimes gets such flood at 
5-10Gbit/s.


Looks like MRU reduce reply rate to this queries by 20-25%. I typically 
have 4kpps input and 3-3.2kpps output on server. Also MRU give me list 
of the worst clients and I can list them for futher action. This is 
useful for network and routers that have to process less "crap" pps. Not 
to ntp service directly.


I will test current fixed sources and no-fuzz on the week.

--
Mike
___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Fuzz, Numbers

2020-01-02 Thread Hal Murray via devel


]>> That turns off monitoring, aka the MRU list.
> I believe that was a security feature to prevent amplification of ddos-type
> attacks. (for ntp classic) Or doesn't this work this way for ntpsec? 

That was fixed in ntp classic long before ntpsec forked.

The old code was for the client to send a request then the server would send 
back a lot of data.  If you sent a forged request, that was a nice DDoS 
amplifier.

The fix was to add a cookie.  The server now needs a cookie along with the 
request.  You can get the cookie from the server.  It depends upon the IP 
Address.  If you are sending forged requests, it's hard to get the cookie for 
the target system.

You can also block 

-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Fuzz, Numbers

2020-01-02 Thread Udo van den Heuvel via devel
On 03-01-2020 06:06, Hal Murray wrote:
> 
>>> Do you have a "disable monitor" in your ntp.conf?
>> yes 
> 
> That turns off monitoring, aka the MRU list.

I believe that was a security feature to prevent amplification of
ddos-type attacks. (for ntp classic)
Or doesn't this work this way for ntpsec?

Udo
___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Fuzz, Numbers

2020-01-02 Thread Hal Murray via devel


>> Do you have a "disable monitor" in your ntp.conf?
> yes 

That turns off monitoring, aka the MRU list.

Comment out that line and restart ntpd and you should get some data.  The 
default parameters are OK for home use.  If your system is in the pool you 
probably want to give it more memory.  Details are in 
docs/includes/misc-options.adoc

-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Fuzz, Numbers

2020-01-02 Thread Udo van den Heuvel via devel
On 02-01-2020 20:26, Hal Murray wrote:
> That's not the normal output.
> 
> Do you have a "disable monitor" in your ntp.conf?

yes

> What does "ntpq -c monstats" show?

# ntpq -c monstats
enabled:0
addresses:  0
peak addresses: 0
maximum addresses:  11915
reclaim above count:600
reclaim maxage: 3600
reclaim minage: 64
kilobytes:  0
maximum kilobytes:  1024
alloc: exists:  0
alloc: new: 0
alloc: recycle old: 0
alloc: recycle full:0
alloc: none:0
age of oldest slot: 0
# grep monitor /etc/ntp.conf
disable monitor
#


Udo

___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Fuzz, Numbers

2020-01-02 Thread Hal Murray via devel
> I see stuff like this:
[no data]

That's not the normal output.

Do you have a "disable monitor" in your ntp.conf?

What does "ntpq -c monstats" show?


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Fuzz, Numbers

2020-01-02 Thread Hal Murray via devel


hmur...@megapathdsl.net said:
> The MRU hash table was limited to 16 bits.  I have no idea why.  It's
> probably  leftover from when even big machines didn't have much memory.  I'm
> about to go  fix that.  Should be simple, but I've said that before 

I just pushed a cleanup of that area.  CPU usage should be close to flat as 
the table grows.

In hindsight, things were screwed up worse than I expected.  I'd noticed a 
limit of 16 bits for the hash table, but the setup routine was getting called 
before reading the config file so it allocated the hash table using the 
default parameters.  That turned out to be 8K slots which ends up with 
hundreds of entries chained off each slot.

---

Is anybody other than me and Mike interested in large MRU lists?


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Fuzz, Numbers

2019-12-31 Thread Hal Murray via devel


> The synthetic load with only one client is far away from real production
> load of thousands of requests per second from around the world. As the  owner
> of the production server from the ntppool, I am very interested in
> performance. 

I think the no-MRU (or early/small MRU) case should be a lot better now.  Can 
you try git head?

> Warming up, filling and overflow monlist plays a big role  in CPU load. For
> me, a monlist size that fills in about 5 minutes is  optimal.

It shouldn't play a big role.  Or at least I can't see any reason it should.  
It should be a few cache faults.

The MRU hash table was limited to 16 bits.  I have no idea why.  It's probably 
leftover from when even big machines didn't have much memory.  I'm about to go 
fix that.  Should be simple, but I've said that before

In case you haven't found it yet, if you turn on usestats, ntpd will log 
memory and CPU usage.  (I should add packets.)


> I suggest using the following realistic test mode: big source address  subnet
> (i.e. /8 or 'all internet'); queries per sec 3-10-20k; duration  5-30
> minutes.

Something like that would be nice.  I don't know enough about filtering 
packets and/or what the filter does to CPU usage.  Thanks for the pointer.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Fuzz, Numbers

2019-12-30 Thread Mike Yurlov via devel
The synthetic load with only one client is far away from real production 
load of thousands of requests per second from around the world. As the 
owner of the production server from the ntppool, I am very interested in 
performance.


I suggest using the following realistic test mode: big source address 
subnet (i.e. /8 or 'all internet'); queries per sec 3-10-20k; duration 
5-30 minutes. Warming up, filling and overflow monlist plays a big role 
in CPU load. For me, a monlist size that fills in about 5 minutes is 
optimal.


I have not tried it myself, but Google say ntpperf 
(https://github.com/mlichvar/ntpperf) can generate such a stream of 
requests "from subnet".


--
Mike



29.12.2019 6:52, Hal Murray via devel пишет:

I found the missing line of code that was breaking no-FUZZ.  I found several
other quirks while browsing the code.

I'm starting to work on performance numbers.

With no-FUZZ:

A Pi 3 can support 18-19K packets/sec.  22-23K with monitoring turned off.

An Intel(R) Core(TM) i5-3570 CPU @ 3.40GHz is good for 160K packets/second.
225K with monitoring turned off.

The with-monitoring (default) numbers are slightly optimistic since there is
only one client.  Lots of clients would increase the working set.

Things were slightly faster a few days ago.  I don't know why.  I've seen
similar quirks before.

On my to-do list is numbers with NTS.



A major contribution to sloth was ntp_random().  Once upon a time, ntp had its
own pseudo random number generator.  Back in 2015, that was changed to use the
one in libsodium.  In Jan 2017, it was changed to use OpenSSL and the 31 bit
limit was lost.

Anybody object if I get rid of libntp/ntp_random.c?  POSIX has random()  We
can use it where we don't need crypto quality randomness.

Similarly, is there any reason not to use CLOCK_GETTIME() for timing?
get_systime() has fuzzing which includes a call to randomness.



The fuzzing code doesn't cleanly handle SO_TIMESTAMP (no NS).  Suppose you
want to fuzz by 1/2 microsecond.  Your raw data has already been truncated to
a whole microsecond.  Fuzzing might be appropriate if it were limited to the
current time so we don't get packets leaving before they arrive.

FreeBSD has their own quirky ways to get more accurate time stamps.  I don't
know why they didn't implement SO_TIMESTAMPNS.  ???  We should investigate.


___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Fuzz, Numbers

2019-12-28 Thread Hal Murray via devel


I found the missing line of code that was breaking no-FUZZ.  I found several 
other quirks while browsing the code.

I'm starting to work on performance numbers.

With no-FUZZ:

A Pi 3 can support 18-19K packets/sec.  22-23K with monitoring turned off.

An Intel(R) Core(TM) i5-3570 CPU @ 3.40GHz is good for 160K packets/second.  
225K with monitoring turned off.

The with-monitoring (default) numbers are slightly optimistic since there is 
only one client.  Lots of clients would increase the working set.

Things were slightly faster a few days ago.  I don't know why.  I've seen 
similar quirks before.

On my to-do list is numbers with NTS.



A major contribution to sloth was ntp_random().  Once upon a time, ntp had its 
own pseudo random number generator.  Back in 2015, that was changed to use the 
one in libsodium.  In Jan 2017, it was changed to use OpenSSL and the 31 bit 
limit was lost.

Anybody object if I get rid of libntp/ntp_random.c?  POSIX has random()  We 
can use it where we don't need crypto quality randomness.

Similarly, is there any reason not to use CLOCK_GETTIME() for timing?  
get_systime() has fuzzing which includes a call to randomness.



The fuzzing code doesn't cleanly handle SO_TIMESTAMP (no NS).  Suppose you 
want to fuzz by 1/2 microsecond.  Your raw data has already been truncated to 
a whole microsecond.  Fuzzing might be appropriate if it were limited to the 
current time so we don't get packets leaving before they arrive.

FreeBSD has their own quirky ways to get more accurate time stamps.  I don't 
know why they didn't implement SO_TIMESTAMPNS.  ???  We should investigate.

-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel