Re: pf-badhost-0.3 released

Jordan Geoghegan Wed, 11 Mar 2020 17:21:25 -0700



On 2020-03-11 12:41, Anders Andersson wrote:

On Tue, Mar 10, 2020 at 10:53 PM Jordan Geoghegan <jor...@geoghegan.ca> wrote:

pf-badhost and unbound-adblock are both now at version 0.3, released
earlier today.

Links to the scripts can be found here:

www.geoghegan.ca/pfbadhost.html
www.geoghegan.ca/unbound-adblock.html

Thanks, this looks very interesting! But maybe you can help answering
a question that popped up when I read your page about pf-badhost.

You mention that "Subnet aggregation is used to take the address list
and "aggregate" the addresses into the smallest possible
representation using CIDR blocks.", but I was under the assumption
that pf already did this for its tables to speed up lookups.

Is there anything preventing the aggregation code to run on every pf
table modification? Assuming an already sorted list, it shouldn't take
long to merge a new entry. Perhaps I've missed some use of pf tables
that makes this impossible or not applicable in the general case.


Hi Anders,

I am by no means an expert on the nuts and bolts of pf, but I do knowthat pf stores table data in a radix tree / radix table. By theirnature, radix trees ignore exact duplicates, but I'm not exactly surehow they handle the partial overlapping of ranges. This article gives aneasy to follow cursory overview of raddix trees if you're interested:

https://blog.sqreen.com/demystifying-radix-trees/

As far as I understand, pf makes no modifications to the contents ofyour tables, all it does is parse the list to confirm the addressesand/or CIDR blocks are valid. When it's looking for matches withinranges, it will look for the most specific match available. For example,if you have a list containing an overlap:

...
192.168.0.0/16
192.168.1.0/22
...

When a packet from 192.168.1.5 arrives and is processed by a rulereferencing this table, it will match with 192.168.1.0/22. Even thoughboth entries are valid and match the packet, the /22 is more specific,and thus the one which matches closest.

pf may do some magic optimizations under the hood that I'm unaware of,but at the end of the day, it does not modify the actual contents ofyour table.

The use I've found in the subnet aggregation function has been mostlyfor the purpose of keeping the list clean and tidy. I have a fewinstallations where I have all the lists enabled, including the use ofthe GeoIP country blacklisting function. On these installations, subnetaggregation can reduce the /etc/pf-badhost.txt file from ~60,000 linesdown to ~40,000 lines. For example, when blocking China's netblocks(which pulls an aggregated list of all addresses assigned to China byAPNIC, and thus uses massive CIDR blocks of /10's etc), if any addressesfrom any of the other blocklists come from China, they will be removedfrom the list as they are already covered by the CIDR block info fromAPNIC. I run pf-badhost on a bunch of Edgerouter Lites, and I've foundthem to run better when the lists are tidy.

With regards to pf performing aggregation on all tables automatically,it wouldn't make sense to run the full subnet aggregation calculationsfor every table load or insertion/removal, as it can be quite CPUintensive. It takes less than a second to load the table on a $5 VultrVPS, it takes 20-70 seconds to run the subnet aggregation (depending onwhich lists are enabled). On my Edgerouter Pro with all the listsenabled, it takes ~6 minutes. On my Edgerouter Lite it takes ~15 minutesto run (over 2 hours when using the built in Perl-based aggregator). Ijust run the aggregation function with nice and let it do its thing, itsbeing called by cron in the wee hours, so I'm fine just letting it chugalong.


Regards,

Jordan

Re: pf-badhost-0.3 released

Reply via email to