Re: Let's launch our own blocklists...

Matthias Fischer Sat, 24 Jan 2026 15:41:41 -0800

On 23.01.2026 17:39, Michael Tremer wrote:
> Hello Matthias,

Hi Michael,


> Thank you very much for testing IPFire DBL.

No problem - I have news:

After taking a closer look to the IPS system logs, unfortunately I found
some parsing errors:

'suricata' complains about missing ";".

***SNIP***
...
00:32:40        suricata:       [13343] <Info> -- Including configuration file
/var/ipfire/suricata/suricata-used-rulesfiles.yaml.
00:32:40        suricata:       [13343] <Error> -- no terminating ";" found
00:32:40        suricata:       [13343] <Error> -- error parsing signature "drop
dns any any -> any any (msg:"IPFire DBL [Advertising] Blocked DNS
Query"; dns.query; domain; dataset:isset,ads,type string,load
datasets/ads.txt; classtype:policy-violation; priority:3; sid:983041;
rev:1; reference:url,https://www.ipfire.org/dbl/ads; metadata:dbl
ads.dbl.ipfire.org)" from file /var/lib/suricata/ipfire_dnsbl-ads.rules
at line 72
00:32:40        suricata:       [13343] <Error> -- no terminating ";" found
...
***SNAP***

I tried, but didn't find the right place for any missing ";".

Can "anyone" confirm?

Best
Matthias

>> On 23 Jan 2026, at 15:02, Matthias Fischer <[email protected]> 
>> wrote:
>> 
>> On 22.01.2026 12:33, Michael Tremer wrote:
>>> Hello everyone,
>> 
>> Hi,
>> 
>> short feedback from me:
>> 
>> - I activated both the suricata (IPFire DBL - Domain Blocklist) - and
>> the URLfilter lists from 'dbl.ipfire.org'.
> 
> This is an interesting case. What I didn’t manage to test yet is what happens 
> when Suricata blocks the connection first. If URL Filter sees a domain that 
> is being blocked it will either send you an error page if you are using HTTP, 
> or simply close the connection if it is HTTPS. However, when Suricata comes 
> first in the chain (and it will), it might close the connection because URL 
> Filter has received the request. In the case of HTTPS this does not make any 
> difference because the connection will be closed, but in the HTTP case you 
> won’t see an error page any more and instead have the connection closed, too. 
> You are basically losing the explicit error notification which is a little 
> bit annoying.
> 
> We could have the same when we are doing the same with Unbound and DNS 
> filtering. Potentially we would need to whitelist the local DNS resolver 
> then, but how is Suricata supposed to know that the same categories are 
> activated in both places?
> 
>> - I even took the 'smart-tv' domains from the IFire DBL blacklist and
>> copied/pasted them in my fritzbox filter lists.
> 
> LOL Why not use IPFire to filter this as well?
> 
>> Everything works as expected. Besides, the download of the IPFire
>> DBL-list loads a lot faster than the list from 'Univ. Toulouse'... ;-)
> 
> Yes, we don’t have much traffic on the server, yet.
> 
>> Functionality is good - no false positives or seen problems. Good work -
>> thanks!
> 
> Nice. We need to distinguish a little between what is a technical issue and 
> what is a false-positive/missing domain on the list. However, testing both at 
> the same time is something we will all cope quite well with :)
> 
> -Michael
> 
>> Best
>> Matthias
>> 
>>> Over the past few weeks I have made significant progress on this all, and I 
>>> think we're getting close to something the community will be really happy 
>>> with. I'd love to get feedback from the team before we finalise things.
>>> 
>>> So what has happened?
>>> 
>>> First of all, the entire project has been renamed. DNSBL is not entirely 
>>> what this is. Although the lists can be thrown into DNS, they have much 
>>> more use outside of it that I thought we should simply go with DBL, short 
>>> for Domain Blocklist. After all, we are only importing domains. The new 
>>> home of the project therefore is https://www.ipfire.org/dbl
>>> 
>>> I have added a couple more lists that I thought interesting and I have 
>>> added a couple more sources that I considered a good start. Hopefully, we 
>>> will soon gather some more feedback on how well this is all holding up. My 
>>> main focus has however been on the technology that will power this project.
>>> 
>>> One of the bigger challenges was to create Suricata rules from the lists. 
>>> Initially I tried to create a ton of rules but since our lists are so 
>>> large, this quickly became too complicated. I have now settled on using a 
>>> feature that is only available in more recent versions of Suricata (I 
>>> believe 7 and later), but since we are already on Suricata 8 in IPFire this 
>>> won’t be a problem for us. All domains for each list are basically compiled 
>>> into one massively large dataset and one single rule is referring to that 
>>> dataset. This way, we won’t have the option to remove any false-positives, 
>>> but at least Suricata and the GUI won’t starve a really bad death when 
>>> loading millions of rules.
>>> 
>>> Suricata will now be able to use our rules to block access to any listed 
>>> domains of each of the categories over DNS, HTTP, TLS or QUIC. Although I 
>>> don’t expect many users to use Suricata to block porn or other things, this 
>>> is a great backstop to enforce any policy like that. For example, if there 
>>> is a user on the network who is trying to circumvent the DNS server that 
>>> might filter out certain domains, even after getting an IP address resolved 
>>> through other means, they won’t be able to open a TLS/QUIC connection or 
>>> send a HTTP request to all blocked domains. Some people have said they were 
>>> interested in blocking DNS-over-HTTPS and this is a perfect way to do this 
>>> and actually be sure that any server that is being blocked on the list will 
>>> actually be completely inaccessible.
>>> 
>>> Those Suricata rules are already available for testing in Core Update 200: 
>>> https://git.ipfire.org/?p=ipfire-2.x.git;a=commitdiff;h=9eb8751487d23dd354a105c28bdbbb0398fe6e85
>>> 
>>> I have chosen various severities for the lists. If someone was to block 
>>> advertising using DBL, this is fine, but not a very severe alert. If 
>>> someone chooses to block malware and there is a system on the network 
>>> trying to access those domains, this is an alert worth being investigated 
>>> by an admin. Our new Suricata Reporter will show those violations in 
>>> different colours based on the severity which helps to identify the right 
>>> alerts to further investigate.
>>> 
>>> Formerly I have asked you to test the lists using URL Filter. Those rules 
>>> are now available as well in Core Update 200: 
>>> https://git.ipfire.org/?p=ipfire-2.x.git;a=commitdiff;h=db160694279a4b10378447f775dd536fdfcfb02a
>>> 
>>> I talked about a method to remove any dead domains from any sources which 
>>> is a great way to keep our lists smaller. The pure size of them is a 
>>> problem in so many ways. That check was however a little bit too ambitious 
>>> and I had to make it a little bit less eager. Basically if we are in doubt, 
>>> we need to still list the domain because it might be resolvable by a user.
>>> 
>>>  
>>> https://git.ipfire.org/?p=dbl.git;a=commitdiff;h=bb5b6e33b731501d45dea293505f7d42a61d5ce7
>>> 
>>> So how else could we make the lists smaller without losing any actual data? 
>>> Since we sometimes list a whole TLD (e.g. .xxx or .porn), there is very 
>>> little point in listing any domains of this TLD. They will always be caught 
>>> anyways. So I built a check that marks all domains that don’t need to be 
>>> included on the exported lists because they will never be needed and was 
>>> able to shrink the size of the lists by a lot again.
>>> 
>>> The website does not show this data, but the API returns the number of 
>>> “subsumed” domains (I didn’t have a better name):
>>> 
>>>  curl https://api.dbl.ipfire.org/lists | jq .
>>> 
>>> The number shown would normally be added to the total number of domains and 
>>> usually cuts the size of the list by 50-200%.
>>> 
>>> Those stats will now also be stored in a history table so that we will be 
>>> able to track growth of all lists.
>>> 
>>> Furthermore, the application will now send email notifications for any 
>>> incoming reports. This way, we will be able to stay in close touch with the 
>>> reporters and keep them up to date on their submissions as well as inform 
>>> moderators that there is something to have a look at.
>>> 
>>> The search has been refactored as well, so that we can show clearly whether 
>>> something is blocked or not at one glance: 
>>> https://www.ipfire.org/dbl/search?q=github.com. There is detailed 
>>> information available on all domains and what happened to them. In case of 
>>> GitHub.com, this seems to be blocked and unblocked by someone all of the 
>>> time and we can see a clear audit trail of that: 
>>> https://www.ipfire.org/dbl/lists/malware/domains/github.com
>>> 
>>> On the DNS front, I have added some metadata to the zones so that people 
>>> can programmatically request some data, like when it has been last updated 
>>> (in a human-friendly timestamp and not only the serial), license, 
>>> description and so on:
>>> 
>>>  # dig +short ANY _info.ads.dbl.ipfire.org @primary.dbl.ipfire.org
>>>  "total-domains=42226"
>>>  "license=CC BY-SA 4.0"
>>>  "updated-at=2026-01-20T22:17:02.409933+00:00"
>>>  "description=Blocks domains used for ads, tracking, and ad delivery”
>>> 
>>> Now, I would like to hear more feedback from you. I know we've all been 
>>> stretched thin lately, so I especially appreciate anyone who has time to 
>>> review and provide input. Ideas, just say if you like it or not. Where this 
>>> could go in the future?
>>> 
>>> Looking ahead, I would like us to start thinking about the RPZ feature that 
>>> has been on the wishlist. IPFire DBL has been a bigger piece of work, and I 
>>> think it's worth having a conversation about sustainability. Resources for 
>>> this need to be allocated and paid for. Open source is about freedom, not 
>>> free beer — and to keep building features like this, we will need to 
>>> explore some funding options. I would be interested to hear any ideas you 
>>> might have that could work for IPFire.
>>> 
>>> Please share your thoughts on the mailing list when you can — even a quick 
>>> 'looks good' or 'I have concerns about X' is valuable. Public discussion 
>>> helps everyone stay in the loop and contribute.
>>> 
>>> I am aiming to move forward with this in a week's time, so if you have 
>>> input, now would be a good time to share it.
>>> 
>>> Best,
>>> -Michael
>>> 
>>>> On 6 Jan 2026, at 10:20, Michael Tremer <[email protected]> wrote:
>>>> 
>>>> Good Morning Adolf,
>>>> 
>>>> I had a look at this problem yesterday and it seems that parsing the 
>>>> format is becoming a little bit difficult this way. Since this is only 
>>>> affecting very few domains, I have simply whitelisted them all manually 
>>>> and duckduckgo.com <http://duckduckgo.com/> and others should now be 
>>>> easily reachable again.
>>>> 
>>>> Please let me know if you have any more findings.
>>>> 
>>>> All the best,
>>>> -Michael
>>>> 
>>>>> On 5 Jan 2026, at 11:48, Michael Tremer <[email protected]> wrote:
>>>>> 
>>>>> Hello Adolf,
>>>>> 
>>>>> This is a good find.
>>>>> 
>>>>> But if duckduckgo.com <http://duckduckgo.com/> is blocked, we will have 
>>>>> to have a source somewhere that blocks that domain. Not only a sub-domain 
>>>>> of it. Otherwise we have a bug somewhere.
>>>>> 
>>>>> This is most likely as the domain is listed here, but with some stuff 
>>>>> afterwards:
>>>>> 
>>>>> https://raw.githubusercontent.com/mtxadmin/ublock/refs/heads/master/hosts/_malware_typo
>>>>> 
>>>>> We strip everything after a # away because we consider it a comment. 
>>>>> However, that causes that there is only a line with the domain left which 
>>>>> will cause it being listed.
>>>>> 
>>>>> The # sign is used as some special character but at the same time it is 
>>>>> being used for comments.
>>>>> 
>>>>> I will fix this and then refresh the list.
>>>>> 
>>>>> -Michael
>>>>> 
>>>>>> On 5 Jan 2026, at 11:31, Adolf Belka <[email protected]> wrote:
>>>>>> 
>>>>>> Hi Michael,
>>>>>> 
>>>>>> 
>>>>>> On 05/01/2026 12:11, Adolf Belka wrote:
>>>>>>> Hi Michael,
>>>>>>> 
>>>>>>> I have found that the malware list includes duckduckgo.com
>>>>>>> 
>>>>>> I have checked through the various sources used for the malware list.
>>>>>> 
>>>>>> The ShadowWhisperer (Tracking) list has improving.duckduckgo.com in its 
>>>>>> list. I suspect that this one is the one causing the problem.
>>>>>> 
>>>>>> The mtxadmin (_malware_typo) list has duckduckgo.com mentioned 3 times 
>>>>>> but not directly as a domain name - looks more like a reference.
>>>>>> 
>>>>>> Regards,
>>>>>> 
>>>>>> Adolf.
>>>>>> 
>>>>>> 
>>>>>>> Regards,
>>>>>>> Adolf.
>>>>>>> 
>>>>>>> 
>>>>>>> On 02/01/2026 14:02, Adolf Belka wrote:
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> On 02/01/2026 12:09, Michael Tremer wrote:
>>>>>>>>> Hello,
>>>>>>>>> 
>>>>>>>>>> On 30 Dec 2025, at 14:05, Adolf Belka <[email protected]> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi Michael,
>>>>>>>>>> 
>>>>>>>>>> On 29/12/2025 13:05, Michael Tremer wrote:
>>>>>>>>>>> Hello everyone,
>>>>>>>>>>> 
>>>>>>>>>>> I hope everyone had a great Christmas and a couple of quiet days to 
>>>>>>>>>>> relax from all the stress that was the year 2025.
>>>>>>>>>> Still relaxing.
>>>>>>>>> 
>>>>>>>>> Very good, so let’s have a strong start into 2026 now!
>>>>>>>> 
>>>>>>>> Starting next week, yes.
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>> Having a couple of quieter days, I have been working on a new, 
>>>>>>>>>>> little (hopefully) side project that has probably been high up on 
>>>>>>>>>>> our radar since the Shalla list has shut down in 2020, or maybe 
>>>>>>>>>>> even earlier. The goal of the project is to provide good lists with 
>>>>>>>>>>> categories of domain names which are usually used to block access 
>>>>>>>>>>> to these domains.
>>>>>>>>>>> 
>>>>>>>>>>> I simply call this IPFire DNSBL which is short for IPFire DNS 
>>>>>>>>>>> Blocklists.
>>>>>>>>>>> 
>>>>>>>>>>> How did we get here?
>>>>>>>>>>> 
>>>>>>>>>>> As stated before, the URL filter feature in IPFire has the problem 
>>>>>>>>>>> that there are not many good blocklists available any more. There 
>>>>>>>>>>> used to be a couple more - most famously the Shalla list - but we 
>>>>>>>>>>> are now down to a single list from the University of Toulouse. It 
>>>>>>>>>>> is a great list, but it is not always the best fit for all users.
>>>>>>>>>>> 
>>>>>>>>>>> Then there has been talk about whether we could implement more 
>>>>>>>>>>> blocking features into IPFire that don’t involve the proxy. Most 
>>>>>>>>>>> famously blocking over DNS. The problem here remains a the blocking 
>>>>>>>>>>> feature is only as good as the data that is fed into it. Some 
>>>>>>>>>>> people have been putting forward a number of lists that were 
>>>>>>>>>>> suitable for them, but they would not have replaced the blocking 
>>>>>>>>>>> functionality as we know it. Their aim is to provide “one list for 
>>>>>>>>>>> everything” but that is not what people usually want. It is 
>>>>>>>>>>> targeted at a classic home user and the only separation that is 
>>>>>>>>>>> being made is any adult/porn/NSFW content which usually is put into 
>>>>>>>>>>> a separate list.
>>>>>>>>>>> 
>>>>>>>>>>> It would have been technically possible to include these lists and 
>>>>>>>>>>> let the users decide, but that is not the aim of IPFire. We want to 
>>>>>>>>>>> do the job for the user so that their job is getting easier. 
>>>>>>>>>>> Including obscure lists that don’t have a clear outline of what 
>>>>>>>>>>> they actually want to block (“bad content” is not a category) and 
>>>>>>>>>>> passing the burden of figuring out whether they need the “Light”, 
>>>>>>>>>>> “Normal”, “Pro”, “Pro++”, “Ultimate” or even a “Venti” list with 
>>>>>>>>>>> cream on top is really not going to work. It is all confusing and 
>>>>>>>>>>> will lead to a bad user experience.
>>>>>>>>>>> 
>>>>>>>>>>> An even bigger problem that is however completely impossible to 
>>>>>>>>>>> solve is bad licensing of these lists. A user has asked the 
>>>>>>>>>>> publisher of the HaGeZi list whether they could be included in 
>>>>>>>>>>> IPFire and under what terms. The response was that the list is 
>>>>>>>>>>> available under the terms of the GNU General Public License v3, but 
>>>>>>>>>>> that does not seem to be true. The list contains data from various 
>>>>>>>>>>> sources. Many of them are licensed under incompatible licenses (CC 
>>>>>>>>>>> BY-SA 4.0, MPL, Apache2, …) and unless there is a non-public 
>>>>>>>>>>> agreement that this data may be redistributed, there is a huge 
>>>>>>>>>>> legal issue here. We would expose our users to potential copyright 
>>>>>>>>>>> infringement which we cannot do under any circumstances. 
>>>>>>>>>>> Furthermore many lists are available under a non-commercial license 
>>>>>>>>>>> which excludes them from being used in any kind of business. Plenty 
>>>>>>>>>>> of IPFire systems are running in businesses, if not even the vast 
>>>>>>>>>>> majority.
>>>>>>>>>>> 
>>>>>>>>>>> In short, these lists are completely unusable for us. Apart from 
>>>>>>>>>>> HaGeZi, I consider OISD to have the same problem.
>>>>>>>>>>> 
>>>>>>>>>>> Enough about all the things that are bad. Let’s talk about the new, 
>>>>>>>>>>> good things:
>>>>>>>>>>> 
>>>>>>>>>>> Many blacklists on the internet are an amalgamation of other lists. 
>>>>>>>>>>> These lists vary in quality with some of them being not that good 
>>>>>>>>>>> and without a clear focus and others being excellent data. Since we 
>>>>>>>>>>> don’t have the man power to start from scratch, I felt that we can 
>>>>>>>>>>> copy the concept that HaGeZi and OISD have started and simply 
>>>>>>>>>>> create a new list that is based on other lists at the beginning to 
>>>>>>>>>>> have a good starting point. That way, we have much better control 
>>>>>>>>>>> over what is going on these lists and we can shape and mould them 
>>>>>>>>>>> as we need them. Most importantly, we don’t create a single lists, 
>>>>>>>>>>> but many lists that have a clear focus and allow users to choose 
>>>>>>>>>>> what they want to block and what not.
>>>>>>>>>>> 
>>>>>>>>>>> So the current experimental stage that I am in has these lists:
>>>>>>>>>>> 
>>>>>>>>>>> * Ads
>>>>>>>>>>> * Dating
>>>>>>>>>>> * DoH
>>>>>>>>>>> * Gambling
>>>>>>>>>>> * Malware
>>>>>>>>>>> * Porn
>>>>>>>>>>> * Social
>>>>>>>>>>> * Violence
>>>>>>>>>>> 
>>>>>>>>>>> The categories have been determined by what source lists we have 
>>>>>>>>>>> available with good data and are compatible with our chosen license 
>>>>>>>>>>> CC BY-SA 4.0. This is the same license that we are using for the 
>>>>>>>>>>> IPFire Location database, too.
>>>>>>>>>>> 
>>>>>>>>>>> The main use-cases for any kind of blocking are to comply with 
>>>>>>>>>>> legal requirements in networks with children (i.e. schools) to 
>>>>>>>>>>> remove any kind of pornographic content, sometimes block social 
>>>>>>>>>>> media as well. Gambling and violence are commonly blocked, too. 
>>>>>>>>>>> Even more common would be filtering advertising and any malicious 
>>>>>>>>>>> content.
>>>>>>>>>>> 
>>>>>>>>>>> The latter is especially difficult because so many source lists 
>>>>>>>>>>> throw phishing, spyware, malvertising, tracking and other things 
>>>>>>>>>>> into the same bucket. Here this is currently all in the malware 
>>>>>>>>>>> list which has therefore become quite large. I am not sure whether 
>>>>>>>>>>> this will stay like this in the future or if we will have to make 
>>>>>>>>>>> some adjustments, but that is exactly why this is now entering some 
>>>>>>>>>>> larger testing.
>>>>>>>>>>> 
>>>>>>>>>>> What has been built so far? In order to put these lists together 
>>>>>>>>>>> properly, track any data about where it is coming from, I have 
>>>>>>>>>>> built a tool in Python available here:
>>>>>>>>>>> 
>>>>>>>>>>> https://git.ipfire.org/?p=dnsbl.git;a=summary
>>>>>>>>>>> 
>>>>>>>>>>> This tool will automatically update all lists once an hour if there 
>>>>>>>>>>> have been any changes and export them in various formats. The 
>>>>>>>>>>> exported lists are available for download here:
>>>>>>>>>>> 
>>>>>>>>>>> https://dnsbl.ipfire.org/lists/
>>>>>>>>>> The download using dnsbl.ipfire.org/lists/squidguard.tar.gz as the 
>>>>>>>>>> custom url works fine.
>>>>>>>>>> 
>>>>>>>>>> However you need to remember not to put the https:// at the front of 
>>>>>>>>>> the url otherwise the WUI page completes without any error messages 
>>>>>>>>>> but leaves an error message in the system logs saying
>>>>>>>>>> 
>>>>>>>>>> URL filter blacklist - ERROR: Not a valid URL filter blacklist
>>>>>>>>>> 
>>>>>>>>>> I found this out the hard way.
>>>>>>>>> 
>>>>>>>>> Oh yes, I forgot that there is a field on the web UI. If that does 
>>>>>>>>> not accept https:// as a prefix, please file a bug and we will fix it.
>>>>>>>> 
>>>>>>>> I will confirm it and raise a bug.
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> The other thing I noticed is that if you already have the Toulouse 
>>>>>>>>>> University list downloaded and you then change to the ipfire custom 
>>>>>>>>>> url then all the existing Toulouse blocklists stay in the directory 
>>>>>>>>>> on IPFire and so you end up with a huge number of category tick 
>>>>>>>>>> boxes, most of which are the old Toulouse ones, which are still 
>>>>>>>>>> available to select and it is not clear which ones are from Toulouse 
>>>>>>>>>> and which ones from IPFire.
>>>>>>>>> 
>>>>>>>>> Yes, I got the same thing, too. I think this is a bug, too, because 
>>>>>>>>> otherwise you would have a lot of unused categories lying around that 
>>>>>>>>> will never be updated. You cannot even tell which ones are from the 
>>>>>>>>> current list and which ones from the old list.
>>>>>>>>> 
>>>>>>>>> Long-term we could even consider to remove the Univ. Toulouse list 
>>>>>>>>> entirely and only have our own lists available which would make the 
>>>>>>>>> problem go away.
>>>>>>>>> 
>>>>>>>>>> I think if the blocklist URL source is changed or a custom url is 
>>>>>>>>>> provided the first step should be to remove the old ones already 
>>>>>>>>>> existing.
>>>>>>>>>> That might be a problem because users can also create their own 
>>>>>>>>>> blocklists and I believe those go into the same directory.
>>>>>>>>> 
>>>>>>>>> Good thought. We of course cannot delete the custom lists.
>>>>>>>>> 
>>>>>>>>>> Without clearing out the old blocklists you end up with a huge 
>>>>>>>>>> number of checkboxes for lists but it is not clear what happens if 
>>>>>>>>>> there is a category that has the same name for the Toulouse list and 
>>>>>>>>>> the IPFire list such as gambling. I will have a look at that and see 
>>>>>>>>>> what happens.
>>>>>>>>>> 
>>>>>>>>>> Not sure what the best approach to this is.
>>>>>>>>> 
>>>>>>>>> I believe it is removing all old content.
>>>>>>>>> 
>>>>>>>>>> Manually deleting all contents of the urlfilter/blacklists/ 
>>>>>>>>>> directory and then selecting the IPFire blocklist url for the custom 
>>>>>>>>>> url I end up with only the 8 categories from the IPFire list.
>>>>>>>>>> 
>>>>>>>>>> I have tested some gambling sites from the IPFire list and the block 
>>>>>>>>>> worked on some. On others the site no longer exists so there is 
>>>>>>>>>> nothing to block or has been changed to an https site and in that 
>>>>>>>>>> case it went straight through. Also if I chose the http version of 
>>>>>>>>>> the link, it was automatically changed to https and went through 
>>>>>>>>>> without being blocked.
>>>>>>>>> 
>>>>>>>>> The entire IPFire infrastructure always requires HTTPS. If you start 
>>>>>>>>> using HTTP, you will be automatically redirected. It is 2026 and we 
>>>>>>>>> don’t need to talk HTTP any more :)
>>>>>>>> 
>>>>>>>> Some of the domains in the gambling list (maybe quite a lot) seem to 
>>>>>>>> only have an http access. If I tried https it came back with the fact 
>>>>>>>> that it couldn't find it.
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I am glad to hear that the list is actually blocking. It would have 
>>>>>>>>> been bad if it didn’t. Now we have the big task to check out the 
>>>>>>>>> “quality” - however that can be determined. I think this is what 
>>>>>>>>> needs some time…
>>>>>>>>> 
>>>>>>>>> In the meantime I have set up a small page on our website:
>>>>>>>>> 
>>>>>>>>> https://www.ipfire.org/dnsbl
>>>>>>>>> 
>>>>>>>>> I would like to run this as a first-class project inside IPFire like 
>>>>>>>>> we are doing with IPFire Location. That means that we need to tell 
>>>>>>>>> people about what we are doing. Hopefully this page is a little start.
>>>>>>>>> 
>>>>>>>>> Initially it has a couple of high-level bullet points about what we 
>>>>>>>>> are trying to achieve. I don’t think the text is very good, yet, but 
>>>>>>>>> it is the best I had in that moment. There is then also a list of the 
>>>>>>>>> lists that we currently offer. For each list, a detailed page will 
>>>>>>>>> tell you about the license, how many domains are listed, when the 
>>>>>>>>> last update has been, the sources and even there is a history page 
>>>>>>>>> that shows all the changes whenever they have happened.
>>>>>>>>> 
>>>>>>>>> Finally there is a section that explains “How To Use?” the list which 
>>>>>>>>> I would love to extend to include AdGuard Plus and things like that 
>>>>>>>>> as well as Pi-Hole and whatever else could use the list. In a later 
>>>>>>>>> step we should go ahead and talk to any projects to include our 
>>>>>>>>> list(s) into their dropdown so that people can enable them nice and 
>>>>>>>>> easy.
>>>>>>>>> 
>>>>>>>>> Behind the web page there is an API service that is running on the 
>>>>>>>>> host that is running the DNSBL. The frontend web app that is running 
>>>>>>>>> www.ipfire.org <http://www.ipfire.org/> is connecting to that API 
>>>>>>>>> service to fetch the current lists, any details and so on. That way, 
>>>>>>>>> we can split the logic and avoid creating a huge monolith of a web 
>>>>>>>>> app. This also means that page could be down a little as I am still 
>>>>>>>>> working on the entire thing and will frequently restart it.
>>>>>>>>> 
>>>>>>>>> The API documentation is available here and the API is publicly 
>>>>>>>>> available: https://api.dnsbl.ipfire.org/docs
>>>>>>>>> 
>>>>>>>>> The website/API allows to file reports for anything that does not 
>>>>>>>>> seem to be right on any of the lists. I would like to keep it as an 
>>>>>>>>> open process, however, long-term, this cannot cost us any time. In 
>>>>>>>>> the current stage, the reports are getting filed and that is about 
>>>>>>>>> it. I still need to build out some way for admins or moderators (I am 
>>>>>>>>> not sure what kind of roles I want to have here) to accept or reject 
>>>>>>>>> those reports.
>>>>>>>>> 
>>>>>>>>> In case of us receiving a domain from a source list, I would rather 
>>>>>>>>> like to submit a report to upstream for them to de-list. That way, we 
>>>>>>>>> don’t have any admin to do and we are contributing back to other 
>>>>>>>>> list. That would be a very good thing to do. We cannot however throw 
>>>>>>>>> tons of emails at some random upstream projects without co-ordinating 
>>>>>>>>> this first. By not reporting upstream, we will probably over time 
>>>>>>>>> create large whitelists and I am not sure if that is a good thing to 
>>>>>>>>> do.
>>>>>>>>> 
>>>>>>>>> Finally, there is a search box that can be used to find out if a 
>>>>>>>>> domain is listed on any of the lists.
>>>>>>>>> 
>>>>>>>>>>> If you download and open any of the files, you will see a large 
>>>>>>>>>>> header that includes copyright information and lists all sources 
>>>>>>>>>>> that have been used to create the individual lists. This way we 
>>>>>>>>>>> ensure maximum transparency, comply with the terms of the 
>>>>>>>>>>> individual licenses of the source lists and give credit to the 
>>>>>>>>>>> people who help us to put together the most perfect list for our 
>>>>>>>>>>> users.
>>>>>>>>>>> 
>>>>>>>>>>> I would like this to become a project that is not only being used 
>>>>>>>>>>> in IPFire. We can and will be compatible with other solutions like 
>>>>>>>>>>> AdGuard, PiHole so that people can use our lists if they would like 
>>>>>>>>>>> to even though they are not using IPFire. Hopefully, these users 
>>>>>>>>>>> will also feed back to us so that we can improve our lists over 
>>>>>>>>>>> time and make them one of the best options out there.
>>>>>>>>>>> 
>>>>>>>>>>> All lists are available as a simple text file that lists the 
>>>>>>>>>>> domains. Then there is a hosts file available as well as a DNS zone 
>>>>>>>>>>> file and an RPZ file. Each list is individually available to be 
>>>>>>>>>>> used in squidGuard and there is a larger tarball available with all 
>>>>>>>>>>> lists that can be used in IPFire’s URL Filter. I am planning to add 
>>>>>>>>>>> Suricata/Snort signatures whenever I have time to do so. Even 
>>>>>>>>>>> though it is not a good idea to filter pornographic content this 
>>>>>>>>>>> way, I suppose that catching malware and blocking DoH are good 
>>>>>>>>>>> use-cases for an IPS. Time will tell…
>>>>>>>>>>> 
>>>>>>>>>>> As a start, we will make these lists available in IPFire’s URL 
>>>>>>>>>>> Filter and collect some feedback about how we are doing. 
>>>>>>>>>>> Afterwards, we can see where else we can take this project.
>>>>>>>>>>> 
>>>>>>>>>>> If you want to enable this on your system, simply add the URL to 
>>>>>>>>>>> your autoupdate.urls file like here:
>>>>>>>>>>> 
>>>>>>>>>>> https://git.ipfire.org/?p=people/ms/ipfire-2.x.git;a=commitdiff;h=bf675bb937faa7617474b3cc84435af3b1f7f45f
>>>>>>>>>> I also tested out adding the IPFire url to autoupdate.urls and that 
>>>>>>>>>> also worked fine for me.
>>>>>>>>> 
>>>>>>>>> Very good. Should we include this already with Core Update 200? I 
>>>>>>>>> don’t think we would break anything, but we might already gain a 
>>>>>>>>> couple more people who are helping us to test this all?
>>>>>>>> 
>>>>>>>> I think that would be a good idea.
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> The next step would be to build and test our DNS infrastructure. In 
>>>>>>>>> the “How To Use?” Section on the pages of the individual lists, you 
>>>>>>>>> can already see some instructions on how to use the lists as an RPZ. 
>>>>>>>>> In comparison to other “providers”, I would prefer if people would be 
>>>>>>>>> using DNS to fetch the lists. This is simply to push out updates in a 
>>>>>>>>> cheap way for us and also do it very regularly.
>>>>>>>>> 
>>>>>>>>> Initially, clients will pull the entire list using AXFR. There is no 
>>>>>>>>> way around this as they need to have the data in the first place. 
>>>>>>>>> After that, clients will only need the changes. As you can see in the 
>>>>>>>>> history, the lists don’t actually change that often. Sometimes only 
>>>>>>>>> once a day and therefore downloading the entire list again would be a 
>>>>>>>>> huge waste of data, both on the client side, but also for us hosting 
>>>>>>>>> then.
>>>>>>>>> 
>>>>>>>>> Some other providers update their lists “every 10 minutes”, and there 
>>>>>>>>> won't be any changes whatsoever. We don’t do that. We will only 
>>>>>>>>> export the lists again when they have actually changed. The 
>>>>>>>>> timestamps on the files that we offer using HTTPS can be checked by 
>>>>>>>>> clients so that they won’t re-download the list again if it has not 
>>>>>>>>> been changed. But using HTTPS still means that we would have to 
>>>>>>>>> re-download the entire list and not only the changes.
>>>>>>>>> 
>>>>>>>>> Using DNS and IXFR will update the lists by only transferring a few 
>>>>>>>>> kilobytes and therefore we can have clients check once an hour if a 
>>>>>>>>> list has actually changed and only send out the raw changes. That 
>>>>>>>>> way, we will be able to serve millions of clients at very cheap cost 
>>>>>>>>> and they will always have a very up to date list.
>>>>>>>>> 
>>>>>>>>> As far as I can see any DNS software that supports RPZs supports 
>>>>>>>>> AXFR/IXFR with exception of Knot Resolver which expects the zone to 
>>>>>>>>> be downloaded externally. There is a ticket for AXFR/IXFR support 
>>>>>>>>> (https://gitlab.nic.cz/knot/knot-resolver/-/issues/195).
>>>>>>>>> 
>>>>>>>>> Initially, some of the lists have been *huge* which is why a simple 
>>>>>>>>> HTTP download is not feasible. The porn list was over 100 MiB. We 
>>>>>>>>> could have spent thousands on just traffic alone which I don’t have 
>>>>>>>>> for this kind of project. It would also be unnecessary money being 
>>>>>>>>> spent. There are simply better solutions out there. But then I built 
>>>>>>>>> something that basically tests the data that we are receiving from 
>>>>>>>>> upstream but simply checking if a listed domain still exists. The 
>>>>>>>>> result was very astonishing to me.
>>>>>>>>> 
>>>>>>>>> So whenever someone adds a domain to the list, we will (eventually, 
>>>>>>>>> but not immediately) check if we can resolve the domain’s SOA record. 
>>>>>>>>> If not, we mark the domain as non-active and will no longer include 
>>>>>>>>> them in the exported data. This brought down the porn list from just 
>>>>>>>>> under 5 million domains to just 421k. On the sources page 
>>>>>>>>> (https://www.ipfire.org/dnsbl/lists/porn/sources) I am listing the 
>>>>>>>>> percentage of dead domains from each of them and the UT1 list has 94% 
>>>>>>>>> dead domains. Wow.
>>>>>>>>> 
>>>>>>>>> If we cannot resolve the domain, neither can our users. So we would 
>>>>>>>>> otherwise fill the lists with tons of domains that simply could never 
>>>>>>>>> be reached. And if they cannot be reached, why would we block them? 
>>>>>>>>> We would waste bandwidth and a lot of memory on each single client.
>>>>>>>>> 
>>>>>>>>> The other sources have similarly high rations of dead domains. Most 
>>>>>>>>> of them are in the 50-80% range. Therefore I am happy that we are 
>>>>>>>>> doing some extra work here to give our users much better data for 
>>>>>>>>> their filtering.
>>>>>>>> 
>>>>>>>> Removing all dead entries sounds like an excellent step.
>>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> 
>>>>>>>> Adolf.
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> So, if you like, please go and check out the RPZ blocking with 
>>>>>>>>> Unbound. Instructions are on the page. I would be happy to hear how 
>>>>>>>>> this is turning out.
>>>>>>>>> 
>>>>>>>>> Please let me know if there are any more questions, and I would be 
>>>>>>>>> glad to answer them.
>>>>>>>>> 
>>>>>>>>> Happy New Year,
>>>>>>>>> -Michael
>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Regards,
>>>>>>>>>> Adolf.
>>>>>>>>>>> This email is just a brain dump from me to this list. I would be 
>>>>>>>>>>> happy to answer any questions about implementation details, etc. if 
>>>>>>>>>>> people are interested. Right now, this email is long enough already…
>>>>>>>>>>> 
>>>>>>>>>>> All the best,
>>>>>>>>>>> -Michael
>>>>>>>>>> 
>>>>>>>>>> -- 
>>>>>>>>>> Sent from my laptop
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> -- 
>>>>>> Sent from my laptop
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
>

Re: Let's launch our own blocklists...

Reply via email to