Re: Let's launch our own blocklists...

Adolf Belka Mon, 05 Jan 2026 03:31:39 -0800

Hi Michael,


On 05/01/2026 12:11, Adolf Belka wrote:

Hi Michael,

I have found that the malware list includes duckduckgo.com

I have checked through the various sources used for the malware list.

The ShadowWhisperer (Tracking) list has improving.duckduckgo.com in itslist. I suspect that this one is the one causing the problem.

The mtxadmin (_malware_typo) list has duckduckgo.com mentioned 3 timesbut not directly as a domain name - looks more like a reference.


Regards,

Adolf.

Regards,
Adolf.


On 02/01/2026 14:02, Adolf Belka wrote:
Hi,

On 02/01/2026 12:09, Michael Tremer wrote:
Hello,
On 30 Dec 2025, at 14:05, Adolf Belka <[email protected]> wrote:

Hi Michael,

On 29/12/2025 13:05, Michael Tremer wrote:
Hello everyone,
I hope everyone had a great Christmas and a couple of quiet daysto relax from all the stress that was the year 2025.
Still relaxing.
Very good, so let’s have a strong start into 2026 now!
Starting next week, yes.
Having a couple of quieter days, I have been working on a new,little (hopefully) side project that has probably been high up onour radar since the Shalla list has shut down in 2020, or maybeeven earlier. The goal of the project is to provide good listswith categories of domain names which are usually used to blockaccess to these domains.
I simply call this IPFire DNSBL which is short for IPFire DNSBlocklists.
How did we get here?
As stated before, the URL filter feature in IPFire has the problemthat there are not many good blocklists available any more. Thereused to be a couple more - most famously the Shalla list - but weare now down to a single list from the University of Toulouse. Itis a great list, but it is not always the best fit for all users.
Then there has been talk about whether we could implement moreblocking features into IPFire that don’t involve the proxy. Mostfamously blocking over DNS. The problem here remains a theblocking feature is only as good as the data that is fed into it.Some people have been putting forward a number of lists that weresuitable for them, but they would not have replaced the blockingfunctionality as we know it. Their aim is to provide “one list foreverything” but that is not what people usually want. It istargeted at a classic home user and the only separation that isbeing made is any adult/porn/NSFW content which usually is putinto a separate list.
It would have been technically possible to include these lists andlet the users decide, but that is not the aim of IPFire. We wantto do the job for the user so that their job is getting easier.Including obscure lists that don’t have a clear outline of whatthey actually want to block (“bad content” is not a category) andpassing the burden of figuring out whether they need the “Light”,“Normal”, “Pro”, “Pro++”, “Ultimate” or even a “Venti” list withcream on top is really not going to work. It is all confusing andwill lead to a bad user experience.
An even bigger problem that is however completely impossible tosolve is bad licensing of these lists. A user has asked thepublisher of the HaGeZi list whether they could be included inIPFire and under what terms. The response was that the list isavailable under the terms of the GNU General Public License v3,but that does not seem to be true. The list contains data fromvarious sources. Many of them are licensed under incompatiblelicenses (CC BY-SA 4.0, MPL, Apache2, …) and unless there is anon-public agreement that this data may be redistributed, there isa huge legal issue here. We would expose our users to potentialcopyright infringement which we cannot do under any circumstances.Furthermore many lists are available under a non-commerciallicense which excludes them from being used in any kind ofbusiness. Plenty of IPFire systems are running in businesses, ifnot even the vast majority.
In short, these lists are completely unusable for us. Apart fromHaGeZi, I consider OISD to have the same problem.
Enough about all the things that are bad. Let’s talk about thenew, good things:
Many blacklists on the internet are an amalgamation of otherlists. These lists vary in quality with some of them being notthat good and without a clear focus and others being excellentdata. Since we don’t have the man power to start from scratch, Ifelt that we can copy the concept that HaGeZi and OISD havestarted and simply create a new list that is based on other listsat the beginning to have a good starting point. That way, we havemuch better control over what is going on these lists and we canshape and mould them as we need them. Most importantly, we don’tcreate a single lists, but many lists that have a clear focus andallow users to choose what they want to block and what not.
So the current experimental stage that I am in has these lists:

   * Ads
   * Dating
   * DoH
   * Gambling
   * Malware
   * Porn
   * Social
   * Violence
The categories have been determined by what source lists we haveavailable with good data and are compatible with our chosenlicense CC BY-SA 4.0. This is the same license that we are usingfor the IPFire Location database, too.
The main use-cases for any kind of blocking are to comply withlegal requirements in networks with children (i.e. schools) toremove any kind of pornographic content, sometimes block socialmedia as well. Gambling and violence are commonly blocked, too.Even more common would be filtering advertising and any maliciouscontent.
The latter is especially difficult because so many source liststhrow phishing, spyware, malvertising, tracking and other thingsinto the same bucket. Here this is currently all in the malwarelist which has therefore become quite large. I am not sure whetherthis will stay like this in the future or if we will have to makesome adjustments, but that is exactly why this is now enteringsome larger testing.
What has been built so far? In order to put these lists togetherproperly, track any data about where it is coming from, I havebuilt a tool in Python available here:
   https://git.ipfire.org/?p=dnsbl.git;a=summary
This tool will automatically update all lists once an hour ifthere have been any changes and export them in various formats.The exported lists are available for download here:
   https://dnsbl.ipfire.org/lists/
The download using dnsbl.ipfire.org/lists/squidguard.tar.gz as thecustom url works fine.
However you need to remember not to put the https:// at the frontof the url otherwise the WUI page completes without any errormessages but leaves an error message in the system logs saying
URL filter blacklist - ERROR: Not a valid URL filter blacklist

I found this out the hard way.
Oh yes, I forgot that there is a field on the web UI. If that doesnot accept https:// as a prefix, please file a bug and we will fix it.
I will confirm it and raise a bug.
The other thing I noticed is that if you already have the ToulouseUniversity list downloaded and you then change to the ipfire customurl then all the existing Toulouse blocklists stay in the directoryon IPFire and so you end up with a huge number of category tickboxes, most of which are the old Toulouse ones, which are stillavailable to select and it is not clear which ones are fromToulouse and which ones from IPFire.
Yes, I got the same thing, too. I think this is a bug, too, becauseotherwise you would have a lot of unused categories lying aroundthat will never be updated. You cannot even tell which ones are fromthe current list and which ones from the old list.
Long-term we could even consider to remove the Univ. Toulouse listentirely and only have our own lists available which would make theproblem go away.
I think if the blocklist URL source is changed or a custom url isprovided the first step should be to remove the old ones alreadyexisting.That might be a problem because users can also create their ownblocklists and I believe those go into the same directory.
Good thought. We of course cannot delete the custom lists.
Without clearing out the old blocklists you end up with a hugenumber of checkboxes for lists but it is not clear what happens ifthere is a category that has the same name for the Toulouse listand the IPFire list such as gambling. I will have a look at thatand see what happens.
Not sure what the best approach to this is.
I believe it is removing all old content.
Manually deleting all contents of the urlfilter/blacklists/directory and then selecting the IPFire blocklist url for thecustom url I end up with only the 8 categories from the IPFire list.
I have tested some gambling sites from the IPFire list and theblock worked on some. On others the site no longer exists so thereis nothing to block or has been changed to an https site and inthat case it went straight through. Also if I chose the httpversion of the link, it was automatically changed to https and wentthrough without being blocked.
The entire IPFire infrastructure always requires HTTPS. If you startusing HTTP, you will be automatically redirected. It is 2026 and wedon’t need to talk HTTP any more :)
Some of the domains in the gambling list (maybe quite a lot) seem toonly have an http access. If I tried https it came back with the factthat it couldn't find it.
I am glad to hear that the list is actually blocking. It would havebeen bad if it didn’t. Now we have the big task to check out the“quality” - however that can be determined. I think this is whatneeds some time…
In the meantime I have set up a small page on our website:

   https://www.ipfire.org/dnsbl
I would like to run this as a first-class project inside IPFire likewe are doing with IPFire Location. That means that we need to tellpeople about what we are doing. Hopefully this page is a little start.
Initially it has a couple of high-level bullet points about what weare trying to achieve. I don’t think the text is very good, yet, butit is the best I had in that moment. There is then also a list ofthe lists that we currently offer. For each list, a detailed pagewill tell you about the license, how many domains are listed, whenthe last update has been, the sources and even there is a historypage that shows all the changes whenever they have happened.
Finally there is a section that explains “How To Use?” the listwhich I would love to extend to include AdGuard Plus and things likethat as well as Pi-Hole and whatever else could use the list. In alater step we should go ahead and talk to any projects to includeour list(s) into their dropdown so that people can enable them niceand easy.
Behind the web page there is an API service that is running on thehost that is running the DNSBL. The frontend web app that is runningwww.ipfire.org <http://www.ipfire.org/> is connecting to that APIservice to fetch the current lists, any details and so on. That way,we can split the logic and avoid creating a huge monolith of a webapp. This also means that page could be down a little as I am stillworking on the entire thing and will frequently restart it.
The API documentation is available here and the API is publiclyavailable: https://api.dnsbl.ipfire.org/docs
The website/API allows to file reports for anything that does notseem to be right on any of the lists. I would like to keep it as anopen process, however, long-term, this cannot cost us any time. Inthe current stage, the reports are getting filed and that is aboutit. I still need to build out some way for admins or moderators (Iam not sure what kind of roles I want to have here) to accept orreject those reports.
In case of us receiving a domain from a source list, I would ratherlike to submit a report to upstream for them to de-list. That way,we don’t have any admin to do and we are contributing back to otherlist. That would be a very good thing to do. We cannot however throwtons of emails at some random upstream projects withoutco-ordinating this first. By not reporting upstream, we willprobably over time create large whitelists and I am not sure if thatis a good thing to do.
Finally, there is a search box that can be used to find out if adomain is listed on any of the lists.
If you download and open any of the files, you will see a largeheader that includes copyright information and lists all sourcesthat have been used to create the individual lists. This way weensure maximum transparency, comply with the terms of theindividual licenses of the source lists and give credit to thepeople who help us to put together the most perfect list for ourusers.
I would like this to become a project that is not only being usedin IPFire. We can and will be compatible with other solutions likeAdGuard, PiHole so that people can use our lists if they wouldlike to even though they are not using IPFire. Hopefully, theseusers will also feed back to us so that we can improve our listsover time and make them one of the best options out there.
All lists are available as a simple text file that lists thedomains. Then there is a hosts file available as well as a DNSzone file and an RPZ file. Each list is individually available tobe used in squidGuard and there is a larger tarball available withall lists that can be used in IPFire’s URL Filter. I am planningto add Suricata/Snort signatures whenever I have time to do so.Even though it is not a good idea to filter pornographic contentthis way, I suppose that catching malware and blocking DoH aregood use-cases for an IPS. Time will tell…
As a start, we will make these lists available in IPFire’s URLFilter and collect some feedback about how we are doing.Afterwards, we can see where else we can take this project.
If you want to enable this on your system, simply add the URL toyour autoupdate.urls file like here:
https://git.ipfire.org/?p=people/ms/ipfire-2.x.git;a=commitdiff;h=bf675bb937faa7617474b3cc84435af3b1f7f45f
I also tested out adding the IPFire url to autoupdate.urls and thatalso worked fine for me.
Very good. Should we include this already with Core Update 200? Idon’t think we would break anything, but we might already gain acouple more people who are helping us to test this all?
I think that would be a good idea.
The next step would be to build and test our DNS infrastructure. Inthe “How To Use?” Section on the pages of the individual lists, youcan already see some instructions on how to use the lists as an RPZ.In comparison to other “providers”, I would prefer if people wouldbe using DNS to fetch the lists. This is simply to push out updatesin a cheap way for us and also do it very regularly.
Initially, clients will pull the entire list using AXFR. There is noway around this as they need to have the data in the first place.After that, clients will only need the changes. As you can see inthe history, the lists don’t actually change that often. Sometimesonly once a day and therefore downloading the entire list againwould be a huge waste of data, both on the client side, but also forus hosting then.
Some other providers update their lists “every 10 minutes”, andthere won't be any changes whatsoever. We don’t do that. We willonly export the lists again when they have actually changed. Thetimestamps on the files that we offer using HTTPS can be checked byclients so that they won’t re-download the list again if it has notbeen changed. But using HTTPS still means that we would have tore-download the entire list and not only the changes.
Using DNS and IXFR will update the lists by only transferring a fewkilobytes and therefore we can have clients check once an hour if alist has actually changed and only send out the raw changes. Thatway, we will be able to serve millions of clients at very cheap costand they will always have a very up to date list.
As far as I can see any DNS software that supports RPZs supportsAXFR/IXFR with exception of Knot Resolver which expects the zone tobe downloaded externally. There is a ticket for AXFR/IXFR support(https://gitlab.nic.cz/knot/knot-resolver/-/issues/195).
Initially, some of the lists have been *huge* which is why a simpleHTTP download is not feasible. The porn list was over 100 MiB. Wecould have spent thousands on just traffic alone which I don’t havefor this kind of project. It would also be unnecessary money beingspent. There are simply better solutions out there. But then I builtsomething that basically tests the data that we are receiving fromupstream but simply checking if a listed domain still exists. Theresult was very astonishing to me.
So whenever someone adds a domain to the list, we will (eventually,but not immediately) check if we can resolve the domain’s SOArecord. If not, we mark the domain as non-active and will no longerinclude them in the exported data. This brought down the porn listfrom just under 5 million domains to just 421k. On the sources page(https://www.ipfire.org/dnsbl/lists/porn/sources) I am listing thepercentage of dead domains from each of them and the UT1 list has94% dead domains. Wow.
If we cannot resolve the domain, neither can our users. So we wouldotherwise fill the lists with tons of domains that simply couldnever be reached. And if they cannot be reached, why would we blockthem? We would waste bandwidth and a lot of memory on each singleclient.
The other sources have similarly high rations of dead domains. Mostof them are in the 50-80% range. Therefore I am happy that we aredoing some extra work here to give our users much better data fortheir filtering.
Removing all dead entries sounds like an excellent step.

Regards,

Adolf.
So, if you like, please go and check out the RPZ blocking withUnbound. Instructions are on the page. I would be happy to hear howthis is turning out.
Please let me know if there are any more questions, and I would beglad to answer them.
Happy New Year,
-Michael
Regards,
Adolf.
This email is just a brain dump from me to this list. I would behappy to answer any questions about implementation details, etc.if people are interested. Right now, this email is long enoughalready…
All the best,
-Michael
--
Sent from my laptop


--
Sent from my laptop

Re: Let's launch our own blocklists...

Reply via email to