Good Morning Adolf, I had a look at this problem yesterday and it seems that parsing the format is becoming a little bit difficult this way. Since this is only affecting very few domains, I have simply whitelisted them all manually and duckduckgo.com <http://duckduckgo.com/> and others should now be easily reachable again.
Please let me know if you have any more findings. All the best, -Michael > On 5 Jan 2026, at 11:48, Michael Tremer <[email protected]> wrote: > > Hello Adolf, > > This is a good find. > > But if duckduckgo.com <http://duckduckgo.com/> is blocked, we will have to > have a source somewhere that blocks that domain. Not only a sub-domain of it. > Otherwise we have a bug somewhere. > > This is most likely as the domain is listed here, but with some stuff > afterwards: > > > https://raw.githubusercontent.com/mtxadmin/ublock/refs/heads/master/hosts/_malware_typo > > We strip everything after a # away because we consider it a comment. However, > that causes that there is only a line with the domain left which will cause > it being listed. > > The # sign is used as some special character but at the same time it is being > used for comments. > > I will fix this and then refresh the list. > > -Michael > >> On 5 Jan 2026, at 11:31, Adolf Belka <[email protected]> wrote: >> >> Hi Michael, >> >> >> On 05/01/2026 12:11, Adolf Belka wrote: >>> Hi Michael, >>> >>> I have found that the malware list includes duckduckgo.com >>> >> I have checked through the various sources used for the malware list. >> >> The ShadowWhisperer (Tracking) list has improving.duckduckgo.com in its >> list. I suspect that this one is the one causing the problem. >> >> The mtxadmin (_malware_typo) list has duckduckgo.com mentioned 3 times but >> not directly as a domain name - looks more like a reference. >> >> Regards, >> >> Adolf. >> >> >>> Regards, >>> Adolf. >>> >>> >>> On 02/01/2026 14:02, Adolf Belka wrote: >>>> Hi, >>>> >>>> On 02/01/2026 12:09, Michael Tremer wrote: >>>>> Hello, >>>>> >>>>>> On 30 Dec 2025, at 14:05, Adolf Belka <[email protected]> wrote: >>>>>> >>>>>> Hi Michael, >>>>>> >>>>>> On 29/12/2025 13:05, Michael Tremer wrote: >>>>>>> Hello everyone, >>>>>>> >>>>>>> I hope everyone had a great Christmas and a couple of quiet days to >>>>>>> relax from all the stress that was the year 2025. >>>>>> Still relaxing. >>>>> >>>>> Very good, so let’s have a strong start into 2026 now! >>>> >>>> Starting next week, yes. >>>> >>>>> >>>>>>> Having a couple of quieter days, I have been working on a new, little >>>>>>> (hopefully) side project that has probably been high up on our radar >>>>>>> since the Shalla list has shut down in 2020, or maybe even earlier. The >>>>>>> goal of the project is to provide good lists with categories of domain >>>>>>> names which are usually used to block access to these domains. >>>>>>> >>>>>>> I simply call this IPFire DNSBL which is short for IPFire DNS >>>>>>> Blocklists. >>>>>>> >>>>>>> How did we get here? >>>>>>> >>>>>>> As stated before, the URL filter feature in IPFire has the problem that >>>>>>> there are not many good blocklists available any more. There used to be >>>>>>> a couple more - most famously the Shalla list - but we are now down to >>>>>>> a single list from the University of Toulouse. It is a great list, but >>>>>>> it is not always the best fit for all users. >>>>>>> >>>>>>> Then there has been talk about whether we could implement more blocking >>>>>>> features into IPFire that don’t involve the proxy. Most famously >>>>>>> blocking over DNS. The problem here remains a the blocking feature is >>>>>>> only as good as the data that is fed into it. Some people have been >>>>>>> putting forward a number of lists that were suitable for them, but they >>>>>>> would not have replaced the blocking functionality as we know it. Their >>>>>>> aim is to provide “one list for everything” but that is not what people >>>>>>> usually want. It is targeted at a classic home user and the only >>>>>>> separation that is being made is any adult/porn/NSFW content which >>>>>>> usually is put into a separate list. >>>>>>> >>>>>>> It would have been technically possible to include these lists and let >>>>>>> the users decide, but that is not the aim of IPFire. We want to do the >>>>>>> job for the user so that their job is getting easier. Including obscure >>>>>>> lists that don’t have a clear outline of what they actually want to >>>>>>> block (“bad content” is not a category) and passing the burden of >>>>>>> figuring out whether they need the “Light”, “Normal”, “Pro”, “Pro++”, >>>>>>> “Ultimate” or even a “Venti” list with cream on top is really not going >>>>>>> to work. It is all confusing and will lead to a bad user experience. >>>>>>> >>>>>>> An even bigger problem that is however completely impossible to solve >>>>>>> is bad licensing of these lists. A user has asked the publisher of the >>>>>>> HaGeZi list whether they could be included in IPFire and under what >>>>>>> terms. The response was that the list is available under the terms of >>>>>>> the GNU General Public License v3, but that does not seem to be true. >>>>>>> The list contains data from various sources. Many of them are licensed >>>>>>> under incompatible licenses (CC BY-SA 4.0, MPL, Apache2, …) and unless >>>>>>> there is a non-public agreement that this data may be redistributed, >>>>>>> there is a huge legal issue here. We would expose our users to >>>>>>> potential copyright infringement which we cannot do under any >>>>>>> circumstances. Furthermore many lists are available under a >>>>>>> non-commercial license which excludes them from being used in any kind >>>>>>> of business. Plenty of IPFire systems are running in businesses, if not >>>>>>> even the vast majority. >>>>>>> >>>>>>> In short, these lists are completely unusable for us. Apart from >>>>>>> HaGeZi, I consider OISD to have the same problem. >>>>>>> >>>>>>> Enough about all the things that are bad. Let’s talk about the new, >>>>>>> good things: >>>>>>> >>>>>>> Many blacklists on the internet are an amalgamation of other lists. >>>>>>> These lists vary in quality with some of them being not that good and >>>>>>> without a clear focus and others being excellent data. Since we don’t >>>>>>> have the man power to start from scratch, I felt that we can copy the >>>>>>> concept that HaGeZi and OISD have started and simply create a new list >>>>>>> that is based on other lists at the beginning to have a good starting >>>>>>> point. That way, we have much better control over what is going on >>>>>>> these lists and we can shape and mould them as we need them. Most >>>>>>> importantly, we don’t create a single lists, but many lists that have a >>>>>>> clear focus and allow users to choose what they want to block and what >>>>>>> not. >>>>>>> >>>>>>> So the current experimental stage that I am in has these lists: >>>>>>> >>>>>>> * Ads >>>>>>> * Dating >>>>>>> * DoH >>>>>>> * Gambling >>>>>>> * Malware >>>>>>> * Porn >>>>>>> * Social >>>>>>> * Violence >>>>>>> >>>>>>> The categories have been determined by what source lists we have >>>>>>> available with good data and are compatible with our chosen license CC >>>>>>> BY-SA 4.0. This is the same license that we are using for the IPFire >>>>>>> Location database, too. >>>>>>> >>>>>>> The main use-cases for any kind of blocking are to comply with legal >>>>>>> requirements in networks with children (i.e. schools) to remove any >>>>>>> kind of pornographic content, sometimes block social media as well. >>>>>>> Gambling and violence are commonly blocked, too. Even more common would >>>>>>> be filtering advertising and any malicious content. >>>>>>> >>>>>>> The latter is especially difficult because so many source lists throw >>>>>>> phishing, spyware, malvertising, tracking and other things into the >>>>>>> same bucket. Here this is currently all in the malware list which has >>>>>>> therefore become quite large. I am not sure whether this will stay like >>>>>>> this in the future or if we will have to make some adjustments, but >>>>>>> that is exactly why this is now entering some larger testing. >>>>>>> >>>>>>> What has been built so far? In order to put these lists together >>>>>>> properly, track any data about where it is coming from, I have built a >>>>>>> tool in Python available here: >>>>>>> >>>>>>> https://git.ipfire.org/?p=dnsbl.git;a=summary >>>>>>> >>>>>>> This tool will automatically update all lists once an hour if there >>>>>>> have been any changes and export them in various formats. The exported >>>>>>> lists are available for download here: >>>>>>> >>>>>>> https://dnsbl.ipfire.org/lists/ >>>>>> The download using dnsbl.ipfire.org/lists/squidguard.tar.gz as the >>>>>> custom url works fine. >>>>>> >>>>>> However you need to remember not to put the https:// at the front of the >>>>>> url otherwise the WUI page completes without any error messages but >>>>>> leaves an error message in the system logs saying >>>>>> >>>>>> URL filter blacklist - ERROR: Not a valid URL filter blacklist >>>>>> >>>>>> I found this out the hard way. >>>>> >>>>> Oh yes, I forgot that there is a field on the web UI. If that does not >>>>> accept https:// as a prefix, please file a bug and we will fix it. >>>> >>>> I will confirm it and raise a bug. >>>> >>>>> >>>>>> The other thing I noticed is that if you already have the Toulouse >>>>>> University list downloaded and you then change to the ipfire custom url >>>>>> then all the existing Toulouse blocklists stay in the directory on >>>>>> IPFire and so you end up with a huge number of category tick boxes, most >>>>>> of which are the old Toulouse ones, which are still available to select >>>>>> and it is not clear which ones are from Toulouse and which ones from >>>>>> IPFire. >>>>> >>>>> Yes, I got the same thing, too. I think this is a bug, too, because >>>>> otherwise you would have a lot of unused categories lying around that >>>>> will never be updated. You cannot even tell which ones are from the >>>>> current list and which ones from the old list. >>>>> >>>>> Long-term we could even consider to remove the Univ. Toulouse list >>>>> entirely and only have our own lists available which would make the >>>>> problem go away. >>>>> >>>>>> I think if the blocklist URL source is changed or a custom url is >>>>>> provided the first step should be to remove the old ones already >>>>>> existing. >>>>>> That might be a problem because users can also create their own >>>>>> blocklists and I believe those go into the same directory. >>>>> >>>>> Good thought. We of course cannot delete the custom lists. >>>>> >>>>>> Without clearing out the old blocklists you end up with a huge number of >>>>>> checkboxes for lists but it is not clear what happens if there is a >>>>>> category that has the same name for the Toulouse list and the IPFire >>>>>> list such as gambling. I will have a look at that and see what happens. >>>>>> >>>>>> Not sure what the best approach to this is. >>>>> >>>>> I believe it is removing all old content. >>>>> >>>>>> Manually deleting all contents of the urlfilter/blacklists/ directory >>>>>> and then selecting the IPFire blocklist url for the custom url I end up >>>>>> with only the 8 categories from the IPFire list. >>>>>> >>>>>> I have tested some gambling sites from the IPFire list and the block >>>>>> worked on some. On others the site no longer exists so there is nothing >>>>>> to block or has been changed to an https site and in that case it went >>>>>> straight through. Also if I chose the http version of the link, it was >>>>>> automatically changed to https and went through without being blocked. >>>>> >>>>> The entire IPFire infrastructure always requires HTTPS. If you start >>>>> using HTTP, you will be automatically redirected. It is 2026 and we don’t >>>>> need to talk HTTP any more :) >>>> >>>> Some of the domains in the gambling list (maybe quite a lot) seem to only >>>> have an http access. If I tried https it came back with the fact that it >>>> couldn't find it. >>>> >>>>> >>>>> I am glad to hear that the list is actually blocking. It would have been >>>>> bad if it didn’t. Now we have the big task to check out the “quality” - >>>>> however that can be determined. I think this is what needs some time… >>>>> >>>>> In the meantime I have set up a small page on our website: >>>>> >>>>> https://www.ipfire.org/dnsbl >>>>> >>>>> I would like to run this as a first-class project inside IPFire like we >>>>> are doing with IPFire Location. That means that we need to tell people >>>>> about what we are doing. Hopefully this page is a little start. >>>>> >>>>> Initially it has a couple of high-level bullet points about what we are >>>>> trying to achieve. I don’t think the text is very good, yet, but it is >>>>> the best I had in that moment. There is then also a list of the lists >>>>> that we currently offer. For each list, a detailed page will tell you >>>>> about the license, how many domains are listed, when the last update has >>>>> been, the sources and even there is a history page that shows all the >>>>> changes whenever they have happened. >>>>> >>>>> Finally there is a section that explains “How To Use?” the list which I >>>>> would love to extend to include AdGuard Plus and things like that as well >>>>> as Pi-Hole and whatever else could use the list. In a later step we >>>>> should go ahead and talk to any projects to include our list(s) into >>>>> their dropdown so that people can enable them nice and easy. >>>>> >>>>> Behind the web page there is an API service that is running on the host >>>>> that is running the DNSBL. The frontend web app that is running >>>>> www.ipfire.org <http://www.ipfire.org/> is connecting to that API service >>>>> to fetch the current lists, any details and so on. That way, we can split >>>>> the logic and avoid creating a huge monolith of a web app. This also >>>>> means that page could be down a little as I am still working on the >>>>> entire thing and will frequently restart it. >>>>> >>>>> The API documentation is available here and the API is publicly >>>>> available: https://api.dnsbl.ipfire.org/docs >>>>> >>>>> The website/API allows to file reports for anything that does not seem to >>>>> be right on any of the lists. I would like to keep it as an open process, >>>>> however, long-term, this cannot cost us any time. In the current stage, >>>>> the reports are getting filed and that is about it. I still need to build >>>>> out some way for admins or moderators (I am not sure what kind of roles I >>>>> want to have here) to accept or reject those reports. >>>>> >>>>> In case of us receiving a domain from a source list, I would rather like >>>>> to submit a report to upstream for them to de-list. That way, we don’t >>>>> have any admin to do and we are contributing back to other list. That >>>>> would be a very good thing to do. We cannot however throw tons of emails >>>>> at some random upstream projects without co-ordinating this first. By not >>>>> reporting upstream, we will probably over time create large whitelists >>>>> and I am not sure if that is a good thing to do. >>>>> >>>>> Finally, there is a search box that can be used to find out if a domain >>>>> is listed on any of the lists. >>>>> >>>>>>> If you download and open any of the files, you will see a large header >>>>>>> that includes copyright information and lists all sources that have >>>>>>> been used to create the individual lists. This way we ensure maximum >>>>>>> transparency, comply with the terms of the individual licenses of the >>>>>>> source lists and give credit to the people who help us to put together >>>>>>> the most perfect list for our users. >>>>>>> >>>>>>> I would like this to become a project that is not only being used in >>>>>>> IPFire. We can and will be compatible with other solutions like >>>>>>> AdGuard, PiHole so that people can use our lists if they would like to >>>>>>> even though they are not using IPFire. Hopefully, these users will also >>>>>>> feed back to us so that we can improve our lists over time and make >>>>>>> them one of the best options out there. >>>>>>> >>>>>>> All lists are available as a simple text file that lists the domains. >>>>>>> Then there is a hosts file available as well as a DNS zone file and an >>>>>>> RPZ file. Each list is individually available to be used in squidGuard >>>>>>> and there is a larger tarball available with all lists that can be used >>>>>>> in IPFire’s URL Filter. I am planning to add Suricata/Snort signatures >>>>>>> whenever I have time to do so. Even though it is not a good idea to >>>>>>> filter pornographic content this way, I suppose that catching malware >>>>>>> and blocking DoH are good use-cases for an IPS. Time will tell… >>>>>>> >>>>>>> As a start, we will make these lists available in IPFire’s URL Filter >>>>>>> and collect some feedback about how we are doing. Afterwards, we can >>>>>>> see where else we can take this project. >>>>>>> >>>>>>> If you want to enable this on your system, simply add the URL to your >>>>>>> autoupdate.urls file like here: >>>>>>> >>>>>>> https://git.ipfire.org/?p=people/ms/ipfire-2.x.git;a=commitdiff;h=bf675bb937faa7617474b3cc84435af3b1f7f45f >>>>>> I also tested out adding the IPFire url to autoupdate.urls and that also >>>>>> worked fine for me. >>>>> >>>>> Very good. Should we include this already with Core Update 200? I don’t >>>>> think we would break anything, but we might already gain a couple more >>>>> people who are helping us to test this all? >>>> >>>> I think that would be a good idea. >>>> >>>>> >>>>> The next step would be to build and test our DNS infrastructure. In the >>>>> “How To Use?” Section on the pages of the individual lists, you can >>>>> already see some instructions on how to use the lists as an RPZ. In >>>>> comparison to other “providers”, I would prefer if people would be using >>>>> DNS to fetch the lists. This is simply to push out updates in a cheap way >>>>> for us and also do it very regularly. >>>>> >>>>> Initially, clients will pull the entire list using AXFR. There is no way >>>>> around this as they need to have the data in the first place. After that, >>>>> clients will only need the changes. As you can see in the history, the >>>>> lists don’t actually change that often. Sometimes only once a day and >>>>> therefore downloading the entire list again would be a huge waste of >>>>> data, both on the client side, but also for us hosting then. >>>>> >>>>> Some other providers update their lists “every 10 minutes”, and there >>>>> won't be any changes whatsoever. We don’t do that. We will only export >>>>> the lists again when they have actually changed. The timestamps on the >>>>> files that we offer using HTTPS can be checked by clients so that they >>>>> won’t re-download the list again if it has not been changed. But using >>>>> HTTPS still means that we would have to re-download the entire list and >>>>> not only the changes. >>>>> >>>>> Using DNS and IXFR will update the lists by only transferring a few >>>>> kilobytes and therefore we can have clients check once an hour if a list >>>>> has actually changed and only send out the raw changes. That way, we will >>>>> be able to serve millions of clients at very cheap cost and they will >>>>> always have a very up to date list. >>>>> >>>>> As far as I can see any DNS software that supports RPZs supports >>>>> AXFR/IXFR with exception of Knot Resolver which expects the zone to be >>>>> downloaded externally. There is a ticket for AXFR/IXFR support >>>>> (https://gitlab.nic.cz/knot/knot-resolver/-/issues/195). >>>>> >>>>> Initially, some of the lists have been *huge* which is why a simple HTTP >>>>> download is not feasible. The porn list was over 100 MiB. We could have >>>>> spent thousands on just traffic alone which I don’t have for this kind of >>>>> project. It would also be unnecessary money being spent. There are simply >>>>> better solutions out there. But then I built something that basically >>>>> tests the data that we are receiving from upstream but simply checking if >>>>> a listed domain still exists. The result was very astonishing to me. >>>>> >>>>> So whenever someone adds a domain to the list, we will (eventually, but >>>>> not immediately) check if we can resolve the domain’s SOA record. If not, >>>>> we mark the domain as non-active and will no longer include them in the >>>>> exported data. This brought down the porn list from just under 5 million >>>>> domains to just 421k. On the sources page >>>>> (https://www.ipfire.org/dnsbl/lists/porn/sources) I am listing the >>>>> percentage of dead domains from each of them and the UT1 list has 94% >>>>> dead domains. Wow. >>>>> >>>>> If we cannot resolve the domain, neither can our users. So we would >>>>> otherwise fill the lists with tons of domains that simply could never be >>>>> reached. And if they cannot be reached, why would we block them? We would >>>>> waste bandwidth and a lot of memory on each single client. >>>>> >>>>> The other sources have similarly high rations of dead domains. Most of >>>>> them are in the 50-80% range. Therefore I am happy that we are doing some >>>>> extra work here to give our users much better data for their filtering. >>>> >>>> Removing all dead entries sounds like an excellent step. >>>> >>>> Regards, >>>> >>>> Adolf. >>>> >>>>> >>>>> So, if you like, please go and check out the RPZ blocking with Unbound. >>>>> Instructions are on the page. I would be happy to hear how this is >>>>> turning out. >>>>> >>>>> Please let me know if there are any more questions, and I would be glad >>>>> to answer them. >>>>> >>>>> Happy New Year, >>>>> -Michael >>>>> >>>>>> >>>>>> Regards, >>>>>> Adolf. >>>>>>> This email is just a brain dump from me to this list. I would be happy >>>>>>> to answer any questions about implementation details, etc. if people >>>>>>> are interested. Right now, this email is long enough already… >>>>>>> >>>>>>> All the best, >>>>>>> -Michael >>>>>> >>>>>> -- >>>>>> Sent from my laptop >>>>> >>>>> >>>>> >>>> >>> >> >> -- >> Sent from my laptop >> >> >
