Hello Matthias,

Are you running this on Core Update 200?

There were some changes required so that we will extract the datasets from the 
rules tarballs. I am using a special feature in Suricata so that the lists 
won’t be using too much memory and will be quickly searchable:

  
https://git.ipfire.org/?p=ipfire-2.x.git;a=commitdiff;h=f0b43241a501f7c545e3cb15f6989e945c60b3e2

-Michael

> On 25 Jan 2026, at 17:50, Matthias Fischer <[email protected]> 
> wrote:
> 
> On 25.01.2026 15:40, Michael Tremer wrote:
>> Hello Matthias,
> 
> Hi Michael,
> 
>> Nice catch!
>> 
>> I fixed it here and added the missing “;”:
> 
> Yep. The missing ";" seems to be fixed, but 'suricata' still doesn't
> like our rules. I added 'violence' as an attachment... ;-)
> 
>>  
>> https://git.ipfire.org/?p=dbl.git;a=commitdiff;h=775561e322ceed43e255e5547bd76047b9f8a40b
>> 
>> If you go to the provider settings there is a button to force a ruleset 
>> update which should give you the fixed version. Please let me know if this 
>> works.
> 
> I did that - rules were updated, but no luck:
> 
> ***SNIP***
> ...
> 18:37:37  suricata:  [2577] <Info> -- Including configuration file
> /var/ipfire/suricata/suricata-used-rulesfiles.yaml.
> 18:37:37  suricata:  [2577] <Error> -- failed to set up dataset 'violence'.
> 18:37:37  suricata:  [2577] <Error> -- error parsing signature "drop dns
> any any -> any any (msg:"IPFire DBL [Violence] Blocked DNS Query";
> dns.query; domain; dataset:isset,violence,type string,load
> datasets/violence.txt; classtype:policy-violation; priority:2;
> sid:1048577; rev:1; reference:url,https://www.ipfire.org/dbl/violence;
> metadata:dbl violence.dbl.ipfire.org;)" from file
> /var/lib/suricata/ipfire_dnsbl-violence.rules at line 39
> 18:37:37  suricata:  [2577] <Error> -- failed to set up dataset 'violence'.
> 18:37:37  suricata:  [2577] <Error> -- error parsing signature "drop
> http any any -> any any (msg:"IPFire DBL [Violence] Blocked HTTP
> Request"; http.host; domain; dataset:isset,violence,type string,load
> datasets/violence.txt; classtype:policy-violation; priority:2;
> sid:1048578; rev:1; reference:url,https://www.ipfire.org/dbl/violence;
> metadata:dbl violence.dbl.ipfire.org;)" from file
> /var/lib/suricata/ipfire_dnsbl-violence.rules at line 40
> 18:37:37  suricata:  [2577] <Error> -- failed to set up dataset 'violence'.
> 18:37:37  suricata:  [2577] <Error> -- error parsing signature "drop tls
> any any -> any any (msg:"IPFire DBL [Violence] Blocked TLS Connection";
> tls.sni; domain; dataset:isset,violence,type string,load
> datasets/violence.txt; classtype:policy-violation; priority:2;
> sid:1048579; rev:1; reference:url,https://www.ipfire.org/dbl/violence;
> metadata:dbl violence.dbl.ipfire.org;)" from file
> /var/lib/suricata/ipfire_dnsbl-violence.rules at line 41
> 18:37:37  suricata:  [2577] <Error> -- failed to set up dataset 'violence'.
> 18:37:37  suricata:  [2577] <Error> -- error parsing signature "drop
> quic any any -> any any (msg:"IPFire DBL [Violence] Blocked QUIC
> Connection"; quic.sni; domain; dataset:isset,violence,type string,load
> datasets/violence.txt; classtype:policy-violation; priority:2;
> sid:1048580; rev:1; reference:url,https://www.ipfire.org/dbl/violence;
> metadata:dbl violence.dbl.ipfire.org;)" from file
> /var/lib/suricata/ipfire_dnsbl-violence.rules at line 42
> ...
> ***SNAP***
> 
> For better reading - see attached screenshot.
> 
> Best
> Matthias
> 
>> Best,
>> -Michael
>> 
>>> On 24 Jan 2026, at 23:41, Matthias Fischer <[email protected]> 
>>> wrote:
>>> 
>>> On 23.01.2026 17:39, Michael Tremer wrote:
>>>> Hello Matthias,
>>> 
>>> Hi Michael,
>>> 
>>>> Thank you very much for testing IPFire DBL.
>>> 
>>> No problem - I have news:
>>> 
>>> After taking a closer look to the IPS system logs, unfortunately I found
>>> some parsing errors:
>>> 
>>> 'suricata' complains about missing ";".
>>> 
>>> ***SNIP***
>>> ...
>>> 00:32:40 suricata: [13343] <Info> -- Including configuration file
>>> /var/ipfire/suricata/suricata-used-rulesfiles.yaml.
>>> 00:32:40 suricata: [13343] <Error> -- no terminating ";" found
>>> 00:32:40 suricata: [13343] <Error> -- error parsing signature "drop
>>> dns any any -> any any (msg:"IPFire DBL [Advertising] Blocked DNS
>>> Query"; dns.query; domain; dataset:isset,ads,type string,load
>>> datasets/ads.txt; classtype:policy-violation; priority:3; sid:983041;
>>> rev:1; reference:url,https://www.ipfire.org/dbl/ads; metadata:dbl
>>> ads.dbl.ipfire.org)" from file /var/lib/suricata/ipfire_dnsbl-ads.rules
>>> at line 72
>>> 00:32:40 suricata: [13343] <Error> -- no terminating ";" found
>>> ...
>>> ***SNAP***
>>> 
>>> I tried, but didn't find the right place for any missing ";".
>>> 
>>> Can "anyone" confirm?
>>> 
>>> Best
>>> Matthias
>>> 
>>>>> On 23 Jan 2026, at 15:02, Matthias Fischer <[email protected]> 
>>>>> wrote:
>>>>> 
>>>>> On 22.01.2026 12:33, Michael Tremer wrote:
>>>>>> Hello everyone,
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> short feedback from me:
>>>>> 
>>>>> - I activated both the suricata (IPFire DBL - Domain Blocklist) - and
>>>>> the URLfilter lists from 'dbl.ipfire.org'.
>>>> 
>>>> This is an interesting case. What I didn’t manage to test yet is what 
>>>> happens when Suricata blocks the connection first. If URL Filter sees a 
>>>> domain that is being blocked it will either send you an error page if you 
>>>> are using HTTP, or simply close the connection if it is HTTPS. However, 
>>>> when Suricata comes first in the chain (and it will), it might close the 
>>>> connection because URL Filter has received the request. In the case of 
>>>> HTTPS this does not make any difference because the connection will be 
>>>> closed, but in the HTTP case you won’t see an error page any more and 
>>>> instead have the connection closed, too. You are basically losing the 
>>>> explicit error notification which is a little bit annoying.
>>>> 
>>>> We could have the same when we are doing the same with Unbound and DNS 
>>>> filtering. Potentially we would need to whitelist the local DNS resolver 
>>>> then, but how is Suricata supposed to know that the same categories are 
>>>> activated in both places?
>>>> 
>>>>> - I even took the 'smart-tv' domains from the IFire DBL blacklist and
>>>>> copied/pasted them in my fritzbox filter lists.
>>>> 
>>>> LOL Why not use IPFire to filter this as well?
>>>> 
>>>>> Everything works as expected. Besides, the download of the IPFire
>>>>> DBL-list loads a lot faster than the list from 'Univ. Toulouse'... ;-)
>>>> 
>>>> Yes, we don’t have much traffic on the server, yet.
>>>> 
>>>>> Functionality is good - no false positives or seen problems. Good work -
>>>>> thanks!
>>>> 
>>>> Nice. We need to distinguish a little between what is a technical issue 
>>>> and what is a false-positive/missing domain on the list. However, testing 
>>>> both at the same time is something we will all cope quite well with :)
>>>> 
>>>> -Michael
>>>> 
>>>>> Best
>>>>> Matthias
>>>>> 
>>>>>> Over the past few weeks I have made significant progress on this all, 
>>>>>> and I think we're getting close to something the community will be 
>>>>>> really happy with. I'd love to get feedback from the team before we 
>>>>>> finalise things.
>>>>>> 
>>>>>> So what has happened?
>>>>>> 
>>>>>> First of all, the entire project has been renamed. DNSBL is not entirely 
>>>>>> what this is. Although the lists can be thrown into DNS, they have much 
>>>>>> more use outside of it that I thought we should simply go with DBL, 
>>>>>> short for Domain Blocklist. After all, we are only importing domains. 
>>>>>> The new home of the project therefore is https://www.ipfire.org/dbl
>>>>>> 
>>>>>> I have added a couple more lists that I thought interesting and I have 
>>>>>> added a couple more sources that I considered a good start. Hopefully, 
>>>>>> we will soon gather some more feedback on how well this is all holding 
>>>>>> up. My main focus has however been on the technology that will power 
>>>>>> this project.
>>>>>> 
>>>>>> One of the bigger challenges was to create Suricata rules from the 
>>>>>> lists. Initially I tried to create a ton of rules but since our lists 
>>>>>> are so large, this quickly became too complicated. I have now settled on 
>>>>>> using a feature that is only available in more recent versions of 
>>>>>> Suricata (I believe 7 and later), but since we are already on Suricata 8 
>>>>>> in IPFire this won’t be a problem for us. All domains for each list are 
>>>>>> basically compiled into one massively large dataset and one single rule 
>>>>>> is referring to that dataset. This way, we won’t have the option to 
>>>>>> remove any false-positives, but at least Suricata and the GUI won’t 
>>>>>> starve a really bad death when loading millions of rules.
>>>>>> 
>>>>>> Suricata will now be able to use our rules to block access to any listed 
>>>>>> domains of each of the categories over DNS, HTTP, TLS or QUIC. Although 
>>>>>> I don’t expect many users to use Suricata to block porn or other things, 
>>>>>> this is a great backstop to enforce any policy like that. For example, 
>>>>>> if there is a user on the network who is trying to circumvent the DNS 
>>>>>> server that might filter out certain domains, even after getting an IP 
>>>>>> address resolved through other means, they won’t be able to open a 
>>>>>> TLS/QUIC connection or send a HTTP request to all blocked domains. Some 
>>>>>> people have said they were interested in blocking DNS-over-HTTPS and 
>>>>>> this is a perfect way to do this and actually be sure that any server 
>>>>>> that is being blocked on the list will actually be completely 
>>>>>> inaccessible.
>>>>>> 
>>>>>> Those Suricata rules are already available for testing in Core Update 
>>>>>> 200: 
>>>>>> https://git.ipfire.org/?p=ipfire-2.x.git;a=commitdiff;h=9eb8751487d23dd354a105c28bdbbb0398fe6e85
>>>>>> 
>>>>>> I have chosen various severities for the lists. If someone was to block 
>>>>>> advertising using DBL, this is fine, but not a very severe alert. If 
>>>>>> someone chooses to block malware and there is a system on the network 
>>>>>> trying to access those domains, this is an alert worth being 
>>>>>> investigated by an admin. Our new Suricata Reporter will show those 
>>>>>> violations in different colours based on the severity which helps to 
>>>>>> identify the right alerts to further investigate.
>>>>>> 
>>>>>> Formerly I have asked you to test the lists using URL Filter. Those 
>>>>>> rules are now available as well in Core Update 200: 
>>>>>> https://git.ipfire.org/?p=ipfire-2.x.git;a=commitdiff;h=db160694279a4b10378447f775dd536fdfcfb02a
>>>>>> 
>>>>>> I talked about a method to remove any dead domains from any sources 
>>>>>> which is a great way to keep our lists smaller. The pure size of them is 
>>>>>> a problem in so many ways. That check was however a little bit too 
>>>>>> ambitious and I had to make it a little bit less eager. Basically if we 
>>>>>> are in doubt, we need to still list the domain because it might be 
>>>>>> resolvable by a user.
>>>>>> 
>>>>>> https://git.ipfire.org/?p=dbl.git;a=commitdiff;h=bb5b6e33b731501d45dea293505f7d42a61d5ce7
>>>>>> 
>>>>>> So how else could we make the lists smaller without losing any actual 
>>>>>> data? Since we sometimes list a whole TLD (e.g. .xxx or .porn), there is 
>>>>>> very little point in listing any domains of this TLD. They will always 
>>>>>> be caught anyways. So I built a check that marks all domains that don’t 
>>>>>> need to be included on the exported lists because they will never be 
>>>>>> needed and was able to shrink the size of the lists by a lot again.
>>>>>> 
>>>>>> The website does not show this data, but the API returns the number of 
>>>>>> “subsumed” domains (I didn’t have a better name):
>>>>>> 
>>>>>> curl https://api.dbl.ipfire.org/lists | jq .
>>>>>> 
>>>>>> The number shown would normally be added to the total number of domains 
>>>>>> and usually cuts the size of the list by 50-200%.
>>>>>> 
>>>>>> Those stats will now also be stored in a history table so that we will 
>>>>>> be able to track growth of all lists.
>>>>>> 
>>>>>> Furthermore, the application will now send email notifications for any 
>>>>>> incoming reports. This way, we will be able to stay in close touch with 
>>>>>> the reporters and keep them up to date on their submissions as well as 
>>>>>> inform moderators that there is something to have a look at.
>>>>>> 
>>>>>> The search has been refactored as well, so that we can show clearly 
>>>>>> whether something is blocked or not at one glance: 
>>>>>> https://www.ipfire.org/dbl/search?q=github.com. There is detailed 
>>>>>> information available on all domains and what happened to them. In case 
>>>>>> of GitHub.com, this seems to be blocked and unblocked by someone all of 
>>>>>> the time and we can see a clear audit trail of that: 
>>>>>> https://www.ipfire.org/dbl/lists/malware/domains/github.com
>>>>>> 
>>>>>> On the DNS front, I have added some metadata to the zones so that people 
>>>>>> can programmatically request some data, like when it has been last 
>>>>>> updated (in a human-friendly timestamp and not only the serial), 
>>>>>> license, description and so on:
>>>>>> 
>>>>>> # dig +short ANY _info.ads.dbl.ipfire.org @primary.dbl.ipfire.org
>>>>>> "total-domains=42226"
>>>>>> "license=CC BY-SA 4.0"
>>>>>> "updated-at=2026-01-20T22:17:02.409933+00:00"
>>>>>> "description=Blocks domains used for ads, tracking, and ad delivery”
>>>>>> 
>>>>>> Now, I would like to hear more feedback from you. I know we've all been 
>>>>>> stretched thin lately, so I especially appreciate anyone who has time to 
>>>>>> review and provide input. Ideas, just say if you like it or not. Where 
>>>>>> this could go in the future?
>>>>>> 
>>>>>> Looking ahead, I would like us to start thinking about the RPZ feature 
>>>>>> that has been on the wishlist. IPFire DBL has been a bigger piece of 
>>>>>> work, and I think it's worth having a conversation about sustainability. 
>>>>>> Resources for this need to be allocated and paid for. Open source is 
>>>>>> about freedom, not free beer — and to keep building features like this, 
>>>>>> we will need to explore some funding options. I would be interested to 
>>>>>> hear any ideas you might have that could work for IPFire.
>>>>>> 
>>>>>> Please share your thoughts on the mailing list when you can — even a 
>>>>>> quick 'looks good' or 'I have concerns about X' is valuable. Public 
>>>>>> discussion helps everyone stay in the loop and contribute.
>>>>>> 
>>>>>> I am aiming to move forward with this in a week's time, so if you have 
>>>>>> input, now would be a good time to share it.
>>>>>> 
>>>>>> Best,
>>>>>> -Michael
>>>>>> 
>>>>>>> On 6 Jan 2026, at 10:20, Michael Tremer <[email protected]> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>> Good Morning Adolf,
>>>>>>> 
>>>>>>> I had a look at this problem yesterday and it seems that parsing the 
>>>>>>> format is becoming a little bit difficult this way. Since this is only 
>>>>>>> affecting very few domains, I have simply whitelisted them all manually 
>>>>>>> and duckduckgo.com <http://duckduckgo.com/> and others should now be 
>>>>>>> easily reachable again.
>>>>>>> 
>>>>>>> Please let me know if you have any more findings.
>>>>>>> 
>>>>>>> All the best,
>>>>>>> -Michael
>>>>>>> 
>>>>>>>> On 5 Jan 2026, at 11:48, Michael Tremer <[email protected]> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> Hello Adolf,
>>>>>>>> 
>>>>>>>> This is a good find.
>>>>>>>> 
>>>>>>>> But if duckduckgo.com <http://duckduckgo.com/> is blocked, we will 
>>>>>>>> have to have a source somewhere that blocks that domain. Not only a 
>>>>>>>> sub-domain of it. Otherwise we have a bug somewhere.
>>>>>>>> 
>>>>>>>> This is most likely as the domain is listed here, but with some stuff 
>>>>>>>> afterwards:
>>>>>>>> 
>>>>>>>> https://raw.githubusercontent.com/mtxadmin/ublock/refs/heads/master/hosts/_malware_typo
>>>>>>>> 
>>>>>>>> We strip everything after a # away because we consider it a comment. 
>>>>>>>> However, that causes that there is only a line with the domain left 
>>>>>>>> which will cause it being listed.
>>>>>>>> 
>>>>>>>> The # sign is used as some special character but at the same time it 
>>>>>>>> is being used for comments.
>>>>>>>> 
>>>>>>>> I will fix this and then refresh the list.
>>>>>>>> 
>>>>>>>> -Michael
>>>>>>>> 
>>>>>>>>> On 5 Jan 2026, at 11:31, Adolf Belka <[email protected]> wrote:
>>>>>>>>> 
>>>>>>>>> Hi Michael,
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On 05/01/2026 12:11, Adolf Belka wrote:
>>>>>>>>>> Hi Michael,
>>>>>>>>>> 
>>>>>>>>>> I have found that the malware list includes duckduckgo.com
>>>>>>>>>> 
>>>>>>>>> I have checked through the various sources used for the malware list.
>>>>>>>>> 
>>>>>>>>> The ShadowWhisperer (Tracking) list has improving.duckduckgo.com in 
>>>>>>>>> its list. I suspect that this one is the one causing the problem.
>>>>>>>>> 
>>>>>>>>> The mtxadmin (_malware_typo) list has duckduckgo.com mentioned 3 
>>>>>>>>> times but not directly as a domain name - looks more like a reference.
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> 
>>>>>>>>> Adolf.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> Regards,
>>>>>>>>>> Adolf.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On 02/01/2026 14:02, Adolf Belka wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>> 
>>>>>>>>>>> On 02/01/2026 12:09, Michael Tremer wrote:
>>>>>>>>>>>> Hello,
>>>>>>>>>>>> 
>>>>>>>>>>>>> On 30 Dec 2025, at 14:05, Adolf Belka <[email protected]> 
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi Michael,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On 29/12/2025 13:05, Michael Tremer wrote:
>>>>>>>>>>>>>> Hello everyone,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I hope everyone had a great Christmas and a couple of quiet days 
>>>>>>>>>>>>>> to relax from all the stress that was the year 2025.
>>>>>>>>>>>>> Still relaxing.
>>>>>>>>>>>> 
>>>>>>>>>>>> Very good, so let’s have a strong start into 2026 now!
>>>>>>>>>>> 
>>>>>>>>>>> Starting next week, yes.
>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>>> Having a couple of quieter days, I have been working on a new, 
>>>>>>>>>>>>>> little (hopefully) side project that has probably been high up 
>>>>>>>>>>>>>> on our radar since the Shalla list has shut down in 2020, or 
>>>>>>>>>>>>>> maybe even earlier. The goal of the project is to provide good 
>>>>>>>>>>>>>> lists with categories of domain names which are usually used to 
>>>>>>>>>>>>>> block access to these domains.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I simply call this IPFire DNSBL which is short for IPFire DNS 
>>>>>>>>>>>>>> Blocklists.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> How did we get here?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> As stated before, the URL filter feature in IPFire has the 
>>>>>>>>>>>>>> problem that there are not many good blocklists available any 
>>>>>>>>>>>>>> more. There used to be a couple more - most famously the Shalla 
>>>>>>>>>>>>>> list - but we are now down to a single list from the University 
>>>>>>>>>>>>>> of Toulouse. It is a great list, but it is not always the best 
>>>>>>>>>>>>>> fit for all users.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Then there has been talk about whether we could implement more 
>>>>>>>>>>>>>> blocking features into IPFire that don’t involve the proxy. Most 
>>>>>>>>>>>>>> famously blocking over DNS. The problem here remains a the 
>>>>>>>>>>>>>> blocking feature is only as good as the data that is fed into 
>>>>>>>>>>>>>> it. Some people have been putting forward a number of lists that 
>>>>>>>>>>>>>> were suitable for them, but they would not have replaced the 
>>>>>>>>>>>>>> blocking functionality as we know it. Their aim is to provide 
>>>>>>>>>>>>>> “one list for everything” but that is not what people usually 
>>>>>>>>>>>>>> want. It is targeted at a classic home user and the only 
>>>>>>>>>>>>>> separation that is being made is any adult/porn/NSFW content 
>>>>>>>>>>>>>> which usually is put into a separate list.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> It would have been technically possible to include these lists 
>>>>>>>>>>>>>> and let the users decide, but that is not the aim of IPFire. We 
>>>>>>>>>>>>>> want to do the job for the user so that their job is getting 
>>>>>>>>>>>>>> easier. Including obscure lists that don’t have a clear outline 
>>>>>>>>>>>>>> of what they actually want to block (“bad content” is not a 
>>>>>>>>>>>>>> category) and passing the burden of figuring out whether they 
>>>>>>>>>>>>>> need the “Light”, “Normal”, “Pro”, “Pro++”, “Ultimate” or even a 
>>>>>>>>>>>>>> “Venti” list with cream on top is really not going to work. It 
>>>>>>>>>>>>>> is all confusing and will lead to a bad user experience.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> An even bigger problem that is however completely impossible to 
>>>>>>>>>>>>>> solve is bad licensing of these lists. A user has asked the 
>>>>>>>>>>>>>> publisher of the HaGeZi list whether they could be included in 
>>>>>>>>>>>>>> IPFire and under what terms. The response was that the list is 
>>>>>>>>>>>>>> available under the terms of the GNU General Public License v3, 
>>>>>>>>>>>>>> but that does not seem to be true. The list contains data from 
>>>>>>>>>>>>>> various sources. Many of them are licensed under incompatible 
>>>>>>>>>>>>>> licenses (CC BY-SA 4.0, MPL, Apache2, …) and unless there is a 
>>>>>>>>>>>>>> non-public agreement that this data may be redistributed, there 
>>>>>>>>>>>>>> is a huge legal issue here. We would expose our users to 
>>>>>>>>>>>>>> potential copyright infringement which we cannot do under any 
>>>>>>>>>>>>>> circumstances. Furthermore many lists are available under a 
>>>>>>>>>>>>>> non-commercial license which excludes them from being used in 
>>>>>>>>>>>>>> any kind of business. Plenty of IPFire systems are running in 
>>>>>>>>>>>>>> businesses, if not even the vast majority.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> In short, these lists are completely unusable for us. Apart from 
>>>>>>>>>>>>>> HaGeZi, I consider OISD to have the same problem.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Enough about all the things that are bad. Let’s talk about the 
>>>>>>>>>>>>>> new, good things:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Many blacklists on the internet are an amalgamation of other 
>>>>>>>>>>>>>> lists. These lists vary in quality with some of them being not 
>>>>>>>>>>>>>> that good and without a clear focus and others being excellent 
>>>>>>>>>>>>>> data. Since we don’t have the man power to start from scratch, I 
>>>>>>>>>>>>>> felt that we can copy the concept that HaGeZi and OISD have 
>>>>>>>>>>>>>> started and simply create a new list that is based on other 
>>>>>>>>>>>>>> lists at the beginning to have a good starting point. That way, 
>>>>>>>>>>>>>> we have much better control over what is going on these lists 
>>>>>>>>>>>>>> and we can shape and mould them as we need them. Most 
>>>>>>>>>>>>>> importantly, we don’t create a single lists, but many lists that 
>>>>>>>>>>>>>> have a clear focus and allow users to choose what they want to 
>>>>>>>>>>>>>> block and what not.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> So the current experimental stage that I am in has these lists:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> * Ads
>>>>>>>>>>>>>> * Dating
>>>>>>>>>>>>>> * DoH
>>>>>>>>>>>>>> * Gambling
>>>>>>>>>>>>>> * Malware
>>>>>>>>>>>>>> * Porn
>>>>>>>>>>>>>> * Social
>>>>>>>>>>>>>> * Violence
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> The categories have been determined by what source lists we have 
>>>>>>>>>>>>>> available with good data and are compatible with our chosen 
>>>>>>>>>>>>>> license CC BY-SA 4.0. This is the same license that we are using 
>>>>>>>>>>>>>> for the IPFire Location database, too.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> The main use-cases for any kind of blocking are to comply with 
>>>>>>>>>>>>>> legal requirements in networks with children (i.e. schools) to 
>>>>>>>>>>>>>> remove any kind of pornographic content, sometimes block social 
>>>>>>>>>>>>>> media as well. Gambling and violence are commonly blocked, too. 
>>>>>>>>>>>>>> Even more common would be filtering advertising and any 
>>>>>>>>>>>>>> malicious content.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> The latter is especially difficult because so many source lists 
>>>>>>>>>>>>>> throw phishing, spyware, malvertising, tracking and other things 
>>>>>>>>>>>>>> into the same bucket. Here this is currently all in the malware 
>>>>>>>>>>>>>> list which has therefore become quite large. I am not sure 
>>>>>>>>>>>>>> whether this will stay like this in the future or if we will 
>>>>>>>>>>>>>> have to make some adjustments, but that is exactly why this is 
>>>>>>>>>>>>>> now entering some larger testing.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> What has been built so far? In order to put these lists together 
>>>>>>>>>>>>>> properly, track any data about where it is coming from, I have 
>>>>>>>>>>>>>> built a tool in Python available here:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> https://git.ipfire.org/?p=dnsbl.git;a=summary
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> This tool will automatically update all lists once an hour if 
>>>>>>>>>>>>>> there have been any changes and export them in various formats. 
>>>>>>>>>>>>>> The exported lists are available for download here:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> https://dnsbl.ipfire.org/lists/
>>>>>>>>>>>>> The download using dnsbl.ipfire.org/lists/squidguard.tar.gz as 
>>>>>>>>>>>>> the custom url works fine.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> However you need to remember not to put the https:// at the front 
>>>>>>>>>>>>> of the url otherwise the WUI page completes without any error 
>>>>>>>>>>>>> messages but leaves an error message in the system logs saying
>>>>>>>>>>>>> 
>>>>>>>>>>>>> URL filter blacklist - ERROR: Not a valid URL filter blacklist
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I found this out the hard way.
>>>>>>>>>>>> 
>>>>>>>>>>>> Oh yes, I forgot that there is a field on the web UI. If that does 
>>>>>>>>>>>> not accept https:// as a prefix, please file a bug and we will fix 
>>>>>>>>>>>> it.
>>>>>>>>>>> 
>>>>>>>>>>> I will confirm it and raise a bug.
>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>> The other thing I noticed is that if you already have the 
>>>>>>>>>>>>> Toulouse University list downloaded and you then change to the 
>>>>>>>>>>>>> ipfire custom url then all the existing Toulouse blocklists stay 
>>>>>>>>>>>>> in the directory on IPFire and so you end up with a huge number 
>>>>>>>>>>>>> of category tick boxes, most of which are the old Toulouse ones, 
>>>>>>>>>>>>> which are still available to select and it is not clear which 
>>>>>>>>>>>>> ones are from Toulouse and which ones from IPFire.
>>>>>>>>>>>> 
>>>>>>>>>>>> Yes, I got the same thing, too. I think this is a bug, too, 
>>>>>>>>>>>> because otherwise you would have a lot of unused categories lying 
>>>>>>>>>>>> around that will never be updated. You cannot even tell which ones 
>>>>>>>>>>>> are from the current list and which ones from the old list.
>>>>>>>>>>>> 
>>>>>>>>>>>> Long-term we could even consider to remove the Univ. Toulouse list 
>>>>>>>>>>>> entirely and only have our own lists available which would make 
>>>>>>>>>>>> the problem go away.
>>>>>>>>>>>> 
>>>>>>>>>>>>> I think if the blocklist URL source is changed or a custom url is 
>>>>>>>>>>>>> provided the first step should be to remove the old ones already 
>>>>>>>>>>>>> existing.
>>>>>>>>>>>>> That might be a problem because users can also create their own 
>>>>>>>>>>>>> blocklists and I believe those go into the same directory.
>>>>>>>>>>>> 
>>>>>>>>>>>> Good thought. We of course cannot delete the custom lists.
>>>>>>>>>>>> 
>>>>>>>>>>>>> Without clearing out the old blocklists you end up with a huge 
>>>>>>>>>>>>> number of checkboxes for lists but it is not clear what happens 
>>>>>>>>>>>>> if there is a category that has the same name for the Toulouse 
>>>>>>>>>>>>> list and the IPFire list such as gambling. I will have a look at 
>>>>>>>>>>>>> that and see what happens.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Not sure what the best approach to this is.
>>>>>>>>>>>> 
>>>>>>>>>>>> I believe it is removing all old content.
>>>>>>>>>>>> 
>>>>>>>>>>>>> Manually deleting all contents of the urlfilter/blacklists/ 
>>>>>>>>>>>>> directory and then selecting the IPFire blocklist url for the 
>>>>>>>>>>>>> custom url I end up with only the 8 categories from the IPFire 
>>>>>>>>>>>>> list.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I have tested some gambling sites from the IPFire list and the 
>>>>>>>>>>>>> block worked on some. On others the site no longer exists so 
>>>>>>>>>>>>> there is nothing to block or has been changed to an https site 
>>>>>>>>>>>>> and in that case it went straight through. Also if I chose the 
>>>>>>>>>>>>> http version of the link, it was automatically changed to https 
>>>>>>>>>>>>> and went through without being blocked.
>>>>>>>>>>>> 
>>>>>>>>>>>> The entire IPFire infrastructure always requires HTTPS. If you 
>>>>>>>>>>>> start using HTTP, you will be automatically redirected. It is 2026 
>>>>>>>>>>>> and we don’t need to talk HTTP any more :)
>>>>>>>>>>> 
>>>>>>>>>>> Some of the domains in the gambling list (maybe quite a lot) seem 
>>>>>>>>>>> to only have an http access. If I tried https it came back with the 
>>>>>>>>>>> fact that it couldn't find it.
>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> I am glad to hear that the list is actually blocking. It would 
>>>>>>>>>>>> have been bad if it didn’t. Now we have the big task to check out 
>>>>>>>>>>>> the “quality” - however that can be determined. I think this is 
>>>>>>>>>>>> what needs some time…
>>>>>>>>>>>> 
>>>>>>>>>>>> In the meantime I have set up a small page on our website:
>>>>>>>>>>>> 
>>>>>>>>>>>> https://www.ipfire.org/dnsbl
>>>>>>>>>>>> 
>>>>>>>>>>>> I would like to run this as a first-class project inside IPFire 
>>>>>>>>>>>> like we are doing with IPFire Location. That means that we need to 
>>>>>>>>>>>> tell people about what we are doing. Hopefully this page is a 
>>>>>>>>>>>> little start.
>>>>>>>>>>>> 
>>>>>>>>>>>> Initially it has a couple of high-level bullet points about what 
>>>>>>>>>>>> we are trying to achieve. I don’t think the text is very good, 
>>>>>>>>>>>> yet, but it is the best I had in that moment. There is then also a 
>>>>>>>>>>>> list of the lists that we currently offer. For each list, a 
>>>>>>>>>>>> detailed page will tell you about the license, how many domains 
>>>>>>>>>>>> are listed, when the last update has been, the sources and even 
>>>>>>>>>>>> there is a history page that shows all the changes whenever they 
>>>>>>>>>>>> have happened.
>>>>>>>>>>>> 
>>>>>>>>>>>> Finally there is a section that explains “How To Use?” the list 
>>>>>>>>>>>> which I would love to extend to include AdGuard Plus and things 
>>>>>>>>>>>> like that as well as Pi-Hole and whatever else could use the list. 
>>>>>>>>>>>> In a later step we should go ahead and talk to any projects to 
>>>>>>>>>>>> include our list(s) into their dropdown so that people can enable 
>>>>>>>>>>>> them nice and easy.
>>>>>>>>>>>> 
>>>>>>>>>>>> Behind the web page there is an API service that is running on the 
>>>>>>>>>>>> host that is running the DNSBL. The frontend web app that is 
>>>>>>>>>>>> running www.ipfire.org <http://www.ipfire.org/> is connecting to 
>>>>>>>>>>>> that API service to fetch the current lists, any details and so 
>>>>>>>>>>>> on. That way, we can split the logic and avoid creating a huge 
>>>>>>>>>>>> monolith of a web app. This also means that page could be down a 
>>>>>>>>>>>> little as I am still working on the entire thing and will 
>>>>>>>>>>>> frequently restart it.
>>>>>>>>>>>> 
>>>>>>>>>>>> The API documentation is available here and the API is publicly 
>>>>>>>>>>>> available: https://api.dnsbl.ipfire.org/docs
>>>>>>>>>>>> 
>>>>>>>>>>>> The website/API allows to file reports for anything that does not 
>>>>>>>>>>>> seem to be right on any of the lists. I would like to keep it as 
>>>>>>>>>>>> an open process, however, long-term, this cannot cost us any time. 
>>>>>>>>>>>> In the current stage, the reports are getting filed and that is 
>>>>>>>>>>>> about it. I still need to build out some way for admins or 
>>>>>>>>>>>> moderators (I am not sure what kind of roles I want to have here) 
>>>>>>>>>>>> to accept or reject those reports.
>>>>>>>>>>>> 
>>>>>>>>>>>> In case of us receiving a domain from a source list, I would 
>>>>>>>>>>>> rather like to submit a report to upstream for them to de-list. 
>>>>>>>>>>>> That way, we don’t have any admin to do and we are contributing 
>>>>>>>>>>>> back to other list. That would be a very good thing to do. We 
>>>>>>>>>>>> cannot however throw tons of emails at some random upstream 
>>>>>>>>>>>> projects without co-ordinating this first. By not reporting 
>>>>>>>>>>>> upstream, we will probably over time create large whitelists and I 
>>>>>>>>>>>> am not sure if that is a good thing to do.
>>>>>>>>>>>> 
>>>>>>>>>>>> Finally, there is a search box that can be used to find out if a 
>>>>>>>>>>>> domain is listed on any of the lists.
>>>>>>>>>>>> 
>>>>>>>>>>>>>> If you download and open any of the files, you will see a large 
>>>>>>>>>>>>>> header that includes copyright information and lists all sources 
>>>>>>>>>>>>>> that have been used to create the individual lists. This way we 
>>>>>>>>>>>>>> ensure maximum transparency, comply with the terms of the 
>>>>>>>>>>>>>> individual licenses of the source lists and give credit to the 
>>>>>>>>>>>>>> people who help us to put together the most perfect list for our 
>>>>>>>>>>>>>> users.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I would like this to become a project that is not only being 
>>>>>>>>>>>>>> used in IPFire. We can and will be compatible with other 
>>>>>>>>>>>>>> solutions like AdGuard, PiHole so that people can use our lists 
>>>>>>>>>>>>>> if they would like to even though they are not using IPFire. 
>>>>>>>>>>>>>> Hopefully, these users will also feed back to us so that we can 
>>>>>>>>>>>>>> improve our lists over time and make them one of the best 
>>>>>>>>>>>>>> options out there.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> All lists are available as a simple text file that lists the 
>>>>>>>>>>>>>> domains. Then there is a hosts file available as well as a DNS 
>>>>>>>>>>>>>> zone file and an RPZ file. Each list is individually available 
>>>>>>>>>>>>>> to be used in squidGuard and there is a larger tarball available 
>>>>>>>>>>>>>> with all lists that can be used in IPFire’s URL Filter. I am 
>>>>>>>>>>>>>> planning to add Suricata/Snort signatures whenever I have time 
>>>>>>>>>>>>>> to do so. Even though it is not a good idea to filter 
>>>>>>>>>>>>>> pornographic content this way, I suppose that catching malware 
>>>>>>>>>>>>>> and blocking DoH are good use-cases for an IPS. Time will tell…
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> As a start, we will make these lists available in IPFire’s URL 
>>>>>>>>>>>>>> Filter and collect some feedback about how we are doing. 
>>>>>>>>>>>>>> Afterwards, we can see where else we can take this project.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> If you want to enable this on your system, simply add the URL to 
>>>>>>>>>>>>>> your autoupdate.urls file like here:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> https://git.ipfire.org/?p=people/ms/ipfire-2.x.git;a=commitdiff;h=bf675bb937faa7617474b3cc84435af3b1f7f45f
>>>>>>>>>>>>> I also tested out adding the IPFire url to autoupdate.urls and 
>>>>>>>>>>>>> that also worked fine for me.
>>>>>>>>>>>> 
>>>>>>>>>>>> Very good. Should we include this already with Core Update 200? I 
>>>>>>>>>>>> don’t think we would break anything, but we might already gain a 
>>>>>>>>>>>> couple more people who are helping us to test this all?
>>>>>>>>>>> 
>>>>>>>>>>> I think that would be a good idea.
>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> The next step would be to build and test our DNS infrastructure. 
>>>>>>>>>>>> In the “How To Use?” Section on the pages of the individual lists, 
>>>>>>>>>>>> you can already see some instructions on how to use the lists as 
>>>>>>>>>>>> an RPZ. In comparison to other “providers”, I would prefer if 
>>>>>>>>>>>> people would be using DNS to fetch the lists. This is simply to 
>>>>>>>>>>>> push out updates in a cheap way for us and also do it very 
>>>>>>>>>>>> regularly.
>>>>>>>>>>>> 
>>>>>>>>>>>> Initially, clients will pull the entire list using AXFR. There is 
>>>>>>>>>>>> no way around this as they need to have the data in the first 
>>>>>>>>>>>> place. After that, clients will only need the changes. As you can 
>>>>>>>>>>>> see in the history, the lists don’t actually change that often. 
>>>>>>>>>>>> Sometimes only once a day and therefore downloading the entire 
>>>>>>>>>>>> list again would be a huge waste of data, both on the client side, 
>>>>>>>>>>>> but also for us hosting then.
>>>>>>>>>>>> 
>>>>>>>>>>>> Some other providers update their lists “every 10 minutes”, and 
>>>>>>>>>>>> there won't be any changes whatsoever. We don’t do that. We will 
>>>>>>>>>>>> only export the lists again when they have actually changed. The 
>>>>>>>>>>>> timestamps on the files that we offer using HTTPS can be checked 
>>>>>>>>>>>> by clients so that they won’t re-download the list again if it has 
>>>>>>>>>>>> not been changed. But using HTTPS still means that we would have 
>>>>>>>>>>>> to re-download the entire list and not only the changes.
>>>>>>>>>>>> 
>>>>>>>>>>>> Using DNS and IXFR will update the lists by only transferring a 
>>>>>>>>>>>> few kilobytes and therefore we can have clients check once an hour 
>>>>>>>>>>>> if a list has actually changed and only send out the raw changes. 
>>>>>>>>>>>> That way, we will be able to serve millions of clients at very 
>>>>>>>>>>>> cheap cost and they will always have a very up to date list.
>>>>>>>>>>>> 
>>>>>>>>>>>> As far as I can see any DNS software that supports RPZs supports 
>>>>>>>>>>>> AXFR/IXFR with exception of Knot Resolver which expects the zone 
>>>>>>>>>>>> to be downloaded externally. There is a ticket for AXFR/IXFR 
>>>>>>>>>>>> support (https://gitlab.nic.cz/knot/knot-resolver/-/issues/195).
>>>>>>>>>>>> 
>>>>>>>>>>>> Initially, some of the lists have been *huge* which is why a 
>>>>>>>>>>>> simple HTTP download is not feasible. The porn list was over 100 
>>>>>>>>>>>> MiB. We could have spent thousands on just traffic alone which I 
>>>>>>>>>>>> don’t have for this kind of project. It would also be unnecessary 
>>>>>>>>>>>> money being spent. There are simply better solutions out there. 
>>>>>>>>>>>> But then I built something that basically tests the data that we 
>>>>>>>>>>>> are receiving from upstream but simply checking if a listed domain 
>>>>>>>>>>>> still exists. The result was very astonishing to me.
>>>>>>>>>>>> 
>>>>>>>>>>>> So whenever someone adds a domain to the list, we will 
>>>>>>>>>>>> (eventually, but not immediately) check if we can resolve the 
>>>>>>>>>>>> domain’s SOA record. If not, we mark the domain as non-active and 
>>>>>>>>>>>> will no longer include them in the exported data. This brought 
>>>>>>>>>>>> down the porn list from just under 5 million domains to just 421k. 
>>>>>>>>>>>> On the sources page 
>>>>>>>>>>>> (https://www.ipfire.org/dnsbl/lists/porn/sources) I am listing the 
>>>>>>>>>>>> percentage of dead domains from each of them and the UT1 list has 
>>>>>>>>>>>> 94% dead domains. Wow.
>>>>>>>>>>>> 
>>>>>>>>>>>> If we cannot resolve the domain, neither can our users. So we 
>>>>>>>>>>>> would otherwise fill the lists with tons of domains that simply 
>>>>>>>>>>>> could never be reached. And if they cannot be reached, why would 
>>>>>>>>>>>> we block them? We would waste bandwidth and a lot of memory on 
>>>>>>>>>>>> each single client.
>>>>>>>>>>>> 
>>>>>>>>>>>> The other sources have similarly high rations of dead domains. 
>>>>>>>>>>>> Most of them are in the 50-80% range. Therefore I am happy that we 
>>>>>>>>>>>> are doing some extra work here to give our users much better data 
>>>>>>>>>>>> for their filtering.
>>>>>>>>>>> 
>>>>>>>>>>> Removing all dead entries sounds like an excellent step.
>>>>>>>>>>> 
>>>>>>>>>>> Regards,
>>>>>>>>>>> 
>>>>>>>>>>> Adolf.
>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> So, if you like, please go and check out the RPZ blocking with 
>>>>>>>>>>>> Unbound. Instructions are on the page. I would be happy to hear 
>>>>>>>>>>>> how this is turning out.
>>>>>>>>>>>> 
>>>>>>>>>>>> Please let me know if there are any more questions, and I would be 
>>>>>>>>>>>> glad to answer them.
>>>>>>>>>>>> 
>>>>>>>>>>>> Happy New Year,
>>>>>>>>>>>> -Michael
>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Adolf.
>>>>>>>>>>>>>> This email is just a brain dump from me to this list. I would be 
>>>>>>>>>>>>>> happy to answer any questions about implementation details, etc. 
>>>>>>>>>>>>>> if people are interested. Right now, this email is long enough 
>>>>>>>>>>>>>> already…
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> All the best,
>>>>>>>>>>>>>> -Michael
>>>>>>>>>>>>> 
>>>>>>>>>>>>> -- 
>>>>>>>>>>>>> Sent from my laptop
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> -- 
>>>>>>>>> Sent from my laptop
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>> 
> <suricata-log.jpg><ipfire_dnsbl-violence.rules>



Reply via email to