On Sat, Apr 8, 2023 at 2:14 PM Dave Wreski <dwre...@guardiandigital.com.invalid> wrote:
> Hi, > > I have an apache-2.4.56 install on fedora37 and trying to block some bots > from accessing the site, unless they're trying to access our RSS feeds. How > can I do this? > > I'm blocking the bots with SetEnvIF lines in the .htacess file in the > document root like: > > SetEnvIf user-agent "(?i:libwww)" stayout=1 > deny from env=stayout > <RequireAll> > Require all granted > Require not env stayout > </RequireAll> > > However, creating an entry that explicitly allows access to the XML files > before or after doesn't seem to take effect: > > RewriteRule linuxsecurity_features\.xml$ - [L] > > It is still blocked by the user-agent setting above. I understood the file > was processed from the top down, and when a match is made, it stops > processing. Is that not the case? Shouldn't the RewriteRule above, if > placed before the env rule, be enough to stop processing the htaccess file > and allow access? > The [L] flag only stops later RewriteRule directives from being processed. Every module still gets its configuration merged from every matching config context, then decides what to do with its configuration when passed control at various times. setenvif is processed very early, so if you can stay with it for manipulating this variable it will be much more intuitive > I've also tried adding these RewriteRule entries to the server config > htaccess with an Include, but it appears the .htaccess in the document root > is always processed afterwards, even after finding match in the server > config htaccess. > I'd suggest the following: 1. Ditch the "deny", requireall, and require all granted leaving just "Require not env stayout" 2. Ditch the RewriteRule and do a second SetEnvIf for the exception (SetEnvIf Request_URI linuxsecurity_features\.xml$ !stayout" -- Eric Covener cove...@gmail.com