Re: [users@httpd] Redirecting based on IP

2024-06-06 Thread Dave Wreski

Hi,


The next steps I'd like to do is to redirect anyone not in that RequireAll 
statement to be redirected to the production site. Is this possible? Perhaps a 
RewriteCond that depends upon certain IPs, then otherwise redirects to the 
production site?

I don't think relying on the IPs is a good idea, since those will
change, and the proper process to validate them requires 2 DNS
lookups, if I'm not mistaken. Just use a rewriteCond + rewriteRule to
generously check the User-Agent and perform the redirect. You may have
to set an environment variable in the rewrite rule and check that in
your RequireAll statement to permit the 301 response to be sent. You
may want to verify that the Vary:User-Agent response header gets sent
to the client to prevent cache pollution.


I'm back to trying to work on this, and hoped you could assist further. 
Is this along the lines of what I should be doing?


  SetEnvIf user-agent "(?i:Googlebot)" stayout=1
      RewriteCond %{HTTP_USER_AGENT}    Googlebot
      RewriteRule (.*) https://linuxsecurity.com$1 [E=stayout:1]

I'm also not sure about the Vary:User-Agent - we are using cloudflare, 
but that appears related to triggering googlebot to also scan as another 
user agent, such as its mobile bot?


dave



Re: [users@httpd] Redirecting based on IP

2024-05-17 Thread Dave Wreski

Hi,


The staging site is even protected with a RequireAll statement for the 
DocumentRoot based on the IP, which then results in a 404 and other errors in 
GSC.

That sound wrong. If your RequireAll was working as advertised, should
it not return a 403?


Yes, it does - my mistake.


The next steps I'd like to do is to redirect anyone not in that RequireAll 
statement to be redirected to the production site. Is this possible? Perhaps a 
RewriteCond that depends upon certain IPs, then otherwise redirects to the 
production site?

I don't think relying on the IPs is a good idea, since those will
change, and the proper process to validate them requires 2 DNS
lookups, if I'm not mistaken. Just use a rewriteCond + rewriteRule to
generously check the User-Agent and perform the redirect. You may have
to set an environment variable in the rewrite rule and check that in
your RequireAll statement to permit the 301 response to be sent. You
may want to verify that the Vary:User-Agent response header gets sent
to the client to prevent cache pollution.


I used your rewritecond+rewriterule approach, and it worked perfectly in 
my tests. Thanks so much.





Re: [users@httpd] Redirecting based on IP

2024-05-16 Thread Rainer Canavan
On Thu, May 16, 2024 at 1:15 AM Dave Wreski
 wrote:
>
> Hi,
>
[...]
> The staging site is even protected with a RequireAll statement for the 
> DocumentRoot based on the IP, which then results in a 404 and other errors in 
> GSC.

That sound wrong. If your RequireAll was working as advertised, should
it not return a 403?

[...]
>
> The next steps I'd like to do is to redirect anyone not in that RequireAll 
> statement to be redirected to the production site. Is this possible? Perhaps 
> a RewriteCond that depends upon certain IPs, then otherwise redirects to the 
> production site?

I don't think relying on the IPs is a good idea, since those will
change, and the proper process to validate them requires 2 DNS
lookups, if I'm not mistaken. Just use a rewriteCond + rewriteRule to
generously check the User-Agent and perform the redirect. You may have
to set an environment variable in the rewrite rule and check that in
your RequireAll statement to permit the 301 response to be sent. You
may want to verify that the Vary:User-Agent response header gets sent
to the client to prevent cache pollution.

Rainer

-
To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
For additional commands, e-mail: users-h...@httpd.apache.org



Re: [users@httpd] Redirecting based on IP

2024-05-15 Thread gene heskett

On 5/15/24 19:15, Dave Wreski wrote:

Hi,

Google insists that one of our staging sites needs to be indexed despite 
"disallow" in robots.txt and a half-dozen other methods for preventing 
Google from indexing it (including submitting it for removal from their 
index). The staging site is even protected with a RequireAll statement 
for the DocumentRoot based on the IP, which then results in a 404 and 
other errors in GSC. This impacts our SEO and also causes GSC to stop 
processing the rest of our site.


The next steps I'd like to do is to redirect anyone not in that 
RequireAll statement to be redirected to the production site. Is this 
possible? Perhaps a RewriteCond that depends upon certain IPs, then 
otherwise redirects to the production site?


Thanks,
Dave

The last time I ran into this was back in iptables days 20 years ago. 
Based on IP they were denied because my site at the time included my 
photo's and totalled about 13 gigabytes. This was in the days of 
bandwidth per month of 30 gigs. Because google has so many machines they 
used up all my allocation long before the month was up. I wound up 
putting another search engine in that database, mj12, so I wound up with 
an iptables file about 15k lines long. That continued until I had ported 
the whole thing to a couple new Seacrate 1t drives, both of which went 
tits down in the night within 2 weeks, just disappearing off the 
sata-III bus. I was so pi$$ed I didn't even warranty them. SSD's are it 
today. I have only one spinning rust drive in 8 machines here now, a 250 
gig that refuses to die. iptables worked but took about 10 hours a month 
to maintain it cuz they moved the machines to a new address.  Some of 
the iptables rules ended in /16, so I was blocking a goodly share of the 
ipv4 space when I had the gran crash.  I controlled it most of the time 
but it was several hours a week keeping even with them. You never get 
ahead. I still have a registered name but all you get is the apache test 
page.


Cheers, Gene Heskett, CET.
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author, 1940)
If we desire respect for the law, we must first make the law respectable.
 - Louis D. Brandeis


-
To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
For additional commands, e-mail: users-h...@httpd.apache.org



[users@httpd] Redirecting based on IP

2024-05-15 Thread Dave Wreski

Hi,

Google insists that one of our staging sites needs to be indexed despite 
"disallow" in robots.txt and a half-dozen other methods for preventing 
Google from indexing it (including submitting it for removal from their 
index). The staging site is even protected with a RequireAll statement 
for the DocumentRoot based on the IP, which then results in a 404 and 
other errors in GSC. This impacts our SEO and also causes GSC to stop 
processing the rest of our site.


The next steps I'd like to do is to redirect anyone not in that 
RequireAll statement to be redirected to the production site. Is this 
possible? Perhaps a RewriteCond that depends upon certain IPs, then 
otherwise redirects to the production site?


Thanks,
Dave