On 6/6/08, *André Warnier* <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote:
Mohit Anchlia wrote:
On 6/5/08, André Warnier <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>>
wrote:
Mohit Anchlia wrote:
On 6/5/08, André Warnier <[EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>> wrote:
Mohit Anchlia wrote:
On 6/5/08, André Warnier <[EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>> wrote:
Mohit Anchlia wrote:
On 6/4/08, Dragon <[EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>> wrote:
André Warnier wrote:
Mohit Anchlia wrote:
2. Another question I had was
sometimes we don't get real physical IP
of
the
machine but the IP of
something that's in between
like "router", is
there
a
way to get the real IP so
that we don't end up
blocking people
coming
from
that "router" or "proxy"
In my opinion, you cannot.
The whole point of such
routers and
proxies
is
to make the requests look like
they are coming from the
router/proxy,
so
that is the sender IP address
you are seeing at your server level,
and
that's it. Your server never
receives the original requester IP
address.
---------------- End original
message. ---------------------
There are legitimate reasons for
this to be done as well,
indiscriminately
blocking such access is a bad idea
as it will affect legitimate
users.
NAT
and IP address sharing are among the
reasons. This allows an
organization
to
have a router with one public IP
address to serve a larger internal
network
with private IP addresses. Without
this, we would have run out of
IPv4
addresses a long time ago.
Dragon
If there is no way to get the real
IP address then how would router
know
which machine to direct the response to.
It got to have some
information
in
the packet. For eg: If A send to router
B and router sends to C then
when
C
responds how would B know that the
response is for A.
You are perfectly right : the router
knows the real IP address. But
it
will not tell you, haha.
Seriously, this is how it works :
the original system sends out an "open
session" packet, through the
router,
to the final destination.
The router sees this packet, and analyses
it. It extracts the IP
address
and port of the original sender, and keeps
it in a table.
Then it replaces the IP address by it's own,
adds some port number, and
also memorises this new port number in the
same table entry.
Then it sends the modified packet to the
external server (yours).
It knows that the server on the other side
is going to respond to this
same
IP address and port (the ones of the router).
When the return packet from the server comes
back, the router looks at
the
port in it, finds the corresponding entry in
it's table, and now it
knows
to
whom it should send the packet internally.
And so on.
So :
- the router knows everything
- the internal system thinks it is talking
directly to the external
server
- the external server (yours) only sees the
router IP and port, so it
thinks that is where the packet comes from.
That's NAT for you, in a nutshell.
Yes ?
---
Thanks for the great explanation. But, I
wonder how do people design
app
agains Denial of Service attack. Say Computer A
uses Cox/Times warner
(cable) Internet connection and starts attacking
B, then how would a
system be configured in a way that not all the
users using Times
Warner/Cox
are affected. Should it be granular enough to
give IP and source Port in
IP
blocking rules ?
I think that is quite a different case. Not all
users of an ISP (like
the
one you mention I suppose) are "behind" a NAT router
that hides their IP
address. Instead, these ISP's have a large pool of
public IP addresses
which they "own", and they attribute them
dynamically to users when they
connect (and put the address back in the pool when
the user disconnects).
If a DOS attack came from a router with a fixed IP
address, and everyone
would know that this IP address belongs to company
xyz, I'm sure that it
would not be long before company xyz would be facing
a big lawsuit.
But in the case of an ISP, with tens of thousands of
customers, each one
of
which gets a different IP address each time he turns
on his computer (and
anyway once per 24 hours in general), finding out
who exactly was "
a234d-45hjk-dialin-atlanta.cox-t-warner.net
<http://a234d-45hjk-dialin-atlanta.cox-t-warner.net/>"
between 17:45 and 17:53
yesterday is a bit more time-consuming.
But in that case anyway, you do have a real
individual sender IP address
when the packet reaches your server, so you can
decide to block it.
And keep blocking all packets from this address for
the next 24 hours.
And that's exactly what many servers do.
And that is also why sometimes you may turn on your
PC at home (getting a
brand-new IP address) and find out that you cannot
connect to some server
because it is rejecting your IP address. Chances
are that you are
unlucky
enough to have received today the IP address that
was used yesterday by
someone else who used it to send out 1M emails.
But isn't this getting a bit off-topic ?
If you want to know more about this, I suggest you
Google a bit on
"blacklists", "greylists" and "whitelists" for example.
or start here : http://en.wikipedia.org/wiki/DNSBL
Thanks ..it did go off-track a little bit and but it
helps me understand
what I should expect when doing such a blocking. Thanks
for your
explanation.
Now coming back on track, out of below 2 approaches
which one is better:
1. Use "deny from IP" in <LocationMatch>
2. Use RewriteCond and call a perl script dynamically.
This helps me
configure IP dynamically without having to stop and
start servers
everytime
I change httpd.conf
Is there any performance impact of using 2 over 1 or any
other issues.
There will be a very big difference : in case (1), the IP
addresses or
ranges are pre-processed by Apache at startup time, and the
comparison will
be made by an internal (and fast) Apache module, on the base
of information
in memory. In case (2), not only are you using a rewrite of
the URI, but in
addition you will be executing a script, which itself is
going to read an
external file. That is going to be several hundred times
slower, at least.
Thousands of times slower if you recompile and execute the
script with perl
each time (if not under mod_perl).
Now wether it matters or not in your case, depends on the
load of your
server. If it is doing nothing anyway 90% of the time, it
doesn't matter.
An Apache restart may or may not be such a big problem
either, it all
depends on your circumstances.
But rather than using a perl script, I would definitely in
that case use a
mod_perl add-on module written as a PerlAccessHandler. But
that's another
story, and one more for the mod_perl list.
I would bet that there exists already such a mod_perl module
by the way.
Have a look here :
http://cpan.uwinnipeg.ca/search?query=apache2&mode=dist
<http://cpan.uwinnipeg.ca/search?query=apache2&mode=dist>
or, there is probably an example in the Mod_perl Cookbook
As per your suggestion I looked at PerlAccessHandler, how would this
approach be in terms of performance as compared to have "deny
from IP", is
it still going to be really bad.
<Location /URL>
PerlAccessHandler Example::AccessHandler
</Location>
I will try running some test also.
Well again, it all depends on your circumstances, what you want to
achieve, how many accesses you expect, why exactly you want to block
or allow some IPs, how many different IP's or IP ranges you would
want to allow/block, how often they change, in function of what they
change, whether it is a big problem or not for you to do an Apache
restart, how loaded your system is expected to be, etc..
Even if one solution looks like it is 200 times slower than another,
but your server is only loaded at 10% (happens more frequently than
you would think), and it really makes your life easier for the next
3 years, it's worth looking at.
And even if one solution is 200 times slower than another, that can
still mean 0,1 millisecond, so is it important for you ?
A simple tip :
in the Apache configuration file, you can use an "include"
directive, I believe just about anywhere, to insert at that point
another bit of configuration file.
You could have a simple text file containing all your
Deny from *MailScanner warning: numerical links are often
malicious:* 1.2.3.4 <http://1.2.3.4/>
Deny from *MailScanner warning: numerical links are often
malicious:* 2.3.4.5 <http://2.3.4.5/>
...
lines, and include it wherever you want.
Then a simple Apache restart would re-read it.
A this file could be written and re-written by some external script
which decides which IPs are allowed or not. Or edited with vi
manually, if that is how often changes happen.
If you have a PerlAccessHandler under mod_perl :
- perl itself is part of the server, so it does not have to be
reloaded each time
- the handler gets compiled once the first time it is run, and the
compiled code is re-used afterward
- it can be smart, and only re-read the IP address list, and rebuild
its internal table when the file changes
- and in the meantime, it uses the table in memory
So in that case you would not have to restart Apache, and any
changes would take effect immediately.
Also, something else :
So far, you have been talking about blocking HTTP accesses at the
Apache level. But maybe you want to block more than port 80 from
those IP addresses, and maybe you should do this outside of Apache,
before it even gets to Apache ?
There are many solutions, but you are the one to decide which one
you implement.
Thanks. You are right we should not even let these people get to Apache.
We have that process in place, but it often takes time to get that
request approved and processed by Network team. Meanwhile we want
something that we can block on ASAP. I am not sure how often this list
will change. To begin with this list is going to be empty. Only when we
experience DOS then we will update the IP.
We expect to get 1000s of requests per second. Since it's going to be
highly loaded server I started to think about something that would
change dynamically. You mentioned the code is compiled when apache
restarts, which means that if I keep list of IPs as an array inside the
perl script is not going to take affect until next restart. Only option
I think then is to read the list from flat file. I just have one basic
question about mod_perl. Does apache web server executes one process of
perl per request ? Reason I am asking is because you mentioned I could
read the list from memory, and I am not sure how would it read from
memory when this script will be executed every time it tries to process
the request. Because if I try to read from file then every request will
try to open the file and read from it. It looks like a stateless.
Thanks for detailed explanation. It does clear lot of things and also is
giving me different view points. Include directive was a great tip that
I wasn't aware of.