lost of packets

2005-01-26 Thread marc gmx
I continue to try to use nat with pf on OpenBSD.

I send 1000 snmp request ( UDP packet ) for 1000 differents IP.
The packets pass from interface bge0 to interface bge1.
I put the nat on interface bge1.

There is an important lost of packets.

The counter "Packets In/Blocked" for interface bge0  indicate a value
of 124, WHY ???

pfctl -s all
TRANSLATION RULES:
nat on bge1 inet from 172.19.40.0/24 to 10.128.0.0/9 -> (bge1) round-robin

FILTER RULES:
block drop in log all
block drop out log all
pass out all keep state
pass quick on lo all
pass quick on bge0 all
No queue in use

STATES:
self udp 172.19.40.169:1024 -> 192.168.13.3:52939 -> 10.128.1.0:161   
SINGLE:NO_TRAFFIC
self udp 172.19.40.169:1024 -> 192.168.13.3:54406 -> 10.128.2.0:161   
SINGLE:NO_TRAFFIC
self udp 172.19.40.169:1024 -> 192.168.13.3:55997 -> 10.128.0.1:161   
SINGLE:NO_TRAFFIC
self udp 172.19.40.169:1024 -> 192.168.13.3:50088 -> 10.128.1.1:161   
SINGLE:NO_TRAFFIC
self udp 172.19.40.169:1024 -> 192.168.13.3:59982 -> 10.128.2.1:161   
SINGLE:NO_TRAFFIC
self udp 172.19.40.169:1024 -> 192.168.13.3:59460 -> 10.128.0.2:161   
SINGLE:NO_TRAFFIC
self udp 172.19.40.169:1024 -> 192.168.13.3:64233 -> 10.128.1.2:161   
SINGLE:NO_TRAFFIC
..
self udp 172.19.40.169:1024 -> 192.168.13.3:56339 -> 10.128.0.255:161   
SINGLE:NO_TRAFFIC
self udp 172.19.40.169:1024 -> 192.168.13.3:55663 -> 10.128.1.255:161   
SINGLE:NO_TRAFFIC

INFO:
Status: Enabled for 0 days 00:00:32 Debug: Misc

Hostid: 0x500b7878

Interface Stats for bge0  IPv4 IPv6
  Bytes In   777630
  Bytes Out  72860  352
  Packets In
Passed10070
Blocked1240
  Packets Out
Passed 1011
Blocked  04

State Table  Total Rate
  current entries  872
  searches2986   93.3/s
  inserts  872   27.2/s
  removals   00.0/s
Counters
  match   1990   62.2/s
  bad-offset 00.0/s
  fragment   00.0/s
  short  00.0/s
  normalize  00.0/s
  memory 00.0/s
  bad-timestamp  00.0/s

TIMEOUTS:
tcp.first  3600s
tcp.opening 900s
tcp.established  432000s
tcp.closing3600s
tcp.finwait 600s
tcp.closed  180s
tcp.tsdiff   60s
udp.first60s
udp.single   30s
udp.multiple 60s
icmp.first   20s
icmp.error   10s
other.first  60s
other.single 30s
other.multiple   60s
frag 30s
interval 10s
adaptive.start0 states
adaptive.end  0 states
src.track 0s

LIMITS:
states hard limit 20
src-nodes  hard limit  1
frags  hard limit   5000

TABLES:

OS FINGERPRINTS:
345 fingerprints loaded
/root #


Re: lost of packets

2005-01-26 Thread Daniel Hartmeier
On Wed, Jan 26, 2005 at 11:44:21AM +0100, marc gmx wrote:

> The counter "Packets In/Blocked" for interface bge0  indicate a value
> of 124, WHY ???

One explanation would be that those 124 packets had invalid IP or UDP
checksums. Before you assume that's impossible, check the output of

  $ netstat -sp udp
  $ netstat -sp ip

before and after the test. If any non-obvious counter is increasing,
that would be a lead.

It might also help to capture all test packets on bge0 and bge1 with

  $ tcpdump -i bge0 -w bge0.pcap
  $ tcpdump -i bge1 -w bge1.pcap

then run the test.

You'll see all packets received on bge0 with

  $ tcpdump -nvvvXr bge0.pcap

including those blocked by pf (as tcpdump using bpf gets them earlier).

Now compare that with those packets that came out on bge1 (using
bge1.pcap). You should be able to associate each outgoing packet with an
incoming one, based on the destination address (which is unique in your
test, if I understand correctly). If you find packets in bge0.pcap that
have no corresponding entry in bge1.pcap, show us the full output of
tcpdump -nvvvX for those packets only (one or two examples should be
enough). If there are invalid checksums or IP options or other packet
corruptions, we'll see it in the hexdump.

I don't think this is related to the NATing at all. If there was a
problem in the NAT code (like failure to allocate a proxy port, or
similar), you wouldn't see packets counted as 'blocked in on bge0'.

More likely, the perl script is not correctly setting the checksums in
all cases, or there's corruption due to collisions or such.

Daniel


Re: transparent squid and load balancing outgoing traffic

2005-01-26 Thread Emilio Lucena
Kevin,

First of all, thanks for your help.


On Tue, 25 Jan 2005, Kevin wrote:

> Can you provide more information on your load-balancing configuration,
> specifically on what the two external interfaces are connected through?
> Are you doing any NAT?

Yes .. we are doing NAT.

lan_net=$int_if:network

nat on $ext_if1 from $lan_net to any -> ($ext_if1)
nat on $ext_if2 from $lan_net to any -> ($ext_if2)

Also, we are redirecting web traffic to the firewall where squid runs.

rdr on $int_if inet proto tcp from $lan_net to any port www -> 127.0.0.1 port 
3128

The external interfaces are connected to a cable Internet connection and 
an ADSL Internet connection.

> I think this line should read:
> pass in quick on $int_if inet proto tcp from $int_net to ($int_if)
> port=3128 keep state

Daniel's setup for transparent squid tells us to use the rule I mentioned.

> If you set tcp_outgoing_address to an alias IP on $int_if, you could try this:
> pass out route-to \
> { ($ext_if1 ) , ($ext_if2 ) }  round-robin \
> inet proto tcp from $squid_ip to any flags S/SA modulate state
> 

Thanks ... I will try this and see if it works.

> Depending on how your inbound traffic is load-balanced, you might not need to
> do any tricks, as 99.99% of the squid-related traffic is going to be 
> downloads,
> limiting the need to load-balance outbound -- the exception being if you are
> using NAT to rewrite outbound sessions to be sourced with a different ext_if
> interface address to force reply traffic to come back the same path it went 
> out?

This is how I am doing inbound load balance:

# Internal services on the LAN

pass in log on $ext_if1 from any to $intweb tag from_ef1 keep state
pass in log on $ext_if2 from any to $intweb tag from_ef2 keep state

# packets for the internal webserver

pass out log on $int_if reply-to ($ext_if1 $gw_if1) \
 from any to $intweb tagged from_ef1 keep state
pass out log on $int_if reply-to ($ext_if2 $gw_if2) \
 from any to $intweb tagged from_ef2 keep state

Regards,

ebl


Re: altq and rate limiting (packets/sec)

2005-01-26 Thread mikem170
ASAIK pf rate-limits based on bits per second, not packets per second. 
qlimit controls depth of queues, not how fast they are emptied.

You could have two queues, one for syn packets and one for other traffic. 
The syn packet queue can be rate limited to X bits/second which can be 
based on known small syn packet size.

Mike
On Tue, 25 Jan 2005, Christopher Linn wrote:
i am interested 9in using altq to limit the outflow from an rfc1918
NAT'd network to alleviate the possibility of e.g. DDoS attacks
originating from within the NAT.
one of our security guys (who is not familiar with pf) mentioned to
me that i should look for something to rate-limit (packets/sec)
outgoing, since for example a DDoS SYN flood pointed at a webserver
port 80/443 just spews little packets at a high rate.  but the
closest thing i see to this is the "qlimit" parameter for max
packets queued.. doesn't really seem like it would be the same thing.
am i missing something?  has this issue been discussed?
i suspect i am missing something..
cheers,
chris linn
--
Christopher Linn, (celinn at mtu.edu) | By no means shall either the CEC
Staff System Administrator| or MTU be held in any way liable
 Center for Experimental Computation | for any opinions or conjecture I
   Michigan Technological University | hold to or imply to hold herein.


Re: altq and rate limiting (packets/sec)

2005-01-26 Thread Christopher Linn
On Wed, Jan 26, 2005 at 09:48:06AM -0500, [EMAIL PROTECTED] wrote:
> On Tue, 25 Jan 2005, Christopher Linn wrote:
> 
> >i am interested 9in using altq to limit the outflow from an rfc1918
> >NAT'd network to alleviate the possibility of e.g. DDoS attacks
> >originating from within the NAT.
> >
> >one of our security guys (who is not familiar with pf) mentioned to
> >me that i should look for something to rate-limit (packets/sec)
> >outgoing, since for example a DDoS SYN flood pointed at a webserver
> >port 80/443 just spews little packets at a high rate.  but the
> >closest thing i see to this is the "qlimit" parameter for max
> >packets queued.. doesn't really seem like it would be the same thing.
> >
> >am i missing something?  has this issue been discussed?
> >
> >i suspect i am missing something..
> >
> >cheers,
> >
> >chris linn
> >
> 
> ASAIK pf rate-limits based on bits per second, not packets per second. 
> qlimit controls depth of queues, not how fast they are emptied.
> 
> You could have two queues, one for syn packets and one for other traffic. 
> The syn packet queue can be rate limited to X bits/second which can be 
> based on known small syn packet size.
> 
> Mike


as i read this i said to myself, D'OH!  we're _queueing_ here!..  ;*)

daniel also gave me this advice, which seems like it might even be more 
apropriate:

"... but i wouldn't limit that with altq at all, use max-src-state or
similar to limit the number of concurrent _states_ a client can create
which limits syn floods (any connection flood, actually) nicely.  if
you limit a client to 100 concurrent connections, he can start nmap
or his favorite syn flood tool, but pf will only create the first 100
states, passing 100 syns, then block until old connections are
closed/time out."

also to clarify myself, i wasn't specifically worried about SYN flood,
but rather any possible flooding that anyone might think up, therefor 
i was thinking in terms of a more general solution.  seems like the
use of max-src-state (see pf.conf(5) under STATEFUL TRACKING OPTIONS)
fits this nicely.  thnaks daniel  ;*>

cheers,

chris

-- 
Christopher Linn, (celinn at mtu.edu) | By no means shall either the CEC
Staff System Administrator| or MTU be held in any way liable
  Center for Experimental Computation | for any opinions or conjecture I
Michigan Technological University | hold to or imply to hold herein.


Re: Tagging didn't work as expected

2005-01-26 Thread Peter Fraser

Daniel Hartmeier [EMAIL PROTECTED] wrote that my use of tagging 
should work. So I moved the tagging rules to the very top of my rule set
and did a traceroute from a different machine . This is the result

# pfctl -vvvsr
@0 scrub in all fragment reassemble
  [ Evaluations: 121941Packets: 63360 Bytes: 0   States:
0 ]
@0 block drop log all
  [ Evaluations: 4171  Packets: 10Bytes: 468 States:
0 ]
@1 block drop log quick inet6 all
  [ Evaluations: 4171  Packets: 0 Bytes: 0   States:
0 ]
@2 block drop in inet proto icmp all icmp-type echoreq tag icmp
  [ Evaluations: 4171  Packets: 0 Bytes: 0   States:
0 ]
@3 pass in quick from  to any keep state tagged icmp
  [ Evaluations: 3533  Packets: 0 Bytes: 0   States:
0 ]
@4 pass in quick on ste0 from any to  keep state tagged icmp
  [ Evaluations: 3533  Packets: 0 Bytes: 0   States:
0 ]
@5 pass in quick on fxp0 all keep state tagged icmp
  [ Evaluations: 3533  Packets: 71Bytes: 6016States:
1 ]
@6 pass in quick on ste1 all keep state tagged icmp
  [ Evaluations: 3532  Packets: 0 Bytes: 0   States:
0 ]
@7 pass in quick on ste2 all keep state tagged icmp
  [ Evaluations: 3532  Packets: 0 Bytes: 0   States:
0 ]
@8 pass out quick all keep state tagged icmp
  [ Evaluations: 4171  Packets: 0 Bytes: 0   States:
0 ]
@9 pass out quick inet proto icmp all icmp-type echoreq keep state
  [ Evaluations: 639   Packets: 65Bytes: 5572States:
1 ]


I don't understand why rule 8 was not used. (There are other rules after
9 which 
I didn't not include, but I do not believe that the could effect this
example.)


Re: Tagging didn't work as expected

2005-01-26 Thread Daniel Hartmeier
On Wed, Jan 26, 2005 at 12:49:13PM -0500, Peter Fraser wrote:

> Daniel Hartmeier [EMAIL PROTECTED] wrote that my use of tagging 
> should work. So I moved the tagging rules to the very top of my rule set
> and did a traceroute from a different machine . This is the result

I think you didn't mention traceroute before. If you retry with ping(8)
instead, you'll see that it works. Here's why.

When you use traceroute, the sender will start with an ICMP echo request
with TTL 1, which elicits an ICMP time exceeded error from the nearest
router (an IP forwarder who decrements TTL). It repeats this with
increasing TTL values. The chronological order of ICMP time exceeded
errors produces the list of hops, from nearest to farest.

The first ICMP echo request pf will see coming in will have TTL 1 when
it reaches the pf box. Your rule @2 will match, tag the packet with tag
icmp, create state, and pass the packet.

The IP forwarding code in the kernel will, however, not forward that
first packet, since after decrementing, the TTL will be 0. Instead, it
produces the ICMP time exceeded error.

The traceroute machine will print the pf box as a hop on the path, and
send the next ICMP echo request with a TTL one higher than the last one.
This second packet arrives in on the pf box, matches the state previously
created, and gets passed due to the state entry, _without_ any ruleset
evaluations.

The important detail is that this second packet will not get tagged. In
stateful filtering, only the first packet creating the state entry will
cause tagging. Packets matching this state later on will not get tagged.
This is the reason why the man page mentions:

  pass rules that use the tag keyword must also use keep state, modulate
  state or synproxy state.

The reason for this limitation is merely performance. You generally
don't want to tag further packets of a connection, once it has created
state entries. In most cases, adding tags to packets matching state
entries would just be a waste of CPU cycles.

This second ICMP echo request has a higher TTL, so it doesn't become 0
after decrement, and gets forwarded. This is the first packet that
actually gets sent out on the second interface on the pf box.

There is no state entry on that interface yet, so the ruleset is
evaluated. The 'tagged icmp' rule does not match because this second
packet was never tagged!

If you use ping(8) instead of traceroute, the first packet that creates
state on the first interface will also be the one that gets filtered
first on the second interface. It will have the tag and match the
'tagged' rule. Once states exist on both interface, packets related to
this ping(8) invokation will pass matching these states, not evaluating
any more rules.

If you'd use a bridge instead of an IP forwarder, this issue wouldn't
come up, as bridges do not decrement TTL when forwarding frames.

One thing you can do to address the issue on an IP forwarder is to use
scrub's 'min-ttl' option. For instance, you can have all TTL values of
incoming packets increased from 1 to 2. This ensures that no packet will
be dropped by the IP forwarding code due to TTL reaching 0. It has the
side-effect that the pf machine will no longer produce ICMP time
exceeded errors, and therefore won't show up in traceroute output as a
hop. Some people use the 'min-ttl' option precisely because of that
effect, to hide the hop. In your case, it might be just an unwelcome
side-effect.

Nice puzzle, took me half an hour to find the explanation :)

Daniel


Re: Tagging didn't work as expected

2005-01-26 Thread Lester

On Jan 26, 2005, at 2:44 PM, Daniel Hartmeier wrote:

On Wed, Jan 26, 2005 at 12:49:13PM -0500, Peter Fraser wrote:

Daniel Hartmeier [EMAIL PROTECTED] wrote that my use of tagging 
should work. So I moved the tagging rules to the very top of my rule set
and did a traceroute from a different machine . This is the result

I think you didn't mention traceroute before. If you retry with ping(8)
instead, you'll see that it works. Here's why.

When you use traceroute, the sender will start with an ICMP echo request
with TTL 1, which elicits an ICMP time exceeded error from the nearest
router (an IP forwarder who decrements TTL). It repeats this with
increasing TTL values. The chronological order of ICMP time exceeded
errors produces the list of hops, from nearest to farest.

The first ICMP echo request pf will see coming in will have TTL 1 when
it reaches the pf box. Your rule @2 will match, tag the packet with tag
icmp, create state, and pass the packet.

The IP forwarding code in the kernel will, however, not forward that
first packet, since after decrementing, the TTL will be 0. Instead, it
produces the ICMP time exceeded error.

The traceroute machine will print the pf box as a hop on the path, and
send the next ICMP echo request with a TTL one higher than the last one.
This second packet arrives in on the pf box, matches the state previously
created, and gets passed due to the state entry, _without_ any ruleset
evaluations.

The important detail is that this second packet will not get tagged. In
stateful filtering, only the first packet creating the state entry will
cause tagging. Packets matching this state later on will not get tagged.
This is the reason why the man page mentions:

pass rules that use the tag keyword must also use keep state, modulate
state or synproxy state.

The reason for this limitation is merely performance. You generally
don't want to tag further packets of a connection, once it has created
state entries. In most cases, adding tags to packets matching state
entries would just be a waste of CPU cycles.

This second ICMP echo request has a higher TTL, so it doesn't become 0
after decrement, and gets forwarded. This is the first packet that
actually gets sent out on the second interface on the pf box.

There is no state entry on that interface yet, so the ruleset is
evaluated. The 'tagged icmp' rule does not match because this second
packet was never tagged!

If you use ping(8) instead of traceroute, the first packet that creates
state on the first interface will also be the one that gets filtered
first on the second interface. It will have the tag and match the
'tagged' rule. Once states exist on both interface, packets related to
this ping(8) invokation will pass matching these states, not evaluating
any more rules.

If you'd use a bridge instead of an IP forwarder, this issue wouldn't
come up, as bridges do not decrement TTL when forwarding frames.

One thing you can do to address the issue on an IP forwarder is to use
scrub's 'min-ttl' option. For instance, you can have all TTL values of
incoming packets increased from 1 to 2. This ensures that no packet will
be dropped by the IP forwarding code due to TTL reaching 0. It has the
side-effect that the pf machine will no longer produce ICMP time
exceeded errors, and therefore won't show up in traceroute output as a
hop. Some people use the 'min-ttl' option precisely because of that
effect, to hide the hop. In your case, it might be just an unwelcome
side-effect.

Nice puzzle, took me half an hour to find the explanation :)

Daniel


A. Lester Burke
Network Analyst
Arlington Public School Arlington VA
E [EMAIL PROTECTED]
V 703-228-6057

The wise man can learn from the fool but the fool cannot learn from the wise man.

Dad


Re: transparent squid and load balancing outgoing traffic

2005-01-26 Thread Emilio Lucena
>From what I could understand, the tcp_outgoing_address is only really used
if you are not doing NAT on the external connections, right?

If that is the case, the proposed rule will never be matched, and the web
traffic will only go through the default outbound interface, bypassing the 
load-balance policy.

Is my understanding correct? 

Regards,

ebl

On Wed, 26 Jan 2005, Emilio Lucena wrote:

> Kevin,
> 
> First of all, thanks for your help.
> 
> 
> On Tue, 25 Jan 2005, Kevin wrote:
> 
> > Can you provide more information on your load-balancing configuration,
> > specifically on what the two external interfaces are connected through?
> > Are you doing any NAT?
> 
> Yes .. we are doing NAT.
> 
> lan_net=$int_if:network
> 
> nat on $ext_if1 from $lan_net to any -> ($ext_if1)
> nat on $ext_if2 from $lan_net to any -> ($ext_if2)
> 
> Also, we are redirecting web traffic to the firewall where squid runs.
> 
> rdr on $int_if inet proto tcp from $lan_net to any port www -> 127.0.0.1 port 
> 3128
> 
> The external interfaces are connected to a cable Internet connection and 
> an ADSL Internet connection.
> 
> > I think this line should read:
> > pass in quick on $int_if inet proto tcp from $int_net to ($int_if)
> > port=3128 keep state
> 
> Daniel's setup for transparent squid tells us to use the rule I mentioned.
> 
> > If you set tcp_outgoing_address to an alias IP on $int_if, you could try 
> > this:
> > pass out route-to \
> > { ($ext_if1 ) , ($ext_if2 ) }  round-robin \
> > inet proto tcp from $squid_ip to any flags S/SA modulate state
> > 
> 
> Thanks ... I will try this and see if it works.
> 
> > Depending on how your inbound traffic is load-balanced, you might not need 
> > to
> > do any tricks, as 99.99% of the squid-related traffic is going to be 
> > downloads,
> > limiting the need to load-balance outbound -- the exception being if you are
> > using NAT to rewrite outbound sessions to be sourced with a different ext_if
> > interface address to force reply traffic to come back the same path it went 
> > out?
> 
> This is how I am doing inbound load balance:
> 
> # Internal services on the LAN
> 
> pass in log on $ext_if1 from any to $intweb tag from_ef1 keep state
> pass in log on $ext_if2 from any to $intweb tag from_ef2 keep state
> 
> # packets for the internal webserver
> 
> pass out log on $int_if reply-to ($ext_if1 $gw_if1) \
>  from any to $intweb tagged from_ef1 keep state
> pass out log on $int_if reply-to ($ext_if2 $gw_if2) \
>  from any to $intweb tagged from_ef2 keep state
> 
> Regards,
> 
> ebl
> 


Re: transparent squid and load balancing outgoing traffic

2005-01-26 Thread Daniel Hartmeier
On Tue, Jan 25, 2005 at 06:19:36PM -0300, Emilio Lucena wrote:

> Then the traffic is delivered to squid to be dealt with. But, then this 
> means squid will use the default route to open the http connection to the 
> Internet server and bypass the load balance rule, right? 

Yes, the connections from squid to the external web servers are not
passing through $int_if at all, and are unrelated (for pf) to the client
connections causing them.

> So, is this setup incompatible or there is some trick I can do to make it 
> work?

Instead of using route-to on $int_if, you can let connections go out
through the one interface to the default gateway, and use route-to on a
'pass out on $ext_if1' rule to re-route the outgoing connection to
another interface. Packets will 'try' to get out on the default
interface, but re-routing occurs before they are actually sent out
through the interface.

  pass out on $ext_if1 route-to { ($ext_if1 $gwy_if1), \
($ext_if2 $gwy_if2) } round-robin ... keep state

Where $ext_if1 is the interface to your default gateway, where all
connections will go out through by default. Half of them will be
re-routed out on $ext_if2, and half will go out throuh $ext_if1.

You'd use the same construct if you wanted to load-balance outgoing
connections opened by the firewall itself (say, a DNS server there),
which don't arrive in on any interface at all.

Daniel