Re: Stick tables, good guys, bad guys, and NATs

2015-01-31 Thread Yuan
Hi Willy,

Gratitude. Thanks a lot. Appreciate this tons. Helps a lot.

Regards,

Long Wu Yuan 龙 武 缘 
Sr. Linux Engineer 高级工程师
ChinaNetCloud 云络网络科技(上海)有限公司 | www.ChinaNetCloud.com1238 Xietu Lu, X2 Space 
1-601, Shanghai, China | 中国上海市徐汇区斜土路1238号X2空 间1-601室

24x7 Support Hotline: +86-400-618-0024 | Office Tel: +86-(21)-6422-1946
We are hiring! http://careers.chinanetcloud.com  | Customer Portal - 
https://customer-portal.service.chinanetcloud.com/


On Jan 31, 2015, at 9:32 PM, Willy Tarreau  wrote:

> Hi guys,
> 
> On Tue, Jan 27, 2015 at 06:01:13AM +0800, Yuan Long wrote:
>> I am in the same fix.
>> No matter what we try, the data to address is the real
>> laptop/desktop/cellphone/server count. That count is skewed as soon as
>> there are a hundred laptops/desktops behind a router.
>> 
>> Best I heard is from Willy himself, suggestion to use base32+src. At the
>> cost of losing plain text and having a binary to use in acl but works for
>> now. Grateful to have HAProxy in the first place.
> 
> There's no universal rule. Everything depends on how the site is made,
> and how the bad guys are acting. For example, some sites may work very
> well with a rate-limit on base32+src. That could be the case when you
> want to prevent a client from mirroring a whole web site. But for sites
> with very few urls, it could be another story. Conversely, some sites
> will provide lots of different links to various objects. Think for
> example about a merchant's site where each photo of object for sale is
> a different URL. You wouldn't want to block users who simply click on
> "next" and get 50 new photos each time.
> 
> So the first thing to do is to define how the site is supposed to work.
> Next, you define what is a bad behaviour, and how to distinguish between
> intentional bad behaviour and accidental bad behaviour (eg: people who
> have to hit reload several times because of a poor connection). For most
> sites, you have to keep in mind that it's better to let some bad users
> pass through than to block legitimate users. So you want to put the cursor
> on the business side and not on the policy enforcement side.
> 
> Proxies, firewalls etc make the problem worse, but not too much in general.
> You'll easily see some addresses sending 3-10 times more requests than other
> ones because they're proxying many users. But if you realize that a valid
> user may also reach that level of traffic on regular use of the site, it's
> a threshold you have to accept anyway. What would be unlikely however is
> that surprizingly all users behind a proxy browse on steroids. So setting
> blocking levels 10 times higher than the average pace you normally observe
> might already give very good results.
> 
> If your site is very special and needs to enforce strict rules against
> sucking or spamming (eg: forums), then you may need to identify the client
> and observe cookies. But then there's even less generic rule, it totally
> depends on the application and the sequence to access the site. To be
> transparent on this subject, we've been involved in helping a significant
> number of sites under abuse or attack at HAProxy Technologies, and it
> turns out that whatever new magic tricks you find for one site are often
> irrelevant to the next one. Each time you have to go back to pencil and
> paper and write down the complete browsing sequence and find a few subtle
> elements there.
> 
> Regards,
> Willy
> 



Re: Stick tables, good guys, bad guys, and NATs

2015-01-31 Thread Willy Tarreau
Hi guys,

On Tue, Jan 27, 2015 at 06:01:13AM +0800, Yuan Long wrote:
> I am in the same fix.
> No matter what we try, the data to address is the real
> laptop/desktop/cellphone/server count. That count is skewed as soon as
> there are a hundred laptops/desktops behind a router.
> 
> Best I heard is from Willy himself, suggestion to use base32+src. At the
> cost of losing plain text and having a binary to use in acl but works for
> now. Grateful to have HAProxy in the first place.

There's no universal rule. Everything depends on how the site is made,
and how the bad guys are acting. For example, some sites may work very
well with a rate-limit on base32+src. That could be the case when you
want to prevent a client from mirroring a whole web site. But for sites
with very few urls, it could be another story. Conversely, some sites
will provide lots of different links to various objects. Think for
example about a merchant's site where each photo of object for sale is
a different URL. You wouldn't want to block users who simply click on
"next" and get 50 new photos each time.

So the first thing to do is to define how the site is supposed to work.
Next, you define what is a bad behaviour, and how to distinguish between
intentional bad behaviour and accidental bad behaviour (eg: people who
have to hit reload several times because of a poor connection). For most
sites, you have to keep in mind that it's better to let some bad users
pass through than to block legitimate users. So you want to put the cursor
on the business side and not on the policy enforcement side.

Proxies, firewalls etc make the problem worse, but not too much in general.
You'll easily see some addresses sending 3-10 times more requests than other
ones because they're proxying many users. But if you realize that a valid
user may also reach that level of traffic on regular use of the site, it's
a threshold you have to accept anyway. What would be unlikely however is
that surprizingly all users behind a proxy browse on steroids. So setting
blocking levels 10 times higher than the average pace you normally observe
might already give very good results.

If your site is very special and needs to enforce strict rules against
sucking or spamming (eg: forums), then you may need to identify the client
and observe cookies. But then there's even less generic rule, it totally
depends on the application and the sequence to access the site. To be
transparent on this subject, we've been involved in helping a significant
number of sites under abuse or attack at HAProxy Technologies, and it
turns out that whatever new magic tricks you find for one site are often
irrelevant to the next one. Each time you have to go back to pencil and
paper and write down the complete browsing sequence and find a few subtle
elements there.

Regards,
Willy




Re: Stick tables, good guys, bad guys, and NATs

2015-01-26 Thread Yuan Long
I am in the same fix.
No matter what we try, the data to address is the real
laptop/desktop/cellphone/server count. That count is skewed as soon as
there are a hundred laptops/desktops behind a router.

Best I heard is from Willy himself, suggestion to use base32+src. At the
cost of losing plain text and having a binary to use in acl but works for
now. Grateful to have HAProxy in the first place.

Regards,

Long Wu Yuan 龙 武 缘
Sr. Linux Engineer 高级工程师
ChinaNetCloud 云络网络科技(上海)有限公司 | www.ChinaNetCloud.com1238 Xietu Lu, X2 Space
1-601, Shanghai, China | 中国上海市徐汇区斜土路1238号X2空 间1-601室

24x7 Support Hotline: +86-400-618-0024 | Office Tel: +86-(21)-6422-1946
We are hiring! http://careers.chinanetcloud.com  | Customer Portal -
https://customer-portal.service.chinanetcloud.com/



On Tue, Jan 27, 2015 at 1:57 AM, CJ Ess  wrote:

> I am upgrading my environment from haproxy 1.3/1.4 to haproxy 1.5 but as
> of yet am not using any of the newer features.
>
> I'm intrigued with using the stick table facilities in haproxy 1.5 to help
> mitigate the impact of malicious users and that seems to be a common goal -
> however I haven't seen any discussion about large groups of users behind
> NATs and firewalls (businesses, universities, mobile, etc.) Has anyone
> found a happy median between these two concerns? Aside from white listing
> and the blocks aging out over time.
>
> One thought I had, in a virtual hosting environment, was to use a stick
> table to track the number of requests by Host header, and direct requests
> to a different backend (with dedicated resources) once requests for a
> particular vhost crosses a threshold - and rejoin the common pool once the
> traffic dies down. Has anyone been successful with a similar setup?
>
>
>


Stick tables, good guys, bad guys, and NATs

2015-01-26 Thread CJ Ess
I am upgrading my environment from haproxy 1.3/1.4 to haproxy 1.5 but as of
yet am not using any of the newer features.

I'm intrigued with using the stick table facilities in haproxy 1.5 to help
mitigate the impact of malicious users and that seems to be a common goal -
however I haven't seen any discussion about large groups of users behind
NATs and firewalls (businesses, universities, mobile, etc.) Has anyone
found a happy median between these two concerns? Aside from white listing
and the blocks aging out over time.

One thought I had, in a virtual hosting environment, was to use a stick
table to track the number of requests by Host header, and direct requests
to a different backend (with dedicated resources) once requests for a
particular vhost crosses a threshold - and rejoin the common pool once the
traffic dies down. Has anyone been successful with a similar setup?