Re: TCP traffic multiplexing as balance algorithm?

2009-05-13 Thread Michael Miller
Hi,

In your specific case of testing SMTP servers there is a sendmail milter
to do what you want:
http://www.snertsoft.com/sendmail/roundhouse/

I do not believe that what you are trying to achieve is possible at the
TCP level. haproxy does not have any idea of the application protocol
(eg: SMTP) running over the transport (TCP). You really need some form
of application layer proxy to handle the duplication of your requests to
the two servers.

Regards,
Mike

Maik Broemme wrote:
> Hi,
>
> Benoit  wrote:
>   
>> Maik Broemme a écrit :
>> 
>>> Hi,
>>>   
>>>
>>> Multiplex means traffic duplication. If you have multiple server
>>> configuration options in one listen group, the incoming traffic is
>>> sent to all servers.
>>>   
>>>   
>> Hum, i'm sorry but no, multiplexing is not duplication. In fact it's
>> more like the opposite,
>> it's the process of combining multiple data stream into one (long descr.
>> here: http://en.wikipedia.org/wiki/Multiplexing)
>> 
>
> Sorry multiplexing was the wrong word for it, I rellay talk about
> duplication.
>
>   
>>> tcpdump is not perfect in that case, because it has to run the hole time
>>> you want to duplicate the traffic and sent it to server1 and server2.
>>>   
>>>   
>> So let's say you pacth haproxy and he duplicate traffic to two servers
>> and is able to forget
>> data from one (the dev/test one) and keep the other (the prod one).
>>
>> How will haproxy been able to react to the different responses/timing
>> from each servers ?
>> Let's say you duplicate MX traffic, with the test server being used to
>> validate a new configuration
>> to keep away spammers (simple example).
>> Let's say both server are able to answer at the exact same time or
>> aren't too frisky about having the answer
>> before the question, when the prod server will accept the mail message
>> and start listinening from it's main
>> part the other could have rejected it, however he still will receive the
>> main part, while not expecting it,
>> which depending of the implementation could lead to troubles, like if
>> the originating mail server try to
>>  send another mail using the same connection.
>>
>> 
>
> I am thinking about the timing issues, my guess is to add a option for
> the duplicate balance algorithm, lets say 'async' or 'sync'. In 'async'
> state haproxy will send traffic to dev/test and only take care of
> response from dev, regardless if test respond or not. Later answer from
> test will be dropped by haproxy. In 'sync' state the haproxy will wait
> until dev/test has answered and send the answer from dev to client.
>
> For short:
>
>   - async will drop everything from test, regardless of answer
> time and send everything to test regardless if it is expected or
> not.
>
>   - sync will drop everything from test, but wait until it has answered.
>
> There will be - for sure - not much scenarios were you need such
> feature.
>
>   
>> This is a very simple example and most MX implementation could react
>> correctly but that's not the case of
>> everything
>>
>>
>> 
>
> --Maik
>
>
>   



Re: TCP traffic multiplexing as balance algorithm?

2009-05-13 Thread Maik Broemme
Hi,

Benoit  wrote:
> Maik Broemme a écrit :
> > Hi,
> >   
> >
> > Multiplex means traffic duplication. If you have multiple server
> > configuration options in one listen group, the incoming traffic is
> > sent to all servers.
> >   
> 
> Hum, i'm sorry but no, multiplexing is not duplication. In fact it's
> more like the opposite,
> it's the process of combining multiple data stream into one (long descr.
> here: http://en.wikipedia.org/wiki/Multiplexing)

Sorry multiplexing was the wrong word for it, I rellay talk about
duplication.

> >
> > tcpdump is not perfect in that case, because it has to run the hole time
> > you want to duplicate the traffic and sent it to server1 and server2.
> >   
> So let's say you pacth haproxy and he duplicate traffic to two servers
> and is able to forget
> data from one (the dev/test one) and keep the other (the prod one).
> 
> How will haproxy been able to react to the different responses/timing
> from each servers ?
> Let's say you duplicate MX traffic, with the test server being used to
> validate a new configuration
> to keep away spammers (simple example).
> Let's say both server are able to answer at the exact same time or
> aren't too frisky about having the answer
> before the question, when the prod server will accept the mail message
> and start listinening from it's main
> part the other could have rejected it, however he still will receive the
> main part, while not expecting it,
> which depending of the implementation could lead to troubles, like if
> the originating mail server try to
>  send another mail using the same connection.
> 

I am thinking about the timing issues, my guess is to add a option for
the duplicate balance algorithm, lets say 'async' or 'sync'. In 'async'
state haproxy will send traffic to dev/test and only take care of
response from dev, regardless if test respond or not. Later answer from
test will be dropped by haproxy. In 'sync' state the haproxy will wait
until dev/test has answered and send the answer from dev to client.

For short:

  - async will drop everything from test, regardless of answer
time and send everything to test regardless if it is expected or
not.

  - sync will drop everything from test, but wait until it has answered.

There will be - for sure - not much scenarios were you need such
feature.

> 
> This is a very simple example and most MX implementation could react
> correctly but that's not the case of
> everything
> 
> 

--Maik



Re: TCP traffic multiplexing as balance algorithm?

2009-05-13 Thread Benoit
Maik Broemme a écrit :
> Hi,
>   
>
> Multiplex means traffic duplication. If you have multiple server
> configuration options in one listen group, the incoming traffic is
> sent to all servers.
>   

Hum, i'm sorry but no, multiplexing is not duplication. In fact it's
more like the opposite,
it's the process of combining multiple data stream into one (long descr.
here: http://en.wikipedia.org/wiki/Multiplexing)
>
> tcpdump is not perfect in that case, because it has to run the hole time
> you want to duplicate the traffic and sent it to server1 and server2.
>   
So let's say you pacth haproxy and he duplicate traffic to two servers
and is able to forget
data from one (the dev/test one) and keep the other (the prod one).

How will haproxy been able to react to the different responses/timing
from each servers ?
Let's say you duplicate MX traffic, with the test server being used to
validate a new configuration
to keep away spammers (simple example).
Let's say both server are able to answer at the exact same time or
aren't too frisky about having the answer
before the question, when the prod server will accept the mail message
and start listinening from it's main
part the other could have rejected it, however he still will receive the
main part, while not expecting it,
which depending of the implementation could lead to troubles, like if
the originating mail server try to
 send another mail using the same connection.


This is a very simple example and most MX implementation could react
correctly but that's not the case of
everything




Re: TCP traffic multiplexing as balance algorithm?

2009-05-13 Thread Benoit
Maik Broemme a écrit :
> Hi,
>   
>
> Multiplex means traffic duplication. If you have multiple server
> configuration options in one listen group, the incoming traffic is
> sent to all servers.
>   

Hum, i'm sorry but no, multiplexing is not duplication. In fact it's
more like the opposite,
it's the process of combining multiple data stream into one (long descr.
here: http://en.wikipedia.org/wiki/Multiplexing)
>
> tcpdump is not perfect in that case, because it has to run the hole time
> you want to duplicate the traffic and sent it to server1 and server2.
>   
So let's say you pacth haproxy and he duplicate traffic to two servers
and is able to forget
data from one (the dev/test one) and keep the other (the prod one).

How will haproxy been able to react to the different responses/timing
from each servers ?
Let's say you duplicate MX traffic, with the test server being used to
validate a new configuration
to keep away spammers (simple example).
Let's say both server are able to answer at the exact same time or
aren't too frisky about having the answer
before the question, when the prod server will accept the mail message
and start listinening from it's main
part the other could have rejected it, however he still will receive the
main part, while not expecting it,
which depending of the implementation could lead to troubles, like if
the originating mail server try to
 send another mail using the same connection.


This is a very simple example and most MX implementation could react
correctly but that's not the case of
everything



Re: TCP traffic multiplexing as balance algorithm?

2009-05-13 Thread Maik Broemme
Hi,

Willy Tarreau  wrote:
> Hi Maik,
> 
> On Tue, May 12, 2009 at 01:57:47AM +0200, Maik Broemme wrote:
> > Hi,
> > 
> > I have a small question. Did someone know if it is possible to do simple
> > traffic multiplexing with HAProxy? Maybe I am missing it somehow, but
> > want to ask on the list before creating a patch for it.
> 
> what do you call "traffic multiplexing" ? From your description below, I
> failed to understand what it consists in.
> 

Multiplex means traffic duplication. If you have multiple server
configuration options in one listen group, the incoming traffic is
sent to all servers.

> > Just to answer the real-world scenario question. TCP multiplexing can be
> > very useful for debugging backend servers or doing a simple logging and
> > passive traffic dumping.
> > 
> > There are two major ideas of implementing it:
> > 
> >   - 1:N (Active / Passive)
> >   - 1:N (Active / Active)
> > 
> > Well active means that request is going to destination and response back
> > to client and passive means that only request is going to the destination.
> > In configuration it could look like:
> > 
> > listen  smtp-filter 127.0.0.1:25
> > modetcp
> > balance multiplex
> > server  smtp1 10.0.0.5:25
> > server  smtp2 10.0.0.6:25
> > 
> > The active / active would be very hard to implement, tcp stream
> > synchronisation would be a pain and I think no one will really need
> > this, but active / passive is a very useful feature.
> > 
> > In my environment it is often so, that developers need access to real
> > traffic data to debug (in the example above) their developed smtp
> > software. Is anyone else missing such functionality? :)
> 
> Access to real data is solved with tcpdump or logs, I don't see what
> your load-balancing method will bring here.
> 

tcpdump is not perfect in that case, because it has to run the hole time
you want to duplicate the traffic and sent it to server1 and server2.

> > --Maik
> 
> Regards,
> Willy
> 

--Maik



Re: TCP traffic multiplexing as balance algorithm?

2009-05-12 Thread Willy Tarreau
Hi Maik,

On Tue, May 12, 2009 at 01:57:47AM +0200, Maik Broemme wrote:
> Hi,
> 
> I have a small question. Did someone know if it is possible to do simple
> traffic multiplexing with HAProxy? Maybe I am missing it somehow, but
> want to ask on the list before creating a patch for it.

what do you call "traffic multiplexing" ? From your description below, I
failed to understand what it consists in.

> Just to answer the real-world scenario question. TCP multiplexing can be
> very useful for debugging backend servers or doing a simple logging and
> passive traffic dumping.
> 
> There are two major ideas of implementing it:
> 
>   - 1:N (Active / Passive)
>   - 1:N (Active / Active)
> 
> Well active means that request is going to destination and response back
> to client and passive means that only request is going to the destination.
> In configuration it could look like:
> 
>   listen  smtp-filter 127.0.0.1:25
>   modetcp
>   balance multiplex
>   server  smtp1 10.0.0.5:25
>   server  smtp2 10.0.0.6:25
> 
> The active / active would be very hard to implement, tcp stream
> synchronisation would be a pain and I think no one will really need
> this, but active / passive is a very useful feature.
> 
> In my environment it is often so, that developers need access to real
> traffic data to debug (in the example above) their developed smtp
> software. Is anyone else missing such functionality? :)

Access to real data is solved with tcpdump or logs, I don't see what
your load-balancing method will bring here.

> --Maik

Regards,
Willy