Re: svn commit: r1726787 - /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c

2016-02-01 Thread Ruediger Pluem


On 02/01/2016 03:54 PM, Yann Ylavic wrote:
> On Mon, Feb 1, 2016 at 3:32 PM, Ruediger Pluem  wrote:
>>>
>>> I am fine with flushing at the end of the loop. If no one opposes I can do 
>>> this.
>>>
>>
>> Looking into this in more detail the following thoughts came up:
>>
>> What do we think how much iterations do we do in this loop in the typical 
>> case
>> a.k.a how often do we call ap_get_brigade with a non blocking read?
>> My gut feeling tells me that we only do this once at least in the websocket 
>> case.
> 
> I guess it depends on the size of the outgoing traffic, it the backend
> sometimes sends large data at once we may end up needing more than one
> non-blocking read (AP_IOBUFSIZE/8K each) to forward the fragments.

True and I guess this is often the case for mod_proxy_connect. For 
mod_proxy_wstunnel I would
expect that we see stuff below 8k, e.g. chat messages and alike in most use 
cases.

> 
>> If this is the case the discussion whether to flush after the loop or in the 
>> loop wouldn't be
>> relevant any longer, but more important:
>> If it is ok to flush in the loop I think it would be better to add the flush 
>> bucket
>> directly to bb_out instead of calling ap_fflush afterwards. The reason for 
>> this is
>> that it gives downstream filters a better information on what they should do.
> 
> +1
> 
>> With the current
>> approach they might set the buckets aside and store them for later 
>> processing and only
>> process them in the second round when they see the flush. If the flush would 
>> be on the first
>> brigade at least some of them would process the buckets directly without 
>> setting them aside.
>> Comments?
> 
> That's the "issue" with transient buckets, the core output will almost
> always set them aside if not flushed (including in the mod_proxy_http
> case).
> It wouldn't have to (or this would be a noop) if the buckets were
> allocated on its bucket_alloc instead of the origin's.
> But I agree that for now your approach makes sense.

I guess I will hack something up and we discuss further once I have a proposal.

Regards

Rüdiger



Re: svn commit: r1726787 - /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c

2016-02-01 Thread Ruediger Pluem


On 01/27/2016 08:50 PM, Ruediger Pluem wrote:
> 
> 
> On 01/27/2016 12:46 PM, Yann Ylavic wrote:
>> On Wed, Jan 27, 2016 at 10:02 AM, Plüm, Rüdiger, Vodafone Group
>>  wrote:
>>>
 -Original Message-
 From: Yann Ylavic [mailto:ylavic@gmail.com]
 Sent: Mittwoch, 27. Januar 2016 09:15
 To: httpd-dev
 Subject: Re: svn commit: r1726787 -

> 
>>
>>>
 (btw that could be optimized to flush outside the loop).
>>>
>>> Could be. The question is, whether this is ok from the websockets 
>>> responsiveness point of view.
>>> OTOH we do an non blocking read. So this should be fine.
>>
>> Yes, while successive non-blocking reads give us data I don't think we
>> should "fragment" on the other side, let the core/socket buffers
>> handle it.
> 
> I am fine with flushing at the end of the loop. If no one opposes I can do 
> this.
> 

Looking into this in more detail the following thoughts came up:

What do we think how much iterations do we do in this loop in the typical case
a.k.a how often do we call ap_get_brigade with a non blocking read?
My gut feeling tells me that we only do this once at least in the websocket 
case.
If this is the case the discussion whether to flush after the loop or in the 
loop wouldn't be
relevant any longer, but more important:
If it is ok to flush in the loop I think it would be better to add the flush 
bucket
directly to bb_out instead of calling ap_fflush afterwards. The reason for this 
is
that it gives downstream filters a better information on what they should do. 
With the current
approach they might set the buckets aside and store them for later processing 
and only
process them in the second round when they see the flush. If the flush would be 
on the first
brigade at least some of them would process the buckets directly without 
setting them aside.
Comments?

Regards

Rüdiger


Re: svn commit: r1726787 - /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c

2016-02-01 Thread Yann Ylavic
On Mon, Feb 1, 2016 at 3:32 PM, Ruediger Pluem  wrote:
>>
>> I am fine with flushing at the end of the loop. If no one opposes I can do 
>> this.
>>
>
> Looking into this in more detail the following thoughts came up:
>
> What do we think how much iterations do we do in this loop in the typical case
> a.k.a how often do we call ap_get_brigade with a non blocking read?
> My gut feeling tells me that we only do this once at least in the websocket 
> case.

I guess it depends on the size of the outgoing traffic, it the backend
sometimes sends large data at once we may end up needing more than one
non-blocking read (AP_IOBUFSIZE/8K each) to forward the fragments.

> If this is the case the discussion whether to flush after the loop or in the 
> loop wouldn't be
> relevant any longer, but more important:
> If it is ok to flush in the loop I think it would be better to add the flush 
> bucket
> directly to bb_out instead of calling ap_fflush afterwards. The reason for 
> this is
> that it gives downstream filters a better information on what they should do.

+1

> With the current
> approach they might set the buckets aside and store them for later processing 
> and only
> process them in the second round when they see the flush. If the flush would 
> be on the first
> brigade at least some of them would process the buckets directly without 
> setting them aside.
> Comments?

That's the "issue" with transient buckets, the core output will almost
always set them aside if not flushed (including in the mod_proxy_http
case).
It wouldn't have to (or this would be a noop) if the buckets were
allocated on its bucket_alloc instead of the origin's.
But I agree that for now your approach makes sense.

Regards,
Yann.


Re: svn commit: r1726787 - /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c

2016-01-29 Thread Ruediger Pluem


On 01/28/2016 11:49 AM, Stefan Eissing wrote:
> Interesting discussion. mod_http2 had similar issues to overcome and there 
> should be plenty of room for improvements...
> 
>> Am 27.01.2016 um 20:50 schrieb Ruediger Pluem <rpl...@apache.org>:
>> On 01/27/2016 12:46 PM, Yann Ylavic wrote:
>>> On Wed, Jan 27, 2016 at 10:02 AM, Plüm, Rüdiger, Vodafone Group
>>> <ruediger.pl...@vodafone.com> wrote:
>>>>
>>>>> -Original Message-
>>>>> From: Yann Ylavic [mailto:ylavic....@gmail.com]
>>>>> Sent: Mittwoch, 27. Januar 2016 09:15
>>>>> To: httpd-dev
>>>>> Subject: Re: svn commit: r1726787 -
>>>>> /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c
>>>>>
>>>>> I'm not sure the buckets can be destroyed while in flight in
>>>>> mod_proxy_wstunnel, unlike with mod_proxy_http where the backend
>>>>> connection/request may be cleared before all data are out.
>>>>>
>>>>> ISTM that this can't happen here because we flush out on every write
>>>>
>>>> In theory this should be the case, but I think having brigade and buckets 
>>>> allocated from
>>>> the same bucket allocator is a "better safe than sorry" approach that 
>>>> catches possible
>>>> edge case now or in the future (think of Grahams ideas to have suspending 
>>>> filters).
>>>> The issue with these possible edge cases is that it will be hard to find 
>>>> the root cause,
>>>> because we likely segfault somewhere else.
>>
>> Just saw that the code for mod_proxy_connect is very similar to the one from 
>> mod_proxy_wstunnel
>> and thus could impose the same issues. Maybe it should be fixed as well.
> 
> Just for my understanding of ap_buckets_lifetime_transform(): 
> - duplicates buckets from one brigade+allocator to another
> - optimizes the case where source buckets live at least as 
>   long as the copied ones
> - can therefore use transient buckets and avoids buffer copies

Yes and no. Just to clarify. It is assumed that the source buckets live as long 
as we are
in the same "processing cycle". If someone in the chain (be it in the filter 
chain,
or be it the proxy handler that frees up the backend connection early) cannot 
process
them as long as we know that the sources exists they must set the buckets aside 
which triggers
the copying. The beauty of the solution is that if the above does not happen no 
copy
is needed and that we do not need to decide in ap_buckets_lifetime_transform if 
a copy
is needed or not but defer that decision to the consumers in the later 
processing cyle
who should know much better then we know. And they then just need to do what 
they are used to
do in such situations: Setting the buckets aside.

> 
> mod_http2:
> - does not make that lifetime assumption. needs to copy.

Maybe it could if it waits to destroy the sources until passing to the filter 
chain returns.
In this case the buckets should be either consumed or set aside.

> - has optimization for transferring file buckets by setaside'ing 
>   the apr_file to the target pool
> 
> Maybe both approaches could benefit from each other.
> 
> What are the tasks to synchronize:
>   A. reading from somewhere into a source brigade
>   B. transferring from source to another target brigade
>   C. writing the target brigade to somewhere else
> where A and C keep the current brigade API. 
> 
> Maybe we could introduce something like a "Transfer Window" that
> keeps track of target buckets and their destruction which then
> triggers - maybe at another point in time / another thread - the
> destruction of the corresponding source buckets.

Hm. This might be difficult with the current pool and allocator lifetimes,
as you would probably block other data from being freed if just one source
bucket is still needed.

> 
> There should be a maximum memory footprint for the "transfer window" 
> which suspends further transfers until target buckets have been 
> destroyed. The transfer window could be given mutex and cond for
> working multi-threaded.
> 
> Ideally, this would be chainable, so that we have more than two 
> brigades involved.
> 

Regards

Rüdiger



Re: svn commit: r1726787 - /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c

2016-01-28 Thread Stefan Eissing
Interesting discussion. mod_http2 had similar issues to overcome and there 
should be plenty of room for improvements...

> Am 27.01.2016 um 20:50 schrieb Ruediger Pluem <rpl...@apache.org>:
> On 01/27/2016 12:46 PM, Yann Ylavic wrote:
>> On Wed, Jan 27, 2016 at 10:02 AM, Plüm, Rüdiger, Vodafone Group
>> <ruediger.pl...@vodafone.com> wrote:
>>> 
>>>> -Original Message-
>>>> From: Yann Ylavic [mailto:ylavic@gmail.com]
>>>> Sent: Mittwoch, 27. Januar 2016 09:15
>>>> To: httpd-dev
>>>> Subject: Re: svn commit: r1726787 -
>>>> /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c
>>>> 
>>>> I'm not sure the buckets can be destroyed while in flight in
>>>> mod_proxy_wstunnel, unlike with mod_proxy_http where the backend
>>>> connection/request may be cleared before all data are out.
>>>> 
>>>> ISTM that this can't happen here because we flush out on every write
>>> 
>>> In theory this should be the case, but I think having brigade and buckets 
>>> allocated from
>>> the same bucket allocator is a "better safe than sorry" approach that 
>>> catches possible
>>> edge case now or in the future (think of Grahams ideas to have suspending 
>>> filters).
>>> The issue with these possible edge cases is that it will be hard to find 
>>> the root cause,
>>> because we likely segfault somewhere else.
> 
> Just saw that the code for mod_proxy_connect is very similar to the one from 
> mod_proxy_wstunnel
> and thus could impose the same issues. Maybe it should be fixed as well.

Just for my understanding of ap_buckets_lifetime_transform(): 
- duplicates buckets from one brigade+allocator to another
- optimizes the case where source buckets live at least as 
  long as the copied ones
- can therefore use transient buckets and avoids buffer copies

mod_http2:
- does not make that lifetime assumption. needs to copy.
- has optimization for transferring file buckets by setaside'ing 
  the apr_file to the target pool

Maybe both approaches could benefit from each other.

What are the tasks to synchronize:
  A. reading from somewhere into a source brigade
  B. transferring from source to another target brigade
  C. writing the target brigade to somewhere else
where A and C keep the current brigade API. 

Maybe we could introduce something like a "Transfer Window" that
keeps track of target buckets and their destruction which then
triggers - maybe at another point in time / another thread - the
destruction of the corresponding source buckets.

There should be a maximum memory footprint for the "transfer window" 
which suspends further transfers until target buckets have been 
destroyed. The transfer window could be given mutex and cond for
working multi-threaded.

Ideally, this would be chainable, so that we have more than two 
brigades involved.

-Stefan



Re: svn commit: r1726787 - /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c

2016-01-27 Thread Yann Ylavic
On Tue, Jan 26, 2016 at 1:57 PM,   wrote:
> Author: rpluem
> Date: Tue Jan 26 12:57:18 2016
> New Revision: 1726787
>
> URL: http://svn.apache.org/viewvc?rev=1726787=rev
> Log:
> * Transform the buckets to the correct lifetime of the brigade, connection 
> and filter stack that processes it.

I'm not sure the buckets can be destroyed while in flight in
mod_proxy_wstunnel, unlike with mod_proxy_http where the backend
connection/request may be cleared before all data are out.

ISTM that this can't happen here because we flush out on every write
(btw that could be optimized to flush outside the loop).
So provided we clear the working brigade before leaving, isn't that
enough already?

Regards,
Yann.


Re: svn commit: r1726787 - /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c

2016-01-27 Thread Ruediger Pluem


On 01/27/2016 12:46 PM, Yann Ylavic wrote:
> On Wed, Jan 27, 2016 at 10:02 AM, Plüm, Rüdiger, Vodafone Group
> <ruediger.pl...@vodafone.com> wrote:
>>
>>> -Original Message-
>>> From: Yann Ylavic [mailto:ylavic@gmail.com]
>>> Sent: Mittwoch, 27. Januar 2016 09:15
>>> To: httpd-dev
>>> Subject: Re: svn commit: r1726787 -
>>> /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c
>>>
>>> I'm not sure the buckets can be destroyed while in flight in
>>> mod_proxy_wstunnel, unlike with mod_proxy_http where the backend
>>> connection/request may be cleared before all data are out.
>>>
>>> ISTM that this can't happen here because we flush out on every write
>>
>> In theory this should be the case, but I think having brigade and buckets 
>> allocated from
>> the same bucket allocator is a "better safe than sorry" approach that 
>> catches possible
>> edge case now or in the future (think of Grahams ideas to have suspending 
>> filters).
>> The issue with these possible edge cases is that it will be hard to find the 
>> root cause,
>> because we likely segfault somewhere else.

Just saw that the code for mod_proxy_connect is very similar to the one from 
mod_proxy_wstunnel
and thus could impose the same issues. Maybe it should be fixed as well.

> 
> I agree that a safe approach is needed for buckets' lifetime from
> input to output filters.
> I'd prefer this to be handled at the source though (by the input
> filter and callees that fill in the brigade), possibly in
> ap_core_input_filter() and ap_filter_reinstate_brigade() only.

Why ap_core_input_filter?

Not all filters are using ap_filter_reinstate_brigade() so far. Some just tuck 
away
parts of the brigade in brigades stored in their filter context.
As far as I can tell the only thing we say about that API wise is that the 
filter
MUST setaside the buckets then. Nothing more. So IMHO we would need to make the 
API
more tight here for the consumers if we have other expectations.

> These would ap_buckets_lifetime_transform() buckets from their aside
> brigade to the caller's one (which could be a simple move if the
> lifetime is already fine, ie. non proxy case).
> Maybe that would require some apr_bucket_copy_ex() (or setaside_ex()?)
> which would allow heap buffers (and file/mmap/..., via refcounting)
> moves from one bucket_alloc to another (given as _ex()'s arg).
> Possibly I'm minimizing this task :p

Haven't looked at this so far. Hence I do not know how difficult it is to do 
that,
but the current approach especially for heap buckets looks efficient to me.
A read on the heap bucket is nearly for free as is a creation of a transient 
bucket
with the buffer of the heap bucket. Only in case of a setaside we copy the 
memory
and I see no way how to change the lifetime of the data in the buffer by other 
means.
And this all happens with the current code :-)

> Overall I think that could avoid unnecessary allocations/copies (which
> may still happen currently), putting this logic at the right place
> IMHO (the caller decides where the buffers are allocated according to
> its needs by creating the brigade, the input filters use
> bb->bucket_alloc (and bb->r) to create/move their buckets).

Not really getting the idea so far. Can you please elaborate?

> 
>> And in fact it should cost us not much here as the buckets are likely memory 
>> buckets already
>> and nothing is really done with the content if no one sets it aside.
> 
> Walking the brigades (and creating transient buckets) may be a
> noticeable overhead in the websocket case, I wish we could rely on the
> flush...

I agree that it is some overhead, but IMHO the improved safety / stability 
outweighs this overhead.
But I agree that this is subject to personal opinion somehow.

> 
>>
>>> (btw that could be optimized to flush outside the loop).
>>
>> Could be. The question is, whether this is ok from the websockets 
>> responsiveness point of view.
>> OTOH we do an non blocking read. So this should be fine.
> 
> Yes, while successive non-blocking reads give us data I don't think we
> should "fragment" on the other side, let the core/socket buffers
> handle it.

I am fine with flushing at the end of the loop. If no one opposes I can do this.


Regards

Rüdiger


Re: svn commit: r1726787 - /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c

2016-01-27 Thread Jim Jagielski

> On Jan 27, 2016, at 4:02 AM, Plüm, Rüdiger, Vodafone Group 
> <ruediger.pl...@vodafone.com> wrote:
> 
> 
> 
>> -Original Message-
>> From: Yann Ylavic [mailto:ylavic@gmail.com]
>> Sent: Mittwoch, 27. Januar 2016 09:15
>> To: httpd-dev
>> Subject: Re: svn commit: r1726787 -
>> /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c
>> 
>> On Tue, Jan 26, 2016 at 1:57 PM,  <rpl...@apache.org> wrote:
>>> Author: rpluem
>>> Date: Tue Jan 26 12:57:18 2016
>>> New Revision: 1726787
>>> 
>>> URL: http://svn.apache.org/viewvc?rev=1726787=rev
>>> Log:
>>> * Transform the buckets to the correct lifetime of the brigade,
>> connection and filter stack that processes it.
>> 
>> I'm not sure the buckets can be destroyed while in flight in
>> mod_proxy_wstunnel, unlike with mod_proxy_http where the backend
>> connection/request may be cleared before all data are out.
>> 
>> ISTM that this can't happen here because we flush out on every write
> 
> In theory this should be the case, but I think having brigade and buckets 
> allocated from
> the same bucket allocator is a "better safe than sorry" approach that catches 
> possible
> edge case now or in the future (think of Grahams ideas to have suspending 
> filters).
> The issue with these possible edge cases is that it will be hard to find the 
> root cause,
> because we likely segfault somewhere else.
> And in fact it should cost us not much here as the buckets are likely memory 
> buckets already
> and nothing is really done with the content if no one sets it aside.
> 

Makes sense to me... +1



Re: svn commit: r1726787 - /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c

2016-01-27 Thread Yann Ylavic
On Wed, Jan 27, 2016 at 10:02 AM, Plüm, Rüdiger, Vodafone Group
<ruediger.pl...@vodafone.com> wrote:
>
>> -Original Message-
>> From: Yann Ylavic [mailto:ylavic@gmail.com]
>> Sent: Mittwoch, 27. Januar 2016 09:15
>> To: httpd-dev
>> Subject: Re: svn commit: r1726787 -
>> /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c
>>
>> I'm not sure the buckets can be destroyed while in flight in
>> mod_proxy_wstunnel, unlike with mod_proxy_http where the backend
>> connection/request may be cleared before all data are out.
>>
>> ISTM that this can't happen here because we flush out on every write
>
> In theory this should be the case, but I think having brigade and buckets 
> allocated from
> the same bucket allocator is a "better safe than sorry" approach that catches 
> possible
> edge case now or in the future (think of Grahams ideas to have suspending 
> filters).
> The issue with these possible edge cases is that it will be hard to find the 
> root cause,
> because we likely segfault somewhere else.

I agree that a safe approach is needed for buckets' lifetime from
input to output filters.
I'd prefer this to be handled at the source though (by the input
filter and callees that fill in the brigade), possibly in
ap_core_input_filter() and ap_filter_reinstate_brigade() only.
These would ap_buckets_lifetime_transform() buckets from their aside
brigade to the caller's one (which could be a simple move if the
lifetime is already fine, ie. non proxy case).
Maybe that would require some apr_bucket_copy_ex() (or setaside_ex()?)
which would allow heap buffers (and file/mmap/..., via refcounting)
moves from one bucket_alloc to another (given as _ex()'s arg).
Possibly I'm minimizing this task :p
Overall I think that could avoid unnecessary allocations/copies (which
may still happen currently), putting this logic at the right place
IMHO (the caller decides where the buffers are allocated according to
its needs by creating the brigade, the input filters use
bb->bucket_alloc (and bb->r) to create/move their buckets).

> And in fact it should cost us not much here as the buckets are likely memory 
> buckets already
> and nothing is really done with the content if no one sets it aside.

Walking the brigades (and creating transient buckets) may be a
noticeable overhead in the websocket case, I wish we could rely on the
flush...

>
>> (btw that could be optimized to flush outside the loop).
>
> Could be. The question is, whether this is ok from the websockets 
> responsiveness point of view.
> OTOH we do an non blocking read. So this should be fine.

Yes, while successive non-blocking reads give us data I don't think we
should "fragment" on the other side, let the core/socket buffers
handle it.

Regards,
Yann.


RE: svn commit: r1726787 - /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c

2016-01-27 Thread Plüm , Rüdiger , Vodafone Group


> -Original Message-
> From: Yann Ylavic [mailto:ylavic@gmail.com]
> Sent: Mittwoch, 27. Januar 2016 09:15
> To: httpd-dev
> Subject: Re: svn commit: r1726787 -
> /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c
> 
> On Tue, Jan 26, 2016 at 1:57 PM,  <rpl...@apache.org> wrote:
> > Author: rpluem
> > Date: Tue Jan 26 12:57:18 2016
> > New Revision: 1726787
> >
> > URL: http://svn.apache.org/viewvc?rev=1726787=rev
> > Log:
> > * Transform the buckets to the correct lifetime of the brigade,
> connection and filter stack that processes it.
> 
> I'm not sure the buckets can be destroyed while in flight in
> mod_proxy_wstunnel, unlike with mod_proxy_http where the backend
> connection/request may be cleared before all data are out.
> 
> ISTM that this can't happen here because we flush out on every write

In theory this should be the case, but I think having brigade and buckets 
allocated from
the same bucket allocator is a "better safe than sorry" approach that catches 
possible
edge case now or in the future (think of Grahams ideas to have suspending 
filters).
The issue with these possible edge cases is that it will be hard to find the 
root cause,
because we likely segfault somewhere else.
And in fact it should cost us not much here as the buckets are likely memory 
buckets already
and nothing is really done with the content if no one sets it aside.

> (btw that could be optimized to flush outside the loop).

Could be. The question is, whether this is ok from the websockets 
responsiveness point of view.
OTOH we do an non blocking read. So this should be fine.

> So provided we clear the working brigade before leaving, isn't that
> enough already?

As said above, I feel a little bit uneasy about stuff that could happen in the 
filter chain having these
buckets still alive in different brigades.

Regards

Rüdiger