Re: svn commit: r1726787 - /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c
On 02/01/2016 03:54 PM, Yann Ylavic wrote: > On Mon, Feb 1, 2016 at 3:32 PM, Ruediger Pluemwrote: >>> >>> I am fine with flushing at the end of the loop. If no one opposes I can do >>> this. >>> >> >> Looking into this in more detail the following thoughts came up: >> >> What do we think how much iterations do we do in this loop in the typical >> case >> a.k.a how often do we call ap_get_brigade with a non blocking read? >> My gut feeling tells me that we only do this once at least in the websocket >> case. > > I guess it depends on the size of the outgoing traffic, it the backend > sometimes sends large data at once we may end up needing more than one > non-blocking read (AP_IOBUFSIZE/8K each) to forward the fragments. True and I guess this is often the case for mod_proxy_connect. For mod_proxy_wstunnel I would expect that we see stuff below 8k, e.g. chat messages and alike in most use cases. > >> If this is the case the discussion whether to flush after the loop or in the >> loop wouldn't be >> relevant any longer, but more important: >> If it is ok to flush in the loop I think it would be better to add the flush >> bucket >> directly to bb_out instead of calling ap_fflush afterwards. The reason for >> this is >> that it gives downstream filters a better information on what they should do. > > +1 > >> With the current >> approach they might set the buckets aside and store them for later >> processing and only >> process them in the second round when they see the flush. If the flush would >> be on the first >> brigade at least some of them would process the buckets directly without >> setting them aside. >> Comments? > > That's the "issue" with transient buckets, the core output will almost > always set them aside if not flushed (including in the mod_proxy_http > case). > It wouldn't have to (or this would be a noop) if the buckets were > allocated on its bucket_alloc instead of the origin's. > But I agree that for now your approach makes sense. I guess I will hack something up and we discuss further once I have a proposal. Regards Rüdiger
Re: svn commit: r1726787 - /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c
On 01/27/2016 08:50 PM, Ruediger Pluem wrote: > > > On 01/27/2016 12:46 PM, Yann Ylavic wrote: >> On Wed, Jan 27, 2016 at 10:02 AM, Plüm, Rüdiger, Vodafone Group >>wrote: >>> -Original Message- From: Yann Ylavic [mailto:ylavic@gmail.com] Sent: Mittwoch, 27. Januar 2016 09:15 To: httpd-dev Subject: Re: svn commit: r1726787 - > >> >>> (btw that could be optimized to flush outside the loop). >>> >>> Could be. The question is, whether this is ok from the websockets >>> responsiveness point of view. >>> OTOH we do an non blocking read. So this should be fine. >> >> Yes, while successive non-blocking reads give us data I don't think we >> should "fragment" on the other side, let the core/socket buffers >> handle it. > > I am fine with flushing at the end of the loop. If no one opposes I can do > this. > Looking into this in more detail the following thoughts came up: What do we think how much iterations do we do in this loop in the typical case a.k.a how often do we call ap_get_brigade with a non blocking read? My gut feeling tells me that we only do this once at least in the websocket case. If this is the case the discussion whether to flush after the loop or in the loop wouldn't be relevant any longer, but more important: If it is ok to flush in the loop I think it would be better to add the flush bucket directly to bb_out instead of calling ap_fflush afterwards. The reason for this is that it gives downstream filters a better information on what they should do. With the current approach they might set the buckets aside and store them for later processing and only process them in the second round when they see the flush. If the flush would be on the first brigade at least some of them would process the buckets directly without setting them aside. Comments? Regards Rüdiger
Re: svn commit: r1726787 - /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c
On Mon, Feb 1, 2016 at 3:32 PM, Ruediger Pluemwrote: >> >> I am fine with flushing at the end of the loop. If no one opposes I can do >> this. >> > > Looking into this in more detail the following thoughts came up: > > What do we think how much iterations do we do in this loop in the typical case > a.k.a how often do we call ap_get_brigade with a non blocking read? > My gut feeling tells me that we only do this once at least in the websocket > case. I guess it depends on the size of the outgoing traffic, it the backend sometimes sends large data at once we may end up needing more than one non-blocking read (AP_IOBUFSIZE/8K each) to forward the fragments. > If this is the case the discussion whether to flush after the loop or in the > loop wouldn't be > relevant any longer, but more important: > If it is ok to flush in the loop I think it would be better to add the flush > bucket > directly to bb_out instead of calling ap_fflush afterwards. The reason for > this is > that it gives downstream filters a better information on what they should do. +1 > With the current > approach they might set the buckets aside and store them for later processing > and only > process them in the second round when they see the flush. If the flush would > be on the first > brigade at least some of them would process the buckets directly without > setting them aside. > Comments? That's the "issue" with transient buckets, the core output will almost always set them aside if not flushed (including in the mod_proxy_http case). It wouldn't have to (or this would be a noop) if the buckets were allocated on its bucket_alloc instead of the origin's. But I agree that for now your approach makes sense. Regards, Yann.
Re: svn commit: r1726787 - /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c
On 01/28/2016 11:49 AM, Stefan Eissing wrote: > Interesting discussion. mod_http2 had similar issues to overcome and there > should be plenty of room for improvements... > >> Am 27.01.2016 um 20:50 schrieb Ruediger Pluem <rpl...@apache.org>: >> On 01/27/2016 12:46 PM, Yann Ylavic wrote: >>> On Wed, Jan 27, 2016 at 10:02 AM, Plüm, Rüdiger, Vodafone Group >>> <ruediger.pl...@vodafone.com> wrote: >>>> >>>>> -Original Message- >>>>> From: Yann Ylavic [mailto:ylavic....@gmail.com] >>>>> Sent: Mittwoch, 27. Januar 2016 09:15 >>>>> To: httpd-dev >>>>> Subject: Re: svn commit: r1726787 - >>>>> /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c >>>>> >>>>> I'm not sure the buckets can be destroyed while in flight in >>>>> mod_proxy_wstunnel, unlike with mod_proxy_http where the backend >>>>> connection/request may be cleared before all data are out. >>>>> >>>>> ISTM that this can't happen here because we flush out on every write >>>> >>>> In theory this should be the case, but I think having brigade and buckets >>>> allocated from >>>> the same bucket allocator is a "better safe than sorry" approach that >>>> catches possible >>>> edge case now or in the future (think of Grahams ideas to have suspending >>>> filters). >>>> The issue with these possible edge cases is that it will be hard to find >>>> the root cause, >>>> because we likely segfault somewhere else. >> >> Just saw that the code for mod_proxy_connect is very similar to the one from >> mod_proxy_wstunnel >> and thus could impose the same issues. Maybe it should be fixed as well. > > Just for my understanding of ap_buckets_lifetime_transform(): > - duplicates buckets from one brigade+allocator to another > - optimizes the case where source buckets live at least as > long as the copied ones > - can therefore use transient buckets and avoids buffer copies Yes and no. Just to clarify. It is assumed that the source buckets live as long as we are in the same "processing cycle". If someone in the chain (be it in the filter chain, or be it the proxy handler that frees up the backend connection early) cannot process them as long as we know that the sources exists they must set the buckets aside which triggers the copying. The beauty of the solution is that if the above does not happen no copy is needed and that we do not need to decide in ap_buckets_lifetime_transform if a copy is needed or not but defer that decision to the consumers in the later processing cyle who should know much better then we know. And they then just need to do what they are used to do in such situations: Setting the buckets aside. > > mod_http2: > - does not make that lifetime assumption. needs to copy. Maybe it could if it waits to destroy the sources until passing to the filter chain returns. In this case the buckets should be either consumed or set aside. > - has optimization for transferring file buckets by setaside'ing > the apr_file to the target pool > > Maybe both approaches could benefit from each other. > > What are the tasks to synchronize: > A. reading from somewhere into a source brigade > B. transferring from source to another target brigade > C. writing the target brigade to somewhere else > where A and C keep the current brigade API. > > Maybe we could introduce something like a "Transfer Window" that > keeps track of target buckets and their destruction which then > triggers - maybe at another point in time / another thread - the > destruction of the corresponding source buckets. Hm. This might be difficult with the current pool and allocator lifetimes, as you would probably block other data from being freed if just one source bucket is still needed. > > There should be a maximum memory footprint for the "transfer window" > which suspends further transfers until target buckets have been > destroyed. The transfer window could be given mutex and cond for > working multi-threaded. > > Ideally, this would be chainable, so that we have more than two > brigades involved. > Regards Rüdiger
Re: svn commit: r1726787 - /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c
Interesting discussion. mod_http2 had similar issues to overcome and there should be plenty of room for improvements... > Am 27.01.2016 um 20:50 schrieb Ruediger Pluem <rpl...@apache.org>: > On 01/27/2016 12:46 PM, Yann Ylavic wrote: >> On Wed, Jan 27, 2016 at 10:02 AM, Plüm, Rüdiger, Vodafone Group >> <ruediger.pl...@vodafone.com> wrote: >>> >>>> -Original Message- >>>> From: Yann Ylavic [mailto:ylavic@gmail.com] >>>> Sent: Mittwoch, 27. Januar 2016 09:15 >>>> To: httpd-dev >>>> Subject: Re: svn commit: r1726787 - >>>> /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c >>>> >>>> I'm not sure the buckets can be destroyed while in flight in >>>> mod_proxy_wstunnel, unlike with mod_proxy_http where the backend >>>> connection/request may be cleared before all data are out. >>>> >>>> ISTM that this can't happen here because we flush out on every write >>> >>> In theory this should be the case, but I think having brigade and buckets >>> allocated from >>> the same bucket allocator is a "better safe than sorry" approach that >>> catches possible >>> edge case now or in the future (think of Grahams ideas to have suspending >>> filters). >>> The issue with these possible edge cases is that it will be hard to find >>> the root cause, >>> because we likely segfault somewhere else. > > Just saw that the code for mod_proxy_connect is very similar to the one from > mod_proxy_wstunnel > and thus could impose the same issues. Maybe it should be fixed as well. Just for my understanding of ap_buckets_lifetime_transform(): - duplicates buckets from one brigade+allocator to another - optimizes the case where source buckets live at least as long as the copied ones - can therefore use transient buckets and avoids buffer copies mod_http2: - does not make that lifetime assumption. needs to copy. - has optimization for transferring file buckets by setaside'ing the apr_file to the target pool Maybe both approaches could benefit from each other. What are the tasks to synchronize: A. reading from somewhere into a source brigade B. transferring from source to another target brigade C. writing the target brigade to somewhere else where A and C keep the current brigade API. Maybe we could introduce something like a "Transfer Window" that keeps track of target buckets and their destruction which then triggers - maybe at another point in time / another thread - the destruction of the corresponding source buckets. There should be a maximum memory footprint for the "transfer window" which suspends further transfers until target buckets have been destroyed. The transfer window could be given mutex and cond for working multi-threaded. Ideally, this would be chainable, so that we have more than two brigades involved. -Stefan
Re: svn commit: r1726787 - /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c
On Tue, Jan 26, 2016 at 1:57 PM,wrote: > Author: rpluem > Date: Tue Jan 26 12:57:18 2016 > New Revision: 1726787 > > URL: http://svn.apache.org/viewvc?rev=1726787=rev > Log: > * Transform the buckets to the correct lifetime of the brigade, connection > and filter stack that processes it. I'm not sure the buckets can be destroyed while in flight in mod_proxy_wstunnel, unlike with mod_proxy_http where the backend connection/request may be cleared before all data are out. ISTM that this can't happen here because we flush out on every write (btw that could be optimized to flush outside the loop). So provided we clear the working brigade before leaving, isn't that enough already? Regards, Yann.
Re: svn commit: r1726787 - /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c
On 01/27/2016 12:46 PM, Yann Ylavic wrote: > On Wed, Jan 27, 2016 at 10:02 AM, Plüm, Rüdiger, Vodafone Group > <ruediger.pl...@vodafone.com> wrote: >> >>> -Original Message- >>> From: Yann Ylavic [mailto:ylavic@gmail.com] >>> Sent: Mittwoch, 27. Januar 2016 09:15 >>> To: httpd-dev >>> Subject: Re: svn commit: r1726787 - >>> /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c >>> >>> I'm not sure the buckets can be destroyed while in flight in >>> mod_proxy_wstunnel, unlike with mod_proxy_http where the backend >>> connection/request may be cleared before all data are out. >>> >>> ISTM that this can't happen here because we flush out on every write >> >> In theory this should be the case, but I think having brigade and buckets >> allocated from >> the same bucket allocator is a "better safe than sorry" approach that >> catches possible >> edge case now or in the future (think of Grahams ideas to have suspending >> filters). >> The issue with these possible edge cases is that it will be hard to find the >> root cause, >> because we likely segfault somewhere else. Just saw that the code for mod_proxy_connect is very similar to the one from mod_proxy_wstunnel and thus could impose the same issues. Maybe it should be fixed as well. > > I agree that a safe approach is needed for buckets' lifetime from > input to output filters. > I'd prefer this to be handled at the source though (by the input > filter and callees that fill in the brigade), possibly in > ap_core_input_filter() and ap_filter_reinstate_brigade() only. Why ap_core_input_filter? Not all filters are using ap_filter_reinstate_brigade() so far. Some just tuck away parts of the brigade in brigades stored in their filter context. As far as I can tell the only thing we say about that API wise is that the filter MUST setaside the buckets then. Nothing more. So IMHO we would need to make the API more tight here for the consumers if we have other expectations. > These would ap_buckets_lifetime_transform() buckets from their aside > brigade to the caller's one (which could be a simple move if the > lifetime is already fine, ie. non proxy case). > Maybe that would require some apr_bucket_copy_ex() (or setaside_ex()?) > which would allow heap buffers (and file/mmap/..., via refcounting) > moves from one bucket_alloc to another (given as _ex()'s arg). > Possibly I'm minimizing this task :p Haven't looked at this so far. Hence I do not know how difficult it is to do that, but the current approach especially for heap buckets looks efficient to me. A read on the heap bucket is nearly for free as is a creation of a transient bucket with the buffer of the heap bucket. Only in case of a setaside we copy the memory and I see no way how to change the lifetime of the data in the buffer by other means. And this all happens with the current code :-) > Overall I think that could avoid unnecessary allocations/copies (which > may still happen currently), putting this logic at the right place > IMHO (the caller decides where the buffers are allocated according to > its needs by creating the brigade, the input filters use > bb->bucket_alloc (and bb->r) to create/move their buckets). Not really getting the idea so far. Can you please elaborate? > >> And in fact it should cost us not much here as the buckets are likely memory >> buckets already >> and nothing is really done with the content if no one sets it aside. > > Walking the brigades (and creating transient buckets) may be a > noticeable overhead in the websocket case, I wish we could rely on the > flush... I agree that it is some overhead, but IMHO the improved safety / stability outweighs this overhead. But I agree that this is subject to personal opinion somehow. > >> >>> (btw that could be optimized to flush outside the loop). >> >> Could be. The question is, whether this is ok from the websockets >> responsiveness point of view. >> OTOH we do an non blocking read. So this should be fine. > > Yes, while successive non-blocking reads give us data I don't think we > should "fragment" on the other side, let the core/socket buffers > handle it. I am fine with flushing at the end of the loop. If no one opposes I can do this. Regards Rüdiger
Re: svn commit: r1726787 - /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c
> On Jan 27, 2016, at 4:02 AM, Plüm, Rüdiger, Vodafone Group > <ruediger.pl...@vodafone.com> wrote: > > > >> -Original Message- >> From: Yann Ylavic [mailto:ylavic@gmail.com] >> Sent: Mittwoch, 27. Januar 2016 09:15 >> To: httpd-dev >> Subject: Re: svn commit: r1726787 - >> /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c >> >> On Tue, Jan 26, 2016 at 1:57 PM, <rpl...@apache.org> wrote: >>> Author: rpluem >>> Date: Tue Jan 26 12:57:18 2016 >>> New Revision: 1726787 >>> >>> URL: http://svn.apache.org/viewvc?rev=1726787=rev >>> Log: >>> * Transform the buckets to the correct lifetime of the brigade, >> connection and filter stack that processes it. >> >> I'm not sure the buckets can be destroyed while in flight in >> mod_proxy_wstunnel, unlike with mod_proxy_http where the backend >> connection/request may be cleared before all data are out. >> >> ISTM that this can't happen here because we flush out on every write > > In theory this should be the case, but I think having brigade and buckets > allocated from > the same bucket allocator is a "better safe than sorry" approach that catches > possible > edge case now or in the future (think of Grahams ideas to have suspending > filters). > The issue with these possible edge cases is that it will be hard to find the > root cause, > because we likely segfault somewhere else. > And in fact it should cost us not much here as the buckets are likely memory > buckets already > and nothing is really done with the content if no one sets it aside. > Makes sense to me... +1
Re: svn commit: r1726787 - /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c
On Wed, Jan 27, 2016 at 10:02 AM, Plüm, Rüdiger, Vodafone Group <ruediger.pl...@vodafone.com> wrote: > >> -Original Message- >> From: Yann Ylavic [mailto:ylavic@gmail.com] >> Sent: Mittwoch, 27. Januar 2016 09:15 >> To: httpd-dev >> Subject: Re: svn commit: r1726787 - >> /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c >> >> I'm not sure the buckets can be destroyed while in flight in >> mod_proxy_wstunnel, unlike with mod_proxy_http where the backend >> connection/request may be cleared before all data are out. >> >> ISTM that this can't happen here because we flush out on every write > > In theory this should be the case, but I think having brigade and buckets > allocated from > the same bucket allocator is a "better safe than sorry" approach that catches > possible > edge case now or in the future (think of Grahams ideas to have suspending > filters). > The issue with these possible edge cases is that it will be hard to find the > root cause, > because we likely segfault somewhere else. I agree that a safe approach is needed for buckets' lifetime from input to output filters. I'd prefer this to be handled at the source though (by the input filter and callees that fill in the brigade), possibly in ap_core_input_filter() and ap_filter_reinstate_brigade() only. These would ap_buckets_lifetime_transform() buckets from their aside brigade to the caller's one (which could be a simple move if the lifetime is already fine, ie. non proxy case). Maybe that would require some apr_bucket_copy_ex() (or setaside_ex()?) which would allow heap buffers (and file/mmap/..., via refcounting) moves from one bucket_alloc to another (given as _ex()'s arg). Possibly I'm minimizing this task :p Overall I think that could avoid unnecessary allocations/copies (which may still happen currently), putting this logic at the right place IMHO (the caller decides where the buffers are allocated according to its needs by creating the brigade, the input filters use bb->bucket_alloc (and bb->r) to create/move their buckets). > And in fact it should cost us not much here as the buckets are likely memory > buckets already > and nothing is really done with the content if no one sets it aside. Walking the brigades (and creating transient buckets) may be a noticeable overhead in the websocket case, I wish we could rely on the flush... > >> (btw that could be optimized to flush outside the loop). > > Could be. The question is, whether this is ok from the websockets > responsiveness point of view. > OTOH we do an non blocking read. So this should be fine. Yes, while successive non-blocking reads give us data I don't think we should "fragment" on the other side, let the core/socket buffers handle it. Regards, Yann.
RE: svn commit: r1726787 - /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c
> -Original Message- > From: Yann Ylavic [mailto:ylavic@gmail.com] > Sent: Mittwoch, 27. Januar 2016 09:15 > To: httpd-dev > Subject: Re: svn commit: r1726787 - > /httpd/httpd/trunk/modules/proxy/mod_proxy_wstunnel.c > > On Tue, Jan 26, 2016 at 1:57 PM, <rpl...@apache.org> wrote: > > Author: rpluem > > Date: Tue Jan 26 12:57:18 2016 > > New Revision: 1726787 > > > > URL: http://svn.apache.org/viewvc?rev=1726787=rev > > Log: > > * Transform the buckets to the correct lifetime of the brigade, > connection and filter stack that processes it. > > I'm not sure the buckets can be destroyed while in flight in > mod_proxy_wstunnel, unlike with mod_proxy_http where the backend > connection/request may be cleared before all data are out. > > ISTM that this can't happen here because we flush out on every write In theory this should be the case, but I think having brigade and buckets allocated from the same bucket allocator is a "better safe than sorry" approach that catches possible edge case now or in the future (think of Grahams ideas to have suspending filters). The issue with these possible edge cases is that it will be hard to find the root cause, because we likely segfault somewhere else. And in fact it should cost us not much here as the buckets are likely memory buckets already and nothing is really done with the content if no one sets it aside. > (btw that could be optimized to flush outside the loop). Could be. The question is, whether this is ok from the websockets responsiveness point of view. OTOH we do an non blocking read. So this should be fine. > So provided we clear the working brigade before leaving, isn't that > enough already? As said above, I feel a little bit uneasy about stuff that could happen in the filter chain having these buckets still alive in different brigades. Regards Rüdiger