Hi Chris, my example implementation doesnt assume a string cut-off at a certain place. If your search string has a length of 7 bytes, the "worst case" is that one buffer contains the first 6 bytes and the next buffer the last one. If the string is cut at another place you just carry over a little bit too much (but it doesnt hurt as long as you make sure that the replacement takes place only once). I dont think that you can force the bucket to be a certain length. I dont know how it is handled exactly but I would assume that you get passed any content that is flushed by the previous handler/filter. The only thing you could possibly control is the threshold at which Apache does an automatic flushing of the output buffer.
Hendrik Am Do, 31.03.2011, 12:08 schrieb Chris Datfung: > Hi Hendrik, > > That seems like a good work around assuming the string gets cut off at the > same place each time. Thanks for that, in my case, I'm not certain that it > does. I thought the BUFF_LEN constant defines how many bytes should be > read. > My string is always within the first 5000 bytes, but setting BUFF_LEN to > 8000 did not help as the buffer still sometimes gets cut after ~2500 bytes > or so. Do you know of any way to force the bucket to be a certain length? > > Thanks > Chris > > On Thu, Mar 31, 2011 at 10:07 AM, Hendrik Schumacher > <[email protected]>wrote: > >> Am Do, 31.03.2011, 06:30 schrieb Chris Datfung: >> > On Wed, Mar 30, 2011 at 12:36 PM, Hendrik Schumacher >> > <[email protected]>wrote: >> > >> >> Am Mi, 30.03.2011, 12:17 schrieb Chris Datfung: >> >> >> >> I had a similar problem with a http proxy that injected a string into >> >> the >> >> HTML body. If the response is passed to the filter in multiple parts >> >> there >> >> is a certain probability that the response is split on the string >> >> position >> >> you are looking for (for example part 2 ends with "</bo" and part 3 >> >> starts >> >> with "dy>"). I had to buffer the last bytes of each response part and >> >> take >> >> them into account >> > >> > >> > Hi Hendrik, >> > >> > That is exactly the problem. How did you buffer the last bytes of each >> > response. Don't you just set the BUFF_LEN and thats the number of >> > characters >> > you get? >> > >> > Chris >> > >> >> You have to handle the "last bytes buffer" yourself. If you use the >> f->read approach of Apache2::Filter, you could use the following >> (untested >> and probably not very efficient): >> >> my $lastbytes = undef; >> my $done = undef; >> while ($filter->read(my $buffer, $wanted)) { >> { >> if ($lastbytes) >> { >> $buffer = $lastbytes.$buffer; >> $lastbytes = undef; >> } >> if (not $done) >> { >> if ($buffer =~ s/<\/body>/$injection<\/body>/) >> { >> $done = 1; >> } >> else >> { >> $lastbytes = substr ($buffer, -6); # length of string to search - 1 >> $buffer = substr ($buffer, 0, -6); >> } >> } >> $filter->print($buffer); >> } >> if ($filter->seen_eos && $lastbytes) { >> $filter->print($lastbytes); >> } >> >> If you are using the callback approach, you would have to store >> $lastbytes >> somewhere (eg in $filter->ctx) and make sure to flush $lastbytes on eos. >> >> Hendrik >> >> >> >
