Content encoding when filtering proxyed pages

2003-09-06 Thread Esteban Fernandez Stafford


Hello all,

I have a machine acting as a proxy using mod_perl-1.99_09 with apache
2.0.46. This proxy is supposed to filter all html content. So far I
have achieved most of my project's goals. But there is one issue I
can't get straight, this is when the proxy gets a page that is
encoded (like in www.google.com). My first attempt was to DECLINE
filtering such pages, but the $filter-r()-content_encoding() always
gives me 'undef'. Is this something that is not yet implemented or am
I doing something wrong? (See code below) Then I tried looking at
$filter-r()-headers_out()-{'Content-Encoding'} and everything went
just fine!

On the other hand, is it possible that I could put mod_deflate before
my filter to get the content already decompressed for my filter to
parse?

   Thanks a lot in advance

I would like to thank the mod_perl community for mod_perl, it has made
the development of this project fun! And it has kept me from having to
go back to C programming. It was a long time since I last did that.


package WTG::HtmlFilter;

use strict;
use warnings;# FATAL = 'all';

use Apache::RequestRec ();
use Apache::RequestIO ();

use APR::Brigade ();
use APR::Bucket ();

use base qw(Apache::Filter);

use Apache::Const -compile = qw(OK M_POST);
use APR::Const -compile = ':common';

use constant READ_SIZE  = 1024;

use HTML::Parser ();

sub handler : FilterRequestHandler
{
   my $filter = shift;
   my $parser;

   # Initialize parser if not already done
   unless ($parser = $filter-ctx)
   {
  # This is the first call of the filter for a particular request
  # Can we filter this request?
  my $type = $filter-r()-content_type();
  if(! defined $type || $type !~ /^text\/html\b/)
  {
 $filter-remove();
 return Apache::DECLINED;
  }
  # This line gives me undefined
  print STDERR $filter-r()-content_type(), \n;

blah... blah... blah...


   E s  t  eb  a n!


:wq



-- 
Reporting bugs: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html



Re: Unregister streamed output filters

2003-01-12 Thread Esteban Fernandez Stafford

On Sat, 11 Jan 2003, Stas Bekman wrote:

 Esteban Fernandez Stafford wrote:
 
  Hello all,
 
  is there a way to unregister a streamed filter? I have seen this in
  many (all?) apache (C programmed) filters; they are able of declining
  the filtering of a certain stream, for example, when they do not know
  how to handle a certain content type. In the apache api this is done
  with ap_remove_output_filter(f). Is there something similar in mp2?

 Not at this moment, but hopefully it'll be supported soon.

 Since you need this feature, telling us in what situation you'd like to
 remove a filter will help us to build a better test case and provide a
 good real-world example for documentation.

The easiest example that comes to mind is a filter for text/html that
performs some sort of transformation. This filter should unregister for
any content type that is not text/html. Browsing through some apache
code I have found two ways of doing this. One involves the
ap_remove_output_filter function (modules/filters/mod_deflate.c) and
the other returns a DECLINED at a cetrtain point
(modules/filters/mod_include.c). I am not sure about the internals of
each approach but I thought it might help.

It occurs to me just now that it maybe also be possible to do this
staticaly in httpd.conf. Something like:

PerlOutputFilterHandler  MyApache::MyHtmlFilter text/html




E s  t  eb  a n!

:wq




Unregister streamed output filters

2003-01-10 Thread Esteban Fernandez Stafford


Hello all,

is there a way to unregister a streamed filter? I have seen this in
many (all?) apache (C programmed) filters; they are able of declining
the filtering of a certain stream, for example, when they do not know
how to handle a certain content type. In the apache api this is done
with ap_remove_output_filter(f). Is there something similar in mp2?

Thanks in advance!!


   E s  t  eb  a n!


:wq




Trouble with make source_scan

2002-11-26 Thread Esteban Fernandez Stafford


Hello,

I wanted to add some functionality to mod_perl. I am following the
'mod_per 2.0 Source Code Explained' guide. Before doing any change at
all I tried to run 'make source_scan' and I got some errors.  Going
through build/source_scan.pl I was able to tell which errors where
generated from each instruction (see below), but this has not brougth
me any further. Can anybody give me a hint about what I could do?

Thanks!

mod_per: 2.0 (checked out from CVS, Nov 26 2002)
perl: 5.8.0

$p-write_structs_pm complains the following:
--
failed on 










enum {

APR_BUCKET_DATA = 0,

APR_BUCKET_METADATA = 1
} is_metadata; with type=, id=enum, post= at pos=25
--


... and $p-write_functions_pm complains the following:
--
In file included from .apache_includes:4,
 from :1:
xs/modperl_xs_sv_convert.h:149: warning: `mp_xs_sv2_APR__Table' redefined
xs/modperl_xs_util.h:10: warning: this is the location of the previous definition
xs/modperl_xs_sv_convert.h:321: warning: `mp_xs_sv2_r' redefined
xs/modperl_xs_util.h:6: warning: this is the location of the previous definition
In file included from .apache_includes:4,
 from :1:
xs/modperl_xs_sv_convert.h:149: warning: `mp_xs_sv2_APR__Table' redefined
xs/modperl_xs_util.h:10: warning: this is the location of the previous definition
xs/modperl_xs_sv_convert.h:321: warning: `mp_xs_sv2_r' redefined
xs/modperl_xs_util.h:6: warning: this is the location of the previous definition
panic: multiple types without intervening comma in
 regexp*( *regcomp_t  ) (register PerlInterpreter *my_perl,  char* exp, 
char* xend, PMOP* pm)
whited-out as
 regexp*( *regcomp_t  ) (  
 )
Expecting parenth after identifier in `regcomp_t  * Perl_Tregcompp_ptr(register 
PerlInterpreter *my_perl)'
after `regcomp_t  ' at /usr/lib/perl5/site_perl/5.8.0/C/Scan.pm line 783.
C::Scan::do_declaration('extern\x{9}regcomp_t  * 
Perl_Tregcompp_ptr(register PerlInter...','HASH(0x8782610)','HASH(0x878d788)') called 
at /usr/lib/perl5/site_perl/5.8.0/C/Scan.pm line 738

C::Scan::do_declarations('ARRAY(0x878261c)','HASH(0x8782610)','HASH(0x878d788)') 
called at /usr/lib/perl5/site_perl/5.8.0/Data/Flow.pm line 86

Data::Flow::request('Apache::ParseSource::Scan=ARRAY(0x8d3b7dc)','parsed_fdecls') 
called at /usr/lib/perl5/site_perl/5.8.0/Data/Flow.pm line 39
Data::Flow::get('Apache::ParseSource::Scan=ARRAY(0x8d3b7dc)','parsed_fdecls') 
called at lib/Apache/ParseSource.pm line 49

Apache::ParseSource::Scan::get('Apache::ParseSource::Scan=ARRAY(0x8d3b7dc)','parsed_fdecls')
 called at lib/Apache/ParseSource.pm line 311
Apache::ParseSource::get_functions('ModPerl::ParseSource=HASH(0x8ce0714)') 
called at lib/Apache/ParseSource.pm line 407

Apache::ParseSource::write_functions_pm('ModPerl::ParseSource=HASH(0x8ce0714)','FunctionTable.pm','ModPerl::FunctionTable')
 called at lib/ModPerl/ParseSource.pm line 40

ModPerl::ParseSource::write_functions_pm('ModPerl::ParseSource=HASH(0x8ce0714)') 
called at build/source_scan.pl line 34
make: *** [source_scan] Error 255
--

   E s  t  eb  a n!


:wq





Re: Problem with Stream-oriented Output Filter

2002-11-22 Thread Esteban Fernandez Stafford
On Thu, 21 Nov 2002, Stas Bekman wrote:

 Esteban Fernandez Stafford wrote:
 
  Hello,
 
  I am currently developing a modperl filter that uses the streaming
  approach. I started off with the example in
 
  
http://perl.apache.org/docs/2.0/user/handlers/filters.html#Stream_oriented_Output_Filter
 
sub handler {
my $filter = shift;
 
my $left_over = '';
while ($filter-read(my $buffer, BUFF_LEN)) {
$buffer = $left_over . $buffer;
$left_over = '';
while ($buffer =~ /([^\r\n]*)([\r\n]*)/g) {
$left_over = $1, last unless $2;
$filter-print(scalar(reverse $1), $2);
}
}
$filter-print(scalar reverse $left_over) if length $left_over;
 
Apache::OK;
}
 
  This seems to work OK when the file is small (smaller than 8192). When
  the file is larger, then there is a line that gets cut. This is
  because the handler gets called more than once for big requests. Is
  there a nice way to overcome this problem?  I was thinking using
  filter context to store variable $left_over, but then I dont know how
  to detect the real end of the stream.

 The problem is that I've written a patch that makes the $filter-print()
   outside the while() loop work, but it was never committed. So the doc
 is out of sync with the core code. For now please apply this patch:
 http://marc.theaimsgroup.com/?l=apache-modperl-devm=102828686110352w=2

 I'll see that it gets into the core asap.


Thanks Stas,

the patch you sent me solved some of my trouble. The bad news is that I
still have problems when I filter large files. I have written an example
to expose this behaviour. It consists of modified versions of the beloved
SendAlphaNum.pm and FilterReverse1.pm.



package MyApache::SendAlphaNum;

use strict;
use warnings;

use Apache::RequestRec ();
use Apache::RequestIO ();

use Apache::Const -compile = qw(OK);

sub handler {
my $r = shift;

$r-content_type('text/plain');

foreach(1..300)
{
   $r-print(1..9, 0, 'a'..'z', \n);
}
Apache::OK;
}
1;



package MyApache::FilterReverse1;

use strict;
use warnings;

use base qw(Apache::Filter);

use Apache::Const -compile = qw(OK);

use constant BUFF_LEN = 1024;

my $count = 0;

   sub handler
   {
  my $filter = shift;
  $filter-print(\nI'm Starting: $count bytes so far...\n);
  my $left_over = '';
  while ($filter-read(my $buffer, BUFF_LEN)) {
  $count+=length($buffer);
  $buffer = $left_over . $buffer;
  $left_over = '';
  while ($buffer =~ /([^\r\n]*)([\r\n]*)/g) {
  $left_over = $1, last unless $2;
  $filter-print(scalar(reverse $1), $2);
  }
  }
  $filter-print(scalar reverse $left_over) if length $left_over;

  Apache::OK;
   }
1;



PerlModule MyApache::FilterReverse1
PerlModule MyApache::SendAlphaNum
Location /reverse1
   SetHandler modperl
   PerlResponseHandler  MyApache::SendAlphaNum
   PerlOutputFilterHandler  MyApache::FilterReverse1
/Location



As an output I get the following:

$ wget -q -O - http://fangorn:3000/reverse1

I'm Starting: 0 bytes so far...
zyxwvutsrqponmlkjihgfedcba0987654321
zyxwvutsrqponmlkjihgfedcba0987654321
# A lot of these...
zyxwvutsrqponmlkjihgfedcba0987654321
zyxwvutsrqponmlkjihgfedcba0987654321
edcba0987654321
I'm Starting: 8192 bytes so far...
zyxwvutsrqponmlkjihgf
zyxwvutsrqponmlkjihgfedcba0987654321
zyxwvutsrqponmlkjihgfedcba0987654321
# ... and some more...
zyxwvutsrqponmlkjihgfedcba0987654321
zyxwvutsrqponmlkjihgfedcba0987654321


As you can see the filter is called twice and therefore there is one line
that gets broken in two. I could always store $left_over in the context
of the filter and prepend it to what I read when the filter starts again.
But the problem I find is that I have no way to know when the real eos
is reached in order to flush $left_over to the output. I thought there may
be some sort of $filter-eos() call or something.

   E s  t  eb  a n!


:wq






Re: Problems with Apache::Gallery

2002-11-20 Thread Esteban Fernandez Stafford
Hi Juan,

I had a similar problem. To solve it I added these lines to startup.pl
I guess you will have to change the paths to your own settings.


# For the Apache::* modules
use lib qw(/usr/lib/perl5/site_perl/5.6.1/i586-linux/Apache2);

# For my own
use lib qw(/home/esteban/apache_mod_perl/perl);

# Que cojones!!!

   E s  t  eb  a n!



:wq


On 20 Nov 2002, Juan Julian Merelo Guervos wrote:

 Hi,
   I'm having problems with apache::gallery, mod_perl 1.99 and apache 2.0
 (and who hasn't?, you might ask).

 This is the setup I'm using in perl.conf
 PerlSetVar GalleryTemplateDir
 /home/jmerelo/public_html/gallery/templates
 PerlSetVar GalleryAllowOriginal 1
 Location /gallery
   SetHandlerperl-script
   PerlResponseHandler   Apache::Gallery
 /Location

 And this is the error I get:
 [client 10.10.10.77] Can't locate Apache.pm in @INC (@INC contains:
 /usr/lib/perl5/5.8.0/i386-linux-thread-multi /usr/lib/perl5/5.8.0
 /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
 /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl
 /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
 /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl/5.6.1
 /usr/lib/perl5/vendor_perl .) at
 /usr/lib/perl5/site_perl/5.8.0/Apache/Gallery.pm line 12.
 BEGIN failed--compilation aborted at
 /usr/lib/perl5/site_perl/5.8.0/Apache/Gallery.pm line 12.
 Compilation failed in require at (eval 1) line 3.

 I have tried to use this startup.pl file:
 --
 #!/usr/bin/perl

 #use Apache2 (); # if you have 1.0 and 2.0 installed
 use Apache::compat ();
 use Apache::Request();
 use Apache::Gallery ();
 Qué cojones!;
 --
 But I still get a different error when I try to startup httpd.

 Is there a solution? Do I have to install both mod_perl 1 and 2? What's
 the answer to the ultimate question about life, the universe and
 everything? Should I wait for a definitive (redhat) release of mod_perl?

 JJ

 --
 Juan Julian Merelo Guervos [EMAIL PROTECTED]
 GeNeura team





Re: Problems with Apache::Gallery

2002-11-20 Thread Esteban Fernandez Stafford


No, find the location of your apache modules. For a hint type

rpm -ql perl



   E s  t  eb  a n!


:wq


On 20 Nov 2002, Juan Julian Merelo Guervos wrote:

 Hi,


  # For the Apache::* modules
  use lib qw(/usr/lib/perl5/site_perl/5.6.1/i586-linux/Apache2);
 

 Does that mean I have to downgrade to 5.6.1?

 J
 --
 Juan Julian Merelo Guervos [EMAIL PROTECTED]
 GeNeura team





Re: Problems with Apache::Gallery

2002-11-20 Thread Esteban Fernandez Stafford


Look for a Apache2 folder in your perl folder.

find /usr/lib/perl -name Apache2

If you dont find it, modperl is not properly installed.



   E s  t  eb  a n!


:wq


On 20 Nov 2002, Juan Julian Merelo Guervos wrote:

 El mié, 20-11-2002 a las 10:25, Esteban Fernandez Stafford escribió:
  Hi Juan,
 
  I had a similar problem. To solve it I added these lines to startup.pl
  I guess you will have to change the paths to your own settings.
 
 
  # For the Apache::* modules
  use lib qw(/usr/lib/perl5/site_perl/5.6.1/i586-linux/Apache2);
 
  # For my own
  use lib qw(/home/esteban/apache_mod_perl/perl);
 

 The truth is that Apache.pm does not exist; it probably is included in
 the mod_perl 1.xx distribution. Should I install it? Won't that zap the
 mod_per 1.99 installation?

 JJ
 --
 Juan Julian Merelo Guervos [EMAIL PROTECTED]
 GeNeura team





Re: Problems with Apache::Gallery

2002-11-20 Thread Esteban Fernandez Stafford

This is what I have... Your installation is not looking good.

$ ls -F /usr/lib/perl5/site_perl/5.6.1/i586-linux/Apache2/
APR/  APR.pm  Apache/  ModPerl/  auto/  mod_perl.pm  typemap


   E s  t  eb  a n!


:wq


On 20 Nov 2002, Juan Julian Merelo Guervos wrote:

 El mié, 20-11-2002 a las 10:40, Esteban Fernandez Stafford escribió:
  Look for a Apache2 folder in your perl folder.
 
  find /usr/lib/perl -name Apache2

 It's at:
 /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/Apache2
 but  there's only a typemap file in it. Should there be something else?

 
  If you dont find it, modperl is not properly installed.

 That might be the case, but it's the way it comes with redhat 8.0!

 JJ

 --
 Juan Julian Merelo Guervos [EMAIL PROTECTED]
 GeNeura team





Problem with Stream-oriented Output Filter

2002-11-20 Thread Esteban Fernandez Stafford


Hello,

I am currently developing a modperl filter that uses the streaming
approach. I started off with the example in

http://perl.apache.org/docs/2.0/user/handlers/filters.html#Stream_oriented_Output_Filter

  sub handler {
  my $filter = shift;

  my $left_over = '';
  while ($filter-read(my $buffer, BUFF_LEN)) {
  $buffer = $left_over . $buffer;
  $left_over = '';
  while ($buffer =~ /([^\r\n]*)([\r\n]*)/g) {
  $left_over = $1, last unless $2;
  $filter-print(scalar(reverse $1), $2);
  }
  }
  $filter-print(scalar reverse $left_over) if length $left_over;

  Apache::OK;
  }

This seems to work OK when the file is small (smaller than 8192). When
the file is larger, then there is a line that gets cut. This is
because the handler gets called more than once for big requests. Is
there a nice way to overcome this problem?  I was thinking using
filter context to store variable $left_over, but then I dont know how
to detect the real end of the stream.

Thanks in advance

   E s  t  eb  a n!


:wq