[ANNOUNCE] Apache::Clean
The URL http://www.modperlcookbook.org/download/Apache-Clean-0.03.tar.gz has entered CPAN as file: $CPAN/authors/id/G/GE/GEOFF/Apache-Clean-0.03.tar.gz size: 3575 bytes md5: bb3c4e6132ac461e22f510e1974f81b9 This release gives access to more of the features of HTML::Clean, like the ability to specify HTML::Clean options by name (such as 'shortertags'). It also is (I think) the only Apache:: module currently on CPAN that uses the new Apache::Test harness from mod_perl 2.0, so it should be a good example for that as well. I bunged up the README and internal docs (I forgot to mention the new CleanOption variable), so if there are any other bugs taht jump out at people I'll get them fixed asap... --Geoff README: NAME Apache::Clean - mod_perl interface into HTML::Clean SYNOPSIS httpd.conf: Location /clean SetHandler perl-script PerlHandler Apache::Clean PerlSetVar CleanLevel 3 PerlSetVar CleanOption shortertags PerlAddVar CleanOption whitespace /Location Apache::Clean is Filter aware, meaning that it can be used within Apache::Filter framework without modification. Just include the directives PerlModule Apache::Filter PerlSetVar Filter On and modify the PerlHandler directive accordingly... DESCRIPTION Apache::Clean uses HTML::Clean to tidy up large, messy HTML, saving bandwidth. It is particularly useful with Apache::Compress for ultimate savings. The only current configuration directives are CleanLevel, which defaults to 3, and CleanOption, which has no default. Apache::Clean will only tidy up whitespace (via $h-strip) and will not perform other options of HTML::Clean (such as browser compatibility). See the HTML::Clean manpage for details. Only documents with a content type of text/html are affected - all others are passed through unaltered. NOTES Verbose debugging is enabled by setting $Apache::Clean::DEBUG=1 or greater. To turn off all debug information, set your apache LogLevel directive above info level. This is alpha software, and as such has not been tested on multiple platforms or environments. It requires PERL_LOG_API=1, PERL_FILE_API=1, and maybe other hooks to function properly. FEATURES/BUGS No known bugs or features at this time... SEE ALSO perl(1), mod_perl(3), Apache(3), HTML::Clean(3), Apache::Compress(3), Apache::Filter(3) AUTHORS Geoffrey Young [EMAIL PROTECTED] Paul Lindner [EMAIL PROTECTED] Randy Kobes [EMAIL PROTECTED] COPYRIGHT Copyright (c) 2002, Geoffrey Young, Paul Lindner, Randy Kobes. All rights reserved. This module is free software. It may be used, redistributed and/or modified under the same terms as Perl itself. HISTORY This code is derived from the Cookbook::Clean and Cookbook::TestMe modules available as part of The mod_perl Developer's Cookbook. For more information, visit http://www.modperlcookbook.org/
PerlRun Gotchas?
Hi, A site I run uses a fair variety of different programs, the most common of which are run through Apache::Registry. To cut the memory overhead, however, less commonly used programs are run through Apache::PerlRun. Both the Registry and PerlRun programs use a common module which defines a few subroutines and a selection of exported variables. These variables are in the module as globals (ie: no my declaration), but with a use vars to get them through strict. With seeming 50/50 frequency, the PerlRun programs work as intended, or alternatively return: 200 OK The server encountered an internal error or misconfiguration... ...More information about this error may be available in the server error log. Yes, it's not HTTP 500, it is 200. The error log indicates every time that this is due to a global set in my module that remains undef for the program that tries to call it (and an open that dies on failure requires the global). Hitting refresh normally does the trick. The moment I move from PerlRun to ordinary CGI, the problem vanishes. Equally, AFACT, it doesn't happen for Registry. I've searched the guide, but couldn't find anything of help. I have a use Apache::PerlRun (); in my startup file, the common module is also preloaded therein, and are also used in the PerlRun programs themselves. I'm running mod_perl 1.23 on Apache 1.3.19 (Red Hat). Any suggestions gratefully appreciated. Cheers, Andrew. -- Wow, that's almost as fun as meowing. -- http://www.exit109.com/%7Ejeremy/news/providers/4groups.html
Adding information to Virtual Hosts in a startup file...
Hi, I'd like to add a location directives dynamically at startup to a VirtualHost using a startup script. I've been trying: $Apache::ReadConfig::VirtualHost{'127.0.0.1:80'}-{Location}-{'/'} = { SetHandler = 'perl-script', PerlHandler = 'Apache::Hello', }; which doesn't work ... while $Apache::ReadConfig::Location{'/'} = { SetHandler = 'perl-script', PerlHandler = 'Apache::Hello', }; works but clobbers other information already stored in other VirtualHosts. I specifically need to patch into one virtual host out of possibly many. The configuration of the various machines is out of my control which is why I was going with the Apache::ReadConfig approach. Thanks -Chris
Re: PerlRun Gotchas?
A site I run uses a fair variety of different programs, the most common of which are run through Apache::Registry. To cut the memory overhead, however, less commonly used programs are run through Apache::PerlRun. I would not expect PerlRun to use less memory than Registry. Both the Registry and PerlRun programs use a common module which defines a few subroutines and a selection of exported variables. These variables are in the module as globals (ie: no my declaration), but with a use vars to get them through strict. Does the module have a package name? Are you exporting the variables from it? Seeing some code would help. 200 OK The server encountered an internal error or misconfiguration... ...More information about this error may be available in the server error log. That just means the error happened after the initial header was sent. The error log indicates every time that this is due to a global set in my module that remains undef for the program that tries to call it (and an open that dies on failure requires the global). Again, some code would help. I suspect you are getting bitten by namespace collisions: http://perl.apache.org/guide/porting.html#Name_collisions_with_Modules_and - Perrin
MS+HTML - Unix
Say I have a webpage where I want to offer people the ability to upload either a .txt or a .html file. Now these people basically are computer illierate, and don't even konw that UNIX is different from Microsh$t. At anyrate, they will use Save as (HTML) from MSWord 97/2000, Save as (txt), or worse yet, Save as RTF. Then upload that. Big surprise it gets it really wrong basically meaning it doesn't format correctly before or after they use the site in any Browser. One file, tidy told me had over 300 errors and that was just with HTML4.01 not XHTML1.0. Is there anyway I can on the fly take the messed up HTML file I get and covert it to what they meant to give me. Important cases : Parrell Columns not in a table Bullets DIR tags actually closing u tags so the whole page isn't underlined. I've see the demoronizer port, but don't know that much about it, and I don't think its quite what I want. Basically I have to take html given me and make the html they mean. Any Great Ideas END -- Philip M. Gollucci (p6m7g8) [EMAIL PROTECTED] 301.314.3118 Science, Discovery, the Universe (UMCP) Webmaster Webship Teacher URL: http://www.sdu.umd.edu EJPress.com Database/PERL Programmer System Admin URL : http://www.ejournalpress.com Resume : http://www.p6m7g8.com/resume.txt
Re: slow regex [BENCHMARK]
Paul Mineiro wrote: i've cleaned up the example to tighten the case: the mod perl code snippet is: --- my @cg; open DIL, '', /tmp/seqdata; print DIL $seq; close DIL; warn length seq = @{[length ($seq)]}; my $t = timeit (1, sub { while ($seq =~ /CG/g) { push @cg, pos ($seq); } }); print STDERR timestr ($t), \n; --- which yields length seq = 21 at /home/aerives/genegrokker-interface/mod_perl/genomic_img.pm line 634, GEN1 line 102 16 wallclock secs (15.56 usr + 0.01 sys = 15.57 CPU) @ 0.06/s (n=1) and the perl script (command line) version is: --- #!/usr/bin/perl use Benchmark; use strict; open DIL, '', /tmp/seqdata; my $seq = DIL; close DIL; warn length seq is @{[length $seq]}; my @cg; my $t = timeit (1, sub { while ($seq =~ /CG/g) { push @cg, pos ($seq); } }); print STDERR timestr ($t), \n; --- which yields: length seq is 21 at ./t.pl line 10. 0 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU) the data is pretty big, so i didn't attach it, but feel free to contact me directly for it. -- p hi. i'm running mod_perl 1.26 + apache 1.3.14 + perl 5.6.1 i have a loop in a mod_perl handler like so: my $stime = time (); while ($seq =~ /CG/og) { push @cg, pos ($seq); } my $etime = time (); warn time was: , scalar localtime ($stime), , scalar localtime ($etime), , $etime - $stime; under mod_perl this takes 23 seconds. running the perl by hand (via extracting this piece into a seperate perl script) on the same data takes less than 1 second. has anyone seen this kind of extreme slowdown before? -- p info: apache build options: CFLAGS=-g -g -O3 -funroll-loops \ LDFLAGS=-L/home/aerives/lib -L/home/aerives/lib/mysql \ LIBS=-L/home/aerives/genegrokker-interface/lib -L/home/aerives/genegrokker-interface/ext/lib -L/home/aerives/lib -L/home/aerives/lib/mysql \ ./configure \ --prefix=/home/aerives/genegrokker-interface/ext \ --enable-rule=EAPI \ --enable-module=most \ --enable-shared=max \ --with-layout=GNU \ --disable-rule=EXPAT \ $@ mod_perl build options: configure_options=PERL_USELARGEFILES=0 USE_APXS=1 WITH_APXS=$PLAYPEN_ROOT/ext/sbin/apxs EVERYTHING=1 INC=$PLAYPEN_ROOT/ext/include -DEAPI perl -V: Summary of my perl5 (revision 5.0 version 6 subversion 1) configuration: Platform: osname=linux, osvers=2.4.13, archname=i386-linux uname='linux duende 2.4.13 #1 wed oct 31 19:18:07 est 2001 i686 unknown ' config_args='-Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=i386-linux -Dprefix=/usr -Dprivlib=/usr/share/perl/5.6.1 -Darchlib=/usr/lib/perl/5.6.1 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.6.1 -Dsitearch=/usr/local/lib/perl/5.6.1 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Uusesfio -Duseshrplib -Dlibperl=libperl.so.5.6.1 -Dd_dosuid -des' hint=recommended, useposix=true, d_sigaction=define usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef useperlio=undef d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef Compiler: cc='cc', ccflags ='-DDEBIAN -fno-strict-aliasing -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2', cppflags='-DDEBIAN -fno-strict-aliasing -I/usr/local/include' ccversion='', gccversion='2.95.4 (Debian prerelease)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, usemymalloc=n, prototype=define Linker and Libraries: ld='cc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lgdbm -ldb -ldl -lm -lc -lcrypt perllibs=-ldl -lm -lc -lcrypt libc=/lib/libc-2.2.4.so, so=so, useshrplib=true, libperl=libperl.so.5.6.1 Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: USE_LARGE_FILES Built under linux Compiled at Jan 11 2002 04:09:18 %ENV: PERL5LIB=/home/aerives/genegrokker-interface/lib/perl5:/home/aerives/genegrokker-interface/ext/lib/perl5:/home/aerives/lib/perl5 @INC: /home/aerives/genegrokker-interface/lib/perl5 /home/aerives/genegrokker-interface/ext/lib/perl5 /home/aerives/lib/perl5 /usr/local/lib/perl/5.6.1 /usr/local/share/perl/5.6.1 /usr/lib/perl5
Re: MS+HTML - Unix
Hi ( 02.01.23 18:23 + ) Philip M. Gollucci: Is there anyway I can on the fly take the messed up HTML file I get and covert it to what they meant to give me. Probably not. You *could* strip out all HTML [and other formatting cruft] and display as text, but I'd guess your 'constituents' would not like that ... -- \js You have to make it happen. -Joe Greene
RE: MS+HTML - Unix
You could have your users upload MSWord documents and do the html conversion for them on the server using something like wvware. -Original Message- From: Philip M. Gollucci [mailto:[EMAIL PROTECTED]] Sent: Wednesday, January 23, 2002 10:23 AM To: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: MS+HTML - Unix Say I have a webpage where I want to offer people the ability to upload either a .txt or a .html file. Now these people basically are computer illierate, and don't even konw that UNIX is different from Microsh$t. At anyrate, they will use Save as (HTML) from MSWord 97/2000, Save as (txt), or worse yet, Save as RTF. Then upload that. Big surprise it gets it really wrong basically meaning it doesn't format correctly before or after they use the site in any Browser. One file, tidy told me had over 300 errors and that was just with HTML4.01 not XHTML1.0. Is there anyway I can on the fly take the messed up HTML file I get and covert it to what they meant to give me. Important cases : Parrell Columns not in a table Bullets DIR tags actually closing u tags so the whole page isn't underlined. I've see the demoronizer port, but don't know that much about it, and I don't think its quite what I want. Basically I have to take html given me and make the html they mean. Any Great Ideas END -- Philip M. Gollucci (p6m7g8) [EMAIL PROTECTED] 301.314.3118 Science, Discovery, the Universe (UMCP) Webmaster Webship Teacher URL: http://www.sdu.umd.edu EJPress.com Database/PERL Programmer System Admin URL : http://www.ejournalpress.com Resume : http://www.p6m7g8.com/resume.txt
Re: slow regex [BENCHMARK]
On Wed, 23 Jan 2002, Paul Mineiro wrote: i've cleaned up the example to tighten the case: the mod perl code snippet is: Fascinating. The only thing I don't see is where $seq gets assigned to in the CGI case. Where is the data coming from? Is it perhaps a tied variable or otherwise unlike the $seq in the command-line version? If that's not it then I think you might have to build a debugging version of Apache and Perl and break out GDB to get to the bottom of things. -sam
Re: mod_perl installation
On Mon, Jan 21, 2002 at 11:35:44AM -0800, Rasoul Hajikhani wrote: I use SGI IRIX 6.5 Short of reinstalling perl, is there anything else that could be done? Where would I find libgdbm.so? Thanks in advance. Sorry for the late reply. For IRIX software precompiled, you should check the 'freeware' site ; http://freeware.sgi.com/index-by-alpha.html They've got GDBM I notice. Cheers, -- AS | [EMAIL PROTECTED] | http://www.calliope.demon.co.uk |PGP Key : A9DE69F8 ---
Re: slow regex [BENCHMARK]
At 4:01 PM -0800 1/23/02, Paul Mineiro wrote: Paul Mineiro wrote: i've cleaned up the example to tighten the case: the mod perl code snippet is: --- my @cg; open DIL, '', /tmp/seqdata; print DIL $seq; close DIL; warn length seq = @{[length ($seq)]}; my $t = timeit (1, sub { while ($seq =~ /CG/g) { push @cg, pos ($seq); } }); print STDERR timestr ($t), \n; I just ran this on my system here... It's completely unloaded (load average: 0.11, 0.08, 0.02) Result: 0 wallclock secs ( 0.06 usr + 0.00 sys = 0.06 CPU) @ 16.67/s (n=1) I ran it on a file that I created with perl -e print 'ABCGEFSK' x 25000 /tmp/seqdata Which created 25000 entires into @cg. Your system has to be swapping horribly. I bet that the ulimit for whoever apache is running as has the memory segment set super low. Double check everything and if that doesn't work, recompile. Rob -- When I used a Mac, they laughed because I had no command prompt. When I used Linux, they laughed because I had no GUI.
Re: slow regex [BENCHMARK]
Your system has to be swapping horribly. I bet that the ulimit for whoever apache is running as has the memory segment set super low. That's a possibility. I was also thinking that maybe mod_perl was built against a different version of Perl, possibly one that has a problem with this particular regex which was fixed in a later version. - Perrin
Re: disable mod_perl for certain virtual hosts/folders
Hi peter, If I got, the problem is yours the problem about disabling mod_perl. You can do that with somthing like this: VirtualHost ... ... Location / SetHandler default-handler /Location /VirtualHost Merlin, The Mage Diz-se que Grande Mestre [EMAIL PROTECTED] disse outrora: :: On Tue, Jan 22, 2002 at 08:31:02AM -0500, Geoffrey Young wrote: :: [EMAIL PROTECTED] wrote: :: On my Apache mod_perl is generally enabled with the following :: statement: :: :: Directory /data/apache :: Files ~ \.pl$ :: SetHandler perl-script :: PerlHandler Apache::Registry :: Options +ExecCGI :: /Files :: /Directory :: :: you might have better luck with something like :: :: Directory /data/apache ::AddHandler .pl perl-script ::PerlHandler Apache::Registry ::Options +ExecCGI :: /Directory :: :: thnx, but: This part doesnt make the problem. mod_perl works like a :: charm. Problem is how to deactivate it for a certain location ? :: :: thnx, :: peter -- A paixão dos olhos das crianças é toda a magia que o mundo precisa! Alguem disse, talvez Merlin: Camelot vai renascer... Brevemente... Online!!
PerlRun Gotchas?
Hi, A site I run uses a fair variety of different programs, the most common of which are run through Apache::Registry. To cut the memory overhead, however, less commonly used programs are run through Apache::PerlRun. Both the Registry and PerlRun programs use a common module which defines a few subroutines and a selection of exported variables. These variables are in the module as globals (ie: no my declaration), but with a use vars to get them through strict. With seeming 50/50 frequency, the PerlRun programs work as intended, or alternatively return: 200 OK The server encountered an internal error or misconfiguration... ...More information about this error may be available in the server error log. Yes, it's not HTTP 500, it is 200. The error log indicates every time that this is due to a global set in my module that remains undef for the program that tries to call it (and an open that dies on failure requires the global). Hitting refresh normally does the trick. The moment I move from PerlRun to ordinary CGI, the problem vanishes. Equally, it doesn't happen for Registry. I've searched the guide, but couldn't find anything of help. I have a use Apache::PerlRun (); in my startup file, the common module is also preloaded therein, and are also used in the PerlRun programs themselves. I'm running mod_perl 1.23 on Apache 1.3.19 (Red Hat). Any suggestions gratefully appreciated. Cheers, Andrew. -- Wow, that's almost as fun as meowing. -- http://www.exit109.com/%7Ejeremy/news/providers/4groups.html
RE: Help...
In most cases Apache basic auth passwords are set by the htpasswd command that should be available in the Apache source. In order to use this from a perl script you might have to set the SUID bit of htpasswd and make it owned by the Apache user. By writing a small script to take password information and making the appropriate call to htpasswd you should achieve your goal. As always, it's recommended to use SSL when doing something like this. Hope that helps Hello All, I am a programmer who is currently working in UK on a Apache Project.I have a question.one of my project member has developed some html pages relating to Project authontication just like those of Apache Basic Authontication.Now my question is, can i change the password(which is in the header of the HTTP protocol)which is cached by browse throught out the session, when i send a request to Apache Server using Perl script if so how can we do that .I will be help full to the person who can give the solution.Thanks in advance. JK __ Do You Yahoo!? Send FREE video emails in Yahoo! Mail! http://promo.yahoo.com/videomail/
When to cache
I'm interested to know what the opinions are of those on this list with regards to caching objects during database write operations. I've encountered different views and I'm not really sure what the best approach is. Take a typical caching scenario: Data/objects are locally stored upon loading from a database to improve performance for subsequent requests. But when those objects change, what's the best method for refreshing the cache? There are two possible approaches (maybe more?): 1) The old cache entry is overwritten with the new. 2) The old cache entry is expired, thus forcing a database hit (and subsequent cache load) on the next request. The first approach would tend to yield better performance. However there's no guarantee the data will ever be read. The cache could end up with a large amount of data that's never referenced. The second approach would probably allow for a smaller cache by ensuring that data is only cached on reads. In the end, this probably boils down to application requirements. RAM and disk storage is so cheap these days that the first method is probably fine for most purposes. However I'm sure there are situations where resources are limited and the second is more effective. What does everyone think? -- Milo Hyson CyberLife Labs, LLC
Re: Cross-site Scripting prevention with Apache::TaintRequest
Does anybody have an example(s) of how this kind of abuse is actually working? All the time I have just been lucky then I guess. Arnold van Kampen On Tue, 22 Jan 2002, Perrin Harkins wrote: Yes and no. XSS attacks are possible on old browsers, when the charset is not set (something which is often the case with modperl apps) and when the HTML-escaping bit does not match what certain browsers accept as markup. Of course I set the charset, but I didn't know that might not be enough. Does anyone know if Apache::Util::escape_html() and HTML::Entities::encode() are safe? - Perrin