[ANNOUNCE] Apache::Clean

2002-01-23 Thread Geoffrey Young

The URL

   http://www.modperlcookbook.org/download/Apache-Clean-0.03.tar.gz

has entered CPAN as

   file: $CPAN/authors/id/G/GE/GEOFF/Apache-Clean-0.03.tar.gz
   size: 3575 bytes
   md5: bb3c4e6132ac461e22f510e1974f81b9

This release gives access to more of the features of HTML::Clean, like
the ability to specify HTML::Clean options by name (such as
'shortertags').  It also is (I think) the only Apache:: module
currently on CPAN that uses the new Apache::Test harness from mod_perl
2.0, so it should be a good example for that as well.

I bunged up the README and internal docs (I forgot to mention the new
CleanOption variable), so if there are any other bugs taht jump out at
people I'll get them fixed asap...

--Geoff

README:

NAME 

Apache::Clean - mod_perl interface into HTML::Clean

SYNOPSIS

httpd.conf:

 Location /clean
SetHandler perl-script
PerlHandler Apache::Clean

PerlSetVar  CleanLevel 3

PerlSetVar  CleanOption shortertags
PerlAddVar  CleanOption whitespace
 /Location  

Apache::Clean is Filter aware, meaning that it can be used within
Apache::Filter framework without modification.  Just include the
directives

  PerlModule Apache::Filter
  PerlSetVar Filter On

and modify the PerlHandler directive accordingly...

DESCRIPTION

Apache::Clean uses HTML::Clean to tidy up large, messy HTML, saving
bandwidth.  It is particularly useful with Apache::Compress for 
ultimate savings.

The only current configuration directives are CleanLevel, which
defaults
to 3, and CleanOption, which has no default.  Apache::Clean will only 
tidy up whitespace (via $h-strip) and
will not perform other options of HTML::Clean (such as browser
compatibility).  See the HTML::Clean manpage for details.

Only documents with a content type of text/html are affected - all
others are passed through unaltered.

NOTES

Verbose debugging is enabled by setting $Apache::Clean::DEBUG=1
or greater.  To turn off all debug information, set your apache
LogLevel directive above info level.

This is alpha software, and as such has not been tested on multiple
platforms or environments.  It requires PERL_LOG_API=1, 
PERL_FILE_API=1, and maybe other hooks to function properly.

FEATURES/BUGS

  No known bugs or features at this time...

SEE ALSO

perl(1), mod_perl(3), Apache(3), HTML::Clean(3), Apache::Compress(3),
Apache::Filter(3)

AUTHORS

Geoffrey Young [EMAIL PROTECTED]
Paul Lindner [EMAIL PROTECTED]
Randy Kobes [EMAIL PROTECTED]

COPYRIGHT

Copyright (c) 2002, Geoffrey Young, Paul Lindner, Randy Kobes.  
All rights reserved.

This module is free software.  It may be used, redistributed
and/or modified under the same terms as Perl itself.

HISTORY

This code is derived from the Cookbook::Clean and
Cookbook::TestMe modules available as part of
The mod_perl Developer's Cookbook.

For more information, visit http://www.modperlcookbook.org/



PerlRun Gotchas?

2002-01-23 Thread Andrew Green

Hi,

A site I run uses a fair variety of different programs, the most common
of which are run through Apache::Registry.  To cut the memory overhead,
however, less commonly used programs are run through Apache::PerlRun.

Both the Registry and PerlRun programs use a common module which defines
a few subroutines and a selection of exported variables.  These variables
are in the module as globals (ie: no my declaration), but with a use
vars to get them through strict.

With seeming 50/50 frequency, the PerlRun programs work as intended, or
alternatively return:

 200 OK
 The server encountered an internal error or misconfiguration...
 ...More information about this error may be available in the
 server error log.

Yes, it's not HTTP 500, it is 200.  The error log indicates every time
that this is due to a global set in my module that remains undef for the
program that tries to call it (and an open that dies on failure requires
the global).  Hitting refresh normally does the trick.

The moment I move from PerlRun to ordinary CGI, the problem vanishes. 
Equally, AFACT, it doesn't happen for Registry.

I've searched the guide, but couldn't find anything of help.  I have a
use Apache::PerlRun (); in my startup file, the common module is also
preloaded therein, and are also used in the PerlRun programs
themselves.  I'm running mod_perl 1.23 on Apache 1.3.19 (Red Hat).

Any suggestions gratefully appreciated.

Cheers,
Andrew.

-- 
Wow, that's almost as fun as meowing.
   -- http://www.exit109.com/%7Ejeremy/news/providers/4groups.html



Adding information to Virtual Hosts in a startup file...

2002-01-23 Thread Marceusz

Hi,

I'd like to add a location directives dynamically at startup to a VirtualHost using a 
startup script.

I've been trying:

$Apache::ReadConfig::VirtualHost{'127.0.0.1:80'}-{Location}-{'/'} = {
 SetHandler = 'perl-script',
 PerlHandler = 'Apache::Hello',
};

which doesn't work ... while 

$Apache::ReadConfig::Location{'/'} = {
 SetHandler = 'perl-script',
 PerlHandler = 'Apache::Hello',
};

works but clobbers other information already stored in other VirtualHosts. I 
specifically need to patch into one virtual host out of possibly many. 

The configuration of the various machines is out of my control which is why I was 
going with the Apache::ReadConfig approach.

Thanks
-Chris



Re: PerlRun Gotchas?

2002-01-23 Thread Perrin Harkins

 A site I run uses a fair variety of different programs, the most common
 of which are run through Apache::Registry.  To cut the memory overhead,
 however, less commonly used programs are run through Apache::PerlRun.

I would not expect PerlRun to use less memory than Registry.

 Both the Registry and PerlRun programs use a common module which defines
 a few subroutines and a selection of exported variables.  These variables
 are in the module as globals (ie: no my declaration), but with a use
 vars to get them through strict.

Does the module have a package name?  Are you exporting the variables from
it?  Seeing some code would help.

  200 OK
  The server encountered an internal error or misconfiguration...
  ...More information about this error may be available in the
  server error log.

That just means the error happened after the initial header was sent.

 The error log indicates every time
 that this is due to a global set in my module that remains undef for the
 program that tries to call it (and an open that dies on failure requires
 the global).

Again, some code would help.

I suspect you are getting bitten by namespace collisions:
http://perl.apache.org/guide/porting.html#Name_collisions_with_Modules_and

- Perrin




MS+HTML - Unix

2002-01-23 Thread Philip M. Gollucci

Say I have a webpage where I want to offer people the ability to upload
either a .txt or a .html file.  Now these people basically are computer
illierate, and don't even konw that UNIX is different from Microsh$t.

At anyrate, they will use Save as (HTML) from MSWord 97/2000, Save as
(txt), or worse yet, Save as RTF.
Then upload that.

Big surprise it gets it really wrong basically meaning it doesn't format
correctly before or after they use the site in any Browser.
One file, tidy told me had over 300 errors and that was just with HTML4.01
not XHTML1.0.

Is there anyway I can on the fly take the messed up HTML file I get and
covert it to what they meant to give me.

Important cases :
  Parrell Columns not in a table
  Bullets
  DIR tags
  actually closing u tags so the whole page isn't underlined.

I've see the demoronizer port, but don't know that much about it, and I
don't think its quite what I want.

Basically I have to take html given me and make the html they mean.


Any Great Ideas


END
--
Philip M. Gollucci (p6m7g8) [EMAIL PROTECTED] 301.314.3118

Science, Discovery,  the Universe (UMCP)
Webmaster  Webship Teacher
URL: http://www.sdu.umd.edu

EJPress.com
Database/PERL Programmer  System Admin
URL : http://www.ejournalpress.com

Resume  : http://www.p6m7g8.com/resume.txt





Re: slow regex [BENCHMARK]

2002-01-23 Thread Paul Mineiro

Paul Mineiro wrote:

i've cleaned up the example to tighten the case:

the mod perl code  snippet is:

---

  my @cg;
 
  open DIL, '', /tmp/seqdata;
  print DIL $seq;
  close DIL;
 
  warn length seq = @{[length ($seq)]};
 
  my $t = timeit (1, sub {
while ($seq =~ /CG/g)
  {
push @cg, pos ($seq);
  }
 });
 
  print STDERR timestr ($t), \n;

---
 
which yields
length seq = 21 at 
/home/aerives/genegrokker-interface/mod_perl/genomic_img.pm line 634, 
GEN1 line 102
16 wallclock secs (15.56 usr +  0.01 sys = 15.57 CPU) @  0.06/s (n=1)

and the perl script (command line) version is:

---

#!/usr/bin/perl

use Benchmark;
use strict;

open DIL, '', /tmp/seqdata;
my $seq = DIL;
close DIL;

warn length seq is @{[length $seq]};

my @cg;

my $t = timeit (1, sub {
  while ($seq =~ /CG/g)
{
  push @cg, pos ($seq);
}
   });

print STDERR timestr ($t), \n;

---
which yields:

length seq is 21 at ./t.pl line 10.
 0 wallclock secs ( 0.00 usr +  0.00 sys =  0.00 CPU)

the data is pretty big, so i didn't attach it, but feel free to contact 
me directly for it.

-- p

hi.  i'm running mod_perl 1.26 + apache 1.3.14 + perl 5.6.1

i have a loop in a mod_perl handler like so:

  my $stime = time ();

  while ($seq =~ /CG/og)
{ 
  push @cg,  pos ($seq);
}

  my $etime = time ();

  warn time was: , scalar localtime ($stime),  ,
scalar localtime ($etime),  , $etime - $stime;  


under mod_perl this takes 23 seconds.  running the perl by hand (via 
extracting this piece into a seperate perl script) on the same data takes 
less than 1 second.

has anyone seen this kind of extreme slowdown before?

-- p

info:

apache build options:

CFLAGS=-g -g -O3 -funroll-loops \
LDFLAGS=-L/home/aerives/lib -L/home/aerives/lib/mysql \
LIBS=-L/home/aerives/genegrokker-interface/lib 
-L/home/aerives/genegrokker-interface/ext/lib -L/home/aerives/lib 
-L/home/aerives/lib/mysql \
./configure \
--prefix=/home/aerives/genegrokker-interface/ext \
--enable-rule=EAPI \
--enable-module=most \
--enable-shared=max \
--with-layout=GNU \
--disable-rule=EXPAT \
$@

mod_perl build options:

configure_options=PERL_USELARGEFILES=0 USE_APXS=1 
WITH_APXS=$PLAYPEN_ROOT/ext/sbin/apxs EVERYTHING=1 
INC=$PLAYPEN_ROOT/ext/include -DEAPI

perl -V:
Summary of my perl5 (revision 5.0 version 6 subversion 1) configuration:
  Platform:
osname=linux, osvers=2.4.13, archname=i386-linux
uname='linux duende 2.4.13 #1 wed oct 31 19:18:07 est 2001 i686 unknown '
config_args='-Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=i386-linux 
-Dprefix=/usr -Dprivlib=/usr/share/perl/5.6.1 -Darchlib=/usr/lib/perl/5.6.1 
-Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 
-Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.6.1 
-Dsitearch=/usr/local/lib/perl/5.6.1 -Dman1dir=/usr/share/man/man1 
-Dman3dir=/usr/share/man/man3 -Dman1ext=1 -Dman3ext=3perl 
-Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Uusesfio -Duseshrplib 
-Dlibperl=libperl.so.5.6.1 -Dd_dosuid -des'
hint=recommended, useposix=true, d_sigaction=define
usethreads=undef use5005threads=undef useithreads=undef 
usemultiplicity=undef
useperlio=undef d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=undef use64bitall=undef uselongdouble=undef
  Compiler:
cc='cc', ccflags ='-DDEBIAN -fno-strict-aliasing -I/usr/local/include 
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
optimize='-O2',
cppflags='-DDEBIAN -fno-strict-aliasing -I/usr/local/include'
ccversion='', gccversion='2.95.4  (Debian prerelease)', gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', 
lseeksize=8
alignbytes=4, usemymalloc=n, prototype=define
  Linker and Libraries:
ld='cc', ldflags =' -L/usr/local/lib'
libpth=/usr/local/lib /lib /usr/lib
libs=-lgdbm -ldb -ldl -lm -lc -lcrypt
perllibs=-ldl -lm -lc -lcrypt
libc=/lib/libc-2.2.4.so, so=so, useshrplib=true, libperl=libperl.so.5.6.1
  Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic'
cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'


Characteristics of this binary (from libperl): 
  Compile-time options: USE_LARGE_FILES
  Built under linux
  Compiled at Jan 11 2002 04:09:18
  %ENV:

PERL5LIB=/home/aerives/genegrokker-interface/lib/perl5:/home/aerives/genegrokker-interface/ext/lib/perl5:/home/aerives/lib/perl5
  @INC:
/home/aerives/genegrokker-interface/lib/perl5
/home/aerives/genegrokker-interface/ext/lib/perl5
/home/aerives/lib/perl5
/usr/local/lib/perl/5.6.1
/usr/local/share/perl/5.6.1
/usr/lib/perl5

Re: MS+HTML - Unix

2002-01-23 Thread John Saylor

Hi

( 02.01.23 18:23 + ) Philip M. Gollucci:
 Is there anyway I can on the fly take the messed up HTML file I get and
 covert it to what they meant to give me.

Probably not. You *could* strip out all HTML [and other formatting
cruft] and display as text, but I'd guess your 'constituents' would not
like that ...

-- 
\js

You have to make it happen. 
-Joe Greene 



RE: MS+HTML - Unix

2002-01-23 Thread John Greene

You could have your users upload  MSWord documents and do the html
conversion for them on the server using something like wvware.

-Original Message-
From: Philip M. Gollucci [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, January 23, 2002 10:23 AM
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED];
[EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: MS+HTML - Unix


Say I have a webpage where I want to offer people the ability to upload
either a .txt or a .html file.  Now these people basically are computer
illierate, and don't even konw that UNIX is different from Microsh$t.

At anyrate, they will use Save as (HTML) from MSWord 97/2000, Save as
(txt), or worse yet, Save as RTF.
Then upload that.

Big surprise it gets it really wrong basically meaning it doesn't format
correctly before or after they use the site in any Browser.
One file, tidy told me had over 300 errors and that was just with HTML4.01
not XHTML1.0.

Is there anyway I can on the fly take the messed up HTML file I get and
covert it to what they meant to give me.

Important cases :
  Parrell Columns not in a table
  Bullets
  DIR tags
  actually closing u tags so the whole page isn't underlined.

I've see the demoronizer port, but don't know that much about it, and I
don't think its quite what I want.

Basically I have to take html given me and make the html they mean.


Any Great Ideas


END

--
Philip M. Gollucci (p6m7g8) [EMAIL PROTECTED] 301.314.3118

Science, Discovery,  the Universe (UMCP)
Webmaster  Webship Teacher
URL: http://www.sdu.umd.edu

EJPress.com
Database/PERL Programmer  System Admin
URL : http://www.ejournalpress.com

Resume  : http://www.p6m7g8.com/resume.txt






Re: slow regex [BENCHMARK]

2002-01-23 Thread Sam Tregar

On Wed, 23 Jan 2002, Paul Mineiro wrote:

 i've cleaned up the example to tighten the case:

 the mod perl code  snippet is:

Fascinating.  The only thing I don't see is where $seq gets assigned to in
the CGI case.  Where is the data coming from?  Is it perhaps a tied
variable or otherwise unlike the $seq in the command-line version?  If
that's not it then I think you might have to build a debugging version of
Apache and Perl and break out GDB to get to the bottom of things.

-sam




Re: mod_perl installation

2002-01-23 Thread Alastair Sherringham

On Mon, Jan 21, 2002 at 11:35:44AM -0800, Rasoul Hajikhani wrote:
 I use SGI IRIX 6.5 
 
 Short of reinstalling perl, is there anything else that could be done?
 Where would I find libgdbm.so?
 Thanks in advance.

Sorry for the late reply. For IRIX software precompiled, you should
check the 'freeware' site ;

http://freeware.sgi.com/index-by-alpha.html

They've got GDBM I notice.

Cheers,


-- 
AS  | 
[EMAIL PROTECTED]   |
http://www.calliope.demon.co.uk |PGP Key : A9DE69F8
---



Re: slow regex [BENCHMARK]

2002-01-23 Thread Robert Landrum

At 4:01 PM -0800 1/23/02, Paul Mineiro wrote:
Paul Mineiro wrote:

i've cleaned up the example to tighten the case:

the mod perl code  snippet is:

---

 my @cg;
 open DIL, '', /tmp/seqdata;
 print DIL $seq;
 close DIL;
 warn length seq = @{[length ($seq)]};
 my $t = timeit (1, sub {
   while ($seq =~ /CG/g)
 {
   push @cg, pos ($seq);
 }
});
 print STDERR timestr ($t), \n;


I just ran this on my system here... It's completely unloaded (load 
average: 0.11, 0.08, 0.02)

Result:

0 wallclock secs ( 0.06 usr + 0.00 sys = 0.06 CPU) @ 16.67/s (n=1)


I ran it on a file that I created with

perl -e print 'ABCGEFSK' x 25000  /tmp/seqdata

Which created 25000 entires into @cg.

Your system has to be swapping horribly.  I bet that the ulimit for 
whoever apache is running as has the memory segment set super low.

Double check everything and if that doesn't work, recompile.

Rob


--
When I used a Mac, they laughed because I had no command prompt. When 
I used Linux, they laughed because I had no GUI.  



Re: slow regex [BENCHMARK]

2002-01-23 Thread Perrin Harkins

 Your system has to be swapping horribly.  I bet that the ulimit for
 whoever apache is running as has the memory segment set super low.

That's a possibility.  I was also thinking that maybe mod_perl was built
against a different version of Perl, possibly one that has a problem
with this particular regex which was fixed in a later version.

- Perrin




Re: disable mod_perl for certain virtual hosts/folders

2002-01-23 Thread Merlin

Hi peter,

If I got, the problem is yours the problem about disabling mod_perl. You can 
do that with somthing like this:

VirtualHost ...
...
Location /
SetHandler default-handler
/Location
/VirtualHost

Merlin, The Mage

Diz-se que Grande Mestre [EMAIL PROTECTED] disse outrora:
:: On Tue, Jan 22, 2002 at 08:31:02AM -0500, Geoffrey Young wrote:
::  [EMAIL PROTECTED] wrote:
::   On my Apache mod_perl is generally enabled with the following
::   statement:
::  
::   Directory /data/apache
::   Files ~ \.pl$
::   SetHandler perl-script
::   PerlHandler Apache::Registry
::   Options +ExecCGI
::   /Files
::   /Directory
:: 
::  you might have better luck with something like
:: 
::  Directory /data/apache
::AddHandler .pl perl-script
::PerlHandler Apache::Registry
::Options +ExecCGI
::  /Directory
::
:: thnx, but: This part doesnt make the problem. mod_perl works like a
:: charm. Problem is how to deactivate it for a certain location ?
::
:: thnx,
:: peter

-- 
A paixão dos olhos das crianças é toda a magia que o mundo precisa!
Alguem disse, talvez Merlin:  Camelot vai renascer... Brevemente... Online!!



PerlRun Gotchas?

2002-01-23 Thread Andrew Green

Hi,

A site I run uses a fair variety of different programs, the most common
of which are run through Apache::Registry.  To cut the memory overhead,
however, less commonly used programs are run through Apache::PerlRun.

Both the Registry and PerlRun programs use a common module which defines
a few subroutines and a selection of exported variables.  These variables
are in the module as globals (ie: no my declaration), but with a use
vars to get them through strict.

With seeming 50/50 frequency, the PerlRun programs work as intended, or
alternatively return:

 200 OK
 The server encountered an internal error or misconfiguration...
 ...More information about this error may be available in the
 server error log.

Yes, it's not HTTP 500, it is 200.  The error log indicates every time
that this is due to a global set in my module that remains undef for the
program that tries to call it (and an open that dies on failure requires
the global).  Hitting refresh normally does the trick.

The moment I move from PerlRun to ordinary CGI, the problem vanishes. 
Equally, it doesn't happen for Registry.

I've searched the guide, but couldn't find anything of help.  I have a
use Apache::PerlRun (); in my startup file, the common module is also
preloaded therein, and are also used in the PerlRun programs
themselves.  I'm running mod_perl 1.23 on Apache 1.3.19 (Red Hat).

Any suggestions gratefully appreciated.

Cheers,
Andrew.

-- 
Wow, that's almost as fun as meowing.
   -- http://www.exit109.com/%7Ejeremy/news/providers/4groups.html



RE: Help...

2002-01-23 Thread stevea

In most cases Apache basic auth passwords are set by the htpasswd command
that should be available in the Apache source. In order to use this from a
perl script you might have to set the SUID bit of htpasswd and make it owned
by the Apache user. By writing a small script to take password information
and making the appropriate call to htpasswd you should achieve your goal. As
always, it's recommended to use SSL when doing something like this.

Hope that helps

Hello All,
I am a programmer who is currently working in UK on a
Apache Project.I have a question.one of my project
member has developed some html pages relating to
Project authontication just like those of Apache Basic
Authontication.Now my question is,

can i change the password(which is in the header of
the HTTP protocol)which is cached by browse throught
out the session, when i send a request to Apache
Server using Perl script if so how can we do that .I
will be help full to the person who can give the
solution.Thanks in advance.

JK


__
Do You Yahoo!?
Send FREE video emails in Yahoo! Mail!
http://promo.yahoo.com/videomail/




When to cache

2002-01-23 Thread Milo Hyson

I'm interested to know what the opinions are of those on this list with 
regards to caching objects during database write operations. I've encountered 
different views and I'm not really sure what the best approach is.

Take a typical caching scenario: Data/objects are locally stored upon loading 
from a database to improve performance for subsequent requests. But when 
those objects change, what's the best method for refreshing the cache? There 
are two possible approaches (maybe more?):

1) The old cache entry is overwritten with the new.
2) The old cache entry is expired, thus forcing a database hit (and 
subsequent cache load) on the next request.

The first approach would tend to yield better performance. However there's no 
guarantee the data will ever be read. The cache could end up with a large 
amount of data that's never referenced. The second approach would probably 
allow for a smaller cache by ensuring that data is only cached on reads.

In the end, this probably boils down to application requirements. RAM and 
disk storage is so cheap these days that the first method is probably fine 
for most purposes. However I'm sure there are situations where resources are 
limited and the second is more effective. What does everyone think?

-- 
Milo Hyson
CyberLife Labs, LLC



Re: Cross-site Scripting prevention with Apache::TaintRequest

2002-01-23 Thread Arnold van Kampen



Does anybody have an example(s) of how this kind of abuse is actually
working?

All the time I have just been lucky then I guess. 

Arnold van Kampen


On Tue, 22 Jan 2002, Perrin Harkins wrote:

  Yes and no. XSS attacks are possible on old browsers, when the charset is
 not
  set (something which is often the case with modperl apps) and when the
  HTML-escaping bit does not match what certain browsers accept as markup.
 
 Of course I set the charset, but I didn't know that might not be enough.
 Does anyone know if Apache::Util::escape_html() and HTML::Entities::encode()
 are safe?
 
 - Perrin