[OT] Request for Extreme Programming with Perl Experiences

2003-08-29 Thread Rob Nagler
Apologies for the off-topic post...

I'm looking for stories, annecdotes, comparisons, etc. from people who
are using extreme programming using Perl.  Even if you are using only
some of the practices, such as, testing, coding, and refactoring, your
input will be useful.  The good, the bad, the ugly.  Anything that is
about a real XP experience using Perl in real-world situations.  I
would especially like to hear about large application development in a
commercial environment.

The stories are for my book: Extreme Programming with Perl.  If you
want to keep your name and company anonymous, please let me know.  If
I decide to include your story, you'll get to review the text before
I submit it to my editor.

If you want to reply to a list, send it to the extremeperl Yahoo
group.  Or, you can send it directly to me (nagler) at bivio.biz, and
I'll keep it confidential.

Thanks,
Rob




-- 
Reporting bugs: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html



Re: submit input truncation

2003-06-25 Thread Rob Nagler
Bill Marrs writes:
 But, I'm getting an intermittent problem with POSTs where the input data is 
 being truncated.  This causes havoc, especially in my forum system.
[snip]
 Has anyone else seen this?  Is there some fix for it?

We have seen this on mp1.  We read $r-header_in('Content-length')
worth of data.  If the data is less, we assume it is a double click
and toss the request (more or less).  Here's the exact code:
http://petshop.bivio.biz/src?s=Bivio::Agent::HTTP::Form
Search for Content-length.

Rob


Re: OSCON ideas - MVC talk

2003-01-09 Thread Rob Nagler
Andy Wardley writes:
 I like the sound of it, but I should warn you that I have a personal 
 crusade against inappropriate use of the phrase MVC in relation to 
 web development.  

So how about a panel discussion.  I would gladly represent the MVC
camp. :-)  (see http://www.bivio.biz/hm/why-bOP for my position.)

I am thinking about giving a talk about subject matter oriented
programming (SMOP).  SMOP separates the programming concerns to allow
you to concentrate on the subject matter with minimal distractions.
If you are familiar with patterns, it's the interpreter pattern taken
to the extreme.

The example would be to compare Sun's Pet Store with our own
http://petshop.bivio.biz.  The 3 major SMOP languages in bOP's PetShop
allow you to focus on the subject matter in the models, views, and
controllers without getting bogged down in syntax and unnecessary
repetition.

This is not a SMOP from J2EE's Pet Store[1]:

  tr
   td class=petstore_form align=right
bFirst Name/b
   /td 
   td align=left colspan=2
waf:input cssClass=petstore_form
 name=given_name_a
  type=text
   size=30
maxlength=30
  validation=validation
 waf:valuec:out value=${customer.account.contactInfo.givenName}//waf:value
/waf:input
   /td
  /tr
  tr
   td class=petstore_form align=right
bLast Name/b
   /td 
   td align=left colspan=2
waf:input cssClass=petstore_form
  type=text
 name=family_name_a
   size=30
maxlength=30
 waf:valuec:out value=${customer.account.contactInfo.familyName}//waf:value
/waf:input
   /td
  /tr
  
And, this is a SMOP in bOP[2]:

[
vs_form_field('UserAccountForm.User.first_name'),
], [
vs_form_field('UserAccountForm.User.last_name'),
],

The intent is to demonstrate the power of Perl to distill the essence
of the subject matter.

Interest?

Rob

[1] http://java.sun.com/blueprints/code/index.html#java_pet_store_demo
[2] http://petshop.bivio.biz/src?s=View.account





Re: development techniques

2003-01-09 Thread Rob Nagler
mpm writes:
 Debugging of the applications now looks like:
 $ced-log('warn',No price for this product)

Here's an an alternative that we've evolved from Modula-2 to C to Java
to Perl :-)  Firstly, I try to distinguish between stuff I always
want to see and debugging messages.  The former we call logging, and
wrap it in a class Bivio::IO::Alert which also outputs the source line
of the caller, time, etc. configurably.  This is very handy for
figuring out what's complaining.

The latter we call trace messages which is presented by
Bivio::IO::Alert, but is defined as follows:

_trace('No price for this product') if $_TRACE;

The if $_TRACE is an optimization, which can be left out but avoids
the overhead of argument evaluation.

The _trace() subroutine and $_TRACE variable is dynamically generated
by our Trace module, which any package can register with as follows:

use vars ('$_TRACE');
Bivio::IO::Trace-register;

You can then configure tracing with two configuration values, which
also can be passed on the command line.  Here's an example:

'Bivio::IO::Trace' = {
package_filter = '/Bivio/  !/PDF/',
call_filter = '$sub ne Bivio::Die::_new',
},

Here I want to see tracing from all packages with the word Bivio in
their names but not PDF, and I want to ignore individual calls from
the subroutine Bivio::Die::_new.  In practice, we rarely use the
call_filter, so from any bOP command line utility, you can say, e.g.,

b-release install my-package --TRACE=/Release/

which translates to:

'Bivio::IO::Trace' = {
package_filter = '/Release/',
},

You can set the call filter or any other configuration value from the
command line with --Bivio::IO::Trace.call_filter='$sub ne foo'

 We use LWP for testing.  For things like cookies and argument parsing, LWP 
 is great for regression testing.  For content, it is much harder to come 
 up with a pass/fail situation since the content can change, but still 
 possible.

You might want to check out Bivio::Test::Language::HTTP.  It parses
the incoming HTML, and allows you to write scripts like:

test_setup('PetShop');
home_page();
follow_link('Dogs');
follow_link('Corgi');
follow_link('Female Puppy Corgi');
add_to_cart();
checkout_as_demo();

This particular code does a number of things including validating that
animals are getting in the cart.  Additional script language is defined in
Bivio::PetShop::Test::PetShop, which subclasses
Bivio::Test::Language::HTTP, which provides follow_link and home_page
generically.

 I haven't found a better way to do web development testing durring 
 development.  Possibly writing the test first would provide some 
 improvement since you know when you have completed the change(see XP 
 docs).

I agree.  A very important practice is unit testing, especially with
large applications.  For an alternative to Test::More and xUnit, have
a look at Bivio::Test, which allows you to write tests that look like:

Bivio::Test-unit([
'Bivio::Type::DateTime' = [
from_literal = [
[undef] = [undef],
['2378497 9'] = ['2378497 9'],
['-9'] = [undef, Bivio::TypeError-DATE_TIME],
['Feb 29 0:0:0 MST 1972'] = ['2441377 0'],
['Feb 29 13:13:13 XXX 2000'] = ['2451604 47593'],
['1972/2/29 0:0:0'] = ['2441377 0'],
['2000/2/29 13:13:13'] = ['2451604 47593'],
['Sun Dec 16 13:47:35 GMT 2001'] = ['2452260 49655'],
],
from_local_literal = [
[undef] = [undef, undef],
['2378497 9'] = ['2378497 7209'],
['-9'] = [undef, Bivio::TypeError-DATE_TIME],
['Feb 29 0:0:0 MST 1972'] = ['2441377 7200'],
['Feb 29 13:13:13 XXX 2000'] = ['2451604 54793'],
['1972/2/29 0:0:0'] = ['2441377 7200'],
['2000/2/29 13:13:13'] = ['2451604 54793'],
],
],
]);

We can write a lot of tests very quickly with this module.  We don't
always do this, but every time we don't, we regret it and end up
writing a test anyway after figuring out that we still aren't
perfect coders. :-)

Yet another trick we use is executing a task from within emacs or on
the command line.  A task in bOP is what the controller executes
when a URI is requested.  For example,

b-test task login

There are two advantages to this: 1) you don't have to restart Apache
and go to another program (browser or crawler) and 2) you get the
stack trace when something goes wrong and you can type C-c C-e (in
emacs) to go right to the error.  We added this facility recently,
because we got tired of the internal server error restart loops.
They slow things down tremendously, and anyway, you often want to look
at the HTML to see if some thing has changed.  The output of 'b-test
task' is the resultant HTML and any mail messages that would be sent,
which you can then search on immediately directly in emacs without
first having to say Tools - View Source and get 

Re: General interest question: PDF contents handling in PostgreSQL.

2002-11-26 Thread Rob Nagler
Fabián R. Breschi writes:
 I wonder if using ModPerl and PostgreSQL there's any possibility to 
 resemble what in Oracle is called 'Intermedia', in this particular case 
 parsing/indexing content of PDF files inside PostgreSQL as  a LOB or 
 alternatively as a flat OS file with metadata parsed/indexed from it 
 into the RDBMS.

We use Intermedia and Postres on separate projects.  Oracle's PDF
parsing can be emulated with pdftotext.  You'll need a search engine.
Frankly, I'm not totally pleased with Intermedia.  It's indexer is
slow, and you have to re-optimize often.  This affects a bunch of
stuff related to the database, e.g., redo logs, which makes db
management more difficult.  If I had the time, I'd probably drop it. 

Rob





Re: How can I tell if a request was proxy-passed form an SSLserver?

2002-11-14 Thread Rob Nagler
John Siracusa writes:
 and that does the trick.  The full code for the module is at the end of this
 message.  But I still think this is an ugly hack, and I'd like to be able to
 do this using standard apache modules or config parameters...

Our hack is to forward 443 to port 81 on the middle tier:

VirtualHost 1.2.3.4:443
...
ProxyVia on
...
RewriteRule ^(.*) http://middle.tier.host:81$1 [proxy]

We set a value (is_secure = 1) on our internal request object when it
is initialized if the incoming port is 81.  We also set remote_ip with:

$r-connection-remote_ip($1)
if ($r-header_in('x-forwarded-for') || '') =~ /((?:\d+\.){3}\d+)/;

This makes the log entries useful.  There might be an easier way to do
this.

Rob





Re: [O] Re: Yahoo is moving to PHP ??

2002-11-04 Thread Rob Nagler
Perrin Harkins writes:
 Correct Perl style is probably not something that any two people will 
 ever agree on.

If you use Extreme Programming, the whole team has to agree.
Collective ownership, pair programming, and refactoring all suffer if
you don't have a common coding style.  The use of map, unless,
closures, eval, etc. needs to be discussed and agreed on.  It's a
sign of a weak team if you can't agree on these details.

 I've seen some hideous Java code at this place that really takes the
 wind out of the Java is more maintainable argument.

I thought all Java code is hideous. ;-)

Rob





Re: [OTish] Version Control?

2002-11-03 Thread Rob Nagler
Michael Schout writes:
 example, one time we upgraded Apache::Filter between releases. 
 Unfortunately, the old version was not compatible with the new version, 
 so a single machine could run either the current release branch, or the 
 development branch, but not both simultaneously (because Apache::Filter 
 was incomptaible and was installed under /usr/lib/perl5).

We are transitioning (slowly) between perl 5.005 and 5.6.1.  Our trick
is to have separate 5.005 and 5.6.1 build/test (and sometimes dev)
machines.  I'm not sure this solves your problem.

 1) some Makefile.PL's refuse to generate a Makefile if PREREQ_PM's are 
 not satisfied (if we haven't built them yet)

If we have to bootstrap, we do a regular CPAN install on the build
machine and then install over it with the RPM build.  Also, we use
Red Hat which has many CPAN modules already installed (see uninstall
instructions below), so bootstrapping is rarely an issue.

 2) some Makefile.PL's are INTERACTIVE, and you cant turn it off (e.g.: 
 Apache::Filter requires you to hit Return a number of times at a MINIMUM.

perl Makefile.PL  /dev/null works for us.  We encapsulate it in a
macro (see below).

 So we resorted to a set of overly-complicated GNUmakefiles that would 
 generate Makefile's from Makefile.PL's, and these would set PERL5LIB to 
 find the dependencies (e.g.: DBD-Pg would put ../DBI/blib into
 PERL5LIB).

Here's our spec file:

Name: perl.modules
Summary: Perl Modules not in stock RH7
Group: Perl/Modules
Provides: perl.modules perl-libwww-perl
Requires: perl
Version: 5.6
BuildRoot: install
%define modules BSD-Resource IO-stringy Digest-MD5 Digest-HMAC Digest-SHA1 MD5 
Crypt-IDEA Crypt-DES Crypt-Blowfish Crypt-CBC DBI DBD-Pg DBD-Oracle DBD-Sybase 
DBD-mysql TimeDate MailTools MIME-tools Devel-Symdump Image-Size Compress-Zlib 
Archive-Zip File-MMagic TermReadKey Crypt-SSLeay libwww-perl Parse-RecDescent 
Mail-Field-Received POP3Client Mail-2IMAPClient Test-Simple Time-HiRes Digest-Nilsimsa 
razor-agents Mail-Audit Mail-SpamAssassin XML-XPath httpmail

%description
Perl Modules not in stock RH7 or newer CPAN versions.

To remove RedHat standard installs, do:
rpm -e --nodeps $(rpm -qa | egrep 'perl-(DBD|DBI|libwww)')

If you want to use Sybase (SQL Server), you need:
b-release install freetds-0.53-1.i386.rpm

And to compile this, you need:
b-release install freetds-devel-0.53-1.i386.rpm

%prep
%{cvs} external/perl-modules-5.6

%build

cd external/perl-modules-5.6
unset PERL_MM_OPT
for f in %{modules}; do
(
if test $f = 'Crypt-IDEA'; then
export PERL_MM_OPT='POLLUTE=1'
elif test $f = 'DBD-Sybase'; then
export SYBASE=/usr
elif test $f = 'DBD-Pg'; then
export POSTGRES_LIB=/usr/lib  POSTGRES_INCLUDE=/usr/include/pgsql
fi
cd $f
%{perl_make}
)
done

%install
cd external/perl-modules-5.6
for f in %{modules}; do
(set -e; cd $f; %{perl_make_install})
done

cd $RPM_BUILD_ROOT
%{allfiles} ../files

%files -f files

%clean
[ $RPM_BUILD_ROOT != / ]  rm -rf $RPM_BUILD_ROOT

%pre
# Perl must be setup properly
perl -e 'require syscall.ph' 2 /dev/null || (
umask 022
cd /usr/include
h2ph -r -l .  /dev/null
)

The macros perl_make_install and perl_make are defined below.  We run
a program (Bivio::Util::Release mentioned in another post) which
generates the actual spec file and calls rpm. (%{allfiles} and %{cvs}
are trivial and defined there, too.)  This program also builds a
separate directory and defines topdir, etc. correctly so you can build
everything as any user.

sub _perl_make {
return
'%define perl_make umask 022  perl Makefile.PL  /dev/null  '
.  make POD2MAN=true\n
. '%define perl_make_install umask 022; make '
. join(' ', map {
 uc($_) . '=$RPM_BUILD_ROOT' . $Config::Config{$_};
} grep($_ =~ /^install(?!style)/
 $Config::Config{$_}  $Config::Config{$_} =~ m!^/!,
sort(keys(%Config::Config
.  ' POD2MAN=true pure_install  '
. ' find $RPM_BUILD_ROOT%{_libdir}/perl? -name *.bs '
.  -o -name .packlist -o -name perllocal.pod | xargs rm -f\n;
}

[Uh oh, there's that nasty map function. ;-]

Note that we don't install man pages.  This slows down the
build/install, and perldoc is just as easy to type as man. :-)

We use this same function for all our perl apps.  Indeed, to build a
new app, our specfile looks like:

Copyright: Logistics R Us, Inc.
Requires: Bivio-bOP apache ImageMagick-perl
%define perl_root LogisticalNightmare
%define perl_exe_prefix ln

_b_release_include('perl-app.include');

perl-app.include knows how to read our tree structure, which is
consistent across projects, and it installs all perl, programs,
images, views, etc.

 How does everyone else cope with this (managing trees of CPAN modules / 
   CPAN module tree build environments)? Maybe we are sort of unique in 
 that we use so many 3rd 

Re: [OTish] Version Control?

2002-11-01 Thread Rob Nagler
Dominic Mitchell writes:
 How do you cope with the problem that perl has of running different 
 versions of modules?

Actually, we have two problems.  One problem is developing with
multiple versions and the other is what you mention, running
production systems.

Sometimes I might be in the middle of some big refactoring, and a
customer calls with a problem.  I then do:

cd
mkdir src_bla
cd src_bla
cvs checkout perl/Bla perl/Bivio

where Bivio is our shared code.  Then I set PERLLIB to
~/src_bla. We've got a bash command that allows me to switch the
configuration and PERLLIB as well.  It's very easy to do.

Oh, and we *never* (almost :-) put code in programs.  The programs
invoke a *.pm file's main so we can say bla-some-command and always
get the right version.

We solve the second problem by buying cheap machines which run Linux
just fine.  (I just bought 4 x Dell 2300, 2 x Dell 1300, and 2 x white
box for $1800. $-)  It just isn't worth my time trying to make two
sites work on the same machine, although we do this in a couple of
cases (e.g. www.bivio.biz and petshop.bivio.biz).

When two or more sites do share the same machine, we always run the
same version of the infrastructure.  This avoids many problems,
e.g. running into defects twice and managing multiple versions.  We
don't tag our CVS.  We can backout changes with RPM.  We do several
releases a week on active applications, and one release a week on
applications in maintenance mode.

One final reason to avoid multiple versions is with schema changes.
The more different database versions you have, the more confusing.  On
bivio.com we upgraded the schema about 250 times in about two years.
It would have been impossible to keep the development, test, and
production if these three diverged too much.

Rob






Re: [OTish] Version Control?

2002-10-31 Thread Rob Nagler
Another approach which allows easy sharing between projects is:

  ~/src/perl/
+ Project1/
+ Project2/
+ Project3/

where Project[123] are root package names.  Set PERLLIB=~/src/perl and
you can get access to any *.pm in the system, each has a globally
unique name.  This makes it easy to implement cross-project
refactorings.

We use CVS for source management, and we use RPMs for deployment.
RPM allows you to ask what release of ProjectN is on the system.
RPM also allows you to manage permissions and ownership easily.
Our RPM spec builder/installer can be found at:
http://petshop.bivio.biz/src?s=Bivio::Util::Release

Rob





Re: Yahoo is moving to PHP ??

2002-10-30 Thread Rob Nagler
Perrin Harkins writes:
 The real application stuff is built in other languages.  (At least
 this is the impression I get from the paper and from talking to
 people there.)

I think Yahoo Stores is written in Lisp.  I also believe it handles
the front and back end.  Would be interesting to know why this was
left out of the discussion.

Rob





Re: Yahoo is moving to PHP ??

2002-10-30 Thread Rob Nagler
Tagore Smith writes:
 I think it would be harder to hire people to work on his system (of course
 you'd probably also get more experienced people, so that might not be such a
 bad thing).

This raises the $64 question: If you could hire 10 PHP programmers at
$50/hour or 4 Perl programmers at $125/hour, which team would deliver
more business value over the life of the site?

 Graham's system uses macros extensively, and from other code of his
 that I've read (Graham wrote a couple of books about Lisp), I'd bet
 that he uses recursion and mapping functions a lot as well.

His On Lisp book is a classic on macros--which are similar to closures
in Perl.  You can download it for free: http://www.paulgraham.com/onlisp.html

My guess is that Graham's answer to the above question would be:
Hire two Lisp programmers at $250/hour. :-)

Rob





Re: cobranding strategies?

2002-10-08 Thread Rob Nagler

Kirk Rogers writes:
 I'm looking to build cobranding capabilities into a mod_perl site and am
 looking for some documentation or guidelines to follow.  Anyone know of
 documentation that I can find.

We've had some pretty stringent requirements that led us to
indirecting all fonts, tagged text, colors, icons, and URIs.  It also
turned out to be handy for building up two independent sites in one
mod_perl server and handling skins for the same site.  This may be
overkill if you just want the custom-logo-in-the-corner cobrand, but
I suspect you are looking for more.

Have a look at http://petshop.bivio.biz/src?s=Bivio::UI::Facade
for the interface, and for an example facade
http://petshop.bivio.biz/src?s=Bivio::PetShop::Facade::PetShop

If you have any questions, mail me directly.

Rob





Re: [OT] - Mailing List Servers/mods .. etc

2002-09-26 Thread Rob Nagler

Jim Morrison [Mailinglists] writes:
 I'm wondering if there is any point in looking for a piece of third
 party software/module etc, that will handle the sending of the mail or
 should I work directly with sendmail? (Is sendmail the best mailserver
 for this kind of thing?)

sendmail has its problems, but I can send about 10K msgs/hour on a
low-end server (500Mhz).  It's good enough for most low-end mailing
list problems.

 I'd be happy to write something along the line of formail.pl on my own,
 so I kinda know what I'm doing, but I'm gonna have to take things like
 Return to sender errors and such into account..

Tough problem in general, which companies like experian, doubleclick
and returnpath.net spend lots of money on.  You need to know how to
parse this information without false positives.

 My question I guess is:
  - Is it ok to send 100's or 1000's of mails to sendmail in one go, or
 is there a better way of doing bulk mail?

I don't think you should worry about it right now.  sendmail can
handle the load.  You can always use an internal relay if you need to
distribute the load.  Hardware is cheap.

  - Are there any mods to help with dealing with returned mail etc..?

bOP has a C program called b-sendmail-http.[1]  It's a gateway from
sendmail to http.  We handle all mail through mod_perl.  You can use
b-sendmail-http with any HTTP implementation, because it simply wraps
the e-mail, client IP, and envelope to into multipart/form-data.[2]

  - Is there a good list of people doing this sort of thing? (Or do you
 mind the thread being a little off-topic!)

I like it, then my current project is in this space. :-)

 I don't think I'm trying to reinvent the wheel.. Just that I think there
 is so much of my own coding involved, I'm not sure if I'm going to be
 able to get away with anything less than writing it from scratch..

The code isn't complicated, but the detailed knowledge is.  There are
a number of mailinglist packages out there including ultimate bbs,
which is used by quite a number of sites.  We rolled our own, because
email is integrated with other apps (e.g. search, file sharing, and
group join).

Rob

[1] http://www.bivio.biz/f/bOP/bin/b-sendmail-http.c
[2] http://petshop.bivio.biz/src?s=Bivio::Biz::Model::MailReceiveBaseForm





RE: Linux + Apache Worm exploiting pre 0.9.6g OpenSSL vulnerabilities on the loose

2002-09-17 Thread Rob Nagler

Christian Gilmore writes:
 I believe the virus only affects systems pre-0.9.6e:
 http://www.openssl.org/news/secadv_20020730.

Also note that vendors may have retrofited older versions with the
patch.  For example, Red Hat still is at 0.9.5a  0.9.6b
(see http://rhn.redhat.com/errata/RHSA-2002-160.html for more info)

Rob





Re: bivio and mod_perl

2002-08-29 Thread Rob Nagler

zt.zamosc.tpsa.pl writes:
 Do many mod_perl programmers use bOP by bivio.biz  in their large projects?

At least 3. :-) We have a few downloads, but I doubt anybody is using
it for anything serious besides us.  (Others, please correct me
if I'm wrong.)

 Could you  share with your experience at working with it?

What is unique to bOP, which also is its weakness, is that we exploit
Perl to the max.  We avoid special syntaxes, such as XML, except for
input and output.  This means we get all the power of Perl in view
languages, acceptance tests, unit tests, etc.  This makes it hard for
anybody who is not a Perl application developer to build applications
in bOP, i.e. we function as designers and programmers--sometimes we
get help from graphic artists or writer.

 The documentation looks very very... promissing.

It works.  There is no design documentation for a variety of
reasons, so you have to be prepared to look at code and examples to
figure out how it works and how to use it.

bOP has been commercially deployed for years and evolves on demand,
e.g. the View language itself was only added relatively recently and
we just released our e-commerce component.

What we like is that we don't have to program very much to get a lot
done, but when we need to write ordinary Perl code, bOP helps us
instead of hindering us.

To me, there are two ways to use bOP: as an example or as a platform.
I think many people have looked at it, and rolled their own.
Infrastructure in mod_perl is *easy*.  It's the applications that are
the hard part (in any platform).  Any infrastructure has to match your
style or you have to be willing to adapt.  If you like learning or
already understand declarative programming, you may find bOP suits
your needs out of the box.

Rob






Re: [ANNOUNCE] Petal 0.1

2002-07-17 Thread Rob Nagler

Jean-Michel Hiver writes:
 My only problem deals with template caching. Currently Petal does the
 following:
 
 * Generate events to build a 'canonical' template file
 * Convert that template file to Perl code
 ** Cache the Perl code onto disk
 * Compiles the Perl code as a subroutine
 ** Caches the subroutine in memory

I wonder how much code you would save if you wrote the templates in
Perl and let the Perl interpreter do the above.

Sorry, I know this doesn't help you answer your question, but by
eliminating XML from the design, the debate about SAX vs XML::Parser
would be irrelevant.  Your code would run faster, and you would need
fewer 3rd party APIs.

Rob





Re: [ANNOUNCE] Petal 0.1

2002-07-17 Thread Rob Nagler

Jean-Michel Hiver writes:
  I wonder how much code you would save if you wrote the templates in
  Perl and let the Perl interpreter do the above.
 
 I recommend that you read this Page:
 http://www.perl.com/pub/a/2001/08/21/templating.html?page=2

Please read the Application Servers section of:

http://www.bivio.biz/hm/why-bOP

 I'm an OO-advocate, I believe in proper separation of logic, content and
 presentation

Moi aussi.  What does this have to do with using Perl for business
logic and presentation logic?

 and on top of that I want people to be able to edit
 templates easily in dreamweaver, frontpage, etc
 and send templates thru
 HTML tidy to be able to always output valid XHTML.

If you are an OO-advocate, you would hide the presentation format in
objects, e.g. Table, String, and Link.  This ensures the output is
valid through the (re)use of independently tested objects.  Objects
also provide a mechanism for overriding behavior.

 Petal lets me do that. If that's not of any use to you, fine. The world
 is full of excellent 'inline style' modules such as HTML::Mason,
 HTML::Embperl and other Apache::ASP.

These all work on the assumption that the template is written in HTML.
If you start with OO Perl, you do not inline anything, not even
the HTML.  Here is an example page:

http://petshop.bivio.biz/items?p=RP-LI-02

And here is the HTML-less source:

http://petshop.bivio.biz/src?s=View.items

Apologies to those who are tired of the *ML vs. Perl debate.

Rob





Re: [OT] Better Linux server platform: Redhat or SuSe?

2002-07-03 Thread Rob Nagler

David Dyer-Bennet writes:
 Obviously hardware RAID will save CPU cycles somewhat, and SCSI disks
 of the right type will increase IO bandwidth somewhat, but if you're
 not short of those things and still want the added security of
 mirroring, I think the software RAID is a viable option.

Harware RAID is usually hotswappable, which is quite nice.

Rob





Re: Is mod_perl the right solution for my GUI dev?

2002-06-25 Thread Rob Nagler

Fran Fabrizio writes:
 - Real-time data updates.  HTTP is stateless: it serves up the page then 
 closes the connection.   Any updating involves a round-trip back to the 
 server.  In traditional GUI, you just hold a db connection and repaint 
 the areas that are updated.

Solved with refresh?  JavaScript and Java can also help here.
For interactivity, check out:

http://www.cs.brown.edu/people/dla/polytope/tetra.html

 - State maintenance.  Since it is stateless, you have to jump through a 
 lot of hoops to realize that two requests are coming from the same 
 person, since they could be handled by two different child processes or 
 even two different servers.  This has all sorts of ramifications on user 
 login, user preferences, where the user was in the application, etc... 
 you have to do a lot of work on the server side to realize that it's the 
 same client that keeps talking to you.

Cookies work fine.

 - Fancy interface widgets/layouts.  HTML/CSS/JavaScript/DHTML can only 
 get you so far.  If you need fancy menu types, forms, layouts, etc... it 
 quickly becomes tedious/impossible on the web.

Tedious is questionable.  Impossible, I seriously doubt.  Remember,
you can always delegate part of your screen to a Java applet, although
I strongly recommend you avoid this.

 This is just the tip of the iceberg.  

Let's talk about the positives:

+ You update the server and instantly all clients are up-to-date.

+ You can detect incorrect usage, bugs, etc. by parsing a single log
  file, in real-time

+ The system is immune to operate system upgrades.  And DLL hell on
  Windows boxes.

+ You access the system from anywhere reliably and securely.  You
  don't have to open up a database connection to anybody but the Web
  server(s).

+ There is only one version of the software.

+ Support people can view the output sent to the client exactly as
  the client received it.  Including following a series of actions.

+ The use of a Web browser is familiar to most users.

+ The user can keep multiple views of the pages she wants, not what the
  application decides to offer.

+ Bookmarks allow users to structure their view of the application.
  Advanced users can create new organizations (short cut pages) for
  themselves and their co-users.

+ Users can share information easily (send page by email, mail
  bookmarks, print page, save to disk, save picture, etc.)

I'm sure others will add to the list.

Rob
  





RE: mod_perl/passing session information (MVC related, maybe...)

2002-06-13 Thread Rob Nagler

Vuillemot, Ward W writes:
 I log into your web-site as memberA.  You kindly leave me a delicious cookie
 with my username stored in it.  Maybe even my password (I hope not!).  Now,
 I know that another member, memberB, has special rights to your site.  What
 is stopping me from editting the cookie to memberB's username and hijacking
 their account?

If you can crack Blowfish, IDEA, etc., you are in.  Then again you can
probably just sniff the network for memberB's username and everybody
else's passwords for that matter, even via SSL.

Part of bOP is multi-tiered security architecture including something
I call data gateways to help protect against programmer mistakes.

 And if you do store the password information in the
 cookie...you are letting each user be compromised either as the cookie is
 flung through the Internet ether, or minimally on their own computer where
 someone else can easily access the cookies.

If you have access to someone's cookie file, you probably can log
their keystrokes.  Contact your local spy agency for more information
on how to do this.

 With sessionID, you have an ID and information that is checksum'd.

Sessions and user IDs are equivalent.  They are called credentials
which allow access to a system.  There's no fundamental difference
between hijacking a session or stealing a user id/password.

 If I wanted to delete a user and ensure they immediately lost all access, it
 is rather trivial to go through all active sessions in the db, see if the
 user I am deleting matches the username in the session information, and if
 so delete the session record.

Denormalization is the root of all evil.  The extra step involves more
code, more bugs, and more system resources.  Other than that, you're
right.  You can do this, but the question I ask: Do you need to?

Rob






Re: separating C from V in MVC

2002-06-13 Thread Rob Nagler

Dave Rolsky writes:
 Trying to jam a thick layer of OO-goodness over relational data is asking
 for a mess.

Most OLTP applications share a lot in common.  The user inputs data in
forms.  The fields they edit often correspond one-to-one with database
fields, and certainly their types.  The user wants reports which are
usually closely mapped to a table/view/join, i.e. an ordered list of
tuples.

A reasonable O/R mapping can solve this problem easily.  Like Perl, it
makes the easy things easy and the hard things possible.  The bOP Pet
Shop demostrates how you can build a simple application with only a
couple of custom SQL queries.  The rest are simple joins and CRUD.  If
you need more complex queries, there are escapes.  You still probably
end up with a list of tuples for your reports.  The key we have found
is avoiding indirection by naming fields and models the same in SQL
and Perl objects.  This allows you to seamlessly switch between the
two.

We've found the O/R mapping to be an indispensable part of the
system.  Since all data is contained in objects, the views/widgets
don't need to how the data is populated.  They access all data through
a single interface.

Rob





Re: mod_perl/passing session information (MVC related, maybe...)

2002-06-13 Thread Rob Nagler

Perrin Harkins writes:
 My preferred design for this is to set one cookie that lasts forever and 
 serves as a browser ID.

I like this.  It's clean and simple.  In this sense, a browser is not
really a session.  The only thing I don't like is garbage collection.

 unique browser ID (or session ID, if you prefer to give out a new one 
 each time someone comes to the site) lets you track this for 
 unregistered users.

We call this a visitor id.  In the PetShop we have a cart id, but
we're not too happy with the abstraction.

 I don't see that as a big deal.  You'd have to delete lots of other data 
 associated with a user too.  Actually deleting a user is something I've 
 never seen happen anywhere.

We do.  Especially when we went from free to fee. :-(  The big issue I
have with session data is that it is often a BLOB which you can't
query.

 Well, eToys handled more than 2.5 million pages per hour, but caching 
 can be important for much smaller sites in some situations.

I'd like numbers on smaller and some. :)

 Here's a situation where a small site could need caching:

We cache, too.  An interesting query is the club count on
bivio.com's home page.  The count of clubs is a fast query, but the
count of the members is not (about 4 seconds).  We compute a ratio
when the server starts of the members to clubs.  We then run the club
count query and use the ratio to compute the member count.  We restart
the servers nightly, so the ratio is computed once a day.

 Maybe I just have bad luck, but I always seem to end up at companies 
 where they give me requirements like these.

It's the real world.  Denormalization is necessary, but only after you
test the normal case.  One of the reasons I got involved in this
discussion is that I saw a lot of messages about solutions and very
few with numbers identifying the problem.

Rob





Re: separating C from V in MVC

2002-06-13 Thread Rob Nagler

Dave Rolsky writes:
 The Pet Shop has a grand total of 13 tables.
 
 How well does this approach work with 90 tables?

Works fine with bivio.com, which has 50 tables.

 How does it handle arbitrary queries that may join 1-6 tables,
 with conditionals and sorting of arbitrary complexity?

The ListModel can override or augment its query.  You can load a
ListModel from an arbitrary data source as a result.  After the load,
it can fix up rows, e.g. computing percent portfolio is not done in
SQL but in Perl in internal_post_load_row().

The automatic sorting is handy for simple joins.  For complex
queries, there's no fully automatic solution for sorting.

Here's a simple query: http://petshop.bivio.biz/pub/products?p=DOGS
The ListModel declares which columns are sortable:

order_by = [
'Product.name',
'Product.product_id',
],

The view doesn't need to say anything, because the Table widget
queries the ListModel meta-data.  The SQL query is dynamically
constructed by the o HTTP query value.

For complex queries, you may be able to take advantage of the sort
infrastructure. There are no guarantees, but you have the rope.

The software is designed for the 80% solution.  As we see patterns
develop in our code, we add general cases to the infrastructure.

 I'm not a big fan of O/R.  I prefer R/O.  But to each their own.

I guess we do R/O in the sense that we design the database
relationally and then map PropertyModels one-to-one with the tables.
Is that what you mean by R/O?

Rob





RE: mod_perl/passing session information (MVC related, maybe...)

2002-06-12 Thread Rob Nagler

Jeff AA writes:
 An advantage of the session/id is that you end up with stateful query
 instances,

Stateful instances are also problematic.  You have essentially two
paths through the code: first time and subsequent time.  If you write
the code statelessly, there is only one path.  Fewer bugs, smaller
code, less development.

Sessions are caches.  Add them only when you know you need them.

 and can remember [at least for a short period!] the total
 number of items, so that you can say 'Results 1 to 10 of 34,566' without
 having to count all results every time.

Maybe this is just because we are using Oracle, but if you do a query:

SELECT count(*) FROM bla, bla...

followed up by:

SELECT field1, field2, ... FROM bla, bla...

Oracle will cache the query compilation and results so it is very fast
(basically a round-trip to database server) for the second query.
We execute these two queries on every paged list on every request.

One of the advantages of a declarative OR mapping is that you can do
things like sort to select asfields and order queries consistently.
Oracle takes advantage of this.  I don't know of mySQL or Postgres do,
too, but they probably will someday.

It's a bit slow (seconds) with Oracle's Context engine, which we've
been considering replacing.  Most of our queries are not text searches
iwc Oracle queries take less than 20ms per query.

We're not a large site (peak 50K views/day), and we have enough
hardware (two front ends, two middle tier, one db).  Our smaller sites
(e.g. bivio.biz) run on minimal hardware and use Postgres.  They use
the same code, and it seems to work fine.

Rob





Re: mod_perl/passing session information (MVC related, maybe...)

2002-06-12 Thread Rob Nagler

Perrin Harkins writes:
 I find you can tie this cache stuff up inside of your data access 
 objects and make it all transparent to the other code.

Absolutely.

 A session is useful for very limited things, like remembering if this 
 user is logged in and linking him to a user_id.

We store this information in the cookie.  I don't see how it could be
otherwise.  It's the browser that maintains the login state.

Consider the following scenario:

* User logs in.
* Site Admin decides to delete the user.
* In our stateless servers, the user_id is invalidated immediately.
* Next request from User, he's implicitly logged out, because the user_id
  is verified on every request.

In the case of a session-based server, you have to delete the user and
invalidate any sessions which the user owns.

 Although Oracle can be fast, some data models and application 
 requirements make it hard to do live queries every time and still have 
 decent performance.  This is especially true as traffic starts to
 climb.

I've tried to put numbers on some of this.  I've never worked on a
1M/day site, so I don't know if this is the point where you need
sessions.  What sites other than etoys needs this type of session
caching?

Rob





Re: separating C from V in MVC

2002-06-11 Thread Rob Nagler

Matt Sergeant writes:
 There's quite a few things that are a lot harder to do with XML in
 plain perl (especially in SAX) than they are in XSLT.

This assumes you need XML in the first place.

It's trivial to manipulate Perl data structures in Perl.  It's
also easy to manipulate XML in Perl.  However, it's impossible(?) to
manipulate Perl data structures in XSLT.

Rob





Re: [OT] MVC soup (was: separating C from V in MVC)

2002-06-06 Thread Rob Nagler

Bill Moseley writes:
 Anyone have links to examples of MVC Perl code (mostly controller code)
 that does a good job of M and C separation, and good ways to propagate
 errors back to the C?  

I humbly (do believe that ;-) submit http://petshop.bivio.biz
Every page contains the control logic which is dynamically parsed from
the Task configuration.  Here's an example:

http://petshop.bivio.biz/pub/products?p=BIRDS

The configuration for this task is:

[qw(
PRODUCTS
500
GENERAL
ANYBODY
Model.ProductList-execute_load_all_with_query
View.products
)],

The name of the task which is used for all internal linkages is
PRODUCTS.  The number is a convenience for FormContext, i.e. our
closure mechanism for holding state between HTTP forms.

The realm is GENERAL, i.e. there is no particular owner.  You might
have a USER realm or CLUB (group) realm, which have owners.

Permission bit is ANYBODY.  You can have multiple permission bits,
e.g. DATA_WRITEDATA_READ.

The rest of the list are items which are executed serially.  The
syntax is ClassMap.Class.  A class map allows you to configure
where your models are loaded from.

Here's another example:

[qw(
LOGIN
517
GENERAL
ANYBODY
Action.UserLogout
Model.UserLoginForm
View.login
next=CART
MISSING_COOKIES=MISSING_COOKIES
)],

The '=' elements (which is not strictly perl, but hey, we all have
are inconsistencies ;-) map events to other tasks.  For example, if
you get a MISSING_COOKIES exception you go to the MISSING_COOKIES
task.  next=CART says that the next task on an OK on the form is the
CART task.

All tasks can be found in
http://petshop.bivio.biz/src?s=Bivio::PetShop::Agent::TaskId

This is all you need to know about the controller if you use bOP.  You
list your tasks and bOP's Agent does the rest.  BTW, the tasks might
be executed via e-mail or HTTP or the command line.  The controller
abstracts this away, too.  (We actually removed our Bivio::Agent::Mail
implementation, because it made more sense to implement everything via
Apache instead of custom servers.)

The interface for Views, Actions, and Models is called execute.
You'll be passed a Bivio::Agent::Request object which holds the
context for the transaction.

Rob







Re: separating C from V in MVC

2002-06-05 Thread Rob Nagler

Andy Wardley writes:
 Because Perl is a general purpose programming language.  TT implements
 a general purpose presentation language.  A different kettle of fish
 altogether.

These are the reserve words of TT:

GET CALL SET DEFAULT INSERT INCLUDE PROCESS WRAPPER 
IF UNLESS ELSE ELSIF FOR FOREACH WHILE SWITCH CASE
USE PLUGIN FILTER MACRO PERL RAWPERL BLOCK META
TRY THROW CATCH FINAL NEXT LAST BREAK RETURN STOP 
CLEAR TO STEP AND OR NOT MOD DIV END

Looks an awful lot like the same keywords in any general-purpose
programming language.

 It's like asking why XML has different syntax and semantics from
 Perl.

Well, if you read the XSLT spec and then look at an XSLT program,
you'll see a lot of verbosity and a lot of general purpose constructs
like variables, conditionals, and loops.  I haven't done much with
XSLT, but I do know you can get it in an infinite loop.  That seems
pretty general purpose to me.

I think the rule is: if you can solve Towers of Hanoi in the language,
its general purpose enough.  True formatting languages, such as,
Scribe do not contain general-purpose constructs, so you couldn't
solve the Towers of Hanoi.  HTML is another good example (ignoring
script).

 I find it easier to have a little language which is tailored to the task
 at hand.

Let's separate syntax from semantics.  You can use Perl syntax very
easily without adopting the semantics for the little language
constructs.  For example, here's a bOP configuration file:

{
'Bivio::Ext::DBI' = {
database = 'petdb',
user = 'petuser',
password = 'petpass',
connection = 'Bivio::SQL::Connection::Postgres',
},
'Bivio::IO::ClassLoader' = {
delegates = {
'Bivio::Agent::TaskId' = 'Bivio::PetShop::Agent::TaskId',
'Bivio::Agent::HTTP::Cookie' =
'Bivio::Delegate::PersistentCookie',
'Bivio::UI::HTML::FormErrors' =
'Bivio::PetShop::UI::FormErrors',
'Bivio::TypeError' = 'Bivio::PetShop::TypeError',
'Bivio::Auth::Support' = 'Bivio::Delegate::SimpleAuthSupport',
},
maps = {
Model = ['Bivio::PetShop::Model', 'Bivio::Biz::Model'],
Type = [ 'Bivio::PetShop::Type', 'Bivio::Type'],
HTMLWidget = ['Bivio::PetShop::Widget',
'Bivio::UI::HTML::Widget', 'Bivio::UI::Widget'],
Facade = ['Bivio::PetShop::Facade'],
Action = ['Bivio::PetShop::Action', 'Bivio::Biz::Action'],
TestLanguage = ['Bivio::PetShop::Test'],
},
},
'Bivio::UI::Facade' = {
default = 'PetShop',
},
'Bivio::UI::Text' = {
http_host = 'petshop.bivio.biz',
mail_host = 'bivio.biz',
},
};

You could use XML, Lisp, or some other syntax for this.  Since the
implementation of the configuration parser is in Perl, we use eval as
the config parser.  When I program in Lisp, I use Lisp syntax for
config and eval for the parser again.  The syntax is different, but
the semantics probably are the same.

Perrin Harkins writes:
 The thing that worries me about a widget approach is that I would have 
 the same problem I had with CGI.pm's HTML widgets way back: the 
 designers can't change the HTML easilly.  Getting perl developers out of 
 the HTML business is my main reason for using templating.

I think this is where our experience diverges.  I have hired designers
before and every time we had to recode the HTML and the JavaScript
anyway.  My approach is to apply the Once And Only Once principle,
which simplifies design changes (discussed more below).

Andy Wardley writes:
 This is abstraction.  Not to be confused with MVC which is one particular
 architecture well suited to GUI applications.  Blindly applying MVC without
 understanding the real issues (abstraction of front/back ends, separation of 
 concerns, don't repeat yourself, etc.) is likely to build a system which is 
 highly fragmented.  Maintenance becomes harder because everything is split 
 up into many different pieces and it becomes difficult to see the wood for 
 the trees.

If you apply Once And Only Once extemely, you'll find that MVC is a
nice fit for just about any information system.

 Despite our best intentions, this web site doesn't neatly fall into 
 clearly defined chunks of model, application and view.  Well, actually,
 those parts do split down quite nicely.  But then you look at localisation,
 for example, and we find there is localisation required in the data backend, 
 localisation required in the applications and localisation required in the 
 templates.  Thus, localisation is an aspect which cuts across the system.
 By building a strict MVC we've fragmented localisation and have to trawl
 through hundreds of different files to localise the site.

To solve this problem, we added a letter.  bOP is MVCF, where F stands
for Facade.  A Facade allows you to control icons, files, colors,
fonts, text, and tasks.  You can 

Re: separating C from V in MVC

2002-06-03 Thread Rob Nagler

Perrin Harkins writes:
 You can actually do that pretty comfortably with Template Toolkit.  You
 could use a filter for example, which might look like this:
 
 [% FILTER font('my_first_name_font') %]
 ... some text, possibly with other template directives in it...
 [% END %]

One of the reasons Perl is popular is its idioms.  Having to say
something in three lines is not as idiomatic as one line.  It takes a
lot of discipline to use it everywhere.  In other words, I don't think
the above is more comfortable than:

String(['User.first_name'], 'my_first_name_font');

Note also the accessor for User.first_name in Template Toolkit is
probably nontrivial.

Rob





Re: separating C from V in MVC

2002-06-03 Thread Rob Nagler

Perrin Harkins writes:
 The advantage is that my example can contain other templating code:
 
 [% FILTER font('basic_info_font') %]
Hello [% User.first_name %]!BR
[% IF User.accounts %]
   You have these accounts:BR
   [% FOREACH User.accounts %]
 [% name %]: [% balance %]BR
   [% END %]
[% END %]
 [% END %]
 
 Unless I'm missing something about your example, the FILTER concept 
 seems more powerful.

[Skirting on the edge of YATW. :-]

I think they are equivalent as far as power.  I go back to why
people use Perl, because it makes the easy jobs easy and the hard jobs
possible.  All programming languages are Turing Complete, but we don't
like programming Turing Machines.

Here's your expanded example in widgets:

String(Prose('EOF'), 'basic_info_font');
Hello String(['Model.User', 'first_name']);!br
If(['Model.AccountList', '-get_result_set_size'],
Join([
You have these accounts:br,
Table('Model.AccountList', [
'name',
'balance',
]),
]),
);
EOF

The Table widget will print a table with headings defined by the
Facade (our term for skin).  The widgets for name and balance are
looked up dynamically.  balance will be right adjusted.  Unless I
missing something, the template example won't align properly in HTML.
This is a significant semantic difference between FOREACH and
Table.

Would you expand on the example so that name and balance are columnar?

Rob





RE: separating C from V in MVC

2002-05-31 Thread Rob Nagler

Jeff AA writes:
 space and that column 5 which contains a possibly long name should 
 use the remaining available space, whilst column 1 which contains
 a name should not be wrapped?

We call this a Grid widget in our framework (bOP).  There are many
options: http://petshop.bivio.biz/src?s=Bivio::UI::HTML::Widget::Grid
and here's an example use: http://petshop.bivio.biz/src?s=View.menu

Rob





Re: separating C from V in MVC

2002-05-31 Thread Rob Nagler

Barry Hoggard writes:
 Do you have a favorite approach for writing the Model objects?

One solution is to create an interface for accessors, i.e. get,
which the views call on objects they need to access.  Our controller
and model objects share this same accessor interface, which allows
the views to access control and database values the same way.

For example,

  vs_form_field('UserAccountForm.RealmOwner.name', {},
[['-get_request'], 'task_id', '-equals_by_name', 'USER_ACCOUNT_CREATE'])

The first parameter to vs_form_field identifies the RealmOwner.name
field of FormModel UserAccountForm.  The second parameter contains
optional attributes.  The third param defines a conditional which
says: only display this row if the view is being rendered in the
USER_ACCOUNT_CREATE task.  We use the same view in the user account
register and edit tasks.

The view doesn't allow you to change your User ID in edit mode.  The
business logic doesn't allow edits either, but you still have to
control the visible state.  You could do that with a model, but that's
denormalization.  Rather than copying state, we go directly to the
source, the request object.  The Request is not a model, but an
ordinary Perl object, which implements the WidgetValueSource
interface.

Originally, we didn't have this clear separation of WidgetValueSource
and Model.  That change really helped us in the view code.  There are
other WidgetValueSource objects (formatters, icons, etc.) and the
views access the data in the same way.

Perhaps you can accomplish this with hash references, but I find that
involves a lot of copying.  Having a method call to an object allows
the object to control the behavior, e.g. dynamically computing
values.  Not coupling it to a heavier Model interface gives you a lot
of flexibility.  For the most part, all the views want is the values.

Rob

P.S. Nice to meet you, Barry.





Re: separating C from V in MVC

2002-05-31 Thread Rob Nagler

Perrin Harkins writes:
 That's exactly what I'm saying, except that I don't see what your 
 layout manager is for.  You should just pass some data to a template 
 (or maybe to something fancier for certain types of output like Excel) 
 and then the template makes all the decisions about how to show that data.

The layout manager is an important tool, which doesn't fit in with the
template model.  It comes from widget-based GUI toolkits like Java,
Tk, etc.

Layout managers accept a declaration ('this cell is northwest', 'this
other one expands across the bottom', etc.).  They interpret the decl
at run-time.  It's like HTML, but more declarative.  Some attributes
of our Grid manager are:

  cell_nowrap - don't wrap the text in the cell
  cell_align - gens valign and align from a single compass direction
  cell_expand - this cell eats up the rest of the columns in the row
  row_control - a conditional value to control whether row is displayed
  cell_width - contains the HTML width= value (may be dynamic)

With Java's GridBag and other layout managers, you relate the cells in
some way and the layout manager does the right thing.  Since this
particular layout manager is HTML, we relate the cells in a row-major
matrix.  Since it's Perl, it's compact.  Here's a simple example:

Grid({
string_font = 'page_text',
pad = 5,
values = [
[
String(Join([
'Please confirm that the following data is correct '
.'and press the bContinue/b button to ship the '
.'order',
])),
], [
String('Billing Address', 'page_heading'),
], [
$address_widget('bill_to', 1),
], [
String('Shipping Address', 'page_heading'),
], [
$address_widget('ship_to', 2),
], [
ImageFormButton({
image = 'continue',
field = 'ok_button',
alt = 'Continue',
}),
],
],
}),

Rob





Re: separating C from V in MVC

2002-05-31 Thread Rob Nagler

Perrin Harkins writes:
 The same template?  How does the layout manager help with that?  Does it 
 modify the template?  It would make more sense to me if this were a sort 
 of abstraction to factor out common layout ideas from multiple
 templates.

I think we're miscommunicating.  I'm talking widgets, and you're
talking templates.

A layout manager is a bit of a red herring in mod_perl.  I was simply
trying to explain how they came to be and why they make sense.  In
GUIs, the layout manager is responsible for placement when the window
is resized.  In mod_perl, it plays a lesser role, because the browser
does most of the work (thank goodness).

Templates and widgets are pretty much the same thing (see discussion
at end).  It's how you use them that makes a difference.  We have a
String widget.  You could just as well make a string template.  It's
not natural in template languages to wrap every piece of text in a
string template, however.

A String widget/template allows you to control the rendering of all
fonts dynamically.  If the String widget/template sees the incoming
request is from IE5+, it doesn't render the font if the font is the
same as the default font.  The Style widget/template renders the
default font in a style if the browser is IE5+.  This avoids the
stylesheet bugs in all other browsers and gives 90% of your users who
are running IE5+ a much lighter weight page.

It's cumbersome to do wrap all text in string templates, because the
calling mechanism is verbose.  Most template languages I've looked at
only support named parameters.

Widgets can have named parameters, e.g.

String({
value = ['User.first_name'],
string_font = 'my_first_name_font',
});

but it is much more convenient to use positional notation:

String(['User.first_name'], 'my_first_name_font');

The way I like to think of this is that HTML corresponds to machine
language.  Templates correspond to assembly language.  Widgets
correspond to structured programming.  You can program everything in
assembly language, it's just more cumbersome.  This is why people
invented macro assemblers, but there is still a significant difference
between building a system in C or Macro-11.

This is why a layout manager is a natural concept to me.  It's a
widget which does something with the results of other widgets.  What's
cool about HTML is that you can do this post draw, i.e., after the
widget renders a child, it can look at the result to determine its
next action.  For example, the String widget can escape the output of
its child.  I haven't seen this in template languages and rarely in
GUI toolkits.

Rob





Re: schedule server possible?

2002-04-29 Thread Rob Nagler

 But I will need a thread that processes the backend stuff, such as
 maintaining the database and message queue (more like a cron). Is
 this configuration possible?

You can do this now.  We rely on cron to kick off the job, but all
the business logic is in Apache/mod_perl.  The advantage of using cron
is that it has rich support for scheduling.

Rob






@DB::args not working on 5.6.1 and 1.26

2002-04-19 Thread Rob Nagler

It seems that DB::args is empty on mod_perl 1.26 and perl 5.6.1.
This is stock Red Hat 7.2 (apache 1.3.22).  The code which references
DB::args works in perl 5.6.1.  It also appears that the failure only
occurs after the perl restarts.  The first time Apache loads mod_perl,
DB::args is being set correctly.

I assume that DB::args isn't empty running under PERLDB, but I
haven't tried this.  The use of DB::args is not for debugging, so I
can't use Apache::DB.

Anybody else seeing this?

Thanks,
Rob





Re: [OT] Encrypting Embedded URLs

2002-04-18 Thread Rob Nagler

Nigel Hamilton writes:
 http://www.foo.com?params=aJHKJHKJHKJHHGHFTDTDGDFDFGDGHDHG879879
 
   A built-in checksum would be a bonus ... any ideas?

You can use any of the Crypt::CBC ciphers.  We then use a modified
MIME::Base64 encoding which is more compact than encrypt_hex
and doesn't require a subsequent escaping for URI specials.  See
http://petshop.bivio.biz/src?s=Bivio::MIME::Base64 for the simple
algorithm (the error checking hack on MIME::Base64::decode may no
longer be necessary with newer versions of MIME::Base64).

Rob





Re: Apache::File correction

2002-04-12 Thread Rob Nagler

 undef $/;   # enable slurp mode

I think the local is pretty important, especially in mod_perl:

local $/;

This has the same effect (the undef is unnecessary).  It's also a
good idea to enclose the code in a subroutine with error checking:

sub read_file {
my($file) = _;
open(FH,  $file) || die(error opening $file: $!);
local($/);
my($content) = FH;
close(FH)  defined($content) || die(error reading $file: $!);
return \$content;
}

Rob





Re: [OT-ish] Session refresh philosophy

2002-02-20 Thread Rob Nagler

Hans Juergen von Lengerke writes:
 Why not put everything in one field? Are there restrictions? Does it
 make a difference when using POST?

That's what we do.  There doesn't appear to be a restriction with
POST.

For while, we were encoding entire forms in URLs, but the limits got
to us for really large forms.

Rob



Re: [OT-ish] Session refresh philosophy

2002-02-20 Thread Rob Nagler

[EMAIL PROTECTED] writes:
 Looking at CGI::EncryptForm that Perrin mentioned, it appears that that
 module would address this concern by storing client-side in a single
 encrypted string that gets put in one hidden form variable. That also
 avoids having to verify more than once.

It is always good to validate the data even if it was encrypted.  It
is also generally a good idea not to give the user any secrets, even
if they are encrypted.  In other words, avoid trusting the user.

[EMAIL PROTECTED] writes:
 No, this just means that input must be validated once again when the
 last «really, really sure ?» button is depressed. Conceptually, this
 divides the pages of your site into two categories (not unlike the
 view vs. controller distinction in Model-View-Controller paradigm for
 GUIs): those that just interact with the user and do the navigation,
 and those that actually have side effects such as writing data into your
 database, sending e-mails, placing orders etc.

It is MVC.  However, instead of thinking of pages, I like to think in
terms of tasks.  The same task that renders the form also validates
and executes it. In the case of execution, the result is a redirect
described by the site's state machine.  A form in our world has four
states: execute_empty (fill in defaults), execute_ok, execute_other
(e.g., cancel or sub form), and execute_unwind (coming back from a sub
form).  All of these paths go through the same task.

Rob



Re: [OT] MVC and web design

2002-02-20 Thread Rob Nagler

___cliff rayman___ writes:
 please take this as interested and not critical.  i was viewing the source:
 http://petshop.bivio.biz/src?s=View.items

Criticism welcome.  I hope you don't mind the rant below.

 and i noticed these lines:
 
 - snip  a
 ])-put(
 cellpadding = 2,
 cellspacing = 2,
),
 - snip -
 
 this looks like the presentation layer peeking through.

The view components are all presentation.  I didn't mention that the
framework is actually MVCF, were the F stands for Facade.  The server
that runs http://petshop.bivio.biz also run http://www.bivio.biz
The pet shop facade is:

http://petshop.bivio.biz/src?s=Bivio::PetShop::Facade::PetShop

and the www facade is something different, and not visible from the
petshop facade.  A facade in bOP controls the entire look and feel.
In the case you pointed out, it might be a good idea to put the
cellspacing and cellpadding in the facade, too.  It was just laziness.

 the petshop site is obviously a demo, and therefore does not have
 the polished look of a professional site, which is very
 understandable.  what i wonder is, could a professional web design
 team make a polished website without involving the programmers?

Well, I guess it depends on what you mean by WebSite and programmers.
I think of the pet shop as an application, not a WebSite.  The same
argument would apply for GUI desktop applications.  Are you a
programmer if you use JBuilder or PowerBuilder? I think so.  Are you a
programmer if you build a WebSite with ColdFusion or PHP?Again, I
think so.

If you are a programmer, then you need to know how to program.  I
don't see anything hard about programming Perl in a constrained
environment if you are a website designer/programmer.  Structure is
important in most WebSites, and all web-delivered applications imiho.

If you just want to do layout, there are many tools which are much
better than an HTML editor, e.g., Photoshop.  Once the layout is
complete, you give it to coders who encode it in whatever language
is best for the application delivery mechanism.

 what happens when a cell padding of 3 is more desirable for the
 design?

The designer modifies the source in CVS, tests it, and checks it in.

 it seems to me, that in all of the technologies i have
 looked at thus far, that attempt to separate the presentation layer
 from the model/view, the precision and flexibility needed to
 graphically communicate to the user is more difficult that the
 standard pagedesign approaches (dreamweaver and a little embperl or
 other embedded language thrown into the mix) .  phrased another way,
 how does bivio or other mvc technology, let web artists design sites
 as beautiful as http://www.marthastewart.com or the even more
 beautiful http://www.genwax.com (cheap plug)?

rant
Ah, that is the question.  The answer is beauty is in the eye of the
user.   I work with a lot of sites at the technical level, and I'm
continually may amazed at the low quality of the sites from a user
perspective.  Let's take Martha Stewart (please ;-) and visit your
account.  For your info, the link is:

http://www.marthastewart.com/page.jhtml;jsessionid=4HVBOQCWGUVEHWCKUUXCIIWYJKSS0JO0?type=page-catid=cat688

This is a good example of the business logic creeping in to the UI.
What do I care if Martha programs in Java.  What happens to her users'
bookmarks if she switches to C#, or heaven forbid Perl? In bOP, you
can have any link you want associated with a task on a per Facade
basis.  In fact you can have multiple links pointing to the same task.
Look at the links in http://www.bivio.com/demo and see if they make
sense to you.  We have some pretty advanced users, who take our links
and embed them in custom home pages in their files area (which is
browsable unlike most groupware sites).  We can maintain backward
compatibility forever.

Now when I come to the page on Martha's site which asks me to login
it's very pretty and weighs in at 45KB without counting the rose
(32KB).  I can't login here, because there are no form fields.  The
rose is very pretty though (did I say that already?).

Many of our users still connect to us with AOL at 26kbps with
60mhz/32MB boxes.  They definitely appreciate the fact that most or
pages are under 20KB.  Only because our pages are programmed that way.

In summary, I buy into the minimalist approach of Nielsen.  Visit
http://useit.com for more info.  Usability is designed, and it takes a
lot of time to design and test it.  The actual coding part is
minuscule in comparison.
/rant

Rob



Re: Session refresh philosophy

2002-02-19 Thread Rob Nagler

Milo Hyson writes:
 shopping-cart-style application (whereby someone selects/configures multiple 
 items before they're ultimately committed to a database) how else would you 
 do it? There has to be some semi-persistent (i.e. inter-request) data where 
 selections are stored before they're confirmed.

As I understand it, the session data is state which is committed to
the database on each request (possibly).  It would seem to me that
instead of denomalizing the state into a separate session table, you
should just store it in a normal table.  If the data needs to be
expired, then it can be time stamped when it is written.

The point is that it's always simpler to use the existing tables
directly rather than making a copy and storing it in the database
somewhere else.  This usually reduces the code by half or more,
because you don't have to worry about making the copy in the first
place.  Simpler code is more reliable and usually runs faster.

To me, sessions are negativist.  My expectation is that users will end
up clicking OK (making the purchase).  If that is the case, you are
much better off putting the data were belongs right from start.  You
may bind it to an ephemeral entity, such as a shopping cart, but when
the order is complete the only thing you have to do is free the cart
and replace it with an order.  The items, amounts, and special
considerations have already been stored.

If most of your users are filling shopping baskets and walking away
from them, it may be a problem with the software.  Checkout
http://www.useit.com for some ideas on how to improve the ratio.

Often you can avoid any server side persistence by using hidden fields
in the forms.  We use this technique extensively, and we have
encapsulated it so that it is easy to use.  For example, you might
have a sub form which asks the user to fill in an address.  When the
user clicks on the fill in address button, the server squirrels away
the context of the current form in the hidden fields of the address
form.  When the user clicks OK on the address form, the fields are
stuffed back into the original form including the new address.

If you have a performance problem, solve it when you can measure it.
Sessions can mitigate performance problems, but so can intelligent
caching, which avoids statefulness in the client-server protocol.

Rob

P.S. For sample sessionless sites, visit http://www.bivio.com and
 http://petshop.bivio.biz (which runs on a 3 year old 300mhz box
 running Apache and Postgres).



Re: Session refresh philosophy

2002-02-19 Thread Rob Nagler

Perrin Harkins writes:
 Actually, even this stuff could be put into a normalized sessions table
 rather than serialized to a blob with Storable.  It just means more work if
 you ever change what's stored in the session.

This is a tough question.  If you store it in a blob, you can't query
it with an ad hoc SQL query.  If you store it in a table, you have to
deal with data evolution.  On the whole, I vote for tables over blobs.
My reasoning is that you have to deal with data evolution anyway.  We
have had about 200 schema changes in the last two years, and very few
of them have had anything to do with user/visitor state.

Rob



Re: Session refresh philosophy

2002-02-18 Thread Rob Nagler

Milo Hyson writes:
 1) A fix-up handler is called to extract the session ID from a cookie. 
[snip]
 1a) If for some reason no session was found (e.g. no cookie) a new one is 
[snip]
 2) During content-generation, the application obtains the session reference 
[snip]
 3) A clean-up handler is called to re-serialize the session and stick it back

I may be asking the wrong question: is there a need for sessions?
This seems like a lot of work when, for most applications, sessions
are unnecessary.

Rob



Re: Mistaken identity problem with cookie

2002-02-15 Thread Rob Nagler

 small operations.  I'm pretty convinced that the problem is on their
 end.  My theory is that these proxies may have cached the cookie
 with an IP address which they provide their clients.

Have you tried capturing all ethernet packets and seeing if the raw
data supports this conclusion.  Checkout:

http://www.ethereal.com/

We have found that it is the bigger ISPs which have faulty caches.
Usually it is a DNS problem, not an HTTP caching problem.

Another trick is throwing a time stamp in every cookie.  This is
useful for other reasons, e.g. cookie expiration and validation.

Cheers,
Rob



extremeperl@yahoogroups.com

2002-01-28 Thread Rob Nagler

It seems there are a number of people interested in Extreme
Programming in Perl, so there's yaml at:

http://groups.yahoo.com/group/extremeperl/

Cheers,
Rob



Re: UI Regression Testing

2002-01-26 Thread Rob Nagler

Hi Craig,

 Have you ever heard of the hw verification tool Specman Elite by Verisity 
 (www.verisity.com)?

No, but it looks interesting.  It would be good to have something like
this for unit tests.  I haven't had very good experience with
automated acceptance testing, however.  The software should be robust
against garbage in, but the main problem we have is making sure the
numbers add up, and that we generate the correct tax forms!  It's
pretty tricky stuff.

FWIW, we are very happy with our unit test structure.  It has evolved
over many years, and many different languages.  I've appended a simple
example, because it is quite different than most of the unit testing
frameworks out there.  It uses the XP philosophy of once and only once
as well as test what is likely to break.

Rob
--

#!perl -w
# $Id: Integer.t,v 1.7 2001/11/24 04:30:19 nagler Exp $
#
use strict;
use Bivio::Test;
use Bivio::Type::Integer;
use Bivio::TypeError;
Bivio::Test-unit([
'Bivio::Type::Integer' = [
get_min = -9,
get_max = 9,
get_precision = 9,
get_width = 10,
get_decimals = 0,
can_be_zero = 1,
can_be_positive = 1,
can_be_negative = 1,
from_literal = [
['9'] = [9],
['+9'] = [9],
['-9'] = [-9],
['x'] = [undef, Bivio::TypeError-INTEGER],
[undef] = [undef],
[''] = [undef],
[' '] = [undef],
['-99'] = [undef, Bivio::TypeError-NUMBER_RANGE],
['-09'] = [-9],
['+09'] = [9],
['-9'] = [-9],
['+9'] = [9],
['+10'] = [undef, Bivio::TypeError-NUMBER_RANGE],
['-10'] = [undef, Bivio::TypeError-NUMBER_RANGE],
],
],
Bivio::Type::Integer-new(1,10) = [
get_min = 1,
get_max = 10,
get_precision = 2,
get_width = 2,
get_decimals = 0,
can_be_zero = 0,
can_be_positive = 1,
can_be_negative = 0,
from_literal = [
['1'] = [1],
['+1'] = [1],
['0'] = [undef, Bivio::TypeError-NUMBER_RANGE],
['11'] = [undef, Bivio::TypeError-NUMBER_RANGE],
['-1'] = [undef, Bivio::TypeError-NUMBER_RANGE],
[undef] = [undef],
['-09'] = [undef, Bivio::TypeError-NUMBER_RANGE],
['+09'] = [9],
],
],
]);



Re: UI Regression Testing

2002-01-26 Thread Rob Nagler

Perrin Harkins writes:
 But what about the actual data?  In order to test my $product-name()
 method, I need to know what the product name is in the database.  That's
 the hard part: writing the big test data script to run every time you
 want to run a test (and probably losing whatever data you had in that
 database at the time).

There are several issues here.  I have answers for some but not all.

We don't do complex unit tests.  We save those for the acceptance test
suite.  The unit tests do simple things.  I've attached a basic unit
test for our DBI abstraction layer.  It runs on Oracle and Postgres.

Acceptance tests take over an hour to run.  We have a program which
sets up some basic users and clubs.  This is run once.  It could be
run before each test suite run, but we don't.  We have tests which
test creating users, entering subscription payments, twiddling files
and email.  By far the biggest piece is testing our accounting.  As I
said, we used student labor to write the tests.  They aren't perfect,
but they catch lots of errors that we miss.

Have a look at:

http://petshop.bivio.biz/src?s=Bivio::PetShop::Util

This program populates the database for our petshop demo.  It builds
the entire schema, too.  The test suite for the petshop will assume
this data.

The amount of data need not be large.  This isn't the point of
acceptance testing imo.  What you want is enough data to exercise
features such as paging, form submits, etc.  Our production database
is multi-GB.

We do have a particularly nasty problem of our quote database.  We
update all of our quote databases nightly using the same software
which talks to our quote provider.  This tests the software in
real-time on all systems.  We run our acceptance test suite in the
morning after all the nightly stuff is done.  It takes hours to
re-import our quote database.

You need a test system distinct from your production and development
systems.  It should be as close in configuration to the production
system as possible.  It can be very cheap. Our test system consists of
a refurb Compaq Presario and a Dell 1300 with 4 disks.  We use
hardware RAID on production and software RAID on test.  Differences
like these don't matter.

The database source needs to be configurable.  Disk is cheap.
You can have multiple users (schemata) using the same database host.
Our database abstraction allows us to specify the target database
vendor, instance, user, and password.  Our command line utility
software allows us to switch instances easily, and the config module
does, too.

I often test against my development database at the same time as I
compare the same results against the test database.  I can do this,
e.g.

 b-petshop -db test create_db

All utilities have a '-db' argument.  Alternatively, I can specify the
user in long hand for the Connection test below:

 perl -w Connection.t --Bivio::Ext::DBI.database=test

All config parameters can be specified this way, or in a dynamically
selectable file.

 This has been by far the biggest obstacle for me in testing, and from
 Gunther's post it sounds like I'm not alone.  If you have any ideas
 about how to make this less painful, I'd be eager to hear them.

It isn't easy.  We don't write a unit test per class.   Indeed we're
far from this.  OTOH, we reuse heavily.  For example, we don't need
to test our product list:
http://petshop.bivio.biz/src?s=Bivio::PetShop::Model::ProductList
It contains no code, only declarations.  All the SQL is generated
by the object-relational mapping layer which handles paging,
column sorting, and so on.  The view is as simple:
http://petshop.bivio.biz/src?s=View.products
Neither of these modules is likely to break, so we feel confident
about not writing unit tests for them.

Rob
--
#!/usr/bin/perl -w
use strict;
use Bivio::Test;
use Bivio::SQL::Connection;

my($_TABLE) = 't_connection_t';
Bivio::Test-unit([
Bivio::SQL::Connection-create = [
execute = [
# Drop the table first, we don't care about the result
[drop table $_TABLE] = undef,
],
commit = undef,
{
method = 'execute',
result_ok = \_expect_statement,
} = [
# We expect to get a statement back.
[EOF] = [],
create table $_TABLE (
f1 numeric(8),
f2 numeric(8),
unique(f1, f2)
)
EOF
[insert into $_TABLE (f1, f2) values (1, 1)] = [],
],
commit = undef,
execute = [
[insert into $_TABLE (f1, f2) values (1, 1)]
= Bivio::DieCode-DB_CONSTRAINT,
],
{
method = 'execute',
result_ok = \_expect_one_row,
} = [
[update $_TABLE set f2 = 13 where f2 = 1] = [],
],
execute_one_row = [
[select f2 from $_TABLE where f2 = 13] = 

Re: UI Regression Testing

2002-01-26 Thread Rob Nagler

Gunther Birznieks writes:
  From the description of your scenario, it sounds like you have a long 
 product life cycle etc.

We release weekly.  We release to test multiple times a day.  We code
freeze the test system over the weekend.

We run all weekly jobs on test during the day on Sat, and then release
to production Sat Night.  The job testing change was introduced
recently.  On production, we have a large job which runs on Tues.  It
also ran on Tues on test.  We changed something later in the week one
release which broke the job, but it wasn't tested.  Now, we get that
extra assurance of having the weeklies run just before the release.

Having an efficient release mechanism is critical.  Also, we get paged
when something goes wrong on production.  With Perl, we can and do
patch individual files midweek in critical emergencies.  For example,
our ec code broke soon after our site went to a pure subscription
model.  It was fun, because it broke from too many paying
customers. $-)   Needless to say, we patched the system asap!

 I think your testing, especially regression testing and the amount of 
 effort you put into it makes a lot of sense because your software is a 
 long-term investment possibly even a product.

Yes, that's an important point.  We run the accounting for over 8,000
investment clubs.  We have a responsibility to make sure the software
is reliable.  We released our Petshop demo without any tests. :)
 
 To each his own I guess.

Agreed.

Rob



Re: UI Regression Testing

2002-01-26 Thread Rob Nagler

 Have you considered talking about Testing at OSC this summer? Mischael 
 Schwern's talk was a great success last summer.

Thanks for the suggestion.  I'll think about it, and see what I can
do. 

 Also writing things down as a doc explaining how things work, with some 
 light examples, to add to our knowledge base would be really cool!

Absolutely.  If there other Extreme Perl programmers out there, send
me a private email.

Rob



Re: performance coding project? (was: Re: When to cache)

2002-01-25 Thread Rob Nagler

 This project's idea is to give stright numbers for some definitely bad 
 coding practices (e.g. map() in the void context), and things which vary 
 a lot depending on the context, but are interesting to think about (e.g. 
 the last example of caching the result of ref() or a method call)

I think this would be handy.  I spend a fair bit of time
wondering/testing myself.  Would be nice to have a repository of the
tradeoffs.

OTOH, I spend too much time mulling over unimportant performance
optimizations.  The foreach/map comparison is a good example of this.
It only starts to matter (read milliseconds) at the +100KB and up
range, I find.  If a site is returning 100KB pages for typical
responses, it has a problem at a completely different level than map
vs foreach.

Rob

Pre-optimization is the root of all evil -- C.A.R. Hoare



Re: UI Regression Testing

2002-01-25 Thread Rob Nagler

 Is anyone familiar with how to go about setting up a test suite for a
 web UI -- without spending an arm and a leg? (Remember, Bricolage is an
 OSS effort!).

Yes, it's very easy.  We did this using student labor, because it is
an excellent project for students and it's probably cheaper.  It's
very important.  We run our test suite nightly.

I'm an extreme programming (XP) advocate.  Testing is one of the most
important practices in XP.

I'm working on packaging what we did so it is fit for public
consumption.  Expect something in a month or so.  It'll come with a
rudimentary test suite for our demo petshop app.

There are many web testers out there.  To put it bluntly, they don't
let you write maintainable test suites.  The key to maintainability is
being able to define your own domain specific language.  Just like
writing maintainable code, you have to encapsulate commonality and
behavior.  The scripts should be short and only contain the details
pertinent to the particular test.  Perl is ideal for this, because you
can easily create domain specific languages.

Rob



Re: UI Regression Testing

2002-01-25 Thread Rob Nagler

 Have you tried webchat?  You can find webchatpp on CPAN.

Just had a look.  It appears to be a rehash of chat (expect) for the
web.  Great stuff, which is really needed and demonstrates the power
of Perl for test scripting.

But...

This is a bit hard to explain.  There are two types of XP testing:
unit and acceptance.  Unit testing is pretty clear in Perl circles
(ok, I have a thing or two to say about it, but not now :-).

Acceptance testing (aka functional testing) is traditionally handled
by a third party testing organization.  The test group writes scripts.
If they are testing GUIs, they click in scripts via a session
recorder.  They don't program anything.  There's almost no reuse,
and very little abstraction.

XP flips testing on its head.  It says that the programmers are
responsible for testing, not some 3rd party org.  The problem I have
found is that instead of programming the test suite, XPers script it,
using the same technology that a testing organization would use.  With
the advent of the web, this is a real shame.

HTTP and HTML are middleware.  You have full programmatic control to
test your application.  You can't control the web browser, so you
still need to do some ad hoc how does it look testing, but this
isn't the hard part.

The acceptance test suite is testing the system from the user's point
of view.  In XP, the user is the customer, and the customer writes
tests.  In my opinion, this means the customer writes tests in a pair
with a programmer.  The programmer's job is to create a language which
the user understands.

Here's an example from our test suite:

Accounting-setup_investment('AAPL');

The user knows what an investment is.  She also knows that AAPL is a
stock ticker.  This statement sets up the environment (using LWP to
the app) to execute tests such as entering dividends, buys, sells,
etc.

The test infrastructure must support the ability to create new
language elements with the ability to build elements using the other
elements.  This requires modularization, and today this means classes
and instances.  There's also a need for state management, just like
the request object in your web application.

Part of the packaging process we're going through is making it even
easier to create domain specific languages.  You actually want to
create lots of dialects, e.g. in our case this means investments, cash
accounts, member accounts, and message boards.  These dialects use
building blocks such as logging in, creating a club, and so on.  At
the bottom you use LWP or webchat.  However, the user doesn't care if
the interface is HTTP or Windows.  You're job as a test suite
programmer is meeting her domain knowledge, and abstracting away
details like webchat's CLICK and EXPECT OK.

In the end, your test suite is a domain knowledge repository.  It
contains hundreds of concise scenarios comprised of statements, or
facts, in knowledge base parlance.  The execution of the test suite
asserts all the facts are true about your application.  The more
concise the test language.  The more easily the user-tester can verify
that she has encoded her expertise correctly.

Rob



Re: UI Regression Testing

2002-01-25 Thread Rob Nagler

Gunther Birznieks writes:
 the database to perform a test suite, this can get time consuming and 
 entails a lot of infrastructural overhead.

We haven't found this to be the case.  All our database operations are
programmed.  We install the database software with an RPM, run a
program to build the database, and program all schema upgrades.  We've
had 194 schema upgrades in about two years.

 unit testing being done on the basis of writing a test class for every 
 class you write. Ugh! That means that any time you refactor you throw away 
 the 2x the coding you did.

By definition, refactoring doesn't change observable behavior.  You
validate refactorings with unit tests.  See http://www.refactoring.com

 To some degree, there should be intelligent rules of thumb as to which 
 interfaces tests should be written to because the extreme of writing tests 
 for everything is quite bad.

Again, we haven't seen this.  Every time I don't have unit tests, I
get nervous.  How do I know if I broke something with my change?
 
 Finally, unit tests do not guarantee an understanding of the specs because 
 the business people generally do not read test code. So all the time spent 
 writing the test AND then writing the program AND ONLY THEN showing it to 
 the users, then you discover it wasn't what the user actually wanted. So 2x 
 the coding time has been invalidated when if the user was shown a prototype 
 BEFORE the testing coding commenced, then the user could have confirmed or 
 denied the basic logic.

Unit tests aren't about specs.  They are about APIs.  Acceptance tests
need to be written by the user or written so the user can understand
them.  You need both kinds of testing.
See http://www.xprogramming.com/xpmag/Reliability.htm

Rob



Re: When to cache

2002-01-24 Thread Rob Nagler

 1) The old cache entry is overwritten with the new.
 2) The old cache entry is expired, thus forcing a database hit (and 
 subsequent cache load) on the next request.

3) Cache only stuff which doesn't expire (except on server restarts).

We don't cache any mutable data, and there are no sessions. We let the
database do the caching.  We use Oracle, which has a pretty good
cache.  We do cache some stuff that doesn't change, e.g. default
permissions, and we release weekly, which involves a server restart
and a refresh of the cache.

If you hit http://www.bivio.com , you'll get a page back in under
300ms. There are probably 10 database queries involved if you are
logged in.  This page is complex, but far from our most complex.
For example, this page
http://www.bivio.com/demo_club/accounting/investments
sums up all the holdings of a portfolio from the individual
transactions (buys, sells, splits, etc.).  It also comes back in under
300ms.

Sorry if this wasn't the answer you were looking for. :)

Rob




Re: When to cache

2002-01-24 Thread Rob Nagler

Perrin Harkins writes:
 To fix this, we moved to not generating anything until it was requested.  We
 would fetch the data the first time it was asked for, and then cache it for
 future requests.  (I think this corresponds to your option 2.)  Of course
 then you have to decide on a cache consistency approach for keeping that
 data fresh.  We used a simple TTL approach because it was fast and easy to
 implement (good enough).

I'd be curious to know the cache hit stats.  BTW, this case seems to
be an example of immutable data, which is definitely worth caching if
performance dictates.

 However, for many of us caching is a necessity for decent
 performance.

I agree with latter clause, but take issue with the former.  Typical
sites get a few hits a second at peak times.  If a site isn't
returning typical pages in under a second using mod_perl, it
probably has some type of basic problem imo.

A common problem is a missing database index.  Another is too much
memory allocation, e.g. passing around a large scalar instead of a
reference or overuse of objects (classical Java problem).  It isn't
always the case that you can fix the problem, but caching doesn't fix
it either.  At least understand the performance problem(s) thoroughly
before adding the cache.

Here's a fun example of a design flaw.  It is a performance test sent
to another list.  The author happened to work for one of our
competitors.  :-)


  That may well be the problem. Building giant strings using .= can be
  incredibly slow; Perl has to reallocate and copy the string for each
  append operation. Performance would likely improve in most
  situations if an array were used as a buffer, instead. Push new
  strings onto the array instead of appending them to a string.

#!/usr/bin/perl -w
### Append.bench ###

use Benchmark;

sub R () { 50 }
sub Q () { 100 }
@array = (  x R) x Q;

sub Append {
my $str = ;
map { $str .= $_ } @array;
}

sub Push {
my @temp;
map { push @temp, $_ } @array;
my $str = join , @temp;
}

timethese($ARGV[0],
{ append = \Append,
  push   = \Push });


Such a simple piece of code, yet the conclusion is incorrect.  The
problem is in the use of map instead of foreach for the performance
test iterations.  The result of Append is an array of whose length is
Q and whose elements grow from R to R * Q.  Change the map to a
foreach and you'll see that push/join is much slower than .=.

Return a string reference from Append.  It saves a copy.
If this is the page, you'll see a significant improvement in
performance.

Interestingly, this couldn't be the problem, because the hypothesis
is incorrect.  The incorrect test just validated something that was
faulty to begin with.  This brings up you can't talk about it unless
you can measure it.  Use a profiler on the actual code.  Add
performance stats in your code.  For example, we encapsulate all DBI
accesses and accumulate the time spent in DBI on any request.  We also
track the time we spend processing the entire request.

Adding a cache is piling more code onto a solution.  It sometimes is
like adding lots of salt to bad cooking.  You do it when you have to,
but you end up paying for it later.

Sorry if my post seems pedantic or obvious.  I haven't seen this type
of stuff discussed much in this particular context.  Besides I'm a
contrarian. ;-)

Rob



Re: When to cache

2002-01-24 Thread Rob Nagler

 When you dig into it, most sites have a lot of data that can be out of sync
 for some period.

Agreed. We run an accounting application which just happens to be
delivered via the web.  This definitely colors (distorts?) my view.

 heavy SQL.  Some people would say to denormalize the database at that point,
 but that's really just another form of caching.

Absolutely.  Denormalization is the root of all evil. ;-)

 No need to do that yourself.  Just use DBIx::Profile to find the hairy
 queries.

History.  Also, another good trick is to make sure your select
statements are as similar as possible.  It is often better to bundle a
couple of similar queries into a single one.  The query compiler
caches queries.

 Ironically, I am quoted in Philip Greenspun's book on web publishing saying
 just what you are saying: that databases should be fast enough without
 middle-tier caching.  Sadly, sometimes they just aren't.

Every system design decision often has an equally valid converse.
The art is knowing when to buy and when to sell.  And Greenspun's book
is a great resource btw.

Rob



RE: Forking another process in Apache?

2002-01-22 Thread Rob Nagler

Chris Hutchinson writes:
 Avoids much work in httpd, and allows user to hang up web connection and
 return later to continue viewing status.

We used to do this, but found it more complex (more protocols and
server types) than simply letting Apache/mod_perl handle the job.  I
guess this depends on the frequency of long requests, but in our case
the mix is handle nicely with a single common server using http as the
only protocol.

The idea is that all the work is handled by the middle tier.  This
includes processing incoming mail messages, long running jobs, and
credit card processing.  There's a lot of common code between all
these tasks, so memory is shared efficiently.

One trick for long running jobs started by an http request is to reply
to the user as normal and do the long part in a PerlCleanupHandler.
This avoids a fork of a large process, which keeps the memory usage
relatively constant.  This simplifies resource allocation.

Just another way to do it.

Rob




Re: RFC: Exception::Handler

2002-01-14 Thread Rob Nagler

   I'm afraid I don't get it - isn't it what the finally functionality
 in Error.pm (CPAN) does ?
 
   try {
 stuffThatMayThrow();
   } finally {
 releaseResources();
   };

One reason for exceptions is to separate error handling code from the
normal control flow.  This makes the normal control flow easier to
read.  If releaseResources() is to be called whenever an exception
occurs, then it is advantageous to eliminate the extra syntax in the
class's methods and just have releaseResources() called whenever an
exception occurs and the object is on the stack.

Our exception handling class searches down the stack looking for
objects which implement handle_die().  It then calls
$object-handle_die($die), where $die is the exception instance.  This
increases the cost and complexity of exception handling, while
decreasing the cost and complexity of normal control flow.  It also
ensures that whenever the object is involved in an exception,
handle_die() is called giving it an opportunity to examine the
exception and clean up global state if necessary.

   This eliminates a lot of explicit
  try/catches.
 
   Well, destructors are of some help too in that issue.

Not if the object is a class or if the object is still live, e.g. the
request context.  We don't do a lot of instance creation/destruction
in our code.  For example, our Task instances are created at start up.
They are executed repeatedly.  Tasks decide whether to commit/rollback
on every execution, independent of the path through the Task class.

I'm agree with the need for try/catch.  That's often the best way to
handle exceptions.  There are cases where a global view is need,
however.  Like Aspects, it ensures that you don't forget or have to
put in code where it is absolutely needed.

Rob
 



Re: RFC: Exception::Handler

2002-01-12 Thread Rob Nagler

Matt Sergeant writes:
 I don't like this for the same reason I don't like $SIG{__DIE__} - it
 promotes action at a distance. In a 1000 line .pm file I *want* to have my
 exception catching mechanism next to my eval{} block.

You need this flexibility, but Perl allows you to do more, for good
reasons. 

One of the things I don't like about traditional try/catch handling is
that it doesn't allow for class level programming.  You need to allow
any subroutine to try/catch exceptions (die).  It's also nice to
notify any object in the stack that there is an unhandled exception
passing through its code.  This eliminates a lot of explicit
try/catches.  This allows reuse without clutter.  If you're familiar
with Aspects, it's basically the same concept.

Rob



Re: Tips tricks needed :)

2001-12-20 Thread Rob Nagler

 By the way, is there a perl module to do calculations with money?

We use Math::BigInt to do fixed point.  We couldn't get the other math
modules to work a few years back.  Our wrapper (Bivio::Type::Number)
normalizes the rounding and allows subclasses to specify precision,
decimals, min, max, etc.  It's not fast, but fast enough. :-)
It's part of bOP, which is available under the Artistic license from
http://www.bivio.biz/hm/download-bOP

We don't do too much math in the database, i.e. with PL/SQL and such.
One thing we have done which has really helped is to define the sign
of all amounts/quantities so that we can use SQL's SUM() function.
Our database is normalized, which speeds development and reduces bugs.
Using SUM() keeps queries fast (100MS) even processing ~1K rows to
produce a portfolio.

Cheers,
Rob




Re: Tips tricks needed :)

2001-12-19 Thread Rob Nagler

Perrin Harkins writes:
 Okay, wishful thinking.  I don't use Class::Singleton, but I have written my
 own versions of Object::Registrar a few times to accomplish the same goal.

Ditto.  We use a registry mechanism, too.  One thing I don't quite
understand is the need to clear out a singleton.  Why would a
singleton need to hold transient state?

Rob



RE: mod_perl vs. C for high performance Apache modules

2001-12-14 Thread Rob Nagler

 I spoke to the technical lead at Yahoo who said mod_perl will not scale as
 well as c++ when you get to their level of traffic, but for a large
 ecommerce site mod_perl is fine.

Scalability has less to do with language/execution environment than
which database you are using.  Path length is affected by language,
but that's usually not the major factor in scalability.  You want
short path lengths to get more efficiency out of your machines.

Rob



Re: form upload limit

2001-12-13 Thread Rob Nagler

 There is no such a limit in Apache and probably most browsers.

By default, LimitRequestBody is 0 (unlimited) in Apache.  We limit
incoming requests with this directive, so server resources aren't
consumed by excessive.  I think POST_MAX happens after the request is
already read into memory.

LimitXMLRequestBody has a default limit of 100.  There are other
LimitRequest* directives which limit various aspects of the header.

Rob



Re: ASP.NET Linux equivalent?

2001-12-05 Thread Rob Nagler

Dave Hodgkinson writes:
 I did an auto-form generator-from-schema thing once.
 
 Too many exceptions and meta-data involved to actually make it really
 worthwhile.

Check out the mapping for, e.g. http://petshop.bivio.biz/pub/products?p=FISH
and click on Model.ProductList and View.products to see how we handle
an automated mapping.  We find it extremely convenient.

Rob



Re: Persistent HTTP Connections ala Apache::DBI

2001-11-22 Thread Rob Nagler

 Has anyone done such a thing before?

No doubt.

 Can someone point me to docs or
 modules which could help doing this?

Perhaps raw sockets might be a consideration.  However, Apache is
great middleware, so I tend to use it in cases like this.  You might
want to use a session-based approach between the db-Apache and the
app-Apache.  The db-Apache would cache the connections to the legacy
DB returning sessions the app-Apache which would cache them as well.
You'd get the performance of cached DB connections without having to
ensure the HTTP connections remain alive across app-Apache queries.

When a session times out on the db-Apache tier, just rollback
(assuming your DB is transactional) and put it in the free pool for
new sessions.

 Or is this whole idea maybe just
 plain stupid?

I don't think so.  I assume the DB connection cost is high (on the
order of seconds) iwc you need some way to cache connections.

 Are there obvious caveats I haven't thought of?

Garbage collection is an issue.  How do you know when to timeout
(rollback) queries on db-Apache?  Are the queries atomic to
app-Apache, i.e. within a single end-user HTTP request or do they span
multiple end-user requests?  (This latter a good idea, imo.)

mfg,
Rob



RE: Cookie authentication

2001-11-16 Thread Rob Nagler

 If you happen to type in a URL, they can revive your
 session from the cookie.  Pretty nifty trick.

This would seem to be a security hole to me.  URLs appear in the logs
of the server as well as any proxy servers along the way.  If the URL
contains reusuable auth info, anybody accessing any of the logs could
gain access to customer accounts.

 to prevent proxy caches from caching personalized pages
 and serving them to the wrong end-user.

If you want to ensure privacy, use:

$r-header_out('Cache-Control' = 'private');

If you want to turn off caching altogether, use:

$r-header_out(Pragma = 'no-cache');

Rob



Re: [Maybe OT] Modular design - calling pages like a subroutine with a twist.

2001-11-15 Thread Rob Nagler

 When PageA calls PageB, as soon as PageB finishes presenting 
 the form it doesn't stop but drops out the bottom and returns 
 immediately to PageA.

In bOP http://www.bivio.net/hm/download-bOP we use FormContext to
solve this problem.  PageB requires context and bOP knows how to
return to PageA through the saved context.  We call this unwinding.
You can nest the stack as deep as you like.  The context is saved in
the URL if PageA isn't a form, or in the called form's hidden fields,
if it is.  The entire form state is saved in the latter case.

PageB and PageA are FormModels in bOP.  If you visit our Pet Shop demo
http://petshop.bivio.net, you'll form context this used in the
LoginForm, OrderConfirmationForm, and ShippingAddressForm.  Here's all
the business logic in our ShippingAddressForm:

sub execute_ok {
my($self) = @_;
# copy the current values into the OrderForm context
$self-put_context_fields(%{$self-internal_get});
return;
}

sub internal_initialize {
my($self) = @_;
my($info) = {
require_context = 1,
version = 1,
visible = [
'Order.ship_to_first_name',
'Order.ship_to_last_name',
'EntityAddress_2.addr1',
'EntityAddress_2.addr2',
'EntityAddress_2.city',
'EntityAddress_2.state',
'EntityAddress_2.zip',
'EntityAddress_2.country',
'EntityPhone_2.phone',
],
};
return $self-merge_initialize_info(
$self-SUPER::internal_initialize, $info);
}

In this case, we get the shipping address from the user, execute_ok is
called which stuffs the forms values into the calling form's context.
The infrastructure automatically unwinds to the OrderForm with the
newly filled in values.

The OrderForm doesn't know about the ShippingAddressForm.
Technically, the ShippingAddressForm doesn't know about the OrderForm.
It only requires the calling form to have fields with the same name.

The relationship between the pages (tasks in bOP) is not specified
by the forms.  That's handled by the control logic.  If a task has a
form, it can specify the next and cancel tasks.  This way you can
reuse the business logic quite easily.  Tasks can control the use of
context.  FormModels specify whether they can accept it or not.

Hope this helps.

Rob



Re: [Maybe OT] Modular design - calling pages like a subroutine with a twist.

2001-11-15 Thread Rob Nagler

 In my opinion, trying to abstract that stuff away in a web application
 causes to more problems than it solves, especially where back buttons and
 bookmarks are concerned.

We haven't found this to be the case.  Our servers are sessionless,
so bookmarks work fine.   Back buttons aren't any more or less of a
problem.  I actually haven't heard of any problems with our sub-forms
and back buttons.  People do bookmark URLs with form context, but
that's a good thing.  It usually is the login page and they login and
it automatically restores the page which they thought they
bookmarked (which redirected to login in the first place).

 I think it's easier to take a state machine
 approach, the way CGI::MxScreen or Apache::PageKit do.

I don't think this works.  The state machine can manage states going
forward, but not backward.  Consider the problem of a Symbol Lookup on
our site (www.bivio.com).  We come into it from just about any
accounting page having to do with a stock transaction.  It's a single
task, which looks up the ticker and fills it in in the Calling form.
You need to stack the state or you have to introduce N new states
(for entry from forms A, B, C, D, ...).

It did take about two years to come up with a decent implementation of
FormContext.  It's a non-trivial problem, but it can be generalized
and it solves the problem we had.

Rob



Re: [Maybe OT] Modular design - calling pages like a subroutine with a twist.

2001-11-15 Thread Rob Nagler

Perrin Harkins writes:
 breaks caused by the request model of HTTP, and that's what I was commenting
 on.  You're talking about a way to preserve data across multiple page
 requests.

FormContext maintains an HTTP call stack, which holds the parameters
(form, query, path_info) and return address (calling Task).  Tasks are
work units (server subroutines).  URIs are UI elements, which is why
we don't store them in the FormContext.

 If I understand your FormContext approach correctly, you are storing the
 state of the current application in URLs or hidden fields.  This is what we
 used at eToys as well, and I think it's a pretty common solution.

FormContext is a formal stack architecture.  The callee can reach into
the stack to get or to modify caller's form data as in the
ShippingAddressForm case.  It also handles the case of a call from a
non-form Task, e.g. if you bookmark your private home page on a site,
the LoginForm requires context so it knows where to return to after
successful authentication.  The Login task needs no knowledge of who
called it; it just returns to the Task specified in its FormContext.
If there is no FormContext, it returns to its next task specified by
the state machine.

The reason I brought up sessions is that the above mechanism wouldn't
work if there were sessions.  Sessions might time out or go away for
bookmarked pages.  FormContext survives server restarts and renaming
of the calling page's URI.

Rob



Re: http or https in URL?

2001-11-06 Thread Rob Nagler

 But how do I get the protocol, http or https.

You can check the port on $c-local_addr.  443 is https.

Rob







Re: Neo-Classical Transaction Processing

2001-10-29 Thread Rob Nagler

Perrin Harkins writes:
 The trouble here should be obvious: sooner or later it becomes hard to scale
 the database. You can cache the read-only data, but the read/write data
 isn't so simple.

Good point.  Fortunately, the problem isn't new.  

 Theoretically, the big players like Oracle and DB2 offer clustering
 solutions to deal with this, but they don't seem to get used very
 often.

Oracle was built on an SMP assumption.  They added clustering later.
It doesn't scale well, which is probably why you haven't heard of
people using their parallel server solutions.  I don't know much about
DB2, but I'm pretty sure it assumes shared memory.  Tandem's Non-Stop
SQL is a shared nothing architecture.  It scales well, but isn't cheap
to walk in the door.

 Other sites find ways to divide their traffic up (users 1 - n go to
 this database, n - m go to that one, etc.)

Partitioning is a great way to get scalability, if you can do it.

 However, you can usually scale up enough just by getting a bigger
 box to run your database on until you reach the reach the realm of
 Yahoo and Amazon, so this doesn't become an issue for most sites.

I agree.  This is why I think Apache/mod_perl is a great solution for
the majority of web apps.  The scaling issues supposedly being solved
by J2EE don't exist.

On another note, one of the ways to make sure your database scales
better is to keep the database as simple as possible.  I've seen a lot
of solutions which rely on stored procedures to get performance.
All this does is make the database slower and more of a bottleneck.

 But how can you actually make a shared nothing system for a commerce web
 site?  They may not be sharing local memory, but you'll need read/write
 access to the same data, which means shared locking and waiting somewhere
 along the line.

I meant shared nothing in the sense of multiprocessor architectures.
SMP (symmetric multiprocessing) relies on shared memory.  This is the
J2EE/E10K model.  shared nothing is the Neo Classical model.  Really
these are NUMAs (non-uniform memory architecture), because most
servers are SMPs.  Here's a classic from Stonebraker on the subject:

http://db.cs.berkeley.edu/papers/hpts85-nothing.pdf

DeWitt has a lot of papers on parallelism and distributed db design:
http://www.cs.wisc.edu/~dewitt/

Cheers,
Rob



Neo-Classical Transaction Processing (was Re: Excellent article...)

2001-10-28 Thread Rob Nagler

Joe Schaefer writes:
  experience, the only way to build large scale systems is with
  stateless, single-threaded servers.
   ^^

 Could you say some more about what you mean by this?  Do you mean
 something like

   use a functional language (like Haskell or Scheme), rather
than an imperative language (like C, Java, Perl ...),

Not exactly, but this is an interesting topic.

 or are you talking more about the application's platform and design
 (e.g. http://www.kegel.com/c10k.html )?

This article addresses path length, which is the single-threaded
part.  Scalability is not addressed.  Both parts are important to
understand when you build enterprise systems.

I changed the subject to Neo-Classical Transaction Processing which is
the way I look at web applications.  If you'll bear with me, I can
explain the Neo and Classical parts with a picture.  Here's a
classical transaction processing system:

T  __
e  ++  +--+   /  \
r -||--|  |---\__/
m  |  Transaction   |  |  +--+|  |
i -|Monitor |-|  ||  |
n  ||  |  |  +--+ |  DB  |
a -|||  |-|  |
l  ||  |  |  |   Custom | |  |
s  ++  +--|  |   Servers| |  |
  |  |  | |  |
  +--|  | |  |
 |  | \__/
 +--+


Now here's a a typical (large) Apache/mod_perl setup (Neo-Classical):

   __
B  ++  +--+   /  \
r -||--|  |---\__/
o  |Apache  |  |  +--+|  |
w -|   mod_proxy|-|  ||  |
s  ||  |  |  +--+ |  DB  |
e -|||  |-|  |
r  ||  |  |  |   Apache | |  |
s  ++  +--|  |   mod_perl   | |  |
  |  |  | |  |
  +--|  | |  |
 |  | \__/
 +--+

The browsers are connected to a fast IP router(s), equivalent to
yesteryears I/O processor.  The mod_proxy servers are simple switches,
just like the TM.  (Unlike the TM, front-ends don't manage the
transactions.)  It's usually a given that the front-ends are
stateless.  Their job is dynamically routing for load-balancing and
reliability.  They also serve icons and other stateless files.  If a
front-end crashes, the IP router ignores it and goes to another
front-end.  No harm done.  The IP router also balances the load,
something that isn't provided by classical I/O processors.

The mod_perl servers are the work horses, just like the custom
servers.  In a classical OLTP system, the customer servers are
stateless, that is, if a server goes down, the TM/mod_proxy server
routes around it.  (The TM rollsback any transactions and restarts the
entire request, which is interesting but irrelevant for this
discussion.)  If the work servers are fully loaded, you simply add
more hardware.  If all the servers are stateless, the system scales
linearly, i.e. the number of servers is directly proportional to the
number of users that can be served.

That's the stateless part.  Threading is the other issue.  Should the
servers (mod_proxy or mod_perl) be threaded.  In classical OLTPs, the
work servers are single threaded (as in one request at a time) and the
TM handles multiple simultaneous requests, but isn't multi-threaded
(in the Java sense).

The work server can be thought of as a resource unit.  Usually it
represents a fair bit of code and takes up a chunk of memory.  It can
only process so many requests per unit time.  If the work server is
multi-threaded, it is harder to manage resources and configure for
peak load.  In the single threaded model, each work server (process)
is a reservation for the resources it needs for one request.  In a
multi-threaded model, the resource reservations are less clear.  It
might have a shared database connection pool or it might have two
simultaneous requests which need more memory.  The meaning of capacity
becomes fuzzy.  If the whole multi-threaded server ever has to wait on
a single shared 

Re: Excellent article on Apache/mod_perl at eToys

2001-10-23 Thread Rob Nagler

 is easier and more standardized, and well documented. But I feel like
 coding front-end web applications is much easier in Perl where the workflow
 bits change all the time. For this, I like using SOAP on the backend Java
 server and SOAP on the front-end Perl.

I don't quite understand the difference between worflow in the front-end and
workflow in the back-end.  They both change.  The danger of making one part
of the system easier to change is that people tend to cheat.  They won't
put the business logic in the back-end if it takes twice as long.

To me, the major issue in Perl vs Java is dynamic vs static typing.  Building
large scale systems in Perl is much like building them in Smalltalk or Lisp.
It takes a certain mindset.  The lack of compiled interfaces means you need
much more discipline (e.g. unit testing).  The payoff is big with Perl, because
you can refactor more easily and quickly than in Java.

The libraries aren't much an issue.  A good example is SOAP.  SOAP is 
middleware.  It is standardized, documented, and the rest of it.  You like
it for connecting Perl to Java, but why can't it be the other way around?
If it can be the other way around, why aren't Perl and Java equally adapted
to building middleware applications?

Rob



Re: Excellent article on Apache/mod_perl at eToys

2001-10-23 Thread Rob Nagler

Gunther wrote:
 If you do not have a strongly typed system, then when you break apart and
 rebuild another part of the system, Perl may very well not complain when a
 subtle bug comes up because of the fact that it is not strongly typed.
 Whereas Java will complain quite often and usually early with compile time
 checking.

I don't think there's an objective view about this.  I also think
the it compiles, so it works attitude is dangerous.  You don't know
it works until your unit and acceptance tests pass.  I've been in too
many shops where the nightly build was the extent of the quality
assurance program.

 Compile time checking can definitely be a friend of yours especially when
 dealing with large systems. But it's also a friend that's judgemental
 (strongly typed) so he's a pain to drag along to a party

To me, strongly vs weakly typed is less descriptive than statically vs
dynamically typed.  For example, Java is missing undef.  It has NULL
for pointers, but not undef for ints, chars, booleans, etc.  Large
systems often have unexpected initialization order problems which are
not handled well by Java due to this missing feature.

 Java's support for multi-threading makes writing servers feel fairly
 trivial with no jumping through IPC::Shared memory stuff hoops to get
 shared memory caches and the like.. you just synchronize on global data
 structures.

It's important to define the problem space for this discussion.  I
think Perl is really good for information systems, be they enterprise
or not.  I probably wouldn't program a real-time system in Perl.  I
might program it in Java.

Here's a strong statement: Threads have no place in information
systems.  The NYSE is run on Tandem boxes.  Tandem's OS does not have
threads.  The NYSE can process over a billion stock transactions a
day.  The EJB spec says you can't fire off threads in a bean.  I think
there's a reason for the way these systems have been architected.

Threads are a false economy for systems which have to scale.  As some
people have joked, Java is Sun's way of moving E10K servers.  SMP
doesn't scale.  As soon as you outgrow your box, you are hosed.  A
shared memory cache doesn't work well over the wire.  In my
experience, the only way to build large scale systems is with
stateless, single-threaded servers.

Rob



Re: Selectively writing to the access log

2001-10-19 Thread Rob Nagler

 I only see methods for writing to the error log. 

I don't think you can change the access log format, but you can 
modify the values.  For example, you can set $c-user and $c-remote_ip.

Rob



Re: Selectively writing to the access log

2001-10-19 Thread Rob Nagler

   Usage: Apache::the_request(r)

This means the sub Apache::the_request takes a single parameter,
i.e. you can't modify the_request.

You can modify the method and uri.  You can't modify the protocol 
(HTTP/1.0).  If you change method or uri, it doesn't change the_request.
You can change your LogFormat to get these values--see
http://httpd.apache.org/docs/mod/mod_log_config.html

Rob



Re: apache::dbi vs mysql relay

2001-10-17 Thread Rob Nagler

 What I don't understand is why they separate the listener and database
 connection daemons if you always need one of each to do anything.

Probably for scalability.   The database engines are doing the work and
the sooner they can free themselves up (due to a slow client, for example),
the better.

Rob



Re: Mod_perl component based architecture

2001-10-16 Thread Rob Nagler

   As for the remaining of the question, I've been wondering for myself if
 there is a MVC (model-view-controller) framework for WWW publishing in
 Perl ? I gather there exist quite a few for Java, but I couldn't find
 anything significant under Perl.

Check out http://www.bivio.net/hm/why-bOP and http://petshop.bivio.net
The former motivates the MVC architecture.  The latter URL is a demo
of Sun's J2EE blueprint demo of a Pet Store implemented using bOP, a
perl application framework.  It's freeware and we use it to run a
large commercial website.

When you visit petshop.bivio.net, at the bottom of the page, you'll
see Control Logic for This Page.  This is what the bOP agent
(controller) uses to determine if the incoming user can access the
page, what the page actually does (models and views), and any state
transitions (form next or cancel).  The links at the bottom of the
page go to the source of this application.

bOP also allows you to change the look-and-feel quite easily.  Compare
these two pages:

http://www.bivio.com/club_cafe/mail-msg?t=1934163
http://ic.bivio.com/club_cafe/mail-msg?t=1934163

They render the same content, but in two entirely different contexts.
Each look-and-feel is described in a single file, which contains
color, font, URL, text, and view mappings.

bOP is about 250 classes including the Pet Shop demo.  It uses Oracle
or Postgres, but it should be easy to port to other databases. You can
also build a static site, e.g. http://www.bivio.net which doesn't 
require a database.

SOAPBOX
The J2EE architecture implements MV, not MVC imiho.  Here's one of my
favorite quotes from Sun's site:

  It is important to understand that Model, View, and Controller are 
  usually not represented by individual classes; instead, they are 
  conceptual subdivisions  of the application. 

This is true for J2EE, but not true for MVC frameworks.  J2EE's
control flow is not a distinct element.  JSPs are usually full of
business logic.  The whole MVC concept passed J2EE by.

Even when you look at Model 2 Methodology (promoted by Apache
Jakarta Turbine), the code is a mess.  Here's a snippet from the
reference article on Model 2:

  public void doPost (HttpServletRequest req, HttpServletResponse res)
  throws ServletException, IOException {
  HttpSession session = req.getSession(false);
  if (session == null) {
   res.sendRedirect(http://localhost:8080/error.html;);
  }
  Vector buylist= (Vector)session.getValue(shopping.shoppingcart);
  [...]
  if (!action.equals(CHECKOUT)) {
  if (action.equals(DELETE)) {
  [...]
  String url=/jsp/shopping/EShop.jsp;
  [...]
  String url=/jsp/shopping/Checkout.jsp;

The excerpt is from a single method in which they mix sessions, port
numbers, hosts, error pages, URLs, button values, etc.

The JSP is no better and contains lines like:

  optionYuan | The Guo Brothers | China | $14.95/option
  bQuantity: /binput type=text name=qty SIZE=3 value=1
  input type=submit value=Delete  
  input type=hidden name=action value=DELETE

Note that DELETE in the JSP must be the same as DELETE in the
Java.  Nothing is checking that.  You only know that the code doesn't
work when someone hits the page.  In this particular example, if you
mispell DELETE in either place, the code does something, and 
doesn't issue an error.

So much for Model 2.  I wonder what Model 3 will be like. ;-)

Sorry, had to get that off my chest...
/SOAPBOX

Cheers,
Rob