Re: Trying not to re-invent the wheel
On Nov 14, 5:00pm, Leslie Mikesell wrote: This means that you can't easily make nested sub-pages without knowing ahead of time how they will be used, and worse, if you get an error in step 3 of generating a page you can't undo the fact that steps 1 and 2 are probably already on the user's screen. If the template language offers some flow control and logic and the ability for one 'page' to return a status plus a string containing it's html to another page that includes it then you wouldn't need a different template system to separate logic from layout, you would just put them in different pages, letting the 'code' page include the layout elements it wants. You get this in Template Toolkit. Basic flow control (i.e. RETURN STOP) and the ability to FILTER template output here, there, or wherever. You can have a Perl program which calls on the toolkit to render its output, or you can have a template which, on being processed by the toolkit, loads and calls and Perl code that it needs (via a plugin interface). It's a slippery slope that leads both up and down. :-)= A -- Andy Wardley [EMAIL PROTECTED] Signature regenerating. Please remain seated. [EMAIL PROTECTED] For a good time: http://www.kfs.org/~abw/
Re: Trying not to re-invent the wheel
On Fri, 12 Nov 1999, Jim Winstead wrote: On Nov 10, Mark Cogan wrote: At 10:10 AM 11/10/99 -0800, Ian Mahuron wrote: I may implement IF/LOOPS/etc.. but not until I see the need. Those introduce more complex problems. And they are, of course, inevitable with almost any templating system. You know, PHP was once just a templating system. I had special tags that I replaced with the output from the business logic I wrote in C in order to avoid needing to recompile my code just to tweak the HTML. Then I figured it would be a good idea to add stuff like IF/LOOPS/etc so I could manipulate my tags a little bit. Now, 5 years later, people are writing template systems that sit on top of PHP because they are writing business logic in PHP which means yet another template system is needed to separate code from layout. I wonder how many layers of templates we will have 5 years from now. It's a slippery slope... -Rasmus
Re: Trying not to re-invent the wheel
According to Rasmus Lerdorf: Those introduce more complex problems. And they are, of course, inevitable with almost any templating system. You know, PHP was once just a templating system. [...] Then I figured it would be a good idea to add stuff like IF/LOOPS/etc so I could manipulate my tags a little bit. Now, 5 years later, people are writing template systems that sit on top of PHP because they are writing business logic in PHP which means yet another template system is needed to separate code from layout. I wonder how many layers of templates we will have 5 years from now. I think a lot of unnecessary complexity comes from the fact that most of the template systems (and apache modules in general) want to output the html as a side effect instead of accumulating the page in a buffer or just returning a string containg the html plus a status value to the caller. This means that you can't easily make nested sub-pages without knowing ahead of time how they will be used, and worse, if you get an error in step 3 of generating a page you can't undo the fact that steps 1 and 2 are probably already on the user's screen. If the template language offers some flow control and logic and the ability for one 'page' to return a status plus a string containing it's html to another page that includes it then you wouldn't need a different template system to separate logic from layout, you would just put them in different pages, letting the 'code' page include the layout elements it wants. Les Mikesell [EMAIL PROTECTED]
Re: Trying not to re-invent the wheel
According to Rasmus Lerdorf: Those introduce more complex problems. And they are, of course, inevitable with almost any templating system. You know, PHP was once just a templating system. [...] Then I figured it would be a good idea to add stuff like IF/LOOPS/etc so I could manipulate my tags a little bit. Now, 5 years later, people are writing template systems that sit on top of PHP because they are writing business logic in PHP which means yet another template system is needed to separate code from layout. I wonder how many layers of templates we will have 5 years from now. I think a lot of unnecessary complexity comes from the fact that most of the template systems (and apache modules in general) want to output the html as a side effect instead of accumulating the page in a buffer or just returning a string containg the html plus a status value to the caller. This means that you can't easily make nested sub-pages without knowing ahead of time how they will be used, and worse, if you get an error in step 3 of generating a page you can't undo the fact that steps 1 and 2 are probably already on the user's screen. If the template language offers some flow control and logic and the ability for one 'page' to return a status plus a string containing it's html to another page that includes it then you wouldn't need a different template system to separate logic from layout, you would just put them in different pages, letting the 'code' page include the layout elements it wants. The IO Layering in Apache2 would be a good place to do this. It really shouldn't be a module's job to buffer its output. A module should do its thing and feed its output back to the web server. The web server should then determine if any other modules need to massage the data or whether it is time to feed it to the socket. -Rasmus
RE: Trying not to re-invent the wheel
I've written up a few test benches for HTML::Parser.. it works ok, but it's not as fast as I would like it to be. IS there some reason you don't just use HTML::Mason? Patient: My tooth aches. Doctor: Is there some reason you haven't replaced your teeth with dentures? -sam
RE: Trying not to re-invent the wheel
I've decided to go with Mason.
Re: Trying not to re-invent the wheel
On Nov 10, Mark Cogan wrote: At 10:10 AM 11/10/99 -0800, Ian Mahuron wrote: I may implement IF/LOOPS/etc.. but not until I see the need. Those introduce more complex problems. And they are, of course, inevitable with almost any templating system. Jim
Re: Trying not to re-invent the wheel
"Joshua" == Joshua Chamas [EMAIL PROTECTED] writes: Joshua What Matt brought up is right. Eewww. Whenever I read "matt" and "w(right)" in the same sentence, all sorts of alarms go off. Sorry. :-) -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 [EMAIL PROTECTED] URL:http://www.stonehenge.com/merlyn/ Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
Re: Trying not to re-invent the wheel
On Wed, 10 Nov 1999, Ian Mahuron wrote: The code in HTML::Template may work.. though it seems that it would be very slow. Actually, I like to think it's some pretty fast code... Of course, that's because it's only looking for TMPL_* tags, and it's allowing them to break all kinds of HTML laws. Thus, it's not really useable for "HTML parseing" at all! I think you need to give us a little more information about what you're trying to do with the HTML. My knee-jerk reaction is to say "You need HTML parseing? Use HTML::Parser!" but perhaps that's not what you really need? -sam
RE: Trying not to re-invent the wheel
As per someone's suggestion I'll ellaborate on what's in the HTML... Insert code for advertisment (there's 1,000's of different ads on the site.): ADVERTISMENT id=252 Insert news scroller: NEWS_ITEM id=92834 bgcolor="#0066FF" There will be at least 50 similar tags.. so I'm not parsing for just a couple of tags like HTML::Template.. I may implement IF/LOOPS/etc.. but not until I see the need. I've written up a few test benches for HTML::Parser.. it works ok, but it's not as fast as I would like it to be. Ian
RE: Trying not to re-invent the wheel
I found that writing my own parser to fit my specific need was far and away the fastest thing I could do. It really depends upon your specific application. HTML::Parser is nice if you want to see the structure of the document your parsing but is just too slow to use for wresting particular tags from a document... If you're interested, I could forward you the code snippet I wrote as it is part of a package we've obtained a software release from ATT called absent (see http://www.research.att.com/projects/absent/ for more). Regards, Christian -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Ian Mahuron Sent: Wednesday, November 10, 1999 1:10 PM To: Ian Mahuron; ModPerl Subject: RE: Trying not to re-invent the wheel As per someone's suggestion I'll ellaborate on what's in the HTML... Insert code for advertisment (there's 1,000's of different ads on the site.): ADVERTISMENT id=252 Insert news scroller: NEWS_ITEM id=92834 bgcolor="#0066FF" There will be at least 50 similar tags.. so I'm not parsing for just a couple of tags like HTML::Template.. I may implement IF/LOOPS/etc.. but not until I see the need. I've written up a few test benches for HTML::Parser.. it works ok, but it's not as fast as I would like it to be. Ian
Re: Trying not to re-invent the wheel
There will be at least 50 similar tags.. so I'm not parsing for just couple of tags like HTML::Template.. I may implement IF/LOOPS/etc.. but not until I see the need. It might be too late to do this, but what if you convert everything to one tag. I can better explain by example: Instead of ADVERTISMENT id=252 use INSERT type=ad id=252 Instead of NEWS_ITEM id=92834 bgcolor="#0066FF" use INSERT type=news id=92834 bgcolor="#0066FF" That way, you can use a logic very similar to what I assume is in HTML::Template -- I've never used it -- and you get the IF/loops/etc logic for free. ELB -- Eric L. Brine | Chicken: The egg's way of making more eggs. [EMAIL PROTECTED] | Do you always hit the nail on the thumb? ICQ# 4629314 | An optimist thinks thorn bushes have roses.
RE: Trying not to re-invent the wheel
I don't know, if you have to stick to the tags as described below, but if you don't have to you may want to take a look at a custom Apache::SSI subclass which can do all this stuff for you and no perl-based HTML parsing is involved: !--#ADVERTISMENT id=252 -- Tobias At 10:10 AM 11/10/99 -0800, Ian Mahuron wrote: As per someone's suggestion I'll ellaborate on what's in the HTML... Insert code for advertisment (there's 1,000's of different ads on the site.): ADVERTISMENT id=252 Insert news scroller: NEWS_ITEM id=92834 bgcolor="#0066FF" There will be at least 50 similar tags.. so I'm not parsing for just a couple of tags like HTML::Template.. I may implement IF/LOOPS/etc.. but not until I see the need. I've written up a few test benches for HTML::Parser.. it works ok, but it's not as fast as I would like it to be. Ian
RE: Trying not to re-invent the wheel
I believe there are a couple of HTML parsers out there. In the Perl News email sent out a couple of days ago there was one that caught my eye. I believe it was an XS implementation so it should be very fast. I'm not an XML expert but you might want to try the XML parser. It's also a perl frontend for a c parser. On 10-Nov-99 Ian Mahuron wrote: As per someone's suggestion I'll ellaborate on what's in the HTML... Insert code for advertisment (there's 1,000's of different ads on the site.): ADVERTISMENT id=252 Insert news scroller: NEWS_ITEM id=92834 bgcolor="#0066FF" There will be at least 50 similar tags.. so I'm not parsing for just a couple of tags like HTML::Template.. I may implement IF/LOOPS/etc.. but not until I see the need. I've written up a few test benches for HTML::Parser.. it works ok, but it's not as fast as I would like it to be. Ian --- Jason Bodnar + [EMAIL PROTECTED] + Tivoli Systems That boy wouldn't know the difference between the Internet and a hair net. -- Jason Bodnar
Re: Trying not to re-invent the wheel
"Christian Gilmore" [EMAIL PROTECTED] writes: I found that writing my own parser to fit my specific need was far and away the fastest thing I could do. It really depends upon your specific application. HTML::Parser is nice if you want to see the structure of the document your parsing but is just too slow to use for wresting particular tags from a document... True. This was the main reason I started work on a new XS based HTML::Parser a week ago. It should make much of the performance argument go away. Still, most of the HTML that I have ever needed to parse or manipulate is regular enough to make perl REs good enough. Since HTML::Parser is XS based now I'm also able to offer many more features without suffering performance. I have attached a message I sent to the [EMAIL PROTECTED] mailing list today describing what's new. Regards, Gisle I am now up to version 2.99_08 of the new HTML::Parser and I think it comes along nicely. As you might guess from the version number I am aiming for version 3.00 when I think it is ready for general use. I still encourage people to download it and test it out on various platforms (at least check that 'make test' says everything is ok). You can get it from: $CPAN/authors/id/GAAS/HTML-Parser-XS-2.99_08.tar.gz Compatibility with HTML-Parser-2.2x is now perfect as far as I can tell. The interfaces to all new features I still reserve the right to change until 3.00-time. There is still no documentation on the new things, but the following text attempts explain most of them: The main new feature is that instead of making a subclass you can just provide callbacks to be invoked when various elements are recognised. When one or more direct callbacks are provided, then no methods will be called. There is a new 'default' callback that is invoked with the text of everything that there is no other callback registered for. This might for instance be used to implement a simple comment stripper by code like this: HTML::Parser-new(comment = sub {}, # ignore default = sub { print $_[0] }, )-parse_file(shift); (I actually thought I was very clever when I realized how handy this would be, but later found out that XML::Parser already had exactly this feature. :-) Text handlers get an extra argument that is true if entities are already expanded in the text string passed. This was needed to handle script, style, xmp, plaintext correctly and in a way that was backwards compatible. There is also a boolean parser attribute called $p-decode_text_entities that can be set to let the parser always internally decode entities (so _you_ can ignore the issue). There is a new boolean parser attribute called $p-keep_case that when set to a true value suppress downcasing of tag and attribute names. There is a new boolean parser attribute called $p-xml_mode that make the parser recognise XMLs empty tags, makes processing instructions be terminated by "?" (instead of ""), and implies $p-keep_case. This should be enough to parse some simple XML documents. There is a new parser attribute called $p-bool_attr_val that can be set to influence the value set for boolean HTML attributes. If you don't set this value they will (as before) take the attribute key as value. There is a new parser attribute called $p-accum. It takes an array reference as its value. If set, then all parsed stuff will be accumulated here in the style of HTML::TokeParser. No callbacks will be invoked. (HTML::TokeParser is in fact implemented based on this now.) HTML::Entities::decode is now implemented by XS code. That makes it a few times faster. Other things I am thinking about supporting (soon?): - keep track of byte counts and line numbers. - an attribute that makes the parser never break text, i.e. that you can never get two 'text' callbacks in a row. This will have to delay text callbacks until some other element is recognised. - attributes that control what will enter the 'accum' array - report byte positions within the start tag where the attributes and their values live. This should be handy when all you want to do is remove/add or change some values while keeping everything else unchanged. - parsing of marked sections; eg. "![CDATA[ ... ]]" - utf8 text (affects what bytes entities are expanded into as well as the range of numeric entities that will be expanded.) Is there anything else anybody have wished for? Regards, Gisle
Re: Trying not to re-invent the wheel
At 18:53 12/11/1999 -0500, Rasmus Lerdorf wrote: You know, PHP was once just a templating system. I had special tags that I replaced with the output from the business logic I wrote in C in order to avoid needing to recompile my code just to tweak the HTML. Then I figured it would be a good idea to add stuff like IF/LOOPS/etc so I could manipulate my tags a little bit. Now, 5 years later, people are writing template systems that sit on top of PHP because they are writing business logic in PHP which means yet another template system is needed to separate code from layout. Interesting anecdote :-) I wonder how many layers of templates we will have 5 years from now. It's a slippery slope... Well, right now I am writing a templating system (well it's more of an XML/XHTML/HTML + XSL/CSS system, but still...) that is meant to publish static pages in various ways (one of the main goals is to separate layout from structure using simple and mostly CSS rules while still being able to send the HTML to browsers that don't understand CSS), but especially so that "live" components of any kind (SSI, Mason, ::ASP, Embperl, whatever) can be embedded easily in the output. And it has hooks so that the original template doesn't have to be a file but can be pretty much anything, including Perl code returning a template (the said Perl code is of course free to use a templating system, and it is likely that it will). That means templates generating templates, possibly including other templates (at least). Hmmm. .Robin "What I like about deadlines is the lovely whooshing sound they make as they rush past." --Douglas Adams