On Saturday, July 26, 2003, at 08:58 AM, Mark Brownell wrote:


Yes I can see that. It will also not work for MTML because a part of one element tag set can begin inside of another element tag set and end outside of it in MTML. This was a primary issue while defending off XML innovators several years ago when I started experimenting with it. The PNLP handler is not effected by this problem, and at this point it is still the fastest choice. Where I see an advantage is in some of MTML's multimedia handling tag sets that could be easier to script with perlRegEx.

Mark, I think you are right to look for pros and cons in each method- however don't sell the regex method short. I don't there is a problem with it or anything that it's not capable of doing, with the right pattern. You are judging regex based on only a few examples of a pattern match. It's certainly possible to handle nested tags and even overlapping tags: it's just a matter of crafting the right regular expression.


Tuviah said the patterns are cached so this means speed is not going to be an issue. Meaning- your regex pattern itself may be fast, or slow, but calling the regex function will be fast in a loop because RR will cache the compiled regex pattern.

Lots of people have written have written XML parsers using regular expressions. I don't know if it's been done with RR, but certainly it has for Perl, Python and other scripting languages with regex features.

There are lots of Perl modules here maybe you can get some ideas:
http://search.cpan.org/modlist/String_Language_Text_Processing/XML

Some of those listed will be just wrappers around C libraries like Expat or Xalan, and some will be written in pure Perl with regular expressions. In particular I think XML::Grove and XML::Parser::Lite use regex to do their parsing. You could copy their perl regular expression syntax for use in your project. You will find some SAX-like, DOM-like and probably some pull-parser like stuff that list.

I say this partially because I don't really understand how the offset method would be used to parse xml in a general, reusable way :-)

Hope this helps,

Alex Rice, Software Developer
Architectural Research Consultants, Inc.
http://ARCplanning.com

_______________________________________________
use-revolution mailing list
[EMAIL PROTECTED]
http://lists.runrev.com/mailman/listinfo/use-revolution

Reply via email to