On Tue, 18 Jan 2005 01:36:48 +0000
 Alex Tweedly <[EMAIL PROTECTED]> wrote:
Rev's RE library is based on PCRE, so should be adequately capable.

However, I don't think it's as easy to parse the realistic version of CSV with REs as you might think.

Well, Alex, it's not so difficult with Perl. If the items in the comma-separated list can contain other commata, in which case they are enclosed by quotes (optionally otherwise), like '"a,b",c,"d"', then the Perl script to parse the list looks like:


#!/usr/bin/perl
@s = (
        '"My family, My PowerBook, My Defender 110","1","[EMAIL PROTECTED]"',
        'Scrooge,2,[EMAIL PROTECTED]',
        'RunRev List,"3,4,...","[EMAIL PROTECTED]"');
foreach (@s) {
        if (/"*([^"]+)"*,"*([^"]+)"*,"*([^"]+)"*/) {
                print ("$_\n\t$1\n\t$2\n\t$3\n");
        }
}

This example gives the result:

"My family, My PowerBook, My Defender 110","1","[EMAIL PROTECTED]"
        My family, My PowerBook, My Defender 110
        1
        [EMAIL PROTECTED]
Scrooge,2,[EMAIL PROTECTED]
        Scrooge
        2
        [EMAIL PROTECTED]
RunRev List,"3,4,...","[EMAIL PROTECTED]"
        RunRev List
        3,4,...
        [EMAIL PROTECTED]

which is what you would expect.

I don't know if it works in Rev because every implementation of RE is a bit different, and Perl has the best I've come across. Anyway: Perl can be installed on every machine, it's pre-installed on Unix, Linux and MacOS/X, so just use the power of this language in combination with Rev, RB or whatever development tool you use, instead of trying to do everything with one tool.

I'm missing this flexibility in the usage of tools in the IT world. Nobody in the industry would use a Porsche to transport stones (except the ones weared around the neck or wherever ladies have them), and nobody would drive a fork-lift truck on a (German) Autobahn. Most of us use hands and feet for their respective purposes. So why do programmers want to use one tool for all?

Cheers,

Thomas G.

---

For those of you who find it hard to read regular expressions (they are a good example of a write-only language):

/"*([^"]+)"*,"*([^"]+)"*,"*([^"]+)"*/

represents 3 times the same group, separated by a comma: "*([^"]+)"*

This expression contains a prefix and a postfix: "* - which means "zero or more 
quotes".

In the middle of the expression - enclosed in brackets - is the term to be extracted: [^"]+ - which reads: any character except a quote, but at least one. If you replace the "+" with a "*", it would be allowed to have to commata following each other.

The regular expression can be shortened even more, but then it becomes completely uncomprehensible, and you need more time to comment it than to write it.
_______________________________________________
use-revolution mailing list
use-revolution@lists.runrev.com
http://lists.runrev.com/mailman/listinfo/use-revolution

Reply via email to