It is not so hard to decide whether using RE is a good thing or not.

When the speed is important and every millisecond counts, RE should be used only when there is no other faster way, because usually RE is less faster than using other core Perl/Python functions that can do matching and replacing.

When the speed is not such a big issue, RE should be used only if it is easier to understand and maintain than using the core functions. And of course, RE should be used when the core functions cannot do what RE can do.

In Python, the RE syntax is not so short and simple as in Perl, so using RE even for very very simple things requires a longer code, so using other core functions may appear as a better solution, because the RE version of the code is almost never as easy to read as the code that uses other core functions (or... for very simple RE, they are probably same as readable).

In Perl, RE syntax is very short and simple, and in some cases it is more easier to understand and maintain a code that uses RE than other core functions.

For example, if somebody wants to check if the $var variable contains the letter "x", a solution without RE in Perl is:

if ( index( $var, 'x' ) >= 0 ) {
   print "ok";
}

while the solution with RE is:

if ( $var =~ /x/ ) {
   print "ok";
}

And it is obviously that the solution that uses RE is shorter and easier to read and maintain, beeing also much more flexible.

Of course, sometimes an even better alternative is to use a module from CPAN like Regexp::Common that can use RE in a more simple and readable way for matching numbers, profanity words, balanced params, programming languages comments, IP and MAC addresses, zip codes... or a module like Email::Valid for verifying if an email address is correct, because it may be very hard to create a RE for matching an email address.

So... just like with Python, there are more ways to do it, but depending on the situation, some of them are better than others. :-)

--Octavian

----- Original Message ----- From: "Chris Torek" <nos...@torek.net>
Newsgroups: comp.lang.python
To: <python-list@python.org>
Sent: Monday, June 06, 2011 10:11 AM
Subject: Re: how to avoid leading white spaces


In article <ef48ad50-da06-47a8-978a-47d6f4271...@d28g2000yqf.googlegroups.com>
ru...@yahoo.com <ru...@yahoo.com> wrote (in part):
[mass snippage]
What I mean is that I see regexes as being an extremely small,
highly restricted, domain specific language targeted specifically
at describing text patterns.  Thus they do that job better than
than trying to describe patterns implicitly with Python code.

Indeed.

Kernighan has often used / supported the idea of "little languages";
see:

   http://www.princeton.edu/~hos/frs122/precis/kernighan.htm

In this case, regular expressions form a "little language" that is
quite well suited to some lexical analysis problems.  Since the
language is (modulo various concerns) targeted at the "right level",
as it were, it becomes easy (modulo various concerns :-) ) to
express the desired algorithm precisely yet concisely.

On the whole, this is a good thing.

The trick lies in knowing when it *is* the right level, and how to
use the language of REs.

On 06/03/2011 08:05 PM, Steven D'Aprano wrote:
If regexes were more readable, as proposed by Wall, that would go
a long way to reducing my suspicion of them.

"Suspicion" seems like an odd term here.

Still, it is true that something (whether it be use of re.VERBOSE,
and whitespace-and-comments, or some New and Improved Syntax) could
help.  Dense and complex REs are quite powerful, but may also contain
and hide programming mistakes.  The ability to describe what is
intended -- which may differ from what is written -- is useful.

As an interesting aside, even without the re.VERBOSE flag, one can
build complex, yet reasonably-understandable, REs in Python, by
breaking them into individual parts and giving them appropriate
names.  (This is also possible in perl, although the perl syntax
makes it less obvious, I think.)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W)  +1 801 277 2603
email: gmail (figure it out)      http://web.torek.net/torek/index.html



--------------------------------------------------------------------------------


--
http://mail.python.org/mailman/listinfo/python-list


--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to