Re: regex, 1 off...

2007-12-17 Thread Todd


Seems it's related to a more general question stated as `Given 2
sequences, find longest common sub sequence'. Many algorithm books
have materials about this one.

-Todd


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: regex, 1 off...

2007-12-17 Thread Todd

 However much depends on the actual data and the variations that you are
 expecting.

 If you are searching for words like those used in the English language
 then you may want to look at how spell checking software works.

Seems related to the algorithm like `find the longest common sub
sequence of give 2 sequences'. Many algorithm books covers it.

-Todd


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: regex, 1 off...

2007-12-17 Thread Jay Savage
On Dec 16, 2007 2:21 PM, namotco [EMAIL PROTECTED] wrote:
 Let's say I want to search some text for abc123.  However, we know
 people can make typos and so they could have entered avc123 or abc223
 or sbc123 or bc123 many other combinations...
 So I want to search for those possibilities as well.  So how would I
 go about creating the proper regex?

 Thanks!

How do you define a typo? How do you know whether it's a typo, or a
different string? Do you know, for instance, that only 'abc\d\d\d' is
valid, and 'avc\d\d\d' is never valid? If so, you could do something
like:

 if (/^abc\d\d\d$/ or s/^a.c(\d\d\d)$/abc$1/) {
# match!
 } else {
#no match!
 }

If you can't predict the input, though, you'll need some heavy duty
algorithmic logic. Take a look through CPAN and see if there isn't
something that meets your needs. String::Approx and
String::KeyboardDistance might be places to start. There are also a
number of things in the Text::* tree.

HTH,

-- jay
--
This email and attachment(s): [  ] blogable; [ x ] ask first; [  ]
private and confidential

daggerquill [at] gmail [dot] com
http://www.tuaw.com  http://www.downloadsquad.com  http://www.engatiki.org

values of β will give rise to dom!


Re: regex, 1 off...

2007-12-17 Thread Rob Dixon

namotco wrote:

Let's say I want to search some text for abc123.  However, we know 
people can make typos and so they could have entered avc123 or abc223 or 
sbc123 or bc123 many other combinations...
So I want to search for those possibilities as well.  So how would I go 
about creating the proper regex?


I don't think a regex is appropriate in this case, but if you want to
write something that guesses at what a misspelled string should have
been then search the Web for Damerau-Levenshtein distance, which is very
effective and the algorithm codes up fairly simply.

Rob

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




regex, 1 off...

2007-12-16 Thread namotco
Let's say I want to search some text for abc123.  However, we know  
people can make typos and so they could have entered avc123 or abc223  
or sbc123 or bc123 many other combinations...
So I want to search for those possibilities as well.  So how would I  
go about creating the proper regex?


Thanks!

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: regex, 1 off...

2007-12-16 Thread John W . Krahn
On Sunday 16 December 2007 11:21, namotco wrote:

 Let's say I want to search some text for abc123.  However, we know
 people can make typos and so they could have entered avc123 or abc223
 or sbc123 or bc123 many other combinations...
 So I want to search for those possibilities as well.  So how would I
 go about creating the proper regex?

Regular expressions are about matching patterns so you have to define 
what kind of pattern you are searching for.

From your example you may want something like:

/ \b (?: .?bc | a.?c | ab.? ) (?: .23 | 1.3 | 12. ) \b /x

However much depends on the actual data and the variations that you are 
expecting.

If you are searching for words like those used in the English language 
then you may want to look at how spell checking software works.



John
-- 
use Perl;
program
fulfillment

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/