Re: [Boston.pm] "cross product" searches (fwd)

Chris Staskewicz Thu, 10 Oct 2002 14:11:02 -0700

I figured out the problem which I posted earlier, so I'm responding for
completeness.  Regarding the subroutine "gen_substrs" if you do the
following:


@strings = ("green", "blue");
foreach $string (@strings) {
        @subs = gen_substrings($string);
        pring "@subs\n";
}

The output is:
line 1  g gr gre gree green r re ... and so on ...
line 2

Nothing on line 2!  If in the subroutine definition, you change

my @s;

to

@s = ();

All is well.  I'm not sure if this satisfies the careful programmer, but
it works as far as I can tell.

Thanks for the help.

Chris.

 --------------------------------------------------------------------
 Chris Staskewicz
 http://www.ZyGob.com/cjs
 http://www.math.utah.edu/~cjs
 --------------------------------------------------------------------

---------- Forwarded message ----------
Date: Thu, 10 Oct 2002 12:34:56 -0600 (MDT)
From: Chris Staskewicz <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Cc: Paul Makepeace <[EMAIL PROTECTED]>
Subject: Re: [Boston.pm] "cross product" searches

Thanks,  the subroutine is very helpful for generating these substrings.
However, if I do:

@strings = ("green", "blue");

foreach $str (@strings) {
        @subs = gen_substrs($str);
}

@subs contains all the desired "green" substrings, but is NULL on the next
iteration of the loop.

Thanks again,

Chris.

On Wed, 9 Oct 2002, Spider Boardman wrote:

> I make no claims as to being optimal, but in the theory of "ops are bad,
> m'kay" (with a nod to Nick Clark), here's what I did:
>
> perl -le 'my @s; $_ = "greensleeves"; /(.+?)(??{push @s,$1})(?!)/; print "@s"'
>
> It's roughly the same algorithm, since it's inherently O(n^2) to generate
> substrings.  [For a string of length n, there are n starting positions, and
> from each position i, running from 0 to n-1, there are n-i substrings to be
> had.]  This solution does minimise the real ops in perl by letting the
> regexp engine handle the actual finding of the substrings.  On the other
> hand, the (??{code}) construct has close to eval/sub-entry overhead.  I
> don't recommend this for readability.  Its real value was in minimising my
> debugging time.
>
> Done as a subroutine (should anyone care):
>
>       sub gen_substrs ($) {
>           my $str = shift;
>           my @s;
>           $str =~ /(.+?)(??{push @s,$1})(?!)/;
>           @s;
>       }
>
> For actual enumeration, for a string of length n, the number of non-null
> contiguous substrings is:
>
>       (n^2 + n) / 2
>
> In particular, for "greensleeves", which is a 12-character string, there are
> 78 such substrings to be found.  Because of the repeated letters in the
> starting string, of course, those 78 values are not unique.
>
> Hmm, I had a point when I started.  Oh, yeah.  As long as your algorithm
> isn't of worse complexity than required by the nature of the problem, don't
> worry about it too much.  Just be sure it gives you the right answers.  If
> finding the substring lists turns out to be a bottleneck, consider using
> Memoize or something similar to avoid re-building the lists for the same
> word each time.
>
>       Hth,
>       --s.
>
> --
> Spider Boardman (at home)                   [EMAIL PROTECTED]
> The management (my cats) made me say this.    http://www.ultranet.com/~spiderb
> PGP public key fingerprint: 96 72 D2 C6 E0 92 32 89  F6 B2 C2 A0 1C AB 1F DC
>

 --------------------------------------------------------------------
 Chris Staskewicz
 http://www.ZyGob.com/cjs
 http://www.math.utah.edu/~cjs
 --------------------------------------------------------------------


_______________________________________________
Boston-pm mailing list
[EMAIL PROTECTED]
http://mail.pm.org/mailman/listinfo/boston-pm

Re: [Boston.pm] "cross product" searches (fwd)

Reply via email to