I figured out the problem which I posted earlier, so I'm responding for completeness. Regarding the subroutine "gen_substrs" if you do the following:
@strings = ("green", "blue"); foreach $string (@strings) { @subs = gen_substrings($string); pring "@subs\n"; } The output is: line 1 g gr gre gree green r re ... and so on ... line 2 Nothing on line 2! If in the subroutine definition, you change my @s; to @s = (); All is well. I'm not sure if this satisfies the careful programmer, but it works as far as I can tell. Thanks for the help. Chris. -------------------------------------------------------------------- Chris Staskewicz http://www.ZyGob.com/cjs http://www.math.utah.edu/~cjs -------------------------------------------------------------------- ---------- Forwarded message ---------- Date: Thu, 10 Oct 2002 12:34:56 -0600 (MDT) From: Chris Staskewicz <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Cc: Paul Makepeace <[EMAIL PROTECTED]> Subject: Re: [Boston.pm] "cross product" searches Thanks, the subroutine is very helpful for generating these substrings. However, if I do: @strings = ("green", "blue"); foreach $str (@strings) { @subs = gen_substrs($str); } @subs contains all the desired "green" substrings, but is NULL on the next iteration of the loop. Thanks again, Chris. On Wed, 9 Oct 2002, Spider Boardman wrote: > I make no claims as to being optimal, but in the theory of "ops are bad, > m'kay" (with a nod to Nick Clark), here's what I did: > > perl -le 'my @s; $_ = "greensleeves"; /(.+?)(??{push @s,$1})(?!)/; print "@s"' > > It's roughly the same algorithm, since it's inherently O(n^2) to generate > substrings. [For a string of length n, there are n starting positions, and > from each position i, running from 0 to n-1, there are n-i substrings to be > had.] This solution does minimise the real ops in perl by letting the > regexp engine handle the actual finding of the substrings. On the other > hand, the (??{code}) construct has close to eval/sub-entry overhead. I > don't recommend this for readability. Its real value was in minimising my > debugging time. > > Done as a subroutine (should anyone care): > > sub gen_substrs ($) { > my $str = shift; > my @s; > $str =~ /(.+?)(??{push @s,$1})(?!)/; > @s; > } > > For actual enumeration, for a string of length n, the number of non-null > contiguous substrings is: > > (n^2 + n) / 2 > > In particular, for "greensleeves", which is a 12-character string, there are > 78 such substrings to be found. Because of the repeated letters in the > starting string, of course, those 78 values are not unique. > > Hmm, I had a point when I started. Oh, yeah. As long as your algorithm > isn't of worse complexity than required by the nature of the problem, don't > worry about it too much. Just be sure it gives you the right answers. If > finding the substring lists turns out to be a bottleneck, consider using > Memoize or something similar to avoid re-building the lists for the same > word each time. > > Hth, > --s. > > -- > Spider Boardman (at home) [EMAIL PROTECTED] > The management (my cats) made me say this. http://www.ultranet.com/~spiderb > PGP public key fingerprint: 96 72 D2 C6 E0 92 32 89 F6 B2 C2 A0 1C AB 1F DC > -------------------------------------------------------------------- Chris Staskewicz http://www.ZyGob.com/cjs http://www.math.utah.edu/~cjs -------------------------------------------------------------------- _______________________________________________ Boston-pm mailing list [EMAIL PROTECTED] http://mail.pm.org/mailman/listinfo/boston-pm