On Thu, Apr 03, 2003 at 07:30:10AM -0700, Luke Palmer wrote:
> > just an aside, and a bit off-topic, but has anybody considered
> > hijacking the regular expression engine in perl6 and turning it into
> > its opposite, namely making *productions* of strings/sounds/whatever
> > that could possibly match the regular expression? ie:
> > 
> > a*
> > 
> > producing
> > 
> > ''
> > a
> > aa
> > aaa
> > aaaa
> > 
> > etc.
> > 
> > I could think of lots of uses for this:
> > 
> >     1) test data for databases and programs.
> >     2) input for genetic algorithms/code generators
> >     3) semantic rules/production of random sentences.
> > 
> > In fact, I foresee a combo of 2,3 and some expert system somewhere
> > producing the first sentient perl program. ;-)
> 
> Yeah, it seems like a neat idea.  It is if you generate it
> right... but, fact is, you probably won't.  For anything that's more
> complex than your /a*/ example, it breaks down (well, mostly):
> 
>     /\w+: \d+/
> 
> Would most likely generate:
> 
>     a: 0
>     a: 00
>     a: 000
>     a: 0000
> 
> Or:
> 
>     a: 0
>     a: 1
>     ...
>     a: 9
>     a: 00
>     a: 01
> 
> ad infinitum, never getting to even aa: .*

But that's the point - I don't want it to be just able to generate all possibilities,
I want it to be able to generate a subset of valid possibilities. And have:

a) a default heuristic for doing so, based on a regex
b) user defined heuristics for doing so

Although I disagree with you on the idea that it has no uses as is  - generating all
possible combinations. You could do:

my @list is Regex::Generator(/([1-6])([1-6^\1])([1-6^\1\2])/)

to return a list of all combinations of numbers between 1 and 6 and:

my @words = qw( word list number one );
my @words2 = qw( word list number two );

my @list is Regex::Generator(/ (@words) (@words2) /);

to generate all possible combinations of words. You could also test hard to understand
rexen by simplifying and generating all possible combinations:

my $_doublestring = q$(?:\"(?>[^\\\"]+|\\\.)*\")$; 

becomes

my $_doublestring = q$(?:\"(?>[notdq]+|\\\")*\")$; 

to generate:

""
"n"
"o"
"t"
...
"\""

> 
> But I guess then you'd see a lot more quantifiers and such.
> 
>     /\w+<8>: \d<4>/

or substituting \w for something more manageable like [a-f] and \d for [1-2].

> Is finite (albeit there are 63**8 * 10**4 == 2,481,557,802,675,210,000
> combinations).  References to the heat death of the universe, anyone?
> 
> And then there's Unicode. %-/

> In reality, I don't think it would be that useful.  Theoretically,
> though, you *can* look inside the regex parse tree and create a
> generator out of it... so, some module, somewhere.

Of course, it would need a little elbow grease to be truly useful. The syntax for
making heuristics in generating useful productions would take some work. But I can 
think 
of a dozen uses for it.

Ex: Right now, I'm writing a generator to generate sample programming problems - for a 
book I'm writing. It spits out both the problem, and the code to answer the problem.. 
Using a production engine like the one above, and this problem generator becomes 20 
lines of code.

Ed


Reply via email to