On 23 Dec 2006 04:23:09 -0800, Chris Johnson <[EMAIL PROTECTED]> wrote: > > BJörn Lindqvist wrote: > > With regexps you can search for strings matching it. For example, > > given the regexp: "foobar\d\d\d". "foobar123" would match. I want to > > do the reverse, from a regexp generate all strings that could match > > it. > > > > The regexp: "[A-Z]{3}\d{3}" should generate the strings "AAA000", > > "AAA001", "AAA002" ... "AAB000", "AAB001" ... "ZZZ999". > > > > Is this possible to do? Obviously, for some regexps the set of matches > > is unbounded (a list of everything that matches "*" would be very > > unpractical), but how would you do it for simple regexps like the one > > above? > > For a very small number of characters, it would be feasible. For any > finite number of characters, it would be possible (though it wouldn't > take much to take longer than the age of the universe). For reference, > in your simple example, you have 17,576,000 matching strings. > > I'm curious as to why you would wish to do this. I certainly understand > considering hard problems for their own sake, but when I formulate > them, there's always some impetus that makes me say "Huh. Now I > wonder..."
I have a thousand use cases in mind. For example: 1. Generate sentences for a bot: ipermute("(I|You|He|She|It|They) do( not)? (dis)?like (you|him|her|me|they)"): Generates many grammatically incorrect sentences but you get the point. 2. Generate URL:s to web resources: The following should generate URL:s to all c.l.p digests from the mail archive: def download_clp(): year_re = "(199\d|200[0-6])" months = ["January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"] month_re = '(' + '|'.join(months) + ')' fmt = "http://mail\.python\.org/pipermail/python-list/%s-%s\.txt" url_re = fmt % (year_re, month_re) for x in ipermute(url_re): print "Downloading", x code to download here The same approach could be used to download all threads in a forum for example, or all articles on Slashdot. 3. "Visualising" regular expressions. I think it would be easier to write regular expressions if you could see what kind of data they would match. 4. Port scanning: ip_tuple = "(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])" for x in ipermute("192\.168\." + "\.".join([ip_tuple] * 2)): scan_ip(x) -- mvh Björn -- http://mail.python.org/mailman/listinfo/python-list