Re: Programming challenge: wildcard exclusion in cartesian products
Chris F Clark wrote: Yes, there is literature on the generating side of the regular expression/FSM model. In fact, the matching problem and the generating problems are exactly equivalent. A slight variation of the definition of how a matcher works, turns it into a generator and vice versa. To directly generate (rather than match) from an FSM, one simply does a walk (e.g. depth first search or breath first search) over the machine writing out (rather than matching) the symbols used as labels. Thus, all the theorems about matching apply to generating and also in reverse. If the language is Sigma* (rather than Sigma^n in the original post) then doing a depth first search over the FSM requires a stack to maintain context, right? I find it interesting that you suggest a PDA (push down automata) is required to enumerate the language accepted by an FSM since PDAs are strongly associated with CFGs (context free grammars). Does it follow then that a Turing machine is required to enumerate the language defined by a CFG? (that was a joke, I think). It is worth noting the regular languages are closed under the difference operator, so that resulting language from substracting one regular expression from another is still a regular language, I was pretty sure this was the case but it has been more than a decade since I studied computational models in school so I wasn't absolutely certain. While I do remember homework that called for creating FSMs that accepted languages defined by a regexp, I don't recall having solved the enumeration problem. Your clear and concise comments did a nice job of jogging my memory. === It seems to me that the purpose of the original challenge was to compare how different languages implement the solution to the problem. For such a well understood problem as this the goal of the challenge (to contrast the languages) is best served when participants have a good understanding of the 'state of the art' abstract solutions (e.g. using a stack to traverse a FSM in DFS fashion). -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Going in a slightly different direction ... There has been lots of published work on how to create efficient FSMs from regexps. Generally these FSMs are used for pattern matching (i.e. does string 's' match regexp 'e'?). Is there any corresponding literature on the topic addressed by the OP's challenge of generating the languaged defined by a regexp (or the complement of that regexp)? --jfc -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Yes, there is literature on the generating side of the regular expression/FSM model. In fact, the matching problem and the generating problems are exactly equivalent. A slight variation of the definition of how a matcher works, turns it into a generator and vice versa. To directly generate (rather than match) from an FSM, one simply does a walk (e.g. depth first search or breath first search) over the machine writing out (rather than matching) the symbols used as labels. Thus, all the theorems about matching apply to generating and also in reverse. You can see this to some extent form the problem posed. If one generates Sigma* and subtracts out the elements from some regular language (say by matching them), that is exactly equivalent (in strings generated) to generating the complement of the regular language. In fact, it is quite easy (with the correct regular expression tool, i.e. one that handles regular expression differences) to take the problems posed and generate provably minimal (i.e. provably maximally efficient) generation (or matching) programs as FSMs. The provably minimal FSM won't go down any paths that don't have some sequence that generates an accept value. It is worth noting the regular languages are closed under the difference operator, so that resulting language from substracting one regular expression from another is still a regular language, which can be used to prove that the machine is minimal. Therefore, while the output can be exponentially larger than the input, one should expect that implementations should be able to generate the output in linear time (relative to the size of the output), since FSMs run in linear time relative to the string they are processing (whether generating or matching). Under a reasonable model of computation, one cannot do better than linear in the size of string processed. I'm sure if you asked under comp.theory, you would get tons of other relevant facts from someone who understands automata theory better than I. Note, if one does ask there, one should correct the notation. The * symbol was used as in globbing, not as commonly used in regular expressions. The notation .* (as someone corrected in one of their replies) is the normal notation for what the original poster wanted by *. Hope this helps, -Chris * Chris ClarkInternet : [EMAIL PROTECTED] Compiler Resources, Inc. Web Site : http://world.std.com/~compres 23 Bailey Rd voice : (508) 435-5016 Berlin, MA 01503 USA fax: (978) 838-0263 (24 hours) -- -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Dinko Tenev [EMAIL PROTECTED] wrote: Dirk Thierbach wrote: [A lot of stuff] Now clearer? Let's leave it there, and take a break. Maybe it would help to just take a concrete example, and work through it. Then you'll see exactly what happens. - Dirk -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Dirk Thierbach wrote: Dinko Tenev [EMAIL PROTECTED] wrote: I don't see immediately how exactly this is going to work. Unless I'm very much mistaken, a FSA in the classical sense will accept or reject only after the whole sequence has been consumed, and this spells exponential times. Exponential in respect to what? With respect to sequence length. But you cannot get rid of this. Consider S = {a, b}, W = {a}. Then there are |S|^n result elements for n 1, and you have to enumerate all of them. Yes, but then, they are in the target set. The point here is whether you can generate S^n - W in Theta( n * |S^n - W| ), which may be dramatically different from Theta( n * |S^n| ). The best thing one can hope for is to just actually process those elements that are really in the result set. With the FSM, you're getting at least close to that. If the FSA has to consume the whole sequence, you're only impoving performace by a constant factor. The automaton is: S: a - A, b - B A: a - A B: b - B The target set is specified as S^n - W, where W is everything matching (.*a.*b|.*b.*a). Following the construction procedure in point, this exclusion set is matched exactly by my DFA with S initial and F final. Then, swapping final and non-final states makes {S, A, B} final, and F non-final. Your DFA above may be equivalent, but to me it is far from clear exactly what algorithm would build it from the given data. Looking at the minimal DFA for the above example, it may be more appropriate to detect being in a state from which there's no path to a final state: This cannot happen, because minimizing removes all those states. (Or, in case you have a different definition of automaton in mind: There'll be just one stuck state, where all transitions that never go to a final state end up. Remove this state, and you'll end up with the other definition). I have to admit that I've never considered this before, but as I'm thinking of it now, a block of such states will never split by minimization, so they're bound to end up together as a single state, which can then be eliminated. I can see your point now. I guess one could even optimize on that: In the minimal automaton, every state has some transition sequence that will end up in a final state. Annotate each state with the minimum number of steps for such a sequence. While walking the automaton, keep the maximum number of letters to produce in a variable. If this number is small then the number in the annotation, stop and backtrace. I don't think this could cut efficiently for patterns like *a*b. The point is not to cut efficiently, the point is to enumerate only those words that are actually in the result set. No, this is not the point. Naive filtering already does that. By cutting efficiently I mean skipping over search sub-trees that don't contain any results from the target set. If you can do this, you can enumerate the target set in time proportional to its size, not to the size of the search space, which is the point. Cheers, Dinko -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
[EMAIL PROTECTED] wrote: Call a wc 'free' if it satisfies the propery that every letter 'a' in it appears only in the form '*a*', and 'anchored' otherwise. What if all wc's are free? How does this affect the DFA? Does it minimize nontrivially? Keep in mind I'm new to DFA theory. There would be no difference for single patterns, but I'm not sure into how large a DFA a set of those would combine. Cheers, Dinko -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Dinko Tenev [EMAIL PROTECTED] wrote: Dirk Thierbach wrote: [One cannot escape exponential behaviour] But you cannot get rid of this. Consider S = {a, b}, W = {a}. Then there are |S|^n result elements for n 1, and you have to enumerate all of them. Yes, but then, they are in the target set. Which is the point. If they are in the target set, you have to enumerate them. If the target set is of exponential size with respect to n, then you'll need exponential time to do that. The point here is whether you can generate S^n - W in Theta( n * |S^n - W| ), which may be dramatically different from Theta( n * |S^n| ). Exactly. Hence, you use a construction that guarantees that the time needed is proportional to n*|S^n - W|: Every step you do will be enecessary to produce at least one word in the output set. Now, if |S^n - W| is still exponential, then you'll still need exponential time. But nevertheless, that's the best you can hope for. The automaton is: S: a - A, b - B A: a - A B: b - B The target set is specified as S^n - W, where W is everything matching (.*a.*b|.*b.*a). Following the construction procedure in point, this exclusion set is matched exactly by my DFA with S initial and F final. Then, swapping final and non-final states makes {S, A, B} final, and F non-final. Your DFA above may be equivalent, but to me it is far from clear exactly what algorithm would build it from the given data. Well, it's just the result from the minimazation algorithm, where my variant of the algorithm just prunes away the stuck state which can never produce any output. The point is not to cut efficiently, the point is to enumerate only those words that are actually in the result set. No, this is not the point. Naive filtering already does that. No, it doesn't. Naive filtering always will look at the complete input set, so, no matter what size |S^n - W| actually is, it will always take time in proportion to |S^n|. By cutting efficiently I mean skipping over search sub-trees that don't contain any results from the target set. Yes. Consider a different example: With the wildcard expressions W = { b*, aa*, ab* }, you'll get S^* - W = { a }. The resulting minimum FSM will just accept 'a' (start state, one final state, and the stuck state if you insist on it), so you skip over every other subtree when enumerating results from that automaton. And for the previous example, you'll need something like 2*n time to enumerate the output set instead of 2^n, because once you're in the a-branch, you're producing only a's, and you're pruning away all the subtrees that start with a b. Similarly in the b-branch. Now clearer? - Dirk -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
[EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Call a wc 'free' if it satisfies the propery that every letter 'a' in it appears only in the form '*a*', and 'anchored' otherwise. Would '*ab*' be free or anchored? What if all wc's are free? How does this affect the DFA? I don't know. The important point here is the interaction of all the wc's. I don't think properties like this do reduce the complexity of interaction in an obvious way. Does it minimize nontrivially? I am not sure what you mean by this. Every DFA minimizes to some other DFA, which is unique up to renaming of states. Now the question is if that minimization reduces the complexity enough to be noticeable (maybe that's what you mean by nontrivially). In general, I don't think one can say much about that without looking at concrete examples. - Dirk -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Dirk Thierbach wrote: [A lot of stuff] Now clearer? - Dirk Actually, it's getting all the messier, and we seem to be running around in circles. I've already lost track of the point you're trying to make, and it seems that you're missing most of my points. Let's leave it there, and take a break. -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Hello, The solution that would have the most utility would be one where the elements are generated one-by-one, loop-like, so that they can be used in the body of a loop, and to avoid the fact that even with exclusion the cardinality of the target set EX^n could be in the millions even with a full list of wc's, that is, a list containing at least one wc of every length in 2..(n-1). I don't know enough Lisp, Haskell or Qi/Prolog to know if the solutions so far can be modified to do this. The Python program is too slow for large sets. Walter Kehowski -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Dirk Thierbach wrote: If more time during preprocessing is allowed, another idea is to treat the wildcard expressions as regular expressions, convert each into a finite state machine, construct the intersection of all these state machines, minimize it and then swap final and non-final states. Given the requirements, did you mean taking the *union* and swapping states? Or maybe swapping states first, and then taking the intersection? Then you can use the resulting automaton to efficiently enumerate S^n - W. In the above case, the resulting FSM would have just three states. I don't see immediately how exactly this is going to work. Unless I'm very much mistaken, a FSA in the classical sense will accept or reject only after the whole sequence has been consumed, and this spells exponential times. For improved asymptotic complexity in this case, you need to be able to at least reject in mid-sequence, and that calls for a slightly different concept of a FSA -- is this what you meant? Cheers, Dinko -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
[EMAIL PROTECTED] schreef: The solution that would have the most utility would be one where the elements are generated one-by-one, loop-like, so that they can be used in the body of a loop, and to avoid the fact that even with exclusion the cardinality of the target set EX^n could be in the millions even with a full list of wc's, that is, a list containing at least one wc of every length in 2..(n-1). I don't know enough Lisp, Haskell or Qi/Prolog to know if the solutions so far can be modified to do this. The Python program is too slow for large sets. Use a bitmapping, see also news:[EMAIL PROTECTED] Detect the exclusions with a bitwise AND. -- Affijn, Ruud Gewoon is een tijger. echo 014C8A26C5DB87DBE85A93DBF |perl -pe 'tr/0-9A-F/JunkshoP cartel,/' -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Dinko Tenev [EMAIL PROTECTED] wrote: Dirk Thierbach wrote: If more time during preprocessing is allowed, another idea is to treat the wildcard expressions as regular expressions, convert each into a finite state machine, construct the intersection of all these state machines, minimize it and then swap final and non-final states. Given the requirements, did you mean taking the *union* and swapping states? Or maybe swapping states first, and then taking the intersection? Whatever the requirements were. Take your pick. :-) Then you can use the resulting automaton to efficiently enumerate S^n - W. In the above case, the resulting FSM would have just three states. I don't see immediately how exactly this is going to work. Unless I'm very much mistaken, a FSA in the classical sense will accept or reject only after the whole sequence has been consumed, and this spells exponential times. Exponential in respect to what? You just recursively walk the automaton for every word of length n, so at most you'll have to check every word in S^n once. Hence, the complexity is not worse than the generate and test approach (it's in fact better, since testing is trivial). However, if the result set is simple (as in the example), then the result FSM will have states that won't have transitions for every letters, so I guess the average case will be a lot better. For improved asymptotic complexity in this case, you need to be able to at least reject in mid-sequence, One can do that if there is no valid transition out of some state. I guess one could even optimize on that: In the minimal automaton, every state has some transition sequence that will end up in a final state. Annotate each state with the minimum number of steps for such a sequence. While walking the automaton, keep the maximum number of letters to produce in a variable. If this number is small then the number in the annotation, stop and backtrace. This should guarantee that only those states are visited that really produce some output. - Dirk -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Dirk Thierbach wrote: Dinko Tenev [EMAIL PROTECTED] wrote: Dirk Thierbach wrote: If more time during preprocessing is allowed, another idea is to treat the wildcard expressions as regular expressions, convert each into a finite state machine, construct the intersection of all these state machines, minimize it and then swap final and non-final states. Given the requirements, did you mean taking the *union* and swapping states? Or maybe swapping states first, and then taking the intersection? Whatever the requirements were. Take your pick. :-) OK :) Then you can use the resulting automaton to efficiently enumerate S^n - W. In the above case, the resulting FSM would have just three states. I don't see immediately how exactly this is going to work. Unless I'm very much mistaken, a FSA in the classical sense will accept or reject only after the whole sequence has been consumed, and this spells exponential times. Exponential in respect to what? With respect to sequence length. In the example above (S = {a, b}, W = {*a*b, *b*a}) you have to enumerate |S^n| potential sequences to get to the couple (a^n, b^n) that are of particular interest. You just recursively walk the automaton for every word of length n, so at most you'll have to check every word in S^n once. That's right -- the sky is the limit ;) Hence, the complexity is not worse than the generate and test approach (it's in fact better, since testing is trivial). Testing per sequence will be faster, yes, but the overall running time will still be a disaster, just like with the generate and test solutions. The Python program tries to do better than that, and it succeeds some of the time. However, if the result set is simple (as in the example), then the result FSM will have states that won't have transitions for every letters, so I guess the average case will be a lot better. I don't believe this case should ever arise out of the current definition of the problem: labels are specified explicitly only for the excluded subset, and you have to accept everything else by default. For improved asymptotic complexity in this case, you need to be able to at least reject in mid-sequence, One can do that if there is no valid transition out of some state. One possibly can, but such states should never be encountered (see above.) Looking at the minimal DFA for the above example, it may be more appropriate to detect being in a state from which there's no path to a final state: S: a - A, b - B A: a - A, b - F B: a - F, b - B F: a - F, b - F Here, S is the starting state, {S, A, B} are final, and F is non-final (after swapping.) Obviously, the smart thing to do is to bail out as soon as you're in F. The point is, though, are things guaranteed to be as simple in the general case? :) I guess one could even optimize on that: In the minimal automaton, every state has some transition sequence that will end up in a final state. Annotate each state with the minimum number of steps for such a sequence. While walking the automaton, keep the maximum number of letters to produce in a variable. If this number is small then the number in the annotation, stop and backtrace. I don't think this could cut efficiently for patterns like *a*b. Cheers, Dinko -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
On 2006-03-23, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: The solution that would have the most utility would be one where the elements are generated one-by-one, loop-like, so that they can be used in the body of a loop, and to avoid the fact that even with exclusion the cardinality of the target set EX^n could be in the millions even with a full list of wc's, that is, a list containing at least one wc of every length in 2..(n-1). I don't know enough Lisp, Haskell or Qi/Prolog to know if the solutions so far can be modified to do this. The Python program is too slow for large sets. In Haskell you can get this essentially for free, due to its laziness. -- Aaron Denney -- -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Call a wc 'free' if it satisfies the propery that every letter 'a' in it appears only in the form '*a*', and 'anchored' otherwise. What if all wc's are free? How does this affect the DFA? Does it minimize nontrivially? Keep in mind I'm new to DFA theory. Walter Kehowski -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Dinko Tenev [EMAIL PROTECTED] wrote: I don't see immediately how exactly this is going to work. Unless I'm very much mistaken, a FSA in the classical sense will accept or reject only after the whole sequence has been consumed, and this spells exponential times. Exponential in respect to what? With respect to sequence length. But you cannot get rid of this. Consider S = {a, b}, W = {a}. Then there are |S|^n result elements for n 1, and you have to enumerate all of them. The best thing one can hope for is to just actually process those elements that are really in the result set. With the FSM, you're getting at least close to that. However, if the result set is simple (as in the example), then the result FSM will have states that won't have transitions for every letters, so I guess the average case will be a lot better. I don't believe this case should ever arise out of the current definition of the problem: labels are specified explicitly only for the excluded subset, and you have to accept everything else by default. If the result set is {a^n, b^n | n \in N}, then you have an automaton where exactly this happens. Since minimum automatons are unique up to renaming of states, that's the result you'll get. For improved asymptotic complexity in this case, you need to be able to at least reject in mid-sequence, One can do that if there is no valid transition out of some state. One possibly can, but such states should never be encountered (see above.) The automaton is: S: a - A, b - B A: a - A B: b - B All the states are final. A and B have just one transition, so you'll be able to generate either a^n or b^n efficiently. Looking at the minimal DFA for the above example, it may be more appropriate to detect being in a state from which there's no path to a final state: This cannot happen, because minimizing removes all those states. (Or, in case you have a different definition of automaton in mind: There'll be just one stuck state, where all transitions that never go to a final state end up. Remove this state, and you'll end up with the other definition). I guess one could even optimize on that: In the minimal automaton, every state has some transition sequence that will end up in a final state. Annotate each state with the minimum number of steps for such a sequence. While walking the automaton, keep the maximum number of letters to produce in a variable. If this number is small then the number in the annotation, stop and backtrace. I don't think this could cut efficiently for patterns like *a*b. The point is not to cut efficiently, the point is to enumerate only those words that are actually in the result set. If you are enumerating all words shorter than a given length, the above method should guarantee this. - Dirk -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
funkyj wrote: How about the other iterator characteristics? when there is a huge solution space can I ask the prolog version to give me the first 1000 solutions? Geoffrey's post above offers one way to do this from within a REPL. Within a program, as soon as you accept a solution, you're potentially out of the loop (it is possible to use cut (!) to make this certain.) The next 1000 solutions (i.e. 1000 - 1999)? I am not quite sure how exactly you propose to do this in e.g. Python, but if you mean to enumerate and skip over the first 1000 and then continue with the rest, yes, this is also possible. If you can do that then it would appear that generators have no advantage over your prolog solution. To be fair, the Python implementation has one significant advantage -- it offers reduced runtime complexity for quite a number of practical cases (of course, this improvement is achievable in Prolog too,) though I'm still inclined to think that it can be made to run in exponential time. Also, my Prolog version is non-compliant in that it treats a wildcard as matching a single position, rather than matching any sequence. Geoffrey's implementation does the right thing though. Cheers, Dinko -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
OK, here's a case that will make your program run in exponential time: S = { a, b }, W = { *a*b, *b*a } -- on my machine, it starts getting ugly as soon as n is 15 or so. Note that S^n - W = { a^n, b^n }. In general, whenever all the patterns in the set match against the last position, your current implementation is guaranteed to have to sift through all of S^n. I'd say the very idea of checking against a blacklist is fundamentally flawed, as far as performance is concerned. -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Mark Carter wrote: At the risk of being labelled a troll One thing I just discovered, and by which I mean *really* discovered ... is that Lisp is an interactive environment. I am working on trying to verify the contents of disks. I noticed that the input formats are slightly wrong, and needed correction. In fact, there's a whole host of jiggery pokery that I need to do in order to massage and build up everything the way it needs to be. A programmers mindset is usually geared towards writing applications. What I'm currently doing in Lisp is building up functions as I need them. Using emacs, I can just C-x C-e to make my functions live, and when it's time to stop for the day, save my working image so that I can use it the next day. It seems to me that only Forth or Scheme really matches this capability. Ruby and Python come kinda close - they do have a REPL, but it's kinda clunky to try to create functions on the fly, plus of course they don't support the idea of an image. -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Mark Carter [EMAIL PROTECTED] writes: A programmers mindset is usually geared towards writing applications. What I'm currently doing in Lisp is building up functions as I need them. Using emacs, I can just C-x C-e to make my functions live, and when it's time to stop for the day, save my working image so that I can use it the next day. It seems to me that only Forth or Scheme really matches this capability. Not really. I'd say matlab's and squeak's (and presumably most smalltalks') interactive environment is quite superior to cl+slime. Emacs lisp is also better and I'd also suspect erlang to be better in some ways, but I haven't used it yet. APL and descendants (J and K) also tend to have quite reasonable interactive facilities and so do CAS systems. Ruby and Python come kinda close - they do have a REPL, but it's kinda clunky to try to create functions on the fly, plus of course they don't support the idea of an image. I don't think interactively creating functions is much harder in python but changing module and class definitions is. Actually although cl+slime is rather nice it is inferior to the much less sophisticated and capable ipython+emacs combo in some regards (runtime error reporting in CL tends to be pretty crap; (i)python docstrings, ease of interactive testing and shell integration are superior). It's also much easier to serialize stuff in python than it is in common lisp (never mind pickle; you can't even readably print hashtables -- and unless I'm missing something this is not user-fixable which really sucks). Of course it's *much* easier to supply a nice interactive experience for a quite limited language such as matlab than it is for CL (mostly because CL is more powerful and geared towards efficient code generation). So CL might well have the best interactiveness/(expressiveness*speed) ratio. 'as -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
[Had to drop alt.comp.lang.haskell, otherwise my newsserver doesn't accept it] Dinko Tenev [EMAIL PROTECTED] wrote: OK, here's a case that will make your program run in exponential time: S = { a, b }, W = { *a*b, *b*a } -- on my machine, it starts getting ugly as soon as n is 15 or so. Note that S^n - W = { a^n, b^n }. In general, whenever all the patterns in the set match against the last position, your current implementation is guaranteed to have to sift through all of S^n. I'd say the very idea of checking against a blacklist is fundamentally flawed, as far as performance is concerned. If more time during preprocessing is allowed, another idea is to treat the wildcard expressions as regular expressions, convert each into a finite state machine, construct the intersection of all these state machines, minimize it and then swap final and non-final states. Then you can use the resulting automaton to efficiently enumerate S^n - W. In the above case, the resulting FSM would have just three states. And it doesn't really matter what language you use to implement this algorithm, it's the idea that counts. Notation aside, all implementations will be quite similar. - Dirk -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
And it doesn't really matter what language you use to implement this algorithm, it's the idea that counts. Notation aside, all implementations will be quite similar. I'll guess I'll get out my Turing tape. ;) What are some good references for finite state machines? Minimization? -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
[EMAIL PROTECTED] [EMAIL PROTECTED] wrote: What are some good references for finite state machines? Minimization? A classic is Introduction to automata theory, languages and computation by Hopcroft and Ullman. But any other book about finite state machines should cover these topics, too. There are good chances that you can just google for a detailed explanation. - Dirk -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
[EMAIL PROTECTED] wrote: After the basic fact of generating the exclusion - a considerable achievement - the program should be interactive. What if the target set has thousands or millions of elements? There should be a loop-like way ('do' in Haskell, for example) to peel off the elements one-by-one and then terminate. Um...interactivity is a bit tricky in Prolog ;) As Geoffrey pointed out in his posting, the solutions are generated effectively one at a time. Following is a typical example of using the generator: generate_member( X, ... ), do_something_with( X ), fail. The underlying semantics is, roughly, 1) bind X, 2) do_something_with( X ), 3) fail, meaning reject this binding of X and backtrack. In this particular case, backtracking is tantamount to going back to 1). You can regard ( generate_member( X, ... ) ... fail ) as the equivalent of a loop construct, and do_something_with( X ) as the loop body. At any given time that the goal is being evaluated, there is only one binding for X in effect. On a side note, Haskell's do notation isn't really about loops. If you're referring to Tomasz's code, it's rather mapM_ that can sort of be thought of as looping through the list of values returned by generateNotMatching :) Cheers, Dinko -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Dinko Tenev wrote: Speculation: the time for building-up a smart structure to speed-up enumeration, together with the time for enumerating the set using that structure, should sum up to roughly Theta( n*|S^n| ), even with a really smart algorithm. OK, maybe not. This might be the worst case, looking at S^n - W only, but it's not quite clear what worst case means in the context of concrete implementations. Surely, one can clog the program with zillions of wildcards to test, so we can produce an arbitrarily bad case :) -- but such a case is obviously of little or no practical importance. It appears that, to make any sensible statements about performance in the relevant cases, we have to take into account the size of the pattern set used to specify W as well. [...] here's another speculation: such structure would likely take up Theta( |S^n| ) space in memory, in the worst case. ...and similarly for worst case here. Cheers, Dinko -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
[EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] After the basic fact of generating the exclusion - a considerable achievement - the program should be interactive. What if the target set has thousands or millions of elements? There should be a loop-like way ('do' in Haskell, for example) to peel off the elements one-by-one and then terminate. There is...(QD) 1 ?- generate_member(X,[1,2],3,[[and, [*,2], [or, [2,1,*], [1,2,*), write(X),nl,write('Is this the term you were looking for? (y/n):'), get(Y), ((Y is 121) - true; fail). % 121 = 'y' [1, 1, 1] Is this the term you were looking for? (y/n):n [1, 1, 2] Is this the term you were looking for? (y/n):|: n [1, 2, 1] Is this the term you were looking for? (y/n):|: y Yes 2 ?- --- Geoff -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Dinko Tenev wrote: Doug Quale wrote: Hmmm...storage is not an issue in the Prolog version. It generates a candidate solution, then checks membership in the wildcard set, then backtracks (backtracking is caused by fail in the test goal.) On backtracking, it effectively forgets the last solution, so the memory is freed up (or at least free to be reclaimed through GC.) How about the other iterator characteristics? when there is a huge solution space can I ask the prolog version to give me the first 1000 solutions? The next 1000 solutions (i.e. 1000 - 1999)? If you can do that then it would appear that generators have no advantage over your prolog solution. -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Doug Quale wrote: funkyj [EMAIL PROTECTED] writes: One advantage of a generator over filtering the full product is that I, as the user of the generator, am not obligated to iterate over the entire solution space. Are there other _practical_ advantages of generators over mapping filtering complete sets? Storage. You can iterate over problem spaces much too large to fit in memory. Also generate + iterate can be faster because of reduced memory pressure. Hmmm...storage is not an issue in the Prolog version. It generates a candidate solution, then checks membership in the wildcard set, then backtracks (backtracking is caused by fail in the test goal.) On backtracking, it effectively forgets the last solution, so the memory is freed up (or at least free to be reclaimed through GC.) Cheers, Dinko -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
[EMAIL PROTECTED] wrote: It would seem that your program is just filtering the full cartesian product, right? The solution I'm looking for generates the elements one-by-one so that it could be used in a loop. OK, having read some of the comments so far, I have the feeling that I may be missing the point in more than one way, so let's set this straight: If I understand correctly, for an alphabet S, and a subset W of S* specified by the wildcards, you expect the enumeration of sequences of length n to run in Theta( n*|S^n - W| ) instead of Theta( n*|S^n| ). First, this doesn't seem to hold for your Python program. Try, for example, S = { a, b, c }, W = { *a*b*, *b*c*, *c*a*, *b*a*, *c*b*, *a*c* }, with some large values of n. Theta( n*|S^n - W| ) predicts that the enumeration time should grow linearly with n, as |S^n - W| = 3, but if you take some measurements, you'd notice that it grows faster than that. Second, my current bet is that such an improvement in asymptotic complexity is not possible, if we consider *both* pre-processing of the wildcard set and subsequent enumeration. Speculation: the time for building-up a smart structure to speed-up enumeration, together with the time for enumerating the set using that structure, should sum up to roughly Theta( n*|S^n| ), even with a really smart algorithm. Even if you're willing to pay up-front for tighter loop execution later, and you build a suitable structure for this purpose, you would have to consider the structure's size, so here's another speculation: such structure would likely take up Theta( |S^n| ) space in memory, in the worst case. I would really appreciate it if you could pour some light into what you're trying to do exactly, and possibly point out anything that I might have missed so far. Cheers, Dinko -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Hi, You wrote into the Qilang News group with your problem. This is a solution in 17 lines of Qi for any n-product = 2. It falls short of your complete requirement since it uses generate and then test, rather than interleaving the two. (define challenge Patterns N X - (filter (/. Y (member Y Patterns)) (n-product N X))) (define n-product 2 X - (cartesian-product l X X) N X - (cartesian-product c X (n-product (- N 1) X))) (define cartesian-product _ [ ] _ - [ ] c [X | Y] Z - (append (map (/. W [X | W]) Z) (cartesian-product c Y Z)) l [X | Y] Z - (append (map (/. W [X W]) Z) (cartesian-product l Y Z))) (define filter _ [] - [] F [X | Y] - (filter F Y) where (F X) F [X | Y] - [X | (filter F Y)]) (define member _ [] - false X [Pattern | _] - truewhere (query-prolog [[= Pattern X]]) X [_ | Patterns] - (member X Patterns)) Notes: Pattern filtering is done by a unification test within the member function. You can do this most easily by calling Qi Prolog using query-prolog. Here's a test. (42 -) (n-product 3 [a b c]) [[a a a] [a a b] [a a c] [a b a] [a b b] [a b c] [a c a] [a c b] [a c c] [b a a] [b a b] [b a c] [b b a] [b b b] [b b c] [b c a] [b c b] [b c c] [c a a] [c a b] [c a c] [c b a] [c b b] [c b c] [c c a] [c c b] [c c c]] OK, remove all lists beginning [a a ]. (43-) (challenge [[a a | X]] 3 [a b c]) [[a b a] [a b b] [a b c] [a c a] [a c b] [a c c] [b a a] [b a b] [b a c] [b b a] [b b b] [b b c] [b c a] [b c b] [b c c] [c a a] [c a b] [c a c] [c b a] [c b b] [c b c] [c c a] [c c b] [c c c]] Remove all lists beginning with a or b. (51-) (challenge [[a | X] [b | X]] 3 [a b c]) [[c a a] [c a b] [c a c] [c b a] [c b b] [c b c] [c c a] [c c b] [c c c]] Mark -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
I'd like to propose a coding challenge of my own. The challenge is to reproduce the TEA (Tiny Encryption Algorith): http://www.simonshepherd.supanet.com/tea.htm in your language of choice. Here's the code, just two simple functions: void encipher(unsigned long *const v,unsigned long *const w, const unsigned long *const k) { register unsigned long y=v[0],z=v[1],sum=0,delta=0x9E3779B9, a=k[0],b=k[1],c=k[2],d=k[3],n=32; while(n--0) { sum += delta; y += (z 4)+a ^ z+sum ^ (z 5)+b; z += (y 4)+c ^ y+sum ^ (y 5)+d; } w[0]=y; w[1]=z; } void decipher(unsigned long *const v,unsigned long *const w, const unsigned long *const k) { register unsigned long y=v[0],z=v[1],sum=0xC6EF3720, delta=0x9E3779B9,a=k[0],b=k[1], c=k[2],d=k[3],n=32; /* sum = delta5, in general sum = delta * n */ while(n--0) { z -= (y 4)+c ^ y+sum ^ (y 5)+d; y -= (z 4)+a ^ z+sum ^ (z 5)+b; sum -= delta; } w[0]=y; w[1]=z; } I had a crack at it in Lisp. My version doesn't work - but of greater concern to me is that it doesn't appear nearly as compact as the C version. Anyway, here's my Lisp code (no prizes for guessing that I'm a noob to Lisp): (defconstant delta 2654435769 ) ; delta= 0x9E3779B9 (defun floorn (n) (nth-value 0 (floor n))) (defun (val num-bytes) Right-shift positive integer val by num-bytes (let* (t1 t2) (setf t1 (expt 2 num-bytes)) (setf t2 (/ val t1)) (floor t2))) (defun (val num-bytes) Left-shift positive integer v by num-bytes (* val (expt 2 num-bytes))) (defun 4 (i) ( i 4)) (defun byte-n (v n) Return the nth byte of a value v (let* ((bits-to-shift (* 8 (1- n))) (shifted-value ( v bits-to-shift))) (logand shifted-value 256))) (defun transform (v1 v2 v3 v4) (let (t1 t2 t3) (setf t1 (4 v1)) (setf t2 (expt v2 v1)) (setf t3 (expt v3 ( v2 5))) (+ t1 t2 t3 v4))) (defun pack64 (b1 b2) (+ ( b1 32) b2)) (defun encipher (v k) (let ((sum 0) (a (byte-n k 3)) ; a=k[0] (b (byte-n k 2)) ; b=k[1] (c (byte-n k 1)) ; c=k[2] (d (byte-n k 0)) ; d=k[3] (y (byte-n v 1)) ; y=v[4] (z (byte-n v 0))) ; z=v[1] (loop for n from 0 to 31 do ;n=32, while(n--0) (incf sum delta);sum += delta; (incf y (transform z a sum b)) ; y += (z 4)+a ^ z+sum ^ (z 5)+b (incf z (transform y c sum d)) ;z += (y 4)+c ^ y+sum ^ (y 5)+d; ) (pack64 y z) ; w[0]=y; w[1]=z; )) (defun decipher (v k) (let ((sum 3337565984) ; 0xC6EF3720 (a (byte-n k 3)) ; a=k[0] (b (byte-n k 2)) ; b=k[1] (c (byte-n k 1)) ; c=k[2] (d (byte-n k 0)) ; d=k[3] (y (byte-n v 1)) ; y=v[4] (z (byte-n v 0))) ; z=v[1] (loop for n from 0 to 31 do ;n=32, while(n--0) (decf z (transform y c sum d)) ;z -= (y 4)+c ^ y+sum ^ (y 5)+d; (decf y (transform z a sum b)) ;y -= (z 4)+a ^ z+sum ^ (z 5)+b; (decf sum delta);sum -= delta; ) (pack64 y z) ; w[0]=y; w[1]=z; )) -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Interesting. But you probably need to post this as a new message, since it is a distinctly different problem from the one driving this thread. Mark -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Mark Tarver wrote: Interesting. At the risk of being labelled a troll, one thought that is occuring to me is that in Lisp it seems that sometimes it is difficult to achieve a simple thing in a simple way. To clarify ... recently, I had been trying to obtain md5 hashes of the files we had on our server (a different exercise than the one I mentioned in my OP, just in case you thought that I didn't understand the difference between encryption and hashing). There is an md5 package for Lisp available on the web, which I used with CLISP. I had a file that contained a non-standard character, causing CLISP to throw an error when it tried to print it. Well, I suppose I could have tried to figure out a way to cajole CLISP into printing something it didn't want to print, but I was keen to give Corman Lisp 2.5 a try-out anyway, so I tried the package on it. EXCEPT, for some reason when you try to read a file with an :element-type of (unsigned-byte 8) (or something similar), Corman didn't like it. In the end, I hacked together an md5 DLL from some sources I found on the internet. You can get the package here, together with Corman Lisp bindings: http://www.markcarter.me.uk/computing/freeware/md5mc/md5mc.htm In the past, I had also employed a similar technique in order to get access to some console functions that I was interested in. My worry is that it seems to be a recurring theme with me ... get stumped in Lisp, realise that it is probably just plain easier in C, and then link the whole thing together in Lisp. Which is kinda less than expected. -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Dinko Tenev [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] [EMAIL PROTECTED] wrote: It would seem that your program is just filtering the full cartesian product, right? The solution I'm looking for generates the elements one-by-one so that it could be used in a loop. Oops...missed that part. It took me a while to study the exchange on this topic more thoroughly, and I now fully appreciate the fact that the problem calls for a much more sophisticated approach. Sorry for the hasty shot, I'll give it another shortly. I wouldn't worry about it, Prolog generated the elements one-by-one. The loop was the print,nl,fail line. Just beefing it up a bit, I didn't take the time to clean it up though. :-) gen(_,0,[]). gen(S,N,[H|T]):- N 0, N1 is N - 1, member(H,S), gen(S,N1,T). filter([],[]). filter([X|T],[X|T1]):- filter(T,T1). filter([*|T],L):- filter(T,L). filter([*|T],[_|T1]):- filter([*|T],T1). filter_list(L,[[and|T]|_]):- filter_and(L,T), !. filter_list(L,[[or|T]|_]):- filter_list(L,T), !. filter_list(L,[H|_]):- H \= [and|_], H \= [or|_], filter(H,L),!. filter_list(L,[H|T]):- H \= [and|_], H \= [or|_], filter_list(L,T). filter_and(_,[]) :- !. filter_and(L,[H|T]):- filter_list(L,[H]), filter_and(L,T). generate_member(X,S,N,[]):-gen(S,N,X). generate_member(X,S,N,[H|T]):-gen(S,N,X),\+ filter_list(X,[H|T]). 1 ?- generate_member(X,[a,b],3,[[a,*,b],[b,*,a]]). X = [a, a, a] ; X = [a, b, a] ; X = [b, a, b] ; X = [b, b, b] ; No 2 ?- generate_member(X,[1,2],3,[[and, [*,2], [or, [2,1,*], [1,2,*). X = [1, 1, 1] ; X = [1, 1, 2] ; X = [1, 2, 1] ; X = [2, 1, 1] ; X = [2, 2, 1] ; X = [2, 2, 2] ; No --- Geoff -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Mark Carter [EMAIL PROTECTED] writes: I'd like to propose a coding challenge of my own. The challenge is to reproduce the TEA (Tiny Encryption Algorith): http://www.simonshepherd.supanet.com/tea.htm in your language of choice. Here's the code, just two simple functions: void encipher(unsigned long *const v,unsigned long *const w, const unsigned long *const k) { register unsigned long y=v[0],z=v[1],sum=0,delta=0x9E3779B9, a=k[0],b=k[1],c=k[2],d=k[3],n=32; while(n--0) { sum += delta; y += (z 4)+a ^ z+sum ^ (z 5)+b; z += (y 4)+c ^ y+sum ^ (y 5)+d; } w[0]=y; w[1]=z; } void decipher(unsigned long *const v,unsigned long *const w, const unsigned long *const k) { register unsigned long y=v[0],z=v[1],sum=0xC6EF3720, delta=0x9E3779B9,a=k[0],b=k[1], c=k[2],d=k[3],n=32; /* sum = delta5, in general sum = delta * n */ while(n--0) { z -= (y 4)+c ^ y+sum ^ (y 5)+d; y -= (z 4)+a ^ z+sum ^ (z 5)+b; sum -= delta; } w[0]=y; w[1]=z; } I get it shorter than in C: (defun op (x a b sum) (logxor (+ (ash x 4) a) (+ x sum) (+ (ash x -5) b))) (declaim (inline op)) (defmacro ciploop ((v w k y z a b c d (sum init-sum) delta) body body) `(let ((,y (aref ,v 0)) (,z (aref ,v 1)) (,sum ,init-sum) (,delta #x9E3779B9) (,a (aref ,k 0)) (,b (aref ,k 1)) (,c (aref ,k 2)) (,d (aref ,k 3))) (loop repeat 32 do ,@body finally (setf (aref ,w 0) ,y (aref ,w 1) ,z (defmacro c-incf (var expr) `(setf ,var (mod (+ ,var ,expr) #x1))) (defmacro c-decf (var expr) `(setf ,var (mod (- ,var ,expr) #x1))) (defun encipher (v w k) (ciploop (v w k y z a b c d (sum 0) delta) (c-incf sum delta) (c-incf y (op z a b sum)) (c-incf z (op y c d sum (defun decipher (v w k) (ciploop (v w k y z a b c d (sum #xC6EF3720) delta) (c-decf z (op y c d sum)) (c-decf y (op z a b sum)) (c-decf sum delta))) You can also easily modify it to implement the improved version of TEA... Note that this Lisp programs will work equally well on a 16-bit, 32-bit or 64-bit Common Lisp implementation. The same cannot be said of the C program above. ;; Let's add a testbed: (defun word (a b c d) (dpb a (byte 8 24) (dpb b (byte 8 16) (dpb c (byte 8 8) d (defun read-words (bits what) (loop for bytes = (progn (format *query-io* Please enter ~D bits of ~A: bits what) (ext:convert-string-to-bytes (read-line *query-io* nil nil) ext:*TERMINAL-ENCODING*)) while ( (* 8 (length bytes)) bits) finally (return (loop for i from 0 by 4 below (truncate (+ 7 bits) 8) collect (word (aref bytes (+ i 0)) (aref bytes (+ i 1)) (aref bytes (+ i 2)) (aref bytes (+ i 3))) into words finally (return (coerce words 'vector)) (defun test () (loop with code = (vector 0 0) with decr = (vector 0 0) for clear = (read-words 64 clear text) for key = (read-words 128 key) do (progn (encipher clear code key) (format t (encipher ~S ~S)~% -- ~S~% clear key code) (decipher code decr key) (format t (decipher ~S ~S)~% -- ~S~% code key decr) (unless (equalp clear decr) (format t !!! ERROR !!!~%) [11] (test) Please enter 64 bits of clear text: Hello World! Please enter 128 bits of key: John McCarthy invented LISP. (encipher #(1214606444 1864390511) #(1248815214 541942595 1634890856 2032167278)) -- #(913593965 183139965) (decipher #(913593965 183139965) #(1248815214 541942595 1634890856 2032167278)) -- #(1214606444 1864390511) Please enter 64 bits of clear text: Big Secret: LISP! Please enter 128 bits of key: A very very secure key. (encipher #(1114203936 1399153522) #(1092646501 1920540790 1702000928 1936024437)) -- #(319804 1851109064) (decipher #(319804 1851109064) #(1092646501 1920540790 1702000928 1936024437)) -- #(1114203936 1399153522) Please enter 64 bits of clear text: ^C -- __Pascal Bourguignon__ http://www.informatimago.com/ ATTENTION: Despite any other listing of product contents found herein, the consumer is advised that, in actuality, this product consists of 99.99% empty space. -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
I had a crack at it in Lisp. My version doesn't work - but of greater concern to me is that it doesn't appear nearly as compact as the C version. Anyway, here's my Lisp code (no prizes for guessing that I'm a noob to Lisp): Lot's of things you can write more compact. But compact is not always the best way to write source. For me the most important criteria is that I can return to some source after, say, a year absence and everything is clear and readable again. (defconstant delta 2654435769 ) ; delta= 0x9E3779B9 (defconstant +delta+ #x9E3779B9) (defun floorn (n) (nth-value 0 (floor n))) is above used? (defun (val num-bytes) Right-shift positive integer val by num-bytes (let* (t1 t2) (setf t1 (expt 2 num-bytes)) (setf t2 (/ val t1)) (floor t2))) (defun (val num-bytes) Right-shift positive integer val by num-bytes (floor (/ val (expt 2 num-bytes (defun transform (v1 v2 v3 v4) (let (t1 t2 t3) (setf t1 (4 v1)) (setf t2 (expt v2 v1)) (setf t3 (expt v3 ( v2 5))) (+ t1 t2 t3 v4))) (defun transform (v1 v2 v3 v4) (+ (4 v1) (expt v2 v1) (expt v3 ( v2 5)) v4)) and so on... -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
[EMAIL PROTECTED] writes: (defun (val num-bytes) Right-shift positive integer val by num-bytes (floor (/ val (expt 2 num-bytes or just (floor val (expt 2 num-bytes)) 'as -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
[ note followups ] Mark Carter [EMAIL PROTECTED] writes: I'd like to propose a coding challenge of my own. The challenge is to reproduce the TEA (Tiny Encryption Algorith): http://www.simonshepherd.supanet.com/tea.htm in your language of choice. Here's mine, in Common Lisp. (defmacro define-tea-pair ((encrypt decrypt) (delta n) (i1 j1 i2 j2)) `(macrolet ((32bitize (form) `(logand ,form #x)) (tea-kernel (x i j) `(logxor (+ (ash ,x 4) (aref key ,i)) (+ ,x sum) (+ (ash ,x -5) (aref key ,j (tea-loop ((text n sum) body body) `(let ((result (make-array 2 :element-type '(unsigned-byte 32))) (y (aref ,text 0)) (z (aref ,text 1))) (do ((n ,n (- n 1)) (sum ,sum)) ((= n 0) (prog1 result (setf (aref result 0) y (aref result 1) z))) ,@body (defun ,encrypt (plaintext key aux (delta ,delta)) (declare (type (simple-array (unsigned-byte 32) (2)) plaintext) (type (simple-array (unsigned-byte 32) (4)) key)) (tea-loop (plaintext ,n 0) (setq sum (32bitize (+ sum delta)) y (32bitize (+ y (tea-kernel z ,i1 ,j1))) z (32bitize (+ z (tea-kernel y ,i2 ,j2)) (defun ,decrypt (ciphertext key aux (delta ,delta)) (declare (type (simple-array (unsigned-byte 32) (2)) ciphertext) (type (simple-array (unsigned-byte 32) (4)) key)) (tea-loop (ciphertext ,n (32bitize (* ,n delta))) (setq z (32bitize (- z (tea-kernel y ,i2 ,j2))) y (32bitize (- y (tea-kernel z ,i1 ,j1))) sum (32bitize (- sum delta))) (define-tea-pair (encipher decipher) (#x9e3779b9 32) (0 1 2 3)) So far, so ordinary; only marginally shorter than the C version; I'm certainly nowhere near Pascal's 14 lines, although my version has the advantage over his that each constant is mentioned only once, and all quantities derived from it are computed at compile-time; I can define a different pair with (define-tea-pair (eprime dprime) (#xabcdef01) (3 1 2 0)) and the new functions are inverses of each other as before. The other thing that might be of interest is the inner loop. There are no declarations other than the argument declarations, and all the code that I have written is portable Common Lisp, and should work in any conforming implemenation. In SBCL (on the PowerPC; other platforms are similar), the inner loop for ENCIPHER is addis $nl0,$nl3,-25033 addi $nl3,$nl0,31161 rlwinm $nl5,$nl2,4,0,27 lwz $nl6,1($fdefn) add $nl6,$nl5,$nl6 add $nl0,$nl2,$nl3 xor $cfunc,$nl6,$nl0 rlwinm $nl0,$nl2,27,5,31 mr $nl5,$nl0 lwz $nl6,5($fdefn) add $nl6,$nl5,$nl6 xor $nl6,$cfunc,$nl6 add $nl1,$nl1,$nl6 rlwinm $nl5,$nl1,4,0,27 lwz $nl6,9($fdefn) add $nl6,$nl5,$nl6 add $nl0,$nl1,$nl3 xor $cfunc,$nl6,$nl0 rlwinm $nl0,$nl1,27,5,31 mr $nl5,$nl0 lwz $nl6,13($fdefn) add $nl6,$nl5,$nl6 xor $nl6,$cfunc,$nl6 add $nl2,$nl2,$nl6 addi $nl4,$nl4,-4 cmpwi cr0,$nl4,0 bt gt,l0 and while this may be opaque to some readers, the point is that it is pretty much comparable to the C code in performance (the differences between this disassembly and gcc -O2 lie in the fact that SBCL's instruction scheduler is pretty much nonexistent on the PowerPC architecture). Christophe -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
After the basic fact of generating the exclusion - a considerable achievement - the program should be interactive. What if the target set has thousands or millions of elements? There should be a loop-like way ('do' in Haskell, for example) to peel off the elements one-by-one and then terminate. -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Yes, the program is quite a jumble: but it works. And I didn't post to python newsgroup since I was limited to just 5 newsgroups and didn't feel like doing multiple postings to multiple newsgroups. -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Here is an nice intro to K: http://www.kuro5hin.org/?op=displaystory;sid=2002/11/14/22741/791 This is where K starts to set itself from apart from most of the common programming languages in use today. You rarely write loops in K (KDB is 100% loop-free), instead you use adverbs. An adverb modifies a function, returning another function, changing the ways it operates over its arguments and what it does with it's return values. How about an interactive loop-like version? Generating the target set is good for baby test cases but not if the cardinality of the target is large. Does that make the problem more intersesting? -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
When I run this I get through ghc I get C:\Documents and Settings\User\My Documents\wildcardghc ./wc-zielonka.hs compilation IS NOT required C:/Languages/ghc/ghc-6.4.1/libHSrts.a(Main.o)(.text+0x1d):Main.c: undefined refe rence to `__stginit_ZCMain' C:/Languages/ghc/ghc-6.4.1/libHSrts.a(Main.o)(.text+0x43):Main.c: undefined refe rence to `ZCMain_main_closure' collect2: ld returned 1 exit status Unless there's a command line option I'm missing? -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
The cardinality of excluding '*a*b*' from S^n should be (m-1)^(n-1)*(m+n-1), where m=|S|. For m=5 this becomes 4^(n-1)*(n+4), and your table fits this formula. As far as generating and testing, an 'ideal' solution would be to 'start from the ground up', as in excluding length 2 wc, and then length 3, etc, until all wc's have been excluded. The 'ideal' solution would intrinsically exclude wc's and not test against a background generation of all of S^n. Does that make sense? -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Nice! How to put it in a loop? I'm totally a newbie to Lisp myself, just gettng into Graham and Touretzky. Let's create a problem. Suppose after excluding I want to know if the digits sum to 12, say, like maybe they're part of a partition. S={0,..6}, S^5, excluding *1*5* and 1*2*3*, say. How would I do that? -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
[EMAIL PROTECTED] schrieb: This is where K starts to set itself from apart from most of the common programming languages in use today. You rarely write loops in K (KDB is 100% loop-free), instead you use adverbs. An adverb modifies a function, returning another function, changing the ways it operates over its arguments and what it does with it's return values. Doesn't sound too different from what closures do. Or lazy parameter passing. rant I'm not sure whether the K designer actually fits that description, but there are too many language designers around reinventing the wheel, arguing whether it should have seven, eight or thirteen sides... /rant Regards, Jo -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
OK, a bad case of RTFM. I saved your file as WildCartesian.hs and then 1) command line: ghci WildCartesian.hs 2) Get some loading messages 3) command line: test and it works! But how do I compile it to get a program with command line arguments? I'm looking through Daume's tutorial right now. -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Heh, here's a Prolog version: == gen( _, 0, [] ) :- !. gen( S, N, [H | T] ) :- member( H, S ), M is N - 1, gen( S, M, T ). == Yep, that's it :))) Here's how to test it: == 1 ?- gen([a, b, c], 3, X), print(X), nl, fail. [a, a, a] [a, a, b] [a, a, c] [a, b, a] [a, b, b] [a, b, c] [a, c, a] [a, c, b] [a, c, c] [b, a, a] [b, a, b] [b, a, c] [b, b, a] [b, b, b] [b, b, c] [b, c, a] [b, c, b] [b, c, c] [c, a, a] [c, a, b] [c, a, c] [c, b, a] [c, b, b] [c, b, c] [c, c, a] [c, c, b] [c, c, c] No 2 ?- gen([a, b, c], 3, X), not(member(X, [[a, _, _], [_, b, _], [_, _, c]])), print(X), nl, fail. [b, a, a] [b, a, b] [b, c, a] [b, c, b] [c, a, a] [c, a, b] [c, c, a] [c, c, b] No == -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
[EMAIL PROTECTED] wrote: -- The states are lists of regular expressions -- where [a,b,..] means match a or b or... I haven't run or studied your program yet myself but what I had in mind was that the list of wc's are *all* to be excluded, so the list [wc1..wcn] is to correspond generating all tuples matching not(wc1 and .. and wcn). Maybe you're already doing that. The wc's themselves could be logical statements among the 'primitive' wc's. That's why I named it the 'wildcard exclusion problem'. It's a lot easier to specify a list of simpler wc's than create a long logical expression. I missed any logical combination :-( It would be quite easy to fix my first program, but I don't have the time to do it right now. Best regards Tomasz -- I am searching for programmers who are good at least in (Haskell || ML) (Linux || FreeBSD || math) for work in Warsaw, Poland -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
You may be interested in, or already know about http://www.lambdassociates.org/ http://www.lambdassociates.org/aboutqi.htm http://www.lambdassociates.org/webbook/contents.htm http://www.lambdassociates.org/prolog.htm Let me know what you think. -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
It would seem that your program is just filtering the full cartesian product, right? The solution I'm looking for generates the elements one-by-one so that it could be used in a loop. -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
[EMAIL PROTECTED] said: I haven't run or studied your program yet myself but what I had in mind was that the list of wc's are *all* to be excluded I think it already does what you want. You might want to change run n alphabet pat = generate terminal null transition [pat] n alphabet to to run n alphabet pat = generate terminal null transition pat n alphabet to allow lists of patterns where the patterns in a list are ORed together. -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
[EMAIL PROTECTED] wrote: It would seem that your program is just filtering the full cartesian product, right? The solution I'm looking for generates the elements one-by-one so that it could be used in a loop. Oops...missed that part. It took me a while to study the exchange on this topic more thoroughly, and I now fully appreciate the fact that the problem calls for a much more sophisticated approach. Sorry for the hasty shot, I'll give it another shortly. Cheers, Dinko -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
sa wrote: in k: cp:{[c;n;p]+(n#c)_vs(!_ c^n)_dvl,/{2_sv+(,/,/:\:)/(),/:@[x;x=-1;:[;!c]]}'p} That one goes a long way as a proof of eg evolution theory, you know, monkeys reproducing shakespeare with a typewriter k-board and all that :) examples: cp[2;3;,0 -1 1] (0 0 0 0 1 0 1 0 0 1 0 1 1 1 0 1 1 1) cp[2;3;(0 -1 1;1 -1 0)] (0 0 0 0 1 0 1 0 1 1 1 1) cp[2;3;(0 -1 1;1 -1 1)] (0 0 0 0 1 0 1 0 0 1 1 0) arguments of cp: c = cardinality of the input set n = power p = list of patterns (-1 = wildcard) the algorithm directly computes the target set. in other words, it does not generate the set, then filter the matches from the target. modifying cp to accept s instead of the cardinality of s, patterns expressed in terms of elements of s, c. adds nothing of interest to the problem. -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Tomasz Zielonka said: I've implemented the same concept yesterday evening... It's uncanny reading such similar code coming from another person! -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Wade Humeniuk wrote: [EMAIL PROTECTED] wrote: What I have in mind is the efficient, enumerated generation of the complement S^n/WC(S^n). A good program should initialize, generate, and terminate. T=cartprodex(S,n,WC); //initialize for all i in T do what you want with i test to see if any more terminate if not and it should do this without explicitly generating WC and then complementing. For example, if the cardinality of S is m, and the WC is just '*a*b*', with a != b, then EX(S^n):=S^n\WC(S^n) has cardinality (m-1)^(n-1)*(m+n-1). Specifically, if m=5 and n=10, then |EX|=3670016 while |S^10|=9765625, so that |EX|/|S^10| is about 0.3758. In general the program should directly generate EX from arbitrary WC. Of course, in practice the WC should themselves occur in a logically consistent manner, but let's just assume they're a given. Another attempt. I have made no special attempt to create an exclusion language, just used an anonymous lambda predicate. FWIW, here's my Q-and-D pattern matcher (only partially tested). (defun match(list pattern optional (test #'eql)) Match a list of atoms against a pattern list using :all as a 0-to-many wildcard, :single as a 1-to-1 wildcard, a list of elements or a single element to match a specific place. Optional argument test for comparing elements (default eql). Returns: T if match is made, NIL otherwise. Examples: (match '(0 1 2 3 4 5) '(:all (2 3) 3 :single 5 :all)) = T (match '(0 1 2 3 4 5) '(:all (2 3) 3 :single 5 :single)) = NIL (let ((current (first pattern)) (next-pattern (rest pattern)) (candidate (first list))) (cond ((and (null pattern) (null list)) t) ((and (eq :single current) candidate) (match (rest list) next-pattern test)) ((eq :all current) (loop for new-list on list when (match new-list next-pattern test) do (return-from match t)) (null next-pattern)) ; last case null remainder ((if(atom current) (funcall test candidate current) (member candidate current :test test)) (match (rest list) next-pattern test) -- Geoff -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
[EMAIL PROTECTED] [EMAIL PROTECTED] writes: What I have in mind is the efficient, enumerated generation of the complement S^n/WC(S^n). A good program should initialize, generate, and terminate. T=cartprodex(S,n,WC); //initialize for all i in T do what you want with i test to see if any more terminate if not and it should do this without explicitly generating WC and then complementing. For example, if the cardinality of S is m, and the WC is just '*a*b*', with a != b, then EX(S^n):=S^n\WC(S^n) has cardinality (m-1)^(n-1)*(m+n-1). Specifically, if m=5 and n=10, then |EX|=3670016 while |S^10|=9765625, so that |EX|/|S^10| is about 0.3758. In general the program should directly generate EX from arbitrary WC. Of course, in practice the WC should themselves occur in a logically consistent manner, but let's just assume they're a given. The following code doesn't build a data structure. It recurses n deep and those n stack frames contain state that tracks progress through the product. But it still generates and tests, taking time proportional to S^n. Is this what you had in mind? ;; A macro to let you loop over the S^n possibilities ;; without building a big data structure ;; The macro is structured as syntactic sugar on ;; a higher order function ;; (defmacro cartesian-power ((variable set exponent) body code) (let ((body-function (gensym))) `(flet ((,body-function(,variable),@code)) (cartesian-power-hof ,set ,exponent '() (function ,body-function) ;; The standard idea of using recursion to implement a nest ;; of loops of indefinite depth ;; (defun cartesian-power-hof (set exponent prefix f) (if (zerop exponent) (funcall f prefix) (dolist (item set) (cartesian-power-hof set (- exponent 1) (cons item prefix) f ;; A simple recursive pattern match ;; I haven't thought through the implications ;; I guess that it is exponentially slow on some long ;; patterns ;; (defun wild-match (pattern data) (cond ((endp pattern) (endp data)) ((endp data) (or (endp pattern) (equal pattern '(:wild ((eql (car pattern) :wild) (or (null (cdr pattern)) (wild-match (cdr pattern) data) (wild-match pattern (cdr data ('literal-pattern (and (eql (car pattern) (car data)) (wild-match (cdr pattern) (cdr data)) ;; close over a data item to get a function ;; suitable for checking several patterns (defun match-data (data) (lambda(pattern) (wild-match pattern data))) ;; Use the macro and the utilities to count how many are not excluded (defun count-remainder (set exponent rest exclusions) (let ((count 0)) (cartesian-power (item set exponent) (when (notany (match-data item) exclusions) (incf count))) count)) CL-USER (loop for i from 3 to 10 do (format t ~~4D~10D i (count-remainder '(a b c d e) i '(:wild a :wild b :wild 3 112 4 512 5 2304 6 10240 7 45056 8196608 9851968 10 3670016 I can see how a pattern such as (a b :wild) would knock out an element from each of the first two sets so reducing the task from m^n to (m-1)^2 * m^(n-2). Also (:wild a :wild) knocks it down from m^n to (m-1)^n However I can only see the exploitation of (:wild a :wild b :wild) as a hairy special case. Pick one of n places for the first a. Pick earlier elements from the set excluding a, pick later elements from the set excluding b. Add in all the items with a missing altogether (so b can be used freely). I cannot see what algorithm exploits the constraints more generally. Is there a standard technique, page nn of Knuth or the like? Is that what you are actually wanting to see coded? Alan Crowe Edinburgh Scotland -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
[EMAIL PROTECTED] wrote: The wildcard exclusion problem is interesting enough to have many distinct, elegant solutions in as many languages. In that case, you should have crossposted to comp.lang.python also. Your program looks like a dog's breakfast. -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
[EMAIL PROTECTED] wrote: It would seem that your program is just filtering the full cartesian product, right? The solution I'm looking for generates the elements one-by-one so that it could be used in a loop. One advantage of a generator over filtering the full product is that I, as the user of the generator, am not obligated to iterate over the entire solution space. Are there other _practical_ advantages of generators over mapping filtering complete sets? -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
funkyj [EMAIL PROTECTED] writes: One advantage of a generator over filtering the full product is that I, as the user of the generator, am not obligated to iterate over the entire solution space. Are there other _practical_ advantages of generators over mapping filtering complete sets? Storage. You can iterate over problem spaces much too large to fit in memory. Also generate + iterate can be faster because of reduced memory pressure. -- http://mail.python.org/mailman/listinfo/python-list
Programming challenge: wildcard exclusion in cartesian products
The python code below generates a cartesian product subject to any logical combination of wildcard exclusions. For example, suppose I want to generate a cartesian product S^n, n=3, of [a,b,c,d] that excludes '*a*b*' and '*c*d*a*'. See below for details. CHALLENGE: generate an equivalent in ruby, lisp, haskell, ocaml, or in a CAS like maple or mathematica. #--- # Short algorithm description # using function _genAll the program generates # cartesian product without sets, which match # some wildcarts # Sets generation uses recursion - # first of all sets will be generated with dimension 1 and than filtered through wildcarts # then sets will be generated with dimension 2 and filtered again # until the required set dimension is reached # Program avoids explicit generation of some part of CP sets # if the end of whildcart is asterics (*) and if the first part of whildcart (without astrics) # matches current set = then this set will be filtered out and won't be used in # higher dimension set generation # example *,1,*,2,* [1,2] dim = 10 # by dimension 2 only arrays [1,1],[2,1],[2,2] are will be generated # = array [1,2] won't be used in next recursion levels #--- # To obtaine result use function # CPWithoutWC first parameter is a list of any elements (char,int,string,class exemplar , any type) # secont param is CP dimension # other parameters are wildcarts = lists with any values then may include # special value ALL - asterics equivalent #Example of usage: command line # import cartesianProduct as cp # for i in cp.CPWithoutWC([1,2],3,[1,cp.ALL,2]): # print i # [1, 1, 1] # [1, 2, 1] # [2, 1, 1] # [2, 1, 2] # [2, 2, 1] # [2, 2, 2] # for i in cp.CPWithoutWC(['a','b'],3,['a',cp.ALL,'b'],['b',cp.ALL,'a']): # print i # ['a', 'a', 'a'] # ['a', 'b', 'a'] # ['b', 'a', 'b'] # ['b', 'b', 'b'] # for i in cp.CPWithoutWC([1,2],3,[1,cp.ALL,2],[2,cp.ALL,1]): #print i # [1, 1, 1] # [1, 2, 1] # [2, 1, 2] # [2, 2, 2] # # for i in cp.CPWithoutWC([1,2],121212,[1,cp.ALL],[2,cp.ALL,1]): #print i ## execute immediately # # if You don't want to print cp. before ALL and CPWithoutWC use import like this: # from cartesianProduct import ALL,CPWithoutWC # CPWithoutWC is a python generator. Which means that it returns values # immediately and generate next in next cycle. # Program example # ## from cartesianProduct import ALL,CPWithoutWC ## def main(): ## for i in cp.CPWithoutWC(['a','b'],3,['a',cp.ALL,'b'],['b',cp.ALL,'a']): ## ## do what You want with current value ## . ## ## go back to for statement and generate new ## if __name__ == __main__: ## main() # Using logical combinations of WC: 1) It's possible to pass on to the function CPWithoutWC any number of wildcarts after first two parameters, for example: CPWithoutWC([1,2],121212,[1,cp.ALL],[2,cp.ALL,1],...) where ... - is any other wildcart's additional function parameters. Number of additional WC is not limited. Function will filter out all combinations, which match any passed on WC. It's equal to WC1 | WC2 | , where | is python analog of OR logical operations. 2) To use more complex WC combinations follow these steps a) First of all create all needed WC b) Then use operators |, and braces () to create combinations required and then pass it on to function CPWithoutWCEx as the third parameter. Don't use or and and python statement, otherwise program will work improper. First two parameters of this function are the same as of CPWithoutWC function - set of elements and CP dimension. An example of what was described above in command line: from cartesianProduct import ALL,CPWithoutWC,CPWithoutWCEx,WC a = WC([ALL,1,ALL]) b = WC([ALL,2,ALL]) c = a b #filter out all sets which match a and b for i in CPWithoutWCEx([1,2],3,c) : print i [1, 1, 1] [2, 2, 2] # all sets where both 1 and 2 are present will be filtered out d = a | b for i in CPWithoutWCEx([1,2],3,d) : print i # returns nothing for i in CPWithoutWCEx([1,2,3],3,d) : print i [3, 3, 3] a = WC([2,1,ALL]) b = WC([1,2,ALL]) c = WC([ALL,2]) d = ( a | b ) c for i in CPWithoutWCEx([1,2],3,d) : print i [1, 1, 1] [1, 1, 2] [1, 2, 1] [2, 1, 1] [2, 2, 1] [2, 2, 2] # filters out all combinations which start with [1,2] or [2,1] and end with 2 Number of WC, which are used to form logical combinations is not limited. 13.02.2006 a)Two new function - CPWithoutWCEx_L and CPWithoutWC_L are added. Their interface is the same as of CPWithoutWCEx and CPWithoutWC accordingly, except that the third parameter is WC list and they accept strictly three parameters. As You can see these functions are very simple because python is quite flexible = def s(x,y):
Re: Programming challenge: wildcard exclusion in cartesian products
[EMAIL PROTECTED] wrote: The python code below generates a cartesian product subject to any logical combination of wildcard exclusions. For example, suppose I want to generate a cartesian product S^n, n=3, of [a,b,c,d] that excludes '*a*b*' and '*c*d*a*'. See below for details. CHALLENGE: generate an equivalent in ruby, lisp, haskell, ocaml, or in a CAS like maple or mathematica. What is your goal? You want to learn or to cause a flamewar? ;-) Anyway, I found the problem entertaining, so here you go, here is my Haskell code. It could be shorter if I didn't care about performance and wrote in specification style. It's not very efficient either, because it will generate all lists matching the given patterns. In GHCi you can test it by: $ ghci :l WildCartesian.hs test I apologise for the lack of comments. 88888888 module WildCartesian where import Data.Set (Set) import qualified Data.Set as Set import Control.Monad import Control.Exception (assert) import Maybe import List data Pat a = All | Lit a deriving Show generateMatching :: (Ord a) = Int - Set a - [Pat a] - [[a]] generateMatching 0 _[]= [[]] generateMatching 0 _(_:_) = [] generateMatching len alphabet (Lit x : ps) | x `Set.member` alphabet = [ (x : xs) | xs - generateMatching (len - 1) alphabet ps ] | otherwise = [ ] generateMatching len alphabet (All : ps) = [ (x : xs) | x - Set.toList alphabet , xs - unionSorted (generateMatching (len - 1) alphabet ps) (generateMatching (len - 1) alphabet (All : ps)) ] `unionSorted` generateMatching len alphabet ps generateMatching _ _[] = [] generateNotMatching :: (Ord a) = [a] - Int - [[Pat a]] - [[a]] generateNotMatching alphabet len patterns = generateMatching len alphaSet [All] `subtractSorted` foldr unionSorted [] (map (generateMatching len alphaSet . simplifyPat) patterns) where alphaSet = Set.fromList alphabet simplifyPat (All : All : ps) = simplifyPat (All : ps) simplifyPat (p : ps) = p : simplifyPat ps simplifyPat [] = [] joinSorted :: Ord a = [a] - [a] - [(Maybe a, Maybe a)] joinSorted (x1:x2:_) _ | assert (x1 x2) False = undefined joinSorted _ (y1:y2:_) | assert (y1 y2) False = undefined joinSorted (x:xs) (y:ys) = case x `compare` y of LT - (Just x, Nothing) : joinSorted xs (y:ys) EQ - (Just x, Just y) : joinSorted xs ys GT - (Nothing, Just y) : joinSorted (x:xs) ys joinSorted (x:xs) [] = (Just x, Nothing) : joinSorted xs [] joinSorted [] (y:ys) = (Nothing, Just y) : joinSorted [] ys joinSorted [] [] = [] unionSorted :: Ord a = [a] - [a] - [a] unionSorted xs ys = catMaybes (map (uncurry mplus) (joinSorted xs ys)) subtractSorted :: Ord a = [a] - [a] - [a] subtractSorted xs ys = catMaybes (map f (joinSorted xs ys)) where f (Just x, Nothing) = Just x f _ = Nothing test = do t [1,2] 3 [[Lit 1, All, Lit 2]] t ['a','b'] 3 [[Lit 'a', All, Lit 'b'], [Lit 'b', All, Lit 'a']] t [1,2] 3 [[Lit 1, All, Lit 2], [Lit 2, All, Lit 1]] where t a b c = do putStrLn (concat (intersperse [generateMatching, show a, show b, show c])) mapM_ (putStrLn . ( ++) . show) (generateNotMatching a b c) 88888888 Best regards Tomasz -- I am searching for programmers who are good at least in (Haskell || ML) (Linux || FreeBSD || math) for work in Warsaw, Poland -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Tomasz Zielonka wrote: putStrLn (concat (intersperse [generateMatching, show a, show b, show c])) Minor correction: it should be generateNotMatching. Best regards Tomasz -- I am searching for programmers who are good at least in (Haskell || ML) (Linux || FreeBSD || math) for work in Warsaw, Poland -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Major correction (missing case): Tomasz Zielonka wrote: generateMatching :: (Ord a) = Int - Set a - [Pat a] - [[a]] generateMatching 0 _[]= [[]] generateMatching 0 alphabet (All:ps) = generateMatching 0 alphabet ps generateMatching 0 _(_:_) = [] Best regards Tomasz -- I am searching for programmers who are good at least in (Haskell || ML) (Linux || FreeBSD || math) for work in Warsaw, Poland -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Flame war? Absolutely not. My reason is to learn. There are many sites dedicated to reasonably objective comparisons between languages. Here are two examples: http://www.smallscript.org/Language%20Comparison%20Chart.asp http://www.jvoegele.com/software/langcomp.html The wildcard exclusion problem is interesting enough to have many distinct, elegant solutions in as many languages. It would be interesting to see if they converge to roughly the same solution or if there are essential differences. And your code is a crash course in Haskell! Tossing aside the 'flame war' inquiry your code response is my only goal. I hope many others find the problem and responses as fascinating as I do. -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
The point is to submit elegant code that showcases the features of each language. And the problem is, just to clarify, given a set WC of wildcards in any logical combination, and if WC(S^n) is the set all s in S^n that matches the wildcards, then efficiently generate the complement S^n\WC(S^n). You are free to restate the problem in any equivalent way. -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
[EMAIL PROTECTED] schreef: There are many sites dedicated to reasonably objective comparisons between languages. Here are two examples: http://www.smallscript.org/Language%20Comparison%20Chart.asp http://www.jvoegele.com/software/langcomp.html http://shootout.alioth.debian.org/ -- Affijn, Ruud Gewoon is een tijger. echo 014C8A26C5DB87DBE85A93DBF |perl -pe 'tr/0-9A-F/JunkshoP cartel,/' -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Without much testing. Common Lisp Pattern exclusions are made lispy. (defun all-lists (list length) (unless (zerop length) (if (= length 1) (mapcar #'list list) (loop for elt in list nconc (mapcar (lambda (rest) (cons elt rest)) (loop for rest on list nconc (all-lists rest (1- length (defun cp-without-wc (source-list rest patterns) (let* ((length (length (first patterns))) (all-lists (all-lists source-list length))) (dolist (pattern patterns) (setf all-lists (set-difference all-lists (mapcar (lambda (insertion) (let ((cp (copy-list pattern))) (loop for place on cp when (eql :any (car place)) do (setf (car place) (pop insertion))) cp)) (all-lists source-list (count :any pattern))) :test #'equal))) (remove-duplicates all-lists :test #'equal))) CL-USER 22 (cp-without-wc '(a b) '(a :any b) '(b :any a)) ((A A A) (A B A) (B A B) (B B B)) CL-USER 23 (cp-without-wc '(abc xyz) '(abc :any xyz)) ((XYZ XYZ XYZ) (XYZ XYZ ABC) (XYZ ABC XYZ) (XYZ ABC ABC) (ABC XYZ ABC) (ABC ABC ABC)) CL-USER 24 (cp-without-wc '(a b) '(a :any :any)) ((B B B) (B B A) (B A B) (B A A)) CL-USER 25 (cp-without-wc '(a b) '(a :any :any) '(b :any :any)) NIL CL-USER 26 (cp-without-wc '(a b) '(:any :any b)) ((B B A) (B A A) (A B A) (A A A)) CL-USER 27 Wade -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
What I have in mind is the efficient, enumerated generation of the complement S^n/WC(S^n). A good program should initialize, generate, and terminate. T=cartprodex(S,n,WC); //initialize for all i in T do what you want with i test to see if any more terminate if not and it should do this without explicitly generating WC and then complementing. For example, if the cardinality of S is m, and the WC is just '*a*b*', with a != b, then EX(S^n):=S^n\WC(S^n) has cardinality (m-1)^(n-1)*(m+n-1). Specifically, if m=5 and n=10, then |EX|=3670016 while |S^10|=9765625, so that |EX|/|S^10| is about 0.3758. In general the program should directly generate EX from arbitrary WC. Of course, in practice the WC should themselves occur in a logically consistent manner, but let's just assume they're a given. -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
here is my version of the same. REPL output: CL-USER (tests) set = (1 2) n= 3 patterns = ((1 ANY 2)) --- (1 1 1) (1 2 1) (2 1 1) (2 1 2) (2 2 1) (2 2 2) set = (A B) n= 3 patterns = ((A ANY B) (B ANY A)) --- (A A A) (A B A) (B A B) (B B B) set = (1 2) n= 3 patterns = ((1 ANY 2) (2 ANY 1)) --- (1 1 1) (1 2 1) (2 1 2) (2 2 2) NIL CL-USER source: cartesian products minus wildcard patterns per: Newsgroups: comp.lang.lisp, etc... Subject: Programming challenge: wildcard exclusion in cartesian products Date: 16 Mar 2006 03:14:23 -0800 (defun show-me (x) (format t ~A~% x)) (defun set^n (fn set n optional acc) call `fn' on each permutation of `set' raised to the `n' power (if (= n 0) (funcall fn (reverse acc)) (dolist (e set) (set^n fn set (- n 1) (cons e acc) ;; test set^n by printing and visually inspecting the result (defun pr-set^n (set n) (set^n #'show-me set n)) ;; curry `set^n' so that `fn' is the only parameter (defun set^n-gen (set n) (lambda (fn) (set^n fn set n))) (defun mk-matchl-p (pat-list) return a function that tests a value against the patterns in `pat-list' (labels ((matchp (pat val) (cond ((null pat) t) ((or (eq (car pat) (car val)) (eq (car pat) :any)) (matchp (cdr pat) (cdr val)) (lambda (val) predicate: return true if val matches any pattern in `pat-list' (dolist (p pat-list) (if (matchp p val) (return t)) (defun not-fp (f-pred) return the complement of predicate `f-pred' (lambda (x) (not (funcall f-pred x ;; f-gen is a generator of the form returned by set^n-gen (defun accumulate-if (f-gen f-pred) accumulate values generated by f-gen that satisfy f-pred (let (acc) (funcall f-gen (lambda (x) (if (funcall f-pred x) (push x acc (nreverse acc))) ;; `pr-set^n-withoutWC' is the lisp equivalent (more or less) of ;; python code: ;;for i in cp.CPWithoutWC(x,y,z): print i (defun pr-set^n-withoutWC (set n pat-list) (format t ~%~%set = ~A~%n= ~A~%patterns = ~A~%~A~% set n pat-list ---) (dolist (e (accumulate-if (set^n-gen set n) (not-fp (mk-matchl-p pat-list (format t ~A~% e))) (defun tests () generate test output per the original problem examples (pr-set^n-withoutWC '(1 2) 3 '((1 :any 2))) (pr-set^n-withoutWC '(a b) 3 '((a :any b) (b :any a))) (pr-set^n-withoutWC '(1 2) 3 '((1 :any 2) (2 :any 1 -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
NOTE: I am a lisp newbie. I'm sure our resident lisp experts can create much better (both faster, shorter and clearer) solutions than the one above. Even I could have created something shorter but I thought it would be fun to apply the utility function approach in decomposing the problem. --jfc -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
in k: cp:{[c;n;p]+(n#c)_vs(!_ c^n)_dvl,/{2_sv+(,/,/:\:)/(),/:@[x;x=-1;:[;!c]]}'p} examples: cp[2;3;,0 -1 1] (0 0 0 0 1 0 1 0 0 1 0 1 1 1 0 1 1 1) cp[2;3;(0 -1 1;1 -1 0)] (0 0 0 0 1 0 1 0 1 1 1 1) cp[2;3;(0 -1 1;1 -1 1)] (0 0 0 0 1 0 1 0 0 1 1 0) arguments of cp: c = cardinality of the input set n = power p = list of patterns (-1 = wildcard) the algorithm directly computes the target set. in other words, it does not generate the set, then filter the matches from the target. modifying cp to accept s instead of the cardinality of s, patterns expressed in terms of elements of s, c. adds nothing of interest to the problem. [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] The python code below generates a cartesian product subject to any logical combination of wildcard exclusions. For example, suppose I want to generate a cartesian product S^n, n=3, of [a,b,c,d] that excludes '*a*b*' and '*c*d*a*'. See below for details. CHALLENGE: generate an equivalent in ruby, lisp, haskell, ocaml, or in a CAS like maple or mathematica. #--- # Short algorithm description # using function _genAll the program generates # cartesian product without sets, which match # some wildcarts # Sets generation uses recursion - # first of all sets will be generated with dimension 1 and than filtered through wildcarts # then sets will be generated with dimension 2 and filtered again # until the required set dimension is reached # Program avoids explicit generation of some part of CP sets # if the end of whildcart is asterics (*) and if the first part of whildcart (without astrics) # matches current set = then this set will be filtered out and won't be used in # higher dimension set generation # example *,1,*,2,* [1,2] dim = 10 # by dimension 2 only arrays [1,1],[2,1],[2,2] are will be generated # = array [1,2] won't be used in next recursion levels #--- # To obtaine result use function # CPWithoutWC first parameter is a list of any elements (char,int,string,class exemplar , any type) # secont param is CP dimension # other parameters are wildcarts = lists with any values then may include # special value ALL - asterics equivalent #Example of usage: command line # import cartesianProduct as cp # for i in cp.CPWithoutWC([1,2],3,[1,cp.ALL,2]): # print i # [1, 1, 1] # [1, 2, 1] # [2, 1, 1] # [2, 1, 2] # [2, 2, 1] # [2, 2, 2] # for i in cp.CPWithoutWC(['a','b'],3,['a',cp.ALL,'b'],['b',cp.ALL,'a']): # print i # ['a', 'a', 'a'] # ['a', 'b', 'a'] # ['b', 'a', 'b'] # ['b', 'b', 'b'] # for i in cp.CPWithoutWC([1,2],3,[1,cp.ALL,2],[2,cp.ALL,1]): #print i # [1, 1, 1] # [1, 2, 1] # [2, 1, 2] # [2, 2, 2] # # for i in cp.CPWithoutWC([1,2],121212,[1,cp.ALL],[2,cp.ALL,1]): #print i ## execute immediately # # if You don't want to print cp. before ALL and CPWithoutWC use import like this: # from cartesianProduct import ALL,CPWithoutWC # CPWithoutWC is a python generator. Which means that it returns values # immediately and generate next in next cycle. # Program example # ## from cartesianProduct import ALL,CPWithoutWC ## def main(): ## for i in cp.CPWithoutWC(['a','b'],3,['a',cp.ALL,'b'],['b',cp.ALL,'a']): ## ## do what You want with current value ## . ## ## go back to for statement and generate new ## if __name__ == __main__: ## main() # Using logical combinations of WC: 1) It's possible to pass on to the function CPWithoutWC any number of wildcarts after first two parameters, for example: CPWithoutWC([1,2],121212,[1,cp.ALL],[2,cp.ALL,1],...) where ... - is any other wildcart's additional function parameters. Number of additional WC is not limited. Function will filter out all combinations, which match any passed on WC. It's equal to WC1 | WC2 | , where | is python analog of OR logical operations. 2) To use more complex WC combinations follow these steps a) First of all create all needed WC b) Then use operators |, and braces () to create combinations required and then pass it on to function CPWithoutWCEx as the third parameter. Don't use or and and python statement, otherwise program will work improper. First two parameters of this function are the same as of CPWithoutWC function - set of elements and CP dimension. An example of what was described above in command line: from cartesianProduct import ALL,CPWithoutWC,CPWithoutWCEx,WC a = WC([ALL,1,ALL]) b = WC([ALL,2,ALL]) c = a b #filter out all sets which match a and b for i in CPWithoutWCEx([1,2],3,c) : print i [1, 1, 1] [2, 2, 2] # all sets where both 1 and 2 are present will be filtered out d = a | b for i in CPWithoutWCEx([1,2],3,d) : print i
Re: Programming challenge: wildcard exclusion in cartesian products
[EMAIL PROTECTED] [EMAIL PROTECTED] writes: The python code below generates a cartesian product subject to any logical combination of wildcard exclusions. For example, suppose I want to generate a cartesian product S^n, n=3, of [a,b,c,d] that excludes '*a*b*' and '*c*d*a*'. See below for details. I'm afraid that different programs in this thread has understood the asterisk differently: that it matches any single element, or that it matches any sequence of elements. -- __( Marcin Kowalczyk \__/ [EMAIL PROTECTED] ^^ http://qrnik.knm.org.pl/~qrczak/ -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
The asterisk '*' matches any sequence of elements, not just one element. The wildcard '*1*2*' would then correspond to a tuple with a 1 preceding a 2 in any positions. The wc '1*2' would correspond to a starting 1 and an ending 2 with anything in between. The wc *12* would correspond to a 1 adjacent to a 2 with the pair in any position. Possibilities like '*a*a*b*' and '*a*a*a*' of any length are also allowed. If n is the dimension, then any n-tuple wc is just a point. My apologies for the confusion. -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Is this Haskell implementation what you want? It does the wildcard matching through a state machine and it essentially threads the state machine through the cartesian product, switching to the ordinary cartesian product when possible as an optimisation. The execution of the state machine is shared by strings with the same prefix making it reasonably efficient even though the state machine itself isn't optimised. If it doesn't work, I'm sure it's only a few typos away... -- generate strings of length n from alphabet l such that -- the state machine, with transition function t, is not on -- a final state (determined by function f) at the -- end of the string. -- If the state is ever 'unmatchable' (as determined by u) -- we just return the cartesian product as no rejection -- can take place. generate f u t s 0 l = if f s then [] else [[]] generate f u t s n l | u s = sequence (replicate n l) | otherwise = [a:b | a - l, let s' = t s a, b - generate f u t s' (n-1) l] -- The states are lists of regular expressions -- where [a,b,..] means match a or b or... -- This is the transition function for our machine. transition pat a = pat = d a where -- Brzozowski derivative d a [] = [] d a p@('*':pat) = p:d a pat d a (p:pat) | a==p = [pat] | otherwise = [] -- A terminal state is one that matches the null string terminal p = or $ map terminal' p where terminal' = True terminal' ('*':p) = terminal' p terminal' _ = False run n alphabet pat = generate terminal null transition [pat] n alphabet test = run 3 abc aa*a -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
[EMAIL PROTECTED] wrote: What I have in mind is the efficient, enumerated generation of the complement S^n/WC(S^n). A good program should initialize, generate, and terminate. T=cartprodex(S,n,WC); //initialize for all i in T do what you want with i test to see if any more terminate if not and it should do this without explicitly generating WC and then complementing. For example, if the cardinality of S is m, and the WC is just '*a*b*', with a != b, then EX(S^n):=S^n\WC(S^n) has cardinality (m-1)^(n-1)*(m+n-1). Specifically, if m=5 and n=10, then |EX|=3670016 while |S^10|=9765625, so that |EX|/|S^10| is about 0.3758. In general the program should directly generate EX from arbitrary WC. Of course, in practice the WC should themselves occur in a logically consistent manner, but let's just assume they're a given. Another attempt. I have made no special attempt to create an exclusion language, just used an anonymous lambda predicate. ;; Wade Humeniuk (defclass odometer () ((base :initform 0 :accessor base) (meter :initform nil :accessor meter) (n-digits :initarg :n-digits :accessor n-digits) (digit-set :initarg :digit-set :accessor digit-set))) (defmethod initialize-instance :after ((obj odometer) rest initargs) (setf (base obj) (length (digit-set obj)) (meter obj) (make-array (n-digits obj) :initial-element 0) (digit-set obj) (coerce (digit-set obj) 'vector))) (defun inc-odometer (odometer) (loop with carry = 1 for i from (1- (n-digits odometer)) downto 0 for digit = (incf (aref (meter odometer) i) carry) if (= digit (base odometer)) do (setf (aref (meter odometer) i) 0) (setf carry 1) else do (setf carry 0) while (not (zerop carry (defun zero-meter-p (odometer) (every #'zerop (meter odometer))) (defmethod next-set ((obj odometer)) (prog1 (map 'list (lambda (digit) (aref (digit-set obj) digit)) (meter obj)) (inc-odometer obj))) (defclass cs-with-wc (odometer) ((exclusion :initarg :exclusion :accessor exclusion) (at-end :initform nil :accessor at-end))) (defmethod next-set ((obj odometer)) (tagbody :next (unless (at-end obj) (let ((set (call-next-method))) (when (zero-meter-p obj) (setf (at-end obj) t)) (if (not (funcall (exclusion obj) set)) (return-from next-set set) (go :next)) (defun print-all-cs (set length exclusion) (let ((cs-with-wc (make-instance 'cs-with-wc :n-digits length :digit-set set :exclusion exclusion))) (loop for set = (next-set cs-with-wc) while set do (print set CL-USER 134 (cs-with-wc '(a b) 3 (lambda (set) (destructuring-bind (x y z) set (or (and (eql x 'a) (eql z 'b)) (and (eql x 'b) (eql z 'a)) (A A A) (A B A) (B A B) (B B B) NIL CL-USER 135 (cs-with-wc '(a b) 3 (lambda (set) (eql (second set) 'a))) (A B A) (A B B) (B B A) (B B B) NIL CL-USER 136 (cs-with-wc '(abc xyz) 3 (lambda (set) (and (eql (first set) 'abc) (eql (third set) 'xyz (ABC ABC ABC) (ABC XYZ ABC) (XYZ ABC ABC) (XYZ ABC XYZ) (XYZ XYZ ABC) (XYZ XYZ XYZ) NIL -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
Oops, problems cutting an pasting, should be, ;; Wade Humeniuk (defclass odometer () ((base :initform 0 :accessor base) (meter :initform nil :accessor meter) (n-digits :initarg :n-digits :accessor n-digits) (digit-set :initarg :digit-set :accessor digit-set))) (defmethod initialize-instance :after ((obj odometer) rest initargs) (setf (base obj) (length (digit-set obj)) (meter obj) (make-array (n-digits obj) :initial-element 0) (digit-set obj) (coerce (digit-set obj) 'vector))) (defun inc-odometer (odometer) (loop with carry = 1 for i from (1- (n-digits odometer)) downto 0 for digit = (incf (aref (meter odometer) i) carry) if (= digit (base odometer)) do (setf (aref (meter odometer) i) 0) (setf carry 1) else do (setf carry 0) while (not (zerop carry (defun zero-meter-p (odometer) (every #'zerop (meter odometer))) (defmethod next-set ((obj odometer)) (prog1 (map 'list (lambda (digit) (aref (digit-set obj) digit)) (meter obj)) (inc-odometer obj))) (defclass cs-with-wc (odometer) ((exclusion :initarg :exclusion :accessor exclusion) (at-end :initform nil :accessor at-end))) (defmethod next-set ((obj cs-with-wc)) (tagbody :next (unless (at-end obj) (let ((set (call-next-method))) (when (zero-meter-p obj) (setf (at-end obj) t)) (if (not (funcall (exclusion obj) set)) (return-from next-set set) (go :next)) (defun print-all-cs (set length exclusion) (let ((cs-with-wc (make-instance 'cs-with-wc :n-digits length :digit-set set :exclusion exclusion))) (loop for set = (next-set cs-with-wc) while set do (print set CL-USER 7 (print-all-cs '(a b) 3 (lambda (set) (destructuring-bind (x y z) set (or (and (eql x 'a) (eql z 'b)) (and (eql x 'b) (eql z 'a)) (A A A) (A B A) (B A B) (B B B) NIL CL-USER 8 (print-all-cs '(abc xyz) 3 (lambda (set) (and (eql (first set) 'abc) (eql (third set) 'xyz (ABC ABC ABC) (ABC XYZ ABC) (XYZ ABC ABC) (XYZ ABC XYZ) (XYZ XYZ ABC) (XYZ XYZ XYZ) NIL CL-USER 9 -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming challenge: wildcard exclusion in cartesian products
-- The states are lists of regular expressions -- where [a,b,..] means match a or b or... I haven't run or studied your program yet myself but what I had in mind was that the list of wc's are *all* to be excluded, so the list [wc1..wcn] is to correspond generating all tuples matching not(wc1 and .. and wcn). Maybe you're already doing that. The wc's themselves could be logical statements among the 'primitive' wc's. That's why I named it the 'wildcard exclusion problem'. It's a lot easier to specify a list of simpler wc's than create a long logical expression. Thanks to all who are making this an interesting thread. -- http://mail.python.org/mailman/listinfo/python-list