Once in a while I show some comparisons of code translated in different languages.

A challenge from Reddit-Dailyprogrammer:

http://www.reddit.com/r/dailyprogrammer/comments/ywm28/8272012_challenge_92_difficult_bags_and_balls/

The purpose of this post of mine is to compare a nice looking Haskell solution of this problem to its Python/D translation. This means this post is about language design matters and it's not about problem solving :-)

(I have used a little program from that Reddit because this allows to show in this post complete runnable programs. Usually you can't do this with real-world code. But I can attest that the lazyness and other things shown in this program and this post are present and used/useful in much larger programs too.)

------------------------------

The little problem:

Compute all the permutations of placing 9 balls into 4 bags such that each bag contains an odd number of balls. Ball count is transient when placing bags within one another. For example, a bag containing 3 balls is placed inside a bag containing 2 balls. The inner bag contains 3 balls and the outer bag contains 5 balls. Some example permutations:

((((9))))
(8(((1))))
(1)(1)((7))

------------------------------

The Haskell solution written by Ledrug, a little reformatted:


bags :: Int -> Int -> Int -> Int -> [String]
bags 0 _ 0 _ = [""]
bags b c n m
    | b <= 0 || n <= 0 || c <= 0 || m <= 0 = []
    | otherwise = [l ++ r |
                    n1 <- [1 .. m],
                    b1 <- if n1 == m then [1 .. c] else [1 .. b],
                    l <- bag b1 n1,
                    r <- bags (b - b1) b1 (n - n1) n1]

bag :: Int -> Int -> [String]
bag 0 0 = [""]
bag b n
    | b <= 0 || n <= 0 = []
    | even n = []
    | otherwise = ["(" ++ replicate n1 '*' ++ chain ++ ")" |
                    n1 <- [0 .. n],
chain <- bags (b - 1) (b - 1) (n - n1) (n - n1)]

pick :: Int -> Int -> [String]
pick b n = bags b b n n

main = do
    mapM_ putStrLn $ (pick 4 9)

------------------------------

A direct translation to Python2 of the Haskell code comes very similar, it's quite lazy, and it's short enough:


def pick(nbags, nballs):
    def bag(b, n):
        if b == 0 and n == 0:
            return [""]
        if b <= 0 or n <= 0 or n % 2 == 0:
            return []
        return ("(" + '*' * n1 + chain + ")"
                for n1 in xrange(n + 1)
                for chain in bags(b - 1, b - 1, n - n1, n - n1))

    def bags(b, c, n, m):
        if b == 0 and n == 0:
            return [""]
        if b <= 0 or n <= 0 or c <= 0 or m <= 0:
            return []
        return (l + r
                for n1 in xrange(1, m + 1)
for b1 in (xrange(1, c+1) if n1 == m else xrange(1, b+1))
                for l in bag(b1, n1)
                for r in bags(b - b1, b1, n - n1, n1))

    return bags(nbags, nbags, nballs, nballs)

def main():
    for sol in pick(4, 9):
        print sol

main()


I have used inner functions because here they keep the global namespace more clean. The user needs to see just the pick() function.

------------------------------

This is a direct translation of the Python2 code to D (but uses eager arrays, that currently are more idiomatic in D):


import std.stdio, std.array, std.range;

string[] pick(in int nbags, in int nballs) /*pure nothrow*/ {
    static struct Namespace {
        static string[] bag(in int b, in int n) /*pure nothrow*/ {
            if (b == 0 && n == 0)
                return [""];
            if (b <= 0 || n <= 0 || n % 2 == 0)
                return [];
            typeof(return) result;
            foreach (n1; 0 .. n + 1)
foreach (chain; bags(b - 1, b - 1, n - n1, n - n1)) result ~= "(" ~ std.array.replicate("*", n1) ~ chain ~ ")";
            return result;
        }

static string[] bags(in int b, in int c, in int n, in int m) /*pure nothrow*/ {
            if (b == 0 && n == 0)
                return [""];
            if (b <= 0 || n <= 0 || c <= 0 || m <= 0)
                return [];
            typeof(return) result;
            foreach (n1; 1 .. m + 1)
                // iota is not pure, nor nothrow
foreach (b1; (n1 == m) ? iota(1, c+1) : iota(1, b + 1))
                    foreach (l; bag(b1, n1))
                        foreach (r; bags(b - b1, b1, n - n1, n1))
                            result ~= l ~ r;
            return result;
        }
    }

    return Namespace.bags(nbags, nbags, nballs, nballs);
}

void main() {
    writefln("%-(%s\n%)", pick(4, 9));
}


In this D code there are few temporary problems:
- Iota is not pure nor nothrow, so none of the functions can be pure nothrow. This Phobos problem will be fixed. - I have had to fully qualify the path of std.array.replicate, to avoid a name clash. Maybe this little problem will be solved.

There are some small problems:
- I have used a static inner struct Namespace (Don's idea), to allow nested mutual recursion. It's not very nice, but for me it's acceptable, also because mutual recursion among inner functions is not so common. - "*".replicate(n1) is less nice and less short than the Python syntax "*" * n1, but it's not a big problem. - The "%-(%s\n%)" formatting string is complex, this syntax isn't so easy to remember, on the other hand here it has a voided one foreach. So I think it's acceptable. - Currently the DMD front-end doesn't contain logic that specifically optimimizes better std.range.iota. Maybe in some years this will improve. In the meantime this is not a big problem.


A problem with inner functions is that in both D and Python they become hard(er) to unittest. In D I'd like this to be valid code:

void main() {
    int foo() { return 1; }
    unittest { assert(foo() == 1; }
}


There is a significant problem:
- This program generates a small sequence of solutions, so lazyness is not important. But sometimes the algoritms have to generate tons of things, and lazyness becomes useful. This D program is not lazy, unlike the Haskell and Python programs. Writing this program with lazyness is not too much hard, but the code becomes uglier, longer and more bug prone. I think C# designers did the right thing adding yield to their language, because it's a construct very often useful.

------------------------------

This is a lazy translation in D, using opApply:


import std.stdio, std.array, std.range;

auto pick(in int nbags, in int nballs) /*pure nothrow*/ {
    static struct Namespace {
        static struct Bag {
            int b, n;
            int opApply(int delegate(ref string) dg) {
                int result;
                string aux;
                if (b == 0 && n == 0) {
                    aux = "";
                    result = dg(aux);
                    if (result) goto END;
                }
                if (b <= 0 || n <= 0 || n % 2 == 0)
                    goto END;
                foreach (n1; 0 .. n + 1)
foreach (chain; Bags(b - 1, b - 1, n - n1, n - n1)) { aux = "(" ~ std.array.replicate("*", n1) ~ chain ~ ")";
                        result = dg(aux);
                        if (result) goto END;
                    }
                END: return result;
            }
        }

        static struct Bags {
            int b, c, n, m;
            int opApply(int delegate(ref string) dg) {
                int result;
                string aux;
                if (b == 0 && n == 0) {
                    aux = "";
                    result = dg(aux);
                    if (result) goto END;
                }
                if (b <= 0 || n <= 0 || c <= 0 || m <= 0)
                    goto END;
                foreach (n1; 1 .. m + 1)
foreach (b1; (n1 == m) ? iota(1, c+1) : iota(1, b + 1))
                        foreach (l; Bag(b1, n1))
foreach (r; Bags(b - b1, b1, n - n1, n1)) {
                                aux = l ~ r;
                                result = dg(aux);
                                if (result) goto END;
                            }

                END: return result;
            }
        }
    }

    return Namespace.Bags(nbags, nbags, nballs, nballs);
}

void main() {
    foreach (sol; pick(4, 9))
        writeln(sol);
}


In main() I have had to use a foreach because currently writeln() is not able to print all the items of an iterable implemented with opApply. I hope this will be fixed (http://d.puremagic.com/issues/show_bug.cgi?id=4264 ).

Both Walter and Andrei don't like opApply a lot, and I understand why they do (for uniformity, language simplicity, for the higher flexibility of ranges). On the other hand if you try to translate the Haskell code to D using Ranges it becomes long.

------------------------------

Your can object that this is not idiomatic D code, because I was translating (beautiful, short, readable, safe) Haskell/Python code. My answer is that this kind of lazy code is so commonly useful that I'd like it to be better supported in D :-)

Bye,
bearophile

Reply via email to