Phobos usability

bearophile Fri, 24 Dec 2010 00:35:29 -0800

Another post about Phobos usability. Again the relative code is a rosettacode 
Task:


>Assume we have a collection of numbers, and want to find the one with the 
>largest minimal prime factor (that is, the one that contains relatively large 
>factors). To speed up the search, the factorization should be done in parallel 
>using separate threads or processes, to take advantage of multi-core CPUs.<

The original D2 implementation (probably by feepinCreature), with little 
changes:
http://rosettacode.org/wiki/Parallel_calculations#D

This D2 version is not bad, but there are some ways to improve it.

------------------------------------

1) Finding the max or min of an iterable is a really common need. Probably I 
need to do it every 50-100 lines of code or less. This is how it is found in 
that program:

reduce!q{max(a, b)}(0L, minFactors));

I strongly suggest the max() and min() to find the max of an iterable by 
themselves. And to optionally accept a mapping function (as schwartzSort). See:
http://d.puremagic.com/issues/show_bug.cgi?id=4705

With that the line of code becomes more readable and shorter, and surely 
bug-free:
max(minFactors);

Similar considerations suggest the creation of a sum() function:
http://d.puremagic.com/issues/show_bug.cgi?id=4725

A sum() is not just a reduce, because it is allowed to use an optimization. See 
issue 4725.

------------------------------------

2) Clojure language shows how nice is a pmap(), parallel map.

If necessary two different pmap() may be created, one where the single mapping 
operation is slow enough, and one where it is very cheap. For this Task I use 
the pmap() for costly mapping function.

Using pmap() the D2 program may be shortened to (this loses the timings, but 
they are not required):

import std.stdio, std.math, std.algorithm, std.parallel;

pure ulong lowestFactor(immutable ulong n) {
    if (n % 2 == 0)
        return 2;
    else {
        immutable ulong limit = cast(ulong)sqrt(n) + 1;
        for (ulong i = 3; i < limit; i += 2)
            if (n % i == 0)
                return i;
    }
    return n;
}

void main() {
    auto numbers = [2UL^^61-1, 2UL^^61-1, 2UL^^61-1, 112272537195293,
                    115284584522153, 115280098190773, 115797840077099,
                    112582718962171, 112272537095293, 1099726829285419];

    writefln("Largest min. factor is %s.",
             max(pmap(&lowestFactor, numbers)));
}


pmap is able to see that the given mapping function is pure. Here 
lowestFactor() is strongly pure, so pmap() is free to ignore the execution 
order of the single lowestFactor functions.

An alternative syntax for pmap is similar to the map:
pmap!(&lowest_factor)(numbers)

------------------------------------

3) For debugging I may want to print the array of lowest factors:
writeln(pmap(&lowest_factor, numbers));

Or even just:
writeln(map!(&lowest_factor)(numbers));

writeln() is able to print a lazy iterable, but it prints it just as an array. 
This is bad, because it's important to give cues to the person that reads to 
the printout regarding the types of the types printed.

A simple way to tell apart lazy sequences from arrays is to use a semicolon 
instead a comma to tell apart items (lists are sometimes written using a 
semicolon in other functional languages, so this is not a new thing):

[0; 1; 2; 3; 4]

See:
http://d.puremagic.com/issues/show_bug.cgi?id=3813

Empty dynamic arrays, empty static arrays, empty associative arrays and empty 
lazy iterables need some kind of output when they are printed, I suggest a 
simple "[]" at least. With no output it's too much easy to confuse "no output" 
with "empty array" with "missing writeln" with "not run code path", etc. (In 
bugzilla there is a bug report about this, not written by me, I don't remember 
its number).

Bye,
bearophile

Phobos usability

Reply via email to