On 10/8/11 12:30 PM, Xinok wrote:
On 10/8/2011 12:47 PM, Andrei Alexandrescu wrote:
Nice writeup, but I found it quite difficult to get into. What would
help is anchoring it with already known stuff (if it's not, the reader
must assume it's unrelated, which makes things difficult). So it would
be great if you compared and contrasted range swap with the in-place
merge algorithm (e.g. http://www.sgi.com/tech/stl/inplace_merge.html),
STL's stable sort (http://www.sgi.com/tech/stl/stable_sort.html) which
is O(N log(N) log(N)), and possibly with std.algorithm.bringToFront.

Simply presenting a stylized implementation of swap range would be
helpful.

I didn't mean for this text to be anything official. I just felt it was
important to provide an explanation of my algorithm so others could
better understand my algorithm and it's implications. That's all.
There's also the issue of, "what if I'm not the first?" I couldn't find
anything similar to the "range swap", but that doesn't mean it didn't
already exist.

Writing papers isn't my forte, I'm a self taught programmer. So if my
algorithm ever gains popularity, I'll let the experts deal with it.

My attempt was to make a few suggestions that would improve your writeup, which in turn not only helps popularity of the algorithm, but also you to hone a useful skill.

Also there are a few oddities in the text:

* "- Constant additional memory (one memory allocation per thread)" ->
the parenthesis does not sustain the point. There could be one memory
allocation but it might allocate a non-constant amount.

I thought it was important to clarify that. My algorithm is easy to
parallelize, but it does require allocating a unique block of memory for
each thread. It is relatively constant as well, as it would make sense
that the number of running threads matches the number of cores in the
hardware. The only reason to allocate a non-constant amount is if you
include the optimization I mentioned, to allocate O(n/1024) space.

Then you may want to say:

* Constant additional memory and one call to an allocation routine per thread.

* All discussion about tail call optimization is unneeded. Tail calls
can be converted trivially to loops, so don't mention anything. Feel
free to convert to loops if needed.

I think it's an issue worth addressing though. Some programmers might
assume that, because it's a variant of merge sort, stack overflows won't
be an issue. When I originally implemented my algorithm, I didn't use
tail calls and I had problems with stack overflows on partially sorted
data. So it is an issue.

No, it's not. Please think through: tail recursion IS A loop, period. (BTW I see you use simple tail recursion as opposed to the more general tail calls.) It's cut and dried. So no discussion is needed. Just mention there is no risk of stack overflow, without ever mentioning tail calls. In fact there's not even a need to mention that because otherwise the algorithm would have non-constant space complexity, which you clarify it doesn't.


Andrei

Reply via email to