I asked where index origin 0 would help.
(I am familiar with Dijkstra's article on why counting should start
at 0 and agree whole-heartedly.  I am also convinced that "kicking
against the goads" is not helpful.)

Peter Wolf <[EMAIL PROTECTED]>
has provided two examples.

(1) Converting pseudo-code which assumes index origin 0.

    I've done this myself, although not yet in R.  I've also done it
    the other way, converting index origin 1 code to C.

    A good method, I've found, is to to start by converting to
    index-free form.  This applies whatever the source and target
    languages are.

(2) A sorting algorithm.

    sort.6 <- function (a) {
       n <- length(a)
       adapt <- function (i) {i+1}
       a <- c(0,a)
       for (i in 2:n) {
          j <- i-1
          a[adapt(0)] <- a[adapt(i)]
          while (a[adapt(j)] > a[adapt(0)]) {
             a[adapt(j+1)] <- a[adapt(j)]
             j <- j-1
          }
          a[adapt(j+1)] <- a[adapt(0)]
       }
       a[-1]
    }

    The really interesting thing here is that this basically is an
    index origin 1 algorithm.  The original array and the final result
    start at 1, not 0, and position 0 is used for a "sentinel".

    Let's convert it to 1-origin.  I'll skip the details of how I
    did it because that's not the point I want to make.

    sort.VI <- function (a) {
       a <- c(0, a)
       for (i in 3:length(a)) {
          j <- i-1
          a[1] <- a[i]
          while (a[j] > a[1]) {
             a[j+1] <- a[j]
             j <- j-1
          }
          a[j+1] <- a[1]
       }
       a[-1]
    }

    What do you get if you move up to index-free form?

    sort.six <- function (a) {
        s <- c()
        for (x in a) {
            f <- s <= x
            s <- c(s[f], x, s[!f])    # insert x stably into s
        }
        s
    }

    It's clear that sort.six is shorter, clearer, and easier to get
    right than sort.VI.  But how much do we have to pay for this?
    How much efficiency do we lose?

    > a <- runif(400)
    > system.time(sort.VI(a))
    [1]  3.64  0.02 12.56  0.00  0.00
    > system.time(sort.six(a))
    [1] 0.15 0.01 0.16 0.00 0.00

    We don't lose any efficiency at all.  We gain, considerably.
    (Not as much as we'd gain by using the built-in sort(), of course.)

I expect that this will happen fairly often:  rewriting the code to be
index-free will *usually* make it shorter and clearer, and will *always*
make it easier to adapt to a language with a different index origin.
When the target language is R or S, the result is likely to be faster
than a direct conversion.

______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to