Gustaf Neumann wrote:
This is most probably the best variabt so far, and not complicated, such a
optimizer can do "the right thing" easily. sorry for the many versions..
-gustaf


    { unsigned register int s = (size-1) >> 3;
      while (s>1) { s >>= 1; bucket++; }
    }

      if (bucket > NBUCKETS) {
    bucket = NBUCKETS;
      }

I don't think anyone has pointed this out yet, but this is a logarithm in base 2 (log2), and there are a fair number of implementations of this available; for maximum performance there are assembly implementations using 'bsr' on x86 architectures, such as this one from google's tcmalloc:

// Return floor(log2(n)) for n > 0.
#if (defined __i386__ || defined __x86_64__) && defined __GNUC__
static inline int LgFloor(size_t n) {
  // "ro" for the input spec means the input can come from either a
  // register ("r") or offsetable memory ("o").
  size_t result;
  __asm__("bsr  %1, %0"
          : "=r" (result)               // Output spec
          : "ro" (n)                    // Input spec
          : "cc"                        // Clobbers condition-codes
          );
  return result;
}
#else
// Note: the following only works for "n"s that fit in 32-bits, but
// that is fine since we only use it for small sizes.
static inline int LgFloor(size_t n) {
  int log = 0;
  for (int i = 4; i >= 0; --i) {
    int shift = (1 << i);
    size_t x = n >> shift;
    if (x != 0) {
      n = x;
      log += shift;
    }
  }
  ASSERT(n == 1);
  return log;
}
#endif

(Disclaimer - this comment is based on my explorations of zippy, not vt, so the logic may be entirely different) If this log2(requested_size) is used to translate directly index into the bucket table that necessarily restricts you to having power-of-2 bucket sizes, meaning you allocate on average nearly 50% more than requested (i.e., nearly 33% of allocated memory is overhead/wasted). Adding more, closer-spaced buckets adds to the base footprint but possibly reduces the max usage by dropping the wasted space. I believe tcmalloc uses buckets spaced so that the average waste is only 12.5%.

-J

Reply via email to