This is for Accumulo 1.6.  Suppose we have the table splits

c

g

w


Does anyone know how to determine

   1. *the number of tablets assigned to each table split range?  *
   For this example, this is the number of tablets in the ranges (-Inf,c),
   (c,g), (g,w), (w,Inf).  Or is the design 1-1, that is, for each table split
   range there is exactly one tablet?
   2. *the number of rows inside all the tablets occupying a table split
   range?  *
   For this example, this is the total number of rows among all tablets in
   the ranges (-Inf,c), (c,g), (g,w), (w,Inf).

We use this count to verify how well manually set table splits are load
balancing in the tables.

Some context: I wrote functions that found these numbers two years ago
working on D4M in Accumulo 1.5.  I took the dark route of using non-public
Accumulo API to get TabletServer information, get TabletStats information,
and find the matchings to a table's splits by scanning the extents listed
in the METATABLE.  I can share the code if anyone is curious.  It's not
pretty, but it did the job.

Moving forward as we aim to upgrade to Accumulo 1.6, we should determine
the tablet split information the right way, not by reverse engineering
Accumulo.  Any suggestions?

Thanks,
Dylan Hutchison

-- 
www.cs.stevens.edu/~dhutchis

Reply via email to