Thanks Billie/Josh! That's indeed fixing the issue, the scan now returns
instantly!!
So when we scan the whole table and filtering by column family, Accumulo
still has to go through all rows (ordered by the key), and check if the
particular item has specific column family, and in my case since the
Keith kicked off a Continuous Ingest test of 1.6.4 on a cluster of 17
tablet servers. No servers were killed (intentionally, or otherwise). No
data was lost (37B Key/Value pairs).
-Eric
On Fri, Oct 2, 2015 at 10:14 AM, Josh Elser wrote:
> This vote passes with 4 +1s and nothing else.
>
> Thank
Yup that's exactly what my hunch was. You can try configuring a locality
group for your "slow" column families, compact the table and then rerun
your scans. They should be fast after you do this.
On Oct 5, 2015 11:25 AM, "z11373" wrote:
> Hi Josh,
> I see there are 4 tablet files for that table,
Yes. In this case, I would suggest configuring the column families that
have very few rows to be in a separate locality group. You should be able
to do this in the shell with the command:
setgroups groupname=colf1,colf2,colf3 -t tablename
Here, groupname is an arbitrary name for the group; colf1
Hi Josh,
I see there are 4 tablet files for that table, and all of them are in range
from 730MB to 860MB in size.
For those column families that have problem, they are in 2 of those 4
tablets.
They are only a few rows, but for those column families which have no
problem, they have millions of rows.