Yes, the data has not yet been ingested. I can control the table structure;
hopefully by integrating (or extending) the D4M schema.

I'm leaning towards using as part of
the ingest process. Upon start up, existing tables would be analyzed to
find cardinality. Then as records are ingested, the cardinality would be
adjusted as needed. I don't yet know how to store the cardinality
information so that restarting the ingest process doesn't require
re-processing all the data. Still researching.

On Fri, May 16, 2014 at 4:19 PM, Corey Nolet <> wrote:

> Can we assume this data has not yet been ingested? Do you have control
> over the way in which you structure your table?
> On Fri, May 16, 2014 at 1:54 PM, David Medinets 
> <>wrote:
>> If I have the following simple set of data:
>> NAME John
>> NAME Jake
>> NAME John
>> NAME Mary
>> I want to end up with the following:
>> NAME 3
>> I'm thinking that perhaps a HyperLogLog approach should work. See
>> for more information.
>> Has anyone done this before in Accumulo?

Reply via email to