Can the bloom filter functionality in Accumulo (1.5.x) be adapted to become
a counting bloom filter?
I would like to use a counting bloom filter (which uses an array of
counting bins rather than a single bit for each array position) to get the
number of matches that are encountered for each corres
Geoffrey,
My quick answer is that I needed to adjust my container (Karaf in my case)
to export the JAAS packages because they come in the JRE. Then I needed to
make the hadoop bundle import them.
Also before I forget, Hadoop packages its default xml configurations
(core-site.xml, core-default.xml
All,
To what extent does the Accumulo Client rely on the Hadoop Client? I
apologize if the question is a bit obtuse. But I got into dependency weeds
trying to get the Hadoop Client to work in OSGI. (See below Hadoop Client
woes) I am now wondering if I OSGified Accumulo's client would I
encou
You could try sharding:
If your RowID is ingest date (to achieve ability to scan over recently
ingested data, as you describe), you could use RowID of
"ShardID_IngestDate" instead, where:
ShardID = hash(row) % numShards
This will result in numShards number of rows for each IngestDate, and
is cho
Russ,
I experienced the same problem. In the end what we decided to do was to take
another property and use it as a prefix and then presplit the tables
E.g. apples\0454316778
We still have situations where nodes run hot during peak usage but we are able
to live with it
Thanks,
Ariel
---
Sent fr
Hi,
I'm looking for advice re. the best way to structure my row IDs.
Monotonically increasing IDs have the very appealing property that I can
quickly scan all recently-ingested unprocessed rows, particularly because I
maintain a "checkpoint" of the most-recently processed row.
Of course, the prob