Definitely a good idea, Jeremy. Performance numbers always benefit the
community -- I'd love to make sure they get published prominently on the
Accumulo site.
While the value of a benchmark is really only in the workload it
performs, a good benchmark can be decomposed into a base set of
operations which should be generally applicable. I don't agree that
benchmarking Accumulo with D4M is only valid if you then use D4M.
As long as you state your performance requirements in a way that's
comparable to your benchmark, that's all that really matters.
On 3/9/14, 4:35 PM, Jeremy Kepner wrote:
Benchmarking Accumulo generally makes it look very good when
compared to competing technologies and so benchmarking is good for the Accumulo
community.
For any particular application, standard benchmarks can
be very helpful to verify your system is performing correctly.
If we have a performance issue with a system, often the first thing
we will do is run a benchmark on it to determine if the issue
is with the system or how are application is using the system.
On Sun, Mar 09, 2014 at 03:55:31PM -0400, David Medinets wrote:
What is the goal of your benchmarking? To some extent, benchmarking
Accumulo can't provide any true answers because it won't be using your
real-world data. A lot depends on the schema that you use. The D4M
benchmark would only be applicable to you if you plan to use their schema.
On Sun, Mar 9, 2014 at 2:23 PM, Kepner, Jeremy - 0553 - MITLL <
[email protected]> wrote:
On Mar 9, 2014, at 2:21 PM, Arshak Navruzyan <[email protected]> wrote:
The benchmark in the D4M paper is very helpful but perhaps you could
clarify a few things:
1. The 4 million entries per second pertains to the main table only or
the main table, transpose and degree tables as well?
All tables.
2. Can you share you accumulo-site.xml settings for the test? In
particular the memory map size and compaction ratio settings.
On Thu, Mar 6, 2014 at 3:07 PM, Jeremy Kepner <[email protected]> wrote:
There is one in D4M :-)
On Thu, Mar 06, 2014 at 04:48:32PM -0500, Morris, Jason wrote:
Hey everyone,
I was wondering if anyone had a benchmark test for Accumulo? I could
write a map/reduce job that creates a bunch of tables, maybe push some
data, then drop them but I was wondering if anyone had something better?
Thanks,
Jason