(spinning off from the other thread)
The backstory on Sean's testing can be found in [1]. Essentially, in his
testing, he observed some cases where there was an unexplained ~30%
performance impact.
<quote
Batch write performance for Accumulo 1.7.2‐cdh5.5.0 shows a regression
of up to approximately 30 percent, depending on table shape, when
compared to Accumulo 1.6.0‐cdh5.1.4. The performance decrease is more
severe for exceptionally large cells (100k and larger) or exceptionally
wide rows (10k columns). Carefully consider the performance impact for
your environment when deciding to upgrade to Accumulo 1.7.2‐cdh5.5.0.
</quote>
Since it came up again, I was hoping we could put this concern to rest,
chalking it up to the WAL flush/sync calls that changed between 1.6 and
1.7 as documented by our Keith[2]. Hopefully, Sean's notes are
sufficient for us to reconstruct his environment :)
- Josh
[1]
https://www.cloudera.com/documentation/other/accumulo/latest/PDF/Apache-Accumulo-Installation-Guide-1-7-2.pdf
[2] https://accumulo.apache.org/blog/2016/11/02/durability-performance.html
-------- Forwarded Message --------
Subject: Re: [DISCUSS] Question about 1.7 bugfix releases
Date: Tue, 6 Jun 2017 14:20:27 -0400
From: Josh Elser <josh.el...@gmail.com>
To: dev@accumulo.apache.org
On 6/6/17 2:13 PM, Sean Busbey wrote:
On Tue, Jun 6, 2017 at 12:07 PM, Josh Elser <josh.el...@gmail.com> wrote:
On 6/6/17 12:39 PM, Sean Busbey wrote:
For example, has anyone done perf comparisons between 1.7 and 1.8.z?
When it came time for me to start telling folks that it was "safe" to
upgrade to 1.7.z I ran into something like a 40-60% perf degradation
on writes compared to 1.6 across the board. A little bit of this was
already fixed in 1.8 at the time, but a substantial amount required a
non-trivial refactoring because just no one had looked[1]. Even after
all of that, I still had to caveat things because I still saw a
~15-30% perf drop on random writes in the presence of lots of columns.
At a risk of de-railing otherwise good discussion on releases: do you recall
if you had accounted for the following, Sean? (notably, the last code
snippet)
https://accumulo.apache.org/blog/2016/11/02/durability-performance.html
I know that "set durability to flush and not sync" was one of the
parameters for the comparison, but I don't remember what was done
specifically during the testing back in September, tbh.
I can probably dig it out if you'd like; I think we were pretty good
at keeping notes. Probably something for a different thread?
Agreed. Just wanted to ask before I forgot again. Saw some relevance in
the worry of perf regressions 1.7->1.8 based on the existence of those
you saw 1.6->1.7, but def don't want to derail further here.
If you have the time and the notes, would be happy to review.