I read through all the internal notes I could find from back in that testing, and I don't see any mention of changing the durability settings on meta nor root.
So that's a plausible source for the perf hit. I don't know when I'll have time to run through some tests to verify. On Tue, Jun 6, 2017 at 2:43 PM, Josh Elser <josh.el...@gmail.com> wrote: > (spinning off from the other thread) > > The backstory on Sean's testing can be found in [1]. Essentially, in his > testing, he observed some cases where there was an unexplained ~30% > performance impact. > > <quote > Batch write performance for Accumulo 1.7.2‐cdh5.5.0 shows a regression of up > to approximately 30 percent, depending on table shape, when compared to > Accumulo 1.6.0‐cdh5.1.4. The performance decrease is more severe for > exceptionally large cells (100k and larger) or exceptionally wide rows (10k > columns). Carefully consider the performance impact for your environment > when deciding to upgrade to Accumulo 1.7.2‐cdh5.5.0. > </quote> > > Since it came up again, I was hoping we could put this concern to rest, > chalking it up to the WAL flush/sync calls that changed between 1.6 and 1.7 > as documented by our Keith[2]. Hopefully, Sean's notes are sufficient for us > to reconstruct his environment :) > > - Josh > > [1] > https://www.cloudera.com/documentation/other/accumulo/latest/PDF/Apache-Accumulo-Installation-Guide-1-7-2.pdf > [2] https://accumulo.apache.org/blog/2016/11/02/durability-performance.html > > > -------- Forwarded Message -------- > Subject: Re: [DISCUSS] Question about 1.7 bugfix releases > Date: Tue, 6 Jun 2017 14:20:27 -0400 > From: Josh Elser <josh.el...@gmail.com> > To: dev@accumulo.apache.org > > On 6/6/17 2:13 PM, Sean Busbey wrote: >> >> On Tue, Jun 6, 2017 at 12:07 PM, Josh Elser <josh.el...@gmail.com> wrote: >>> >>> On 6/6/17 12:39 PM, Sean Busbey wrote: >>>> >>>> >>>> For example, has anyone done perf comparisons between 1.7 and 1.8.z? >>>> >>>> When it came time for me to start telling folks that it was "safe" to >>>> upgrade to 1.7.z I ran into something like a 40-60% perf degradation >>>> on writes compared to 1.6 across the board. A little bit of this was >>>> already fixed in 1.8 at the time, but a substantial amount required a >>>> non-trivial refactoring because just no one had looked[1]. Even after >>>> all of that, I still had to caveat things because I still saw a >>>> ~15-30% perf drop on random writes in the presence of lots of columns. >>> >>> >>> >>> At a risk of de-railing otherwise good discussion on releases: do you >>> recall >>> if you had accounted for the following, Sean? (notably, the last code >>> snippet) >>> >>> https://accumulo.apache.org/blog/2016/11/02/durability-performance.html >> >> >> I know that "set durability to flush and not sync" was one of the >> parameters for the comparison, but I don't remember what was done >> specifically during the testing back in September, tbh. >> >> I can probably dig it out if you'd like; I think we were pretty good >> at keeping notes. Probably something for a different thread? >> > > Agreed. Just wanted to ask before I forgot again. Saw some relevance in the > worry of perf regressions 1.7->1.8 based on the existence of those you saw > 1.6->1.7, but def don't want to derail further here. > > If you have the time and the notes, would be happy to review. -- busbey