I don't know what could go wrong, so it's hard to say that it would be
obvious.

For what it's worth, if we can identify a performance bug, we can release a
fix in a 1.10.3, so you can upgrade instead of downgrade.

On Fri, Nov 4, 2022, 16:26 Logan Jones <lo...@codescratch.com> wrote:

> Thanks all,
>
> We have a test system that we could try rolling back. If something does
> break, will it be obvious?
>
> Dave, the ingest rates are slightly more spikey, but I think it's mostly
> because tservers are bouncing and the cluster is working to catch up.
> Nothing major jumps out as an increase in throughput (i.e. ingest rate in
> terms of operations per second seem to be roughly equivalent. The same is
> true for the ingest rate in MB/s.)
>
> On Fri, Nov 4, 2022 at 4:21 PM Christopher <ctubb...@apache.org> wrote:
>
> > I don't think it has any changes that would prevent rollback, but it's
> not
> > a scenario that has been tested to my knowledge.
> >
> > On Fri, Nov 4, 2022, 16:15 Dave Marion <dmario...@gmail.com> wrote:
> >
> > > It's going to take some time to review the changes[1], but I don't see
> > > changes in the default JVM sizes. I was wondering if maybe the issue is
> > > that it's running faster. You are loading the same amount of data, but
> is
> > > it going faster by chance? If so, you could be creating more garbage
> per
> > > unit time putting more pressure on the GC. Just a thought.
> > >
> > > [1] https://github.com/apache/accumulo/compare/rel/1.9.3..rel/1.10.2
> > >
> > > On Fri, Nov 4, 2022 at 4:02 PM Logan Jones <lo...@codescratch.com>
> > wrote:
> > >
> > > > Yeah, our memory usage is drastically different since the upgrade.
> > > >
> > > > We are seeing spikes in heap utilization on tablet servers that
> weren't
> > > > happening before the upgrade despite our ingest load being roughly
> the
> > > > same. This increase in heap utilization seems to be causing long GC
> > > times.
> > > > Those GC times are long enough that the tablet servers lose their
> locks
> > > and
> > > > then die.
> > > >
> > > > Looking into the JVM options, we don't see anything obvious that
> > changed
> > > > around the garbage collector, and looking at the Accumulo release
> notes
> > > > didn't leave us any indication that something like this should have
> > > > changed, but nevertheless we are seeing crashes of tservers. I'm
> mostly
> > > > trying to identify whether or not rollback is even an option.
> > > >
> > > > - Logan
> > > >
> > > > On Fri, Nov 4, 2022 at 3:49 PM Dave Marion <dmario...@gmail.com>
> > wrote:
> > > >
> > > > >   Are you running into an error or some other issue that is making
> > you
> > > > > think that you have to rollback? I don't know that rolling back has
> > > been
> > > > > tested.
> > > > >
> > > > > On Fri, Nov 4, 2022 at 3:40 PM Logan Jones <lo...@codescratch.com>
> > > > wrote:
> > > > >
> > > > > > Hello:
> > > > > >
> > > > > > We recently upgraded from Accumulo 1.9.3 to 1.10.2. Is it safe to
> > > roll
> > > > > back
> > > > > > to Accumulo 1.9.3?
> > > > > >
> > > > > > Thanks in advance,
> > > > > >
> > > > > > - Logan
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to