Chris:

Agreed, it would be nice if the iterators were designed to handle their own
exceptions, and that is certainly the case for this particular iterator.

Dave:

The behavior does seem to indicate that the failure is not being propagated
back up. The transactions are never marked as failed. The behavior I'm
seeing (and the one you'll see in the linked GitHub repo from before) is
that a single transaction is created  and remains IN_PROGRESS. As far as I
can tell, that compaction gets tried over and over again.

I wasn't aware of the Iterator Test Harness. I could try to replicate the
problem there, but after a quick glance of the documentation you linked,
I'm worried the limitation of "exercising delete keys" might be a problem.



On Thu, Jul 7, 2022 at 7:37 AM Dave Marion <dmario...@gmail.com> wrote:

> I think FaTE ensures that the transaction is started and it waits for it to
> finish. It must be the case that a failure is not being propagated back up
> to fail the transaction. Are you seeing FaTE restarting the same compaction
> over and over again, or are the multiple IN_PROGRESS transactions from
> different compactions (my guess is the latter)? It would be interesting to
> see if the Iterator Test Harness[1,2] exposes the issue in your iterator.
> You can delete the FaTE transactions, but you will need to shut down the
> Manager (Master) to do so.
>
> [1]
>
> https://accumulo.apache.org/1.10/accumulo_user_manual.html#_iterator_testing
> [2]
>
> https://accumulo.apache.org/docs/2.x/development/development_tools#iterator-test-harness
>
> On Wed, Jul 6, 2022 at 10:59 PM Christopher <ctubb...@apache.org> wrote:
>
> > The behavior in case of error is likely undefined, so I'm not entirely
> > surprised it's behaving this way. There may be things we can do to try to
> > handle errors more gracefully for user initiated compactions when an
> > iterator throws an exception, but it's definitely a good idea to write
> > custom iterators in a way that tries to handle its own errors as much as
> > possible.
> >
> > On Wed, Jul 6, 2022, 20:42 Logan Jones <lo...@codescratch.com> wrote:
> >
> > > Thanks Chris for the quick reply. I'll explain the behavior I'm seeing,
> > and
> > > then maybe you all could either confirm this is the intended behavior,
> or
> > > decide it's maybe not that great.
> > >
> > > My understanding of the happy case for running a user-initiated
> > compaction
> > > is that a fate/transaction gets created in zookeeper, and the Accumulo
> > > master node ends up farming off the compactions to the correct tablet
> > > servers, once the tablets have been completed, somehow the
> > > fates/transactions in zookeeper get cleaned up.
> > >
> > > I experienced a problem, however, in the unhappy case for compactions
> > which
> > > I have since reproduced. We had a custom iterator configured for a
> table,
> > > and that custom iterator was in a bad state (i.e. it was always
> throwing
> > an
> > > exception during initialization). What we noticed is that the fates are
> > > indefinitely stuck IN_PROGRESS and never go away in this case.
> > Effectively
> > > we have a poison pill, and if you issue too many compactions against
> that
> > > table, you can cause other bad problems.
> > >
> > > I created a repo to demonstrate the problem as succinctly as I could
> > > manage:
> > >
> > > https://github.com/loganasherjones/accumulo-iterator-failures
> > >
> > > I thought initially that maybe it was due to the fact that our iterator
> > was
> > > throwing an error during initialization, but this appears to be
> happening
> > > for any error on next, seek, or init calls.
> > >
> > > So my questions are
> > >
> > > 1. Is it expected that a failure in a seek, next, or init in an
> iterator
> > > during a user-initiated compaction would cause accumulo to non-stop
> retry
> > > the compaction
> > > 2. If so, could you help me understand why?
> > >
> > > Thanks in advance,
> > >
> > > - Logan
> > >
> > >
> > >
> > > On Wed, Jul 6, 2022 at 6:31 PM Christopher <ctubb...@apache.org>
> wrote:
> > >
> > > > Yes, either here (especially if it's related to a bug or proposed
> code
> > > > change) or at user@ would work, if it's more of a user question.
> Here
> > is
> > > > fine if you're not sure.
> > > >
> > > > On Wed, Jul 6, 2022, 16:35 Logan Jones <lo...@codescratch.com>
> wrote:
> > > >
> > > > > Hello:
> > > > >
> > > > > I would like to discuss what happens when iterators cause
> > > user-initiated
> > > > > compactions to fail, specifically in relation to the fate
> > transactions.
> > > > Is
> > > > > this the right list for this discussion?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > - Logan
> > > > >
> > > >
> > >
> >
>

Reply via email to