I had a chance to talk with some users about how they would use this
feature. In their opinion, they didn't see the need to manage the
online/ondemand state at anything lower than the Tablet level. If we were
to do that, then that would potentially put a management burden on them to
make sure that Tablets were in the correct state. Talking further with
Christopher I got a better understanding of his viewpoint, which is that we
should make the on-demand functionality the default behavior of the online
table state, thus negating the need for a new ondemand table state.
Further, he suggested that maybe online/offline should be renamed to
enabled/disabled with disabled being a truly immutable table state. Talking
with the team, if my notes are correct, we came to conclusion that the
currently merged on-demand feature should be modified such that:

1. Remove the on-demand table state, modify online table state to adopt the
on-demand behavior
2. Potentially rename online/offline to enabled/disabled
3. The default tablet state for online/enabled tables would be unhosted in
keeping with the current on-demand behavior. However, this could be
modified later to have the default tablet state be hosted to maintain
backward compatibility.
4. There was a discussion about the need for an API or table property that
a user could use to override the new default behavior (see #3) for a range
of Tablets on a table.


On Tue, Apr 4, 2023 at 1:08 PM Keith Turner <ke...@deenlo.com> wrote:

> On Mon, Apr 3, 2023 at 3:33 PM Dave Marion <dmario...@gmail.com> wrote:
> >
> > I could see that working initially, but I think you would get some drift
> > over time as splits or merges happen. In your example, what happens when
>
> Drift is definitely something to consider in the design and users
> would definitely be changing the set of ondemand ranges for a table
> over time.  When a user specifies the set of ondemand ranges for a
> table, thinking disallowing overlapping ranges will help them avoid
> mistakes as they mutate the set of ondemand ranges.
>
> > later someone adds splits for a - z? How would we know to mark (-inf, a]
> > and (a, b] as HOSTED and (b,c] as ONDEMAND? In a table where the row is
>
> Thinking any tablet that overlaps a user specified ondemend range
> should end up with an ondemand goal state.  User should be able to use
> -inf and +inf when specifying ondemand ranges.  So a user could set
> the ondemand range to (-inf, +inf] to make an entire table on demand.
> They could set a tables ondemand ranges to the set { (-inf,d],
> (q,+inf] } and then as new splits are added below d of above q those
> new tablets would automatically have an ondemand goal state.  Any
> tablets between d and q would have a hosted goal state.
>
> > time and new splits are added daily or weekly, the range would have to be
> > updated at the same time that splits are created to keep the last N days
> > hosted.
>
> For the case where splits are being added at the end of a table they
> could use a range like (-inf,D] to make the tablets less than D
> ondemand and everything greater then D hosted.  Later they could
> change the ondemand range from (-inf,D] to (-inf,(D+10)].
>
> We could change the assumption that any tablet that overlaps a user
> specified ondemend range should end up with an ondemand goal state.
> Instead we could decided that tablets must fully fall within an
> ondemand range inorder to end up with an ondemand goal state,
> otherwise it has the hosted goal state.  For example if the user sets
> ondemand ranges to { (-inf,d], (q,+inf] } then a tablet with the range
> (b,f] would overlap the ondemand range and hosted range, so need to
> decide which goal state the tablet should end up with.
>
> >
> > On Mon, Apr 3, 2023 at 2:13 PM Keith Turner <ke...@deenlo.com> wrote:
> >
> > > On Mon, Apr 3, 2023 at 10:45 AM Dave Marion <dmario...@gmail.com>
> wrote:
> > > >
> > > > Looking through the code to see what would have to change to remove
> the
> > > > ondemand table state, I'm struggling to find a way to implement this
> > > > without having an ondemand state. Currently, the ondemand table
> state is
> > >
> > > We could have tablet states instead of the ondemand table state.  So
> > > for a table in the online state, each tablet could have a state of
> > > HOSTED or ONDEMAND.  For compatibility the default state could be
> > > HOSTED.  We could provide a mechanism for users to indicate they want
> > > ranges of a table to be ondemand.  This would set the goal state for
> > > the tablets that fall within one of those ranges to ONDEMAND.
> > >
> > > What I am uncertain about is how users would manage these per table
> > > ranges. It could be via a table property, SPI, or custom API and I am
> > > not sure which way is best.
> > >
> > > For example if a user creates an online table T1 with splits
> > > c,f,j,m,x.  Then somehow they specify they want the range (b,d] to be
> > > ondemand.  This would cause the tablets in the table to have the
> > > following goal states.
> > >
> > > (-inf,c] ONDEMAND
> > > (c,f] ONDEMAND
> > > (f,j] HOSTED
> > > (j,m] HOSTED
> > > (m,x] HOSTED
> > > (x,inf] HOSTED
> > >
> > > Then the manager and client code need to somehow know of the above
> > > goal states and act accordingly.  How the manager and client know
> > > about these goals states depends on how users specify the ranges.
> > >
> > > > set in ZooKeeper as the ZTABLE_STATE and both the client and the
> server
> > > use
> > > > it.
> > > >
> > > > When a tablet cannot be located in the TabletLocator, then it checks
> to
> > > see
> > > > if the table is in an ondemand state. If it is, then it tells the
> server
> > > > side to assign the tablets in the range that the client needs by
> placing
> > > an
> > > > "ondemand" column in the tablet metadata. This ensures that ondemand
> > > > tablets that were hosted as part of a client operation and then
> hosted
> > > > again on a tablet server failure. Prior to the ondemand state, the
> > > > TabletLocator would attempt to find a tablet location for an online
> > > table,
> > > > and if it was not able to find a location, then it would wait - it
> > > assumes
> > > > that it will be hosted at some point. Do we modify the tablet
> locator to
> > > > send a signal to the Manager to assign all unlocated tablets for
> online
> > > > tables?
> > > >
> > > > We could attempt to remove the TableOperations.ondemand method,
> leaving
> > > > just online and offline. Then, we could have something periodically
> check
> > > > to see if the property TABLE_ONDEMAND_UNLOADER is set.  If it is,
> then
> > > that
> > > > could change the internal table state to ondemand. However, when that
> > > > property is unset, I'm not sure we have enough information to know
> > > whether
> > > > to set the table state back to online or offline.
> > > >
> > > > I'm wondering if trying to achieve a simpler user experience is
> > > outweighed
> > > > by the complexity added to the code to achieve it. Personally, I
> don't
> > > > think it's that hard to reason about, especially if the user reads
> the
> > > docs
> > > > and it is explained well.
> > > >
> > > > On Wed, Mar 29, 2023 at 1:04 PM Christopher <ctubb...@apache.org>
> wrote:
> > > >
> > > > > On Wed, Mar 29, 2023 at 5:33 AM Dave Marion <dmario...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > I think we should deprecate support for offline table scanning,
> > > since
> > > > > it
> > > > > > shouldn't be needed with the availability of ScanServers.
> > > > > >
> > > > > > Just making sure I understand your suggestion - you mean
> removing the
> > > > > > OfflineScanner and the ability to scan over offline tables in the
> > > > > MapReduce
> > > > > > code, but we should continue our efforts to allow Scan Servers to
> > > scan
> > > > > > offline tables, right?
> > > > >
> > > > >
> > > > > Yes to removing OfflineScanner. But the rest of that isn't quite
> what
> > > > > I was thinking. What I was trying to say is that with elasticity
> > > > > features, unless immediate consistency is required, the
> ScanServer's
> > > > > ability to scan should not depend on the tablets being "hosted" for
> > > > > live ingest. Using the ScanServer on the table's "unhosted"
> tablets is
> > > > > enough to replace the need for the OfflineScanner, I think.
> > > > >
> > > > > So, yes, we should continue our efforts to allow ScanServers to
> scan
> > > > > tables with "unhosted" tablets. Now, whether we say that is the
> > > > > ScanServer scanning an "offline" table or not, depends on how we're
> > > > > defining "online" and "offline".
> > > > >
> > > > > Currently, without elastic features in place, that would only
> happen
> > > > > if we mark the table in an "offline" state, but once all the
> elastic
> > > > > features are in place, I think this would still be considered
> > > > > "ondemand" or "online, as in available for use, but not pinned for
> > > > > live ingest / unhosted".
> > > > >
> > > > > A lot of this is more about how we communicate the state (naming,
> > > > > concepts, etc.), and depends on the rest of my email, rather than
> > > > > affecting the actual features we're supporting. We should still
> plan
> > > > > for ScanServer to scan "unhosted" tablets, regardless of what
> state we
> > > > > end up calling it.
> > > > >
> > > > > >
> > > > > > > As for "ondemand" table state, from a user perspective, I'm not
> > > sure
> > > > > what
> > > > > > it mean
> > > > > >
> > > > > > I have been thinking about it as "online" means always hosted,
> > > "ondemand"
> > > > > > means hosted as needed, and "offline" means never hosted.
> > > > >
> > > > > Rather than have a mapping from what these mean to how they
> behave, I
> > > > > think it would be better to have the names directly reflect the
> user
> > > > > experience. If we say "online" means "always hosted", then just
> call
> > > > > it "hosted".
> > > > >
> > > > > I think we really need the following states to match to the user
> > > > > experience:
> > > > >
> > > > > (online, live)
> > > > > (online, live-on-demand)
> > > > > (online)
> > > > > (offline / immutable)
> > > > >
> > > > > But, I think the first three states should really just be
> considered
> > > > > one state, with the "live"-ness being configurable.
> > > > >
> > > > > >
> > > > > > > is the "on-demand availability" applicable only for live
> ingest /
> > > > > > immediate consistency? Is it still "always available"for bulk
> import
> > > /
> > > > > > ScanServers? Or does "on-demand availability" somehow apply to
> all
> > > > > > interactions, including bulk import and ScanServer reads?
> > > > > >
> > > > > > We tried to reason about that in
> > > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=247828052
> > > > >
> > > > > I think that matrix is useful as we iron out the implementation,
> but I
> > > > > don't think users should have to consult such a table in order to
> > > > > understand what they can and cannot do in a given state. That's
> why I
> > > > > think it's useful to have just the two states, with pretty much
> > > > > everything working in the "online" state and nothing working in the
> > > > > "offline" state, with the configurable "live"-ness that unlocks
> > > > > additional features (specifically, immediate consistency scanning
> and
> > > > > live ingest). Like buying a car, you get standard (online) or
> nothing
> > > > > (offline), or you can get standard plus extras by paying the cost
> of
> > > > > those extras (in this case, configuration of "live"-ness). It's a
> bit
> > > > > weird to be able to do things in an offline state, or to not be
> able
> > > > > to do things in an online state. But if it's framed as "configure
> the
> > > > > extra features to use them", it's a bit more intuitive to
> understand,
> > > > > because "online" still means "ready to go".
> > > > >
> > > > > If it helps make things even more clear, long before we had the
> > > > > "online" and "offline" states, I worked on an idea for calling the
> > > > > tables "enabled" and "disabled". I abandoned that idea (and the
> > > > > transition states that accompanied them) when these states were
> > > > > introduced. But, I still think those terms might be better, in that
> > > > > they don't imply any relationship to "hosted" or "unhosted"... just
> > > > > whether or not they were usable by the user, which is a better way
> of
> > > > > framing things, I think.
> > > > >
> > > > > If enabled/disabled terms were used in place of online/offline, the
> > > > > states could be:
> > > > >
> > > > > (enabled, hosted)
> > > > > (enabled, on-demand)
> > > > > (enabled, unhosted)
> > > > > (disabled)
> > > > >
> > > > > Or collapsing the first 3 again, just:
> > > > >
> > > > > (enabled) - hosted status is tunable/configurable
> > > > > (disabled)
> > > > >
> > > > >
> > > > > >
> > > > > > Regarding the rest of your email, I think removing the ondemand
> state
> > > > > would
> > > > > > be ok. The ondemand commits added a new property for the user to
> > > specify
> > > > > > which tablet unloader class[1] to use, with the default being
> [2]. We
> > > > > could
> > > > > > add a new default implementation that does not unload and users
> would
> > > > > have
> > > > > > to opt-in to unloading by setting the property for their online
> > > tables.
> > > > > > However this is some code surrounding the new ondemand state
> that we
> > > > > would
> > > > > > need to address. For example, when a TabletServer is low on
> memory it
> > > > > > doesn't call the specified TabletUnloader, it just unloads a
> Tablet.
> > > > > >
> > > > > > [1]
> > > > > >
> > > > >
> > >
> https://github.com/apache/accumulo/blob/elasticity/core/src/main/java/org/apache/accumulo/core/spi/ondemand/OnDemandTabletUnloader.java
> > > > > > [2]
> > > > > >
> > > > >
> > >
> https://github.com/apache/accumulo/blob/elasticity/core/src/main/java/org/apache/accumulo/core/spi/ondemand/DefaultOnDemandTabletUnloader.java
> > > > > >
> > > > > > On Tue, Mar 28, 2023 at 10:27 AM Christopher <
> ctubb...@apache.org>
> > > > > wrote:
> > > > > >
> > > > > > > I think we should deprecate support for offline table scanning,
> > > since
> > > > > > > it shouldn't be needed with the availability of ScanServers.
> Any
> > > > > > > MapReduce that previously relied on scanning offline tables
> could
> > > be
> > > > > > > made to use that instead.
> > > > > > >
> > > > > > > I agree there is a need to have an immutable table state, for
> > > which it
> > > > > > > is possible to read, but no changes can be made. However, even
> in
> > > that
> > > > > > > "locked" state, one should still be able to perform surgery on
> its
> > > > > > > metadata, or manually / surgically compact files (with the
> > > > > > > understanding that doing so will interfere with any concurrent
> > > export
> > > > > > > or scan operations that are relying on it being immutable,
> which I
> > > > > > > think is a tolerable amount of risk, when actually in a
> situation
> > > > > > > where such surgery is needed).
> > > > > > >
> > > > > > > As for "ondemand" table state, from a user perspective, I'm not
> > > sure
> > > > > > > what it means... is the "on-demand availability" applicable
> only
> > > for
> > > > > > > live ingest / immediate consistency? Is it still "always
> available"
> > > > > > > for bulk import / ScanServers? Or does "on-demand availability"
> > > > > > > somehow apply to all interactions, including bulk import and
> > > > > > > ScanServer reads?
> > > > > > >
> > > > > > > I think the "ondemand" state is confusing, because it's
> exposing
> > > > > > > internal state through to the user, and in a way that isn't as
> > > clear
> > > > > > > as the simple "online/offline" states used to be. Previously,
> users
> > > > > > > didn't need to understand what was going on internally...
> "online"
> > > > > > > just meant "I can interact with this table", and "offline"
> meant "I
> > > > > > > can't interact with this table". The user wasn't required to
> > > > > > > understand what a tablet was, or how it was hosted, or
> anything of
> > > > > > > that nature. As we started adding support for "offline"
> features,
> > > the
> > > > > > > lines separating "online and offline" meaning "available and
> > > > > > > unavailable" became blurred. As we proceed adding elasticity, I
> > > think
> > > > > > > we should work to make things more clear and explicit again...
> and
> > > I
> > > > > > > think "ondemand" as a table state, makes things even less clear
> > > when
> > > > > > > the concept is exposed to the user as a separate table state.
> > > > > > >
> > > > > > > I do think we need some kind of on-demand availability for
> > > live-ingest
> > > > > > > and immediate consistency in order to be more elastic, and
> from the
> > > > > > > discussion, it's obvious we need an immutable table state, but
> I
> > > think
> > > > > > > it's a mistake to expose the on-demand availability for
> live-ingest
> > > > > > > and immediate consistency as a new table state. I think that
> > > should be
> > > > > > > left as either some kind of automatic internal behavior, or as
> a
> > > > > > > secondary fine-grained control over an online table (like
> pinned
> > > > > > > tablets, either permanently pinned or temporally pinned, based
> on
> > > > > > > activity).
> > > > > > >
> > > > > > > On Tue, Mar 28, 2023 at 9:51 AM Drew Farris <d...@apache.org>
> > > wrote:
> > > > > > > >
> > > > > > > > On Mon, Mar 27, 2023 at 2:16 PM Keith Turner <
> ke...@deenlo.com>
> > > > > wrote:
> > > > > > > >
> > > > > > > > > One realization that came out examining the different table
> > > states
> > > > > is
> > > > > > > > > that export table currently relies on the fact that offline
> > > tables
> > > > > > > > > will not delete files.  If we enable compactions on offline
> > > tables
> > > > > > > > > then that could cause files to be deleted which would
> break the
> > > > > > > > > expectation of export table.
> > > > > > > > >
> > > > > > > >
> > > > > > > > This is a good point. I hadn't considered the potential
> breakage
> > > to
> > > > > > > export
> > > > > > > > table. I suspect another concern could be the hadoop input
> format
> > > > > that
> > > > > > > > operates over the rfiles in an offline table - and can do so
> > > > > relatively
> > > > > > > > safely
> > > > > > > > because the table is not expected to change while it is
> offline.
> > > > > > > >
> > > > > > > > So, it would seem that there is value in having an
> 'immutable'
> > > table
> > > > > > > state
> > > > > > > > in
> > > > > > > > the form of an offline table. Perhaps 'ondemand' is the
> alternate
> > > > > state
> > > > > > > > that
> > > > > > > > lets us do things like import, split, compact, merge, etc.
> > > > > > >
> > > > >
> > >
>

Reply via email to