On Mon, Apr 3, 2023 at 10:45 AM Dave Marion <dmario...@gmail.com> wrote:
>
> Looking through the code to see what would have to change to remove the
> ondemand table state, I'm struggling to find a way to implement this
> without having an ondemand state. Currently, the ondemand table state is

We could have tablet states instead of the ondemand table state.  So
for a table in the online state, each tablet could have a state of
HOSTED or ONDEMAND.  For compatibility the default state could be
HOSTED.  We could provide a mechanism for users to indicate they want
ranges of a table to be ondemand.  This would set the goal state for
the tablets that fall within one of those ranges to ONDEMAND.

What I am uncertain about is how users would manage these per table
ranges. It could be via a table property, SPI, or custom API and I am
not sure which way is best.

For example if a user creates an online table T1 with splits
c,f,j,m,x.  Then somehow they specify they want the range (b,d] to be
ondemand.  This would cause the tablets in the table to have the
following goal states.

(-inf,c] ONDEMAND
(c,f] ONDEMAND
(f,j] HOSTED
(j,m] HOSTED
(m,x] HOSTED
(x,inf] HOSTED

Then the manager and client code need to somehow know of the above
goal states and act accordingly.  How the manager and client know
about these goals states depends on how users specify the ranges.

> set in ZooKeeper as the ZTABLE_STATE and both the client and the server use
> it.
>
> When a tablet cannot be located in the TabletLocator, then it checks to see
> if the table is in an ondemand state. If it is, then it tells the server
> side to assign the tablets in the range that the client needs by placing an
> "ondemand" column in the tablet metadata. This ensures that ondemand
> tablets that were hosted as part of a client operation and then hosted
> again on a tablet server failure. Prior to the ondemand state, the
> TabletLocator would attempt to find a tablet location for an online table,
> and if it was not able to find a location, then it would wait - it assumes
> that it will be hosted at some point. Do we modify the tablet locator to
> send a signal to the Manager to assign all unlocated tablets for online
> tables?
>
> We could attempt to remove the TableOperations.ondemand method, leaving
> just online and offline. Then, we could have something periodically check
> to see if the property TABLE_ONDEMAND_UNLOADER is set.  If it is, then that
> could change the internal table state to ondemand. However, when that
> property is unset, I'm not sure we have enough information to know whether
> to set the table state back to online or offline.
>
> I'm wondering if trying to achieve a simpler user experience is outweighed
> by the complexity added to the code to achieve it. Personally, I don't
> think it's that hard to reason about, especially if the user reads the docs
> and it is explained well.
>
> On Wed, Mar 29, 2023 at 1:04 PM Christopher <ctubb...@apache.org> wrote:
>
> > On Wed, Mar 29, 2023 at 5:33 AM Dave Marion <dmario...@gmail.com> wrote:
> > >
> > > > I think we should deprecate support for offline table scanning, since
> > it
> > > shouldn't be needed with the availability of ScanServers.
> > >
> > > Just making sure I understand your suggestion - you mean removing the
> > > OfflineScanner and the ability to scan over offline tables in the
> > MapReduce
> > > code, but we should continue our efforts to allow Scan Servers to scan
> > > offline tables, right?
> >
> >
> > Yes to removing OfflineScanner. But the rest of that isn't quite what
> > I was thinking. What I was trying to say is that with elasticity
> > features, unless immediate consistency is required, the ScanServer's
> > ability to scan should not depend on the tablets being "hosted" for
> > live ingest. Using the ScanServer on the table's "unhosted" tablets is
> > enough to replace the need for the OfflineScanner, I think.
> >
> > So, yes, we should continue our efforts to allow ScanServers to scan
> > tables with "unhosted" tablets. Now, whether we say that is the
> > ScanServer scanning an "offline" table or not, depends on how we're
> > defining "online" and "offline".
> >
> > Currently, without elastic features in place, that would only happen
> > if we mark the table in an "offline" state, but once all the elastic
> > features are in place, I think this would still be considered
> > "ondemand" or "online, as in available for use, but not pinned for
> > live ingest / unhosted".
> >
> > A lot of this is more about how we communicate the state (naming,
> > concepts, etc.), and depends on the rest of my email, rather than
> > affecting the actual features we're supporting. We should still plan
> > for ScanServer to scan "unhosted" tablets, regardless of what state we
> > end up calling it.
> >
> > >
> > > > As for "ondemand" table state, from a user perspective, I'm not sure
> > what
> > > it mean
> > >
> > > I have been thinking about it as "online" means always hosted, "ondemand"
> > > means hosted as needed, and "offline" means never hosted.
> >
> > Rather than have a mapping from what these mean to how they behave, I
> > think it would be better to have the names directly reflect the user
> > experience. If we say "online" means "always hosted", then just call
> > it "hosted".
> >
> > I think we really need the following states to match to the user
> > experience:
> >
> > (online, live)
> > (online, live-on-demand)
> > (online)
> > (offline / immutable)
> >
> > But, I think the first three states should really just be considered
> > one state, with the "live"-ness being configurable.
> >
> > >
> > > > is the "on-demand availability" applicable only for live ingest /
> > > immediate consistency? Is it still "always available"for bulk import /
> > > ScanServers? Or does "on-demand availability" somehow apply to all
> > > interactions, including bulk import and ScanServer reads?
> > >
> > > We tried to reason about that in
> > >
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=247828052
> >
> > I think that matrix is useful as we iron out the implementation, but I
> > don't think users should have to consult such a table in order to
> > understand what they can and cannot do in a given state. That's why I
> > think it's useful to have just the two states, with pretty much
> > everything working in the "online" state and nothing working in the
> > "offline" state, with the configurable "live"-ness that unlocks
> > additional features (specifically, immediate consistency scanning and
> > live ingest). Like buying a car, you get standard (online) or nothing
> > (offline), or you can get standard plus extras by paying the cost of
> > those extras (in this case, configuration of "live"-ness). It's a bit
> > weird to be able to do things in an offline state, or to not be able
> > to do things in an online state. But if it's framed as "configure the
> > extra features to use them", it's a bit more intuitive to understand,
> > because "online" still means "ready to go".
> >
> > If it helps make things even more clear, long before we had the
> > "online" and "offline" states, I worked on an idea for calling the
> > tables "enabled" and "disabled". I abandoned that idea (and the
> > transition states that accompanied them) when these states were
> > introduced. But, I still think those terms might be better, in that
> > they don't imply any relationship to "hosted" or "unhosted"... just
> > whether or not they were usable by the user, which is a better way of
> > framing things, I think.
> >
> > If enabled/disabled terms were used in place of online/offline, the
> > states could be:
> >
> > (enabled, hosted)
> > (enabled, on-demand)
> > (enabled, unhosted)
> > (disabled)
> >
> > Or collapsing the first 3 again, just:
> >
> > (enabled) - hosted status is tunable/configurable
> > (disabled)
> >
> >
> > >
> > > Regarding the rest of your email, I think removing the ondemand state
> > would
> > > be ok. The ondemand commits added a new property for the user to specify
> > > which tablet unloader class[1] to use, with the default being [2]. We
> > could
> > > add a new default implementation that does not unload and users would
> > have
> > > to opt-in to unloading by setting the property for their online tables.
> > > However this is some code surrounding the new ondemand state that we
> > would
> > > need to address. For example, when a TabletServer is low on memory it
> > > doesn't call the specified TabletUnloader, it just unloads a Tablet.
> > >
> > > [1]
> > >
> > https://github.com/apache/accumulo/blob/elasticity/core/src/main/java/org/apache/accumulo/core/spi/ondemand/OnDemandTabletUnloader.java
> > > [2]
> > >
> > https://github.com/apache/accumulo/blob/elasticity/core/src/main/java/org/apache/accumulo/core/spi/ondemand/DefaultOnDemandTabletUnloader.java
> > >
> > > On Tue, Mar 28, 2023 at 10:27 AM Christopher <ctubb...@apache.org>
> > wrote:
> > >
> > > > I think we should deprecate support for offline table scanning, since
> > > > it shouldn't be needed with the availability of ScanServers. Any
> > > > MapReduce that previously relied on scanning offline tables could be
> > > > made to use that instead.
> > > >
> > > > I agree there is a need to have an immutable table state, for which it
> > > > is possible to read, but no changes can be made. However, even in that
> > > > "locked" state, one should still be able to perform surgery on its
> > > > metadata, or manually / surgically compact files (with the
> > > > understanding that doing so will interfere with any concurrent export
> > > > or scan operations that are relying on it being immutable, which I
> > > > think is a tolerable amount of risk, when actually in a situation
> > > > where such surgery is needed).
> > > >
> > > > As for "ondemand" table state, from a user perspective, I'm not sure
> > > > what it means... is the "on-demand availability" applicable only for
> > > > live ingest / immediate consistency? Is it still "always available"
> > > > for bulk import / ScanServers? Or does "on-demand availability"
> > > > somehow apply to all interactions, including bulk import and
> > > > ScanServer reads?
> > > >
> > > > I think the "ondemand" state is confusing, because it's exposing
> > > > internal state through to the user, and in a way that isn't as clear
> > > > as the simple "online/offline" states used to be. Previously, users
> > > > didn't need to understand what was going on internally... "online"
> > > > just meant "I can interact with this table", and "offline" meant "I
> > > > can't interact with this table". The user wasn't required to
> > > > understand what a tablet was, or how it was hosted, or anything of
> > > > that nature. As we started adding support for "offline" features, the
> > > > lines separating "online and offline" meaning "available and
> > > > unavailable" became blurred. As we proceed adding elasticity, I think
> > > > we should work to make things more clear and explicit again... and I
> > > > think "ondemand" as a table state, makes things even less clear when
> > > > the concept is exposed to the user as a separate table state.
> > > >
> > > > I do think we need some kind of on-demand availability for live-ingest
> > > > and immediate consistency in order to be more elastic, and from the
> > > > discussion, it's obvious we need an immutable table state, but I think
> > > > it's a mistake to expose the on-demand availability for live-ingest
> > > > and immediate consistency as a new table state. I think that should be
> > > > left as either some kind of automatic internal behavior, or as a
> > > > secondary fine-grained control over an online table (like pinned
> > > > tablets, either permanently pinned or temporally pinned, based on
> > > > activity).
> > > >
> > > > On Tue, Mar 28, 2023 at 9:51 AM Drew Farris <d...@apache.org> wrote:
> > > > >
> > > > > On Mon, Mar 27, 2023 at 2:16 PM Keith Turner <ke...@deenlo.com>
> > wrote:
> > > > >
> > > > > > One realization that came out examining the different table states
> > is
> > > > > > that export table currently relies on the fact that offline tables
> > > > > > will not delete files.  If we enable compactions on offline tables
> > > > > > then that could cause files to be deleted which would break the
> > > > > > expectation of export table.
> > > > > >
> > > > >
> > > > > This is a good point. I hadn't considered the potential breakage to
> > > > export
> > > > > table. I suspect another concern could be the hadoop input format
> > that
> > > > > operates over the rfiles in an offline table - and can do so
> > relatively
> > > > > safely
> > > > > because the table is not expected to change while it is offline.
> > > > >
> > > > > So, it would seem that there is value in having an 'immutable' table
> > > > state
> > > > > in
> > > > > the form of an offline table. Perhaps 'ondemand' is the alternate
> > state
> > > > > that
> > > > > lets us do things like import, split, compact, merge, etc.
> > > >
> >

Reply via email to