On Mon, Apr 3, 2023 at 3:33 PM Dave Marion <dmario...@gmail.com> wrote:
>
> I could see that working initially, but I think you would get some drift
> over time as splits or merges happen. In your example, what happens when

Drift is definitely something to consider in the design and users
would definitely be changing the set of ondemand ranges for a table
over time.  When a user specifies the set of ondemand ranges for a
table, thinking disallowing overlapping ranges will help them avoid
mistakes as they mutate the set of ondemand ranges.

> later someone adds splits for a - z? How would we know to mark (-inf, a]
> and (a, b] as HOSTED and (b,c] as ONDEMAND? In a table where the row is

Thinking any tablet that overlaps a user specified ondemend range
should end up with an ondemand goal state.  User should be able to use
-inf and +inf when specifying ondemand ranges.  So a user could set
the ondemand range to (-inf, +inf] to make an entire table on demand.
They could set a tables ondemand ranges to the set { (-inf,d],
(q,+inf] } and then as new splits are added below d of above q those
new tablets would automatically have an ondemand goal state.  Any
tablets between d and q would have a hosted goal state.

> time and new splits are added daily or weekly, the range would have to be
> updated at the same time that splits are created to keep the last N days
> hosted.

For the case where splits are being added at the end of a table they
could use a range like (-inf,D] to make the tablets less than D
ondemand and everything greater then D hosted.  Later they could
change the ondemand range from (-inf,D] to (-inf,(D+10)].

We could change the assumption that any tablet that overlaps a user
specified ondemend range should end up with an ondemand goal state.
Instead we could decided that tablets must fully fall within an
ondemand range inorder to end up with an ondemand goal state,
otherwise it has the hosted goal state.  For example if the user sets
ondemand ranges to { (-inf,d], (q,+inf] } then a tablet with the range
(b,f] would overlap the ondemand range and hosted range, so need to
decide which goal state the tablet should end up with.

>
> On Mon, Apr 3, 2023 at 2:13 PM Keith Turner <ke...@deenlo.com> wrote:
>
> > On Mon, Apr 3, 2023 at 10:45 AM Dave Marion <dmario...@gmail.com> wrote:
> > >
> > > Looking through the code to see what would have to change to remove the
> > > ondemand table state, I'm struggling to find a way to implement this
> > > without having an ondemand state. Currently, the ondemand table state is
> >
> > We could have tablet states instead of the ondemand table state.  So
> > for a table in the online state, each tablet could have a state of
> > HOSTED or ONDEMAND.  For compatibility the default state could be
> > HOSTED.  We could provide a mechanism for users to indicate they want
> > ranges of a table to be ondemand.  This would set the goal state for
> > the tablets that fall within one of those ranges to ONDEMAND.
> >
> > What I am uncertain about is how users would manage these per table
> > ranges. It could be via a table property, SPI, or custom API and I am
> > not sure which way is best.
> >
> > For example if a user creates an online table T1 with splits
> > c,f,j,m,x.  Then somehow they specify they want the range (b,d] to be
> > ondemand.  This would cause the tablets in the table to have the
> > following goal states.
> >
> > (-inf,c] ONDEMAND
> > (c,f] ONDEMAND
> > (f,j] HOSTED
> > (j,m] HOSTED
> > (m,x] HOSTED
> > (x,inf] HOSTED
> >
> > Then the manager and client code need to somehow know of the above
> > goal states and act accordingly.  How the manager and client know
> > about these goals states depends on how users specify the ranges.
> >
> > > set in ZooKeeper as the ZTABLE_STATE and both the client and the server
> > use
> > > it.
> > >
> > > When a tablet cannot be located in the TabletLocator, then it checks to
> > see
> > > if the table is in an ondemand state. If it is, then it tells the server
> > > side to assign the tablets in the range that the client needs by placing
> > an
> > > "ondemand" column in the tablet metadata. This ensures that ondemand
> > > tablets that were hosted as part of a client operation and then hosted
> > > again on a tablet server failure. Prior to the ondemand state, the
> > > TabletLocator would attempt to find a tablet location for an online
> > table,
> > > and if it was not able to find a location, then it would wait - it
> > assumes
> > > that it will be hosted at some point. Do we modify the tablet locator to
> > > send a signal to the Manager to assign all unlocated tablets for online
> > > tables?
> > >
> > > We could attempt to remove the TableOperations.ondemand method, leaving
> > > just online and offline. Then, we could have something periodically check
> > > to see if the property TABLE_ONDEMAND_UNLOADER is set.  If it is, then
> > that
> > > could change the internal table state to ondemand. However, when that
> > > property is unset, I'm not sure we have enough information to know
> > whether
> > > to set the table state back to online or offline.
> > >
> > > I'm wondering if trying to achieve a simpler user experience is
> > outweighed
> > > by the complexity added to the code to achieve it. Personally, I don't
> > > think it's that hard to reason about, especially if the user reads the
> > docs
> > > and it is explained well.
> > >
> > > On Wed, Mar 29, 2023 at 1:04 PM Christopher <ctubb...@apache.org> wrote:
> > >
> > > > On Wed, Mar 29, 2023 at 5:33 AM Dave Marion <dmario...@gmail.com>
> > wrote:
> > > > >
> > > > > > I think we should deprecate support for offline table scanning,
> > since
> > > > it
> > > > > shouldn't be needed with the availability of ScanServers.
> > > > >
> > > > > Just making sure I understand your suggestion - you mean removing the
> > > > > OfflineScanner and the ability to scan over offline tables in the
> > > > MapReduce
> > > > > code, but we should continue our efforts to allow Scan Servers to
> > scan
> > > > > offline tables, right?
> > > >
> > > >
> > > > Yes to removing OfflineScanner. But the rest of that isn't quite what
> > > > I was thinking. What I was trying to say is that with elasticity
> > > > features, unless immediate consistency is required, the ScanServer's
> > > > ability to scan should not depend on the tablets being "hosted" for
> > > > live ingest. Using the ScanServer on the table's "unhosted" tablets is
> > > > enough to replace the need for the OfflineScanner, I think.
> > > >
> > > > So, yes, we should continue our efforts to allow ScanServers to scan
> > > > tables with "unhosted" tablets. Now, whether we say that is the
> > > > ScanServer scanning an "offline" table or not, depends on how we're
> > > > defining "online" and "offline".
> > > >
> > > > Currently, without elastic features in place, that would only happen
> > > > if we mark the table in an "offline" state, but once all the elastic
> > > > features are in place, I think this would still be considered
> > > > "ondemand" or "online, as in available for use, but not pinned for
> > > > live ingest / unhosted".
> > > >
> > > > A lot of this is more about how we communicate the state (naming,
> > > > concepts, etc.), and depends on the rest of my email, rather than
> > > > affecting the actual features we're supporting. We should still plan
> > > > for ScanServer to scan "unhosted" tablets, regardless of what state we
> > > > end up calling it.
> > > >
> > > > >
> > > > > > As for "ondemand" table state, from a user perspective, I'm not
> > sure
> > > > what
> > > > > it mean
> > > > >
> > > > > I have been thinking about it as "online" means always hosted,
> > "ondemand"
> > > > > means hosted as needed, and "offline" means never hosted.
> > > >
> > > > Rather than have a mapping from what these mean to how they behave, I
> > > > think it would be better to have the names directly reflect the user
> > > > experience. If we say "online" means "always hosted", then just call
> > > > it "hosted".
> > > >
> > > > I think we really need the following states to match to the user
> > > > experience:
> > > >
> > > > (online, live)
> > > > (online, live-on-demand)
> > > > (online)
> > > > (offline / immutable)
> > > >
> > > > But, I think the first three states should really just be considered
> > > > one state, with the "live"-ness being configurable.
> > > >
> > > > >
> > > > > > is the "on-demand availability" applicable only for live ingest /
> > > > > immediate consistency? Is it still "always available"for bulk import
> > /
> > > > > ScanServers? Or does "on-demand availability" somehow apply to all
> > > > > interactions, including bulk import and ScanServer reads?
> > > > >
> > > > > We tried to reason about that in
> > > > >
> > > >
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=247828052
> > > >
> > > > I think that matrix is useful as we iron out the implementation, but I
> > > > don't think users should have to consult such a table in order to
> > > > understand what they can and cannot do in a given state. That's why I
> > > > think it's useful to have just the two states, with pretty much
> > > > everything working in the "online" state and nothing working in the
> > > > "offline" state, with the configurable "live"-ness that unlocks
> > > > additional features (specifically, immediate consistency scanning and
> > > > live ingest). Like buying a car, you get standard (online) or nothing
> > > > (offline), or you can get standard plus extras by paying the cost of
> > > > those extras (in this case, configuration of "live"-ness). It's a bit
> > > > weird to be able to do things in an offline state, or to not be able
> > > > to do things in an online state. But if it's framed as "configure the
> > > > extra features to use them", it's a bit more intuitive to understand,
> > > > because "online" still means "ready to go".
> > > >
> > > > If it helps make things even more clear, long before we had the
> > > > "online" and "offline" states, I worked on an idea for calling the
> > > > tables "enabled" and "disabled". I abandoned that idea (and the
> > > > transition states that accompanied them) when these states were
> > > > introduced. But, I still think those terms might be better, in that
> > > > they don't imply any relationship to "hosted" or "unhosted"... just
> > > > whether or not they were usable by the user, which is a better way of
> > > > framing things, I think.
> > > >
> > > > If enabled/disabled terms were used in place of online/offline, the
> > > > states could be:
> > > >
> > > > (enabled, hosted)
> > > > (enabled, on-demand)
> > > > (enabled, unhosted)
> > > > (disabled)
> > > >
> > > > Or collapsing the first 3 again, just:
> > > >
> > > > (enabled) - hosted status is tunable/configurable
> > > > (disabled)
> > > >
> > > >
> > > > >
> > > > > Regarding the rest of your email, I think removing the ondemand state
> > > > would
> > > > > be ok. The ondemand commits added a new property for the user to
> > specify
> > > > > which tablet unloader class[1] to use, with the default being [2]. We
> > > > could
> > > > > add a new default implementation that does not unload and users would
> > > > have
> > > > > to opt-in to unloading by setting the property for their online
> > tables.
> > > > > However this is some code surrounding the new ondemand state that we
> > > > would
> > > > > need to address. For example, when a TabletServer is low on memory it
> > > > > doesn't call the specified TabletUnloader, it just unloads a Tablet.
> > > > >
> > > > > [1]
> > > > >
> > > >
> > https://github.com/apache/accumulo/blob/elasticity/core/src/main/java/org/apache/accumulo/core/spi/ondemand/OnDemandTabletUnloader.java
> > > > > [2]
> > > > >
> > > >
> > https://github.com/apache/accumulo/blob/elasticity/core/src/main/java/org/apache/accumulo/core/spi/ondemand/DefaultOnDemandTabletUnloader.java
> > > > >
> > > > > On Tue, Mar 28, 2023 at 10:27 AM Christopher <ctubb...@apache.org>
> > > > wrote:
> > > > >
> > > > > > I think we should deprecate support for offline table scanning,
> > since
> > > > > > it shouldn't be needed with the availability of ScanServers. Any
> > > > > > MapReduce that previously relied on scanning offline tables could
> > be
> > > > > > made to use that instead.
> > > > > >
> > > > > > I agree there is a need to have an immutable table state, for
> > which it
> > > > > > is possible to read, but no changes can be made. However, even in
> > that
> > > > > > "locked" state, one should still be able to perform surgery on its
> > > > > > metadata, or manually / surgically compact files (with the
> > > > > > understanding that doing so will interfere with any concurrent
> > export
> > > > > > or scan operations that are relying on it being immutable, which I
> > > > > > think is a tolerable amount of risk, when actually in a situation
> > > > > > where such surgery is needed).
> > > > > >
> > > > > > As for "ondemand" table state, from a user perspective, I'm not
> > sure
> > > > > > what it means... is the "on-demand availability" applicable only
> > for
> > > > > > live ingest / immediate consistency? Is it still "always available"
> > > > > > for bulk import / ScanServers? Or does "on-demand availability"
> > > > > > somehow apply to all interactions, including bulk import and
> > > > > > ScanServer reads?
> > > > > >
> > > > > > I think the "ondemand" state is confusing, because it's exposing
> > > > > > internal state through to the user, and in a way that isn't as
> > clear
> > > > > > as the simple "online/offline" states used to be. Previously, users
> > > > > > didn't need to understand what was going on internally... "online"
> > > > > > just meant "I can interact with this table", and "offline" meant "I
> > > > > > can't interact with this table". The user wasn't required to
> > > > > > understand what a tablet was, or how it was hosted, or anything of
> > > > > > that nature. As we started adding support for "offline" features,
> > the
> > > > > > lines separating "online and offline" meaning "available and
> > > > > > unavailable" became blurred. As we proceed adding elasticity, I
> > think
> > > > > > we should work to make things more clear and explicit again... and
> > I
> > > > > > think "ondemand" as a table state, makes things even less clear
> > when
> > > > > > the concept is exposed to the user as a separate table state.
> > > > > >
> > > > > > I do think we need some kind of on-demand availability for
> > live-ingest
> > > > > > and immediate consistency in order to be more elastic, and from the
> > > > > > discussion, it's obvious we need an immutable table state, but I
> > think
> > > > > > it's a mistake to expose the on-demand availability for live-ingest
> > > > > > and immediate consistency as a new table state. I think that
> > should be
> > > > > > left as either some kind of automatic internal behavior, or as a
> > > > > > secondary fine-grained control over an online table (like pinned
> > > > > > tablets, either permanently pinned or temporally pinned, based on
> > > > > > activity).
> > > > > >
> > > > > > On Tue, Mar 28, 2023 at 9:51 AM Drew Farris <d...@apache.org>
> > wrote:
> > > > > > >
> > > > > > > On Mon, Mar 27, 2023 at 2:16 PM Keith Turner <ke...@deenlo.com>
> > > > wrote:
> > > > > > >
> > > > > > > > One realization that came out examining the different table
> > states
> > > > is
> > > > > > > > that export table currently relies on the fact that offline
> > tables
> > > > > > > > will not delete files.  If we enable compactions on offline
> > tables
> > > > > > > > then that could cause files to be deleted which would break the
> > > > > > > > expectation of export table.
> > > > > > > >
> > > > > > >
> > > > > > > This is a good point. I hadn't considered the potential breakage
> > to
> > > > > > export
> > > > > > > table. I suspect another concern could be the hadoop input format
> > > > that
> > > > > > > operates over the rfiles in an offline table - and can do so
> > > > relatively
> > > > > > > safely
> > > > > > > because the table is not expected to change while it is offline.
> > > > > > >
> > > > > > > So, it would seem that there is value in having an 'immutable'
> > table
> > > > > > state
> > > > > > > in
> > > > > > > the form of an offline table. Perhaps 'ondemand' is the alternate
> > > > state
> > > > > > > that
> > > > > > > lets us do things like import, split, compact, merge, etc.
> > > > > >
> > > >
> >

Reply via email to