Edit: In their opinion, they didn't see the need to manage the
online/ondemand state at anything lower than the Tablet level
Should read: In their opinion, they didn't see the need to manage the
online/ondemand state at anything lower than the Table level

On Wed, Apr 5, 2023 at 1:05 PM Dave Marion <dmario...@gmail.com> wrote:

>
>  I had a chance to talk with some users about how they would use this
> feature. In their opinion, they didn't see the need to manage the
> online/ondemand state at anything lower than the Tablet level. If we were
> to do that, then that would potentially put a management burden on them to
> make sure that Tablets were in the correct state. Talking further with
> Christopher I got a better understanding of his viewpoint, which is that we
> should make the on-demand functionality the default behavior of the online
> table state, thus negating the need for a new ondemand table state.
> Further, he suggested that maybe online/offline should be renamed to
> enabled/disabled with disabled being a truly immutable table state. Talking
> with the team, if my notes are correct, we came to conclusion that the
> currently merged on-demand feature should be modified such that:
>
> 1. Remove the on-demand table state, modify online table state to adopt
> the on-demand behavior
> 2. Potentially rename online/offline to enabled/disabled
> 3. The default tablet state for online/enabled tables would be unhosted in
> keeping with the current on-demand behavior. However, this could be
> modified later to have the default tablet state be hosted to maintain
> backward compatibility.
> 4. There was a discussion about the need for an API or table property that
> a user could use to override the new default behavior (see #3) for a range
> of Tablets on a table.
>
>
> On Tue, Apr 4, 2023 at 1:08 PM Keith Turner <ke...@deenlo.com> wrote:
>
>> On Mon, Apr 3, 2023 at 3:33 PM Dave Marion <dmario...@gmail.com> wrote:
>> >
>> > I could see that working initially, but I think you would get some drift
>> > over time as splits or merges happen. In your example, what happens when
>>
>> Drift is definitely something to consider in the design and users
>> would definitely be changing the set of ondemand ranges for a table
>> over time.  When a user specifies the set of ondemand ranges for a
>> table, thinking disallowing overlapping ranges will help them avoid
>> mistakes as they mutate the set of ondemand ranges.
>>
>> > later someone adds splits for a - z? How would we know to mark (-inf, a]
>> > and (a, b] as HOSTED and (b,c] as ONDEMAND? In a table where the row is
>>
>> Thinking any tablet that overlaps a user specified ondemend range
>> should end up with an ondemand goal state.  User should be able to use
>> -inf and +inf when specifying ondemand ranges.  So a user could set
>> the ondemand range to (-inf, +inf] to make an entire table on demand.
>> They could set a tables ondemand ranges to the set { (-inf,d],
>> (q,+inf] } and then as new splits are added below d of above q those
>> new tablets would automatically have an ondemand goal state.  Any
>> tablets between d and q would have a hosted goal state.
>>
>> > time and new splits are added daily or weekly, the range would have to
>> be
>> > updated at the same time that splits are created to keep the last N days
>> > hosted.
>>
>> For the case where splits are being added at the end of a table they
>> could use a range like (-inf,D] to make the tablets less than D
>> ondemand and everything greater then D hosted.  Later they could
>> change the ondemand range from (-inf,D] to (-inf,(D+10)].
>>
>> We could change the assumption that any tablet that overlaps a user
>> specified ondemend range should end up with an ondemand goal state.
>> Instead we could decided that tablets must fully fall within an
>> ondemand range inorder to end up with an ondemand goal state,
>> otherwise it has the hosted goal state.  For example if the user sets
>> ondemand ranges to { (-inf,d], (q,+inf] } then a tablet with the range
>> (b,f] would overlap the ondemand range and hosted range, so need to
>> decide which goal state the tablet should end up with.
>>
>> >
>> > On Mon, Apr 3, 2023 at 2:13 PM Keith Turner <ke...@deenlo.com> wrote:
>> >
>> > > On Mon, Apr 3, 2023 at 10:45 AM Dave Marion <dmario...@gmail.com>
>> wrote:
>> > > >
>> > > > Looking through the code to see what would have to change to remove
>> the
>> > > > ondemand table state, I'm struggling to find a way to implement this
>> > > > without having an ondemand state. Currently, the ondemand table
>> state is
>> > >
>> > > We could have tablet states instead of the ondemand table state.  So
>> > > for a table in the online state, each tablet could have a state of
>> > > HOSTED or ONDEMAND.  For compatibility the default state could be
>> > > HOSTED.  We could provide a mechanism for users to indicate they want
>> > > ranges of a table to be ondemand.  This would set the goal state for
>> > > the tablets that fall within one of those ranges to ONDEMAND.
>> > >
>> > > What I am uncertain about is how users would manage these per table
>> > > ranges. It could be via a table property, SPI, or custom API and I am
>> > > not sure which way is best.
>> > >
>> > > For example if a user creates an online table T1 with splits
>> > > c,f,j,m,x.  Then somehow they specify they want the range (b,d] to be
>> > > ondemand.  This would cause the tablets in the table to have the
>> > > following goal states.
>> > >
>> > > (-inf,c] ONDEMAND
>> > > (c,f] ONDEMAND
>> > > (f,j] HOSTED
>> > > (j,m] HOSTED
>> > > (m,x] HOSTED
>> > > (x,inf] HOSTED
>> > >
>> > > Then the manager and client code need to somehow know of the above
>> > > goal states and act accordingly.  How the manager and client know
>> > > about these goals states depends on how users specify the ranges.
>> > >
>> > > > set in ZooKeeper as the ZTABLE_STATE and both the client and the
>> server
>> > > use
>> > > > it.
>> > > >
>> > > > When a tablet cannot be located in the TabletLocator, then it
>> checks to
>> > > see
>> > > > if the table is in an ondemand state. If it is, then it tells the
>> server
>> > > > side to assign the tablets in the range that the client needs by
>> placing
>> > > an
>> > > > "ondemand" column in the tablet metadata. This ensures that ondemand
>> > > > tablets that were hosted as part of a client operation and then
>> hosted
>> > > > again on a tablet server failure. Prior to the ondemand state, the
>> > > > TabletLocator would attempt to find a tablet location for an online
>> > > table,
>> > > > and if it was not able to find a location, then it would wait - it
>> > > assumes
>> > > > that it will be hosted at some point. Do we modify the tablet
>> locator to
>> > > > send a signal to the Manager to assign all unlocated tablets for
>> online
>> > > > tables?
>> > > >
>> > > > We could attempt to remove the TableOperations.ondemand method,
>> leaving
>> > > > just online and offline. Then, we could have something periodically
>> check
>> > > > to see if the property TABLE_ONDEMAND_UNLOADER is set.  If it is,
>> then
>> > > that
>> > > > could change the internal table state to ondemand. However, when
>> that
>> > > > property is unset, I'm not sure we have enough information to know
>> > > whether
>> > > > to set the table state back to online or offline.
>> > > >
>> > > > I'm wondering if trying to achieve a simpler user experience is
>> > > outweighed
>> > > > by the complexity added to the code to achieve it. Personally, I
>> don't
>> > > > think it's that hard to reason about, especially if the user reads
>> the
>> > > docs
>> > > > and it is explained well.
>> > > >
>> > > > On Wed, Mar 29, 2023 at 1:04 PM Christopher <ctubb...@apache.org>
>> wrote:
>> > > >
>> > > > > On Wed, Mar 29, 2023 at 5:33 AM Dave Marion <dmario...@gmail.com>
>> > > wrote:
>> > > > > >
>> > > > > > > I think we should deprecate support for offline table
>> scanning,
>> > > since
>> > > > > it
>> > > > > > shouldn't be needed with the availability of ScanServers.
>> > > > > >
>> > > > > > Just making sure I understand your suggestion - you mean
>> removing the
>> > > > > > OfflineScanner and the ability to scan over offline tables in
>> the
>> > > > > MapReduce
>> > > > > > code, but we should continue our efforts to allow Scan Servers
>> to
>> > > scan
>> > > > > > offline tables, right?
>> > > > >
>> > > > >
>> > > > > Yes to removing OfflineScanner. But the rest of that isn't quite
>> what
>> > > > > I was thinking. What I was trying to say is that with elasticity
>> > > > > features, unless immediate consistency is required, the
>> ScanServer's
>> > > > > ability to scan should not depend on the tablets being "hosted"
>> for
>> > > > > live ingest. Using the ScanServer on the table's "unhosted"
>> tablets is
>> > > > > enough to replace the need for the OfflineScanner, I think.
>> > > > >
>> > > > > So, yes, we should continue our efforts to allow ScanServers to
>> scan
>> > > > > tables with "unhosted" tablets. Now, whether we say that is the
>> > > > > ScanServer scanning an "offline" table or not, depends on how
>> we're
>> > > > > defining "online" and "offline".
>> > > > >
>> > > > > Currently, without elastic features in place, that would only
>> happen
>> > > > > if we mark the table in an "offline" state, but once all the
>> elastic
>> > > > > features are in place, I think this would still be considered
>> > > > > "ondemand" or "online, as in available for use, but not pinned for
>> > > > > live ingest / unhosted".
>> > > > >
>> > > > > A lot of this is more about how we communicate the state (naming,
>> > > > > concepts, etc.), and depends on the rest of my email, rather than
>> > > > > affecting the actual features we're supporting. We should still
>> plan
>> > > > > for ScanServer to scan "unhosted" tablets, regardless of what
>> state we
>> > > > > end up calling it.
>> > > > >
>> > > > > >
>> > > > > > > As for "ondemand" table state, from a user perspective, I'm
>> not
>> > > sure
>> > > > > what
>> > > > > > it mean
>> > > > > >
>> > > > > > I have been thinking about it as "online" means always hosted,
>> > > "ondemand"
>> > > > > > means hosted as needed, and "offline" means never hosted.
>> > > > >
>> > > > > Rather than have a mapping from what these mean to how they
>> behave, I
>> > > > > think it would be better to have the names directly reflect the
>> user
>> > > > > experience. If we say "online" means "always hosted", then just
>> call
>> > > > > it "hosted".
>> > > > >
>> > > > > I think we really need the following states to match to the user
>> > > > > experience:
>> > > > >
>> > > > > (online, live)
>> > > > > (online, live-on-demand)
>> > > > > (online)
>> > > > > (offline / immutable)
>> > > > >
>> > > > > But, I think the first three states should really just be
>> considered
>> > > > > one state, with the "live"-ness being configurable.
>> > > > >
>> > > > > >
>> > > > > > > is the "on-demand availability" applicable only for live
>> ingest /
>> > > > > > immediate consistency? Is it still "always available"for bulk
>> import
>> > > /
>> > > > > > ScanServers? Or does "on-demand availability" somehow apply to
>> all
>> > > > > > interactions, including bulk import and ScanServer reads?
>> > > > > >
>> > > > > > We tried to reason about that in
>> > > > > >
>> > > > >
>> > >
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=247828052
>> > > > >
>> > > > > I think that matrix is useful as we iron out the implementation,
>> but I
>> > > > > don't think users should have to consult such a table in order to
>> > > > > understand what they can and cannot do in a given state. That's
>> why I
>> > > > > think it's useful to have just the two states, with pretty much
>> > > > > everything working in the "online" state and nothing working in
>> the
>> > > > > "offline" state, with the configurable "live"-ness that unlocks
>> > > > > additional features (specifically, immediate consistency scanning
>> and
>> > > > > live ingest). Like buying a car, you get standard (online) or
>> nothing
>> > > > > (offline), or you can get standard plus extras by paying the cost
>> of
>> > > > > those extras (in this case, configuration of "live"-ness). It's a
>> bit
>> > > > > weird to be able to do things in an offline state, or to not be
>> able
>> > > > > to do things in an online state. But if it's framed as "configure
>> the
>> > > > > extra features to use them", it's a bit more intuitive to
>> understand,
>> > > > > because "online" still means "ready to go".
>> > > > >
>> > > > > If it helps make things even more clear, long before we had the
>> > > > > "online" and "offline" states, I worked on an idea for calling the
>> > > > > tables "enabled" and "disabled". I abandoned that idea (and the
>> > > > > transition states that accompanied them) when these states were
>> > > > > introduced. But, I still think those terms might be better, in
>> that
>> > > > > they don't imply any relationship to "hosted" or "unhosted"...
>> just
>> > > > > whether or not they were usable by the user, which is a better
>> way of
>> > > > > framing things, I think.
>> > > > >
>> > > > > If enabled/disabled terms were used in place of online/offline,
>> the
>> > > > > states could be:
>> > > > >
>> > > > > (enabled, hosted)
>> > > > > (enabled, on-demand)
>> > > > > (enabled, unhosted)
>> > > > > (disabled)
>> > > > >
>> > > > > Or collapsing the first 3 again, just:
>> > > > >
>> > > > > (enabled) - hosted status is tunable/configurable
>> > > > > (disabled)
>> > > > >
>> > > > >
>> > > > > >
>> > > > > > Regarding the rest of your email, I think removing the ondemand
>> state
>> > > > > would
>> > > > > > be ok. The ondemand commits added a new property for the user to
>> > > specify
>> > > > > > which tablet unloader class[1] to use, with the default being
>> [2]. We
>> > > > > could
>> > > > > > add a new default implementation that does not unload and users
>> would
>> > > > > have
>> > > > > > to opt-in to unloading by setting the property for their online
>> > > tables.
>> > > > > > However this is some code surrounding the new ondemand state
>> that we
>> > > > > would
>> > > > > > need to address. For example, when a TabletServer is low on
>> memory it
>> > > > > > doesn't call the specified TabletUnloader, it just unloads a
>> Tablet.
>> > > > > >
>> > > > > > [1]
>> > > > > >
>> > > > >
>> > >
>> https://github.com/apache/accumulo/blob/elasticity/core/src/main/java/org/apache/accumulo/core/spi/ondemand/OnDemandTabletUnloader.java
>> > > > > > [2]
>> > > > > >
>> > > > >
>> > >
>> https://github.com/apache/accumulo/blob/elasticity/core/src/main/java/org/apache/accumulo/core/spi/ondemand/DefaultOnDemandTabletUnloader.java
>> > > > > >
>> > > > > > On Tue, Mar 28, 2023 at 10:27 AM Christopher <
>> ctubb...@apache.org>
>> > > > > wrote:
>> > > > > >
>> > > > > > > I think we should deprecate support for offline table
>> scanning,
>> > > since
>> > > > > > > it shouldn't be needed with the availability of ScanServers.
>> Any
>> > > > > > > MapReduce that previously relied on scanning offline tables
>> could
>> > > be
>> > > > > > > made to use that instead.
>> > > > > > >
>> > > > > > > I agree there is a need to have an immutable table state, for
>> > > which it
>> > > > > > > is possible to read, but no changes can be made. However,
>> even in
>> > > that
>> > > > > > > "locked" state, one should still be able to perform surgery
>> on its
>> > > > > > > metadata, or manually / surgically compact files (with the
>> > > > > > > understanding that doing so will interfere with any concurrent
>> > > export
>> > > > > > > or scan operations that are relying on it being immutable,
>> which I
>> > > > > > > think is a tolerable amount of risk, when actually in a
>> situation
>> > > > > > > where such surgery is needed).
>> > > > > > >
>> > > > > > > As for "ondemand" table state, from a user perspective, I'm
>> not
>> > > sure
>> > > > > > > what it means... is the "on-demand availability" applicable
>> only
>> > > for
>> > > > > > > live ingest / immediate consistency? Is it still "always
>> available"
>> > > > > > > for bulk import / ScanServers? Or does "on-demand
>> availability"
>> > > > > > > somehow apply to all interactions, including bulk import and
>> > > > > > > ScanServer reads?
>> > > > > > >
>> > > > > > > I think the "ondemand" state is confusing, because it's
>> exposing
>> > > > > > > internal state through to the user, and in a way that isn't as
>> > > clear
>> > > > > > > as the simple "online/offline" states used to be. Previously,
>> users
>> > > > > > > didn't need to understand what was going on internally...
>> "online"
>> > > > > > > just meant "I can interact with this table", and "offline"
>> meant "I
>> > > > > > > can't interact with this table". The user wasn't required to
>> > > > > > > understand what a tablet was, or how it was hosted, or
>> anything of
>> > > > > > > that nature. As we started adding support for "offline"
>> features,
>> > > the
>> > > > > > > lines separating "online and offline" meaning "available and
>> > > > > > > unavailable" became blurred. As we proceed adding elasticity,
>> I
>> > > think
>> > > > > > > we should work to make things more clear and explicit
>> again... and
>> > > I
>> > > > > > > think "ondemand" as a table state, makes things even less
>> clear
>> > > when
>> > > > > > > the concept is exposed to the user as a separate table state.
>> > > > > > >
>> > > > > > > I do think we need some kind of on-demand availability for
>> > > live-ingest
>> > > > > > > and immediate consistency in order to be more elastic, and
>> from the
>> > > > > > > discussion, it's obvious we need an immutable table state,
>> but I
>> > > think
>> > > > > > > it's a mistake to expose the on-demand availability for
>> live-ingest
>> > > > > > > and immediate consistency as a new table state. I think that
>> > > should be
>> > > > > > > left as either some kind of automatic internal behavior, or
>> as a
>> > > > > > > secondary fine-grained control over an online table (like
>> pinned
>> > > > > > > tablets, either permanently pinned or temporally pinned,
>> based on
>> > > > > > > activity).
>> > > > > > >
>> > > > > > > On Tue, Mar 28, 2023 at 9:51 AM Drew Farris <d...@apache.org>
>> > > wrote:
>> > > > > > > >
>> > > > > > > > On Mon, Mar 27, 2023 at 2:16 PM Keith Turner <
>> ke...@deenlo.com>
>> > > > > wrote:
>> > > > > > > >
>> > > > > > > > > One realization that came out examining the different
>> table
>> > > states
>> > > > > is
>> > > > > > > > > that export table currently relies on the fact that
>> offline
>> > > tables
>> > > > > > > > > will not delete files.  If we enable compactions on
>> offline
>> > > tables
>> > > > > > > > > then that could cause files to be deleted which would
>> break the
>> > > > > > > > > expectation of export table.
>> > > > > > > > >
>> > > > > > > >
>> > > > > > > > This is a good point. I hadn't considered the potential
>> breakage
>> > > to
>> > > > > > > export
>> > > > > > > > table. I suspect another concern could be the hadoop input
>> format
>> > > > > that
>> > > > > > > > operates over the rfiles in an offline table - and can do so
>> > > > > relatively
>> > > > > > > > safely
>> > > > > > > > because the table is not expected to change while it is
>> offline.
>> > > > > > > >
>> > > > > > > > So, it would seem that there is value in having an
>> 'immutable'
>> > > table
>> > > > > > > state
>> > > > > > > > in
>> > > > > > > > the form of an offline table. Perhaps 'ondemand' is the
>> alternate
>> > > > > state
>> > > > > > > > that
>> > > > > > > > lets us do things like import, split, compact, merge, etc.
>> > > > > > >
>> > > > >
>> > >
>>
>

Reply via email to