Re: LOCAL vs TRANSACTIONAL indexes

Matthew Van Wely Fri, 23 Sep 2016 15:13:09 -0700

Thanks James,

In response to: "If your table is mutable and not transactional"...   I'd
like to assume mutable tables are configured by default to "Disable mutable
indexes on write failure until consistency restored" but documentation
points to the following params below.  I assume these properties are
configured per deployment (not per table).  I will have to follow up with
my team :) to see what we have configured.


phoenix.index.failure.block.write
phoenix.index.failure.handling.rebuild
phoenix.index.failure.handling.rebuild.interval
phoenix.index.failure.handling.rebuild.overlap.time

--Matt


On Thu, Sep 22, 2016 at 4:55 PM, James Taylor <jamestay...@apache.org>
wrote:

> In all cases, the client throws if a write fails (i.e. data table or index
> table failure). The state your table and index are left in depend on 1) how
> you've configured your table, and 2) when the failure happened. This is
> described here[1] in detail.
>
> Here's a summary:
>
> - If your table is transactional, then you're always left in a consistent
> state. Upon usage of the data or index table in queries, neither will have
> the update applied.
> - If your table is immutable and not transactional, then your tables are
> left in a potentially inconsistent state and it's up to you to retry. It's
> "potentially" inconsistent because it depends on when the failure happened.
> If the write to the data table failed, then your index will still be
> consistent. If the write to the data table succeeded and the write to the
> index table failed, then you're in an inconsistent state.
> - If your table is mutable and not transactional, then your tables are
> left in a potentially inconsistent state. Your data table may be one commit
> ahead of your index table and there are some options to a) automatically
> "roll-forward" the index to get it back in sync in the background, b)
> disable writes to the data table and hide the prior failed commit until the
> index table is available again and caught up to the data table, c) unactive
> the index so it's not used by queries again until it's automatically caught
> up, d) disable the index until it's manually rebuilt.
> - If you're using local indexes and you have a release that includes
> HBASE-15600 (not available in OS world until HBase 1.3 is out, but likely
> available in HDP 2.5), then your data table and index will remain in a
> consistent state.
>
> It's complicated because it's a distributed system. We've made these
> various options for users who don't want the overhead of transactions, but
> if you're ok with the overhead, that's the simplest solution from the users
> POV.
>
> Thanks,
> James
>
> [1] https://phoenix.apache.org/secondary_indexing.html#
> Consistency_Guarantees
>
> On Thu, Sep 22, 2016 at 10:53 AM, Matthew Van Wely <
> mvanw...@salesforce.com> wrote:
>
>> James, what are the "write failure scenarios" in this case? I can only
>> assume one, index update fails and client <does X> while trying to rewrite
>> to the index.
>>
>> How does the statement below fair when an index cannot be updated.  Does
>> client hang, throw error (leaving inconsistent results) or does the table
>> write get rolled back?
>>
>> > "From the same client, there is no race condition. The upsert statement
>> is synchronous, so when control returns back to you, all of your data has
>> been written (both to the data and index table(s))."
>>
>> On Tue, Sep 20, 2016 at 10:47 PM, James Taylor <jamestay...@apache.org>
>> wrote:
>>
>>> Glad to help, Matt. Just to be clear, there are no race conditions from
>>> the same client. The "unlikely" scenarios come into play when there are
>>> multiple clients. Other things to think about are to what degree you want
>>> to guard against various write failure scenarios.
>>>
>>> Thanks,
>>> James
>>>
>>> On Tue, Sep 20, 2016 at 10:30 PM, Matthew Van Wely <
>>> mvanw...@salesforce.com> wrote:
>>>
>>>> Thanks James, knowing that there are no race conditions (or very
>>>> unlikely) from the same client on a mutable table is really helpful.
>>>>
>>>> Thx,
>>>> --Matt
>>>>
>>>> On Sat, Sep 17, 2016 at 4:26 PM, James Taylor <jamestay...@apache.org>
>>>> wrote:
>>>>
>>>>> On Fri, Sep 16, 2016 at 7:22 PM, Matthew Van Wely <
>>>>> mvanw...@salesforce.com> wrote:
>>>>>
>>>>>> All,
>>>>>>
>>>>>> I would like some guidance on LOCAL vs TRANSACTIONAL indexes and I
>>>>>> cannot quite get the details I need from the Phoenix site:
>>>>>> https://phoenix.apache.org/secondary_indexing.htm
>>>>>>
>>>>>> Transactional Tables
>>>>>>
>>>>>> <snip>
>>>>>> transactional tables with secondary indexes potentially lowers your
>>>>>> availability of being able to write to your data table, as both the
>>>>>> data
>>>>>> table and its secondary index tables must be availalbe as otherwise
>>>>>> the
>>>>>> write will fail
>>>>>> </snip>
>>>>>>
>>>>>> 1) What is the likelihood that an index is not available?
>>>>>>
>>>>> This is rare and unlikely. If a region server goes down, HBase
>>>>> relocates the regions it was hosting to another region server. If you 
>>>>> write
>>>>> data exactly when this happens, it's possible that you'll get an exception
>>>>> back if this relocation takes longer than your # of retries and timeout
>>>>> settings.
>>>>>
>>>>>
>>>>>>
>>>>>> 2) If rebuilding, is this on the order of minutes, hours?
>>>>>>
>>>>> Not sure what rebuilding you're asking about. For mutable, non
>>>>> transactional secondary indexes, Phoenix has the ability to partially
>>>>> rebuild them if a write failure occurs. This will be relatively faster
>>>>> because it only rebuilds index rows that were added after the writes began
>>>>> failing. See the options listed under https://phoenix.apache.o
>>>>> rg/secondary_indexing.html#Mutable_Tables
>>>>>
>>>>> If on the other hand you're asking how long does it take to completely
>>>>> rebuild the index, then that depends on how much data the table has (so
>>>>> then you're really asking how fast does HBase write).
>>>>>
>>>>>
>>>>>>
>>>>>> 3) Does Phoenix give an indication the write failed due to unavailable
>>>>>> table/index (bc if so client could handle this with other write
>>>>>> options)?
>>>>>>
>>>>>
>>>>> Yes, Phoenix throws an exception if the write fails. It never fails
>>>>> silently. If your data is immutable, then it's up to you to handle the
>>>>> write failure (usually by just continually retrying the failed write). If
>>>>> mutable, then Phoenix has some options that can automate catching the 
>>>>> index
>>>>> up with the data table (see https://phoenix.apache.or
>>>>> g/secondary_indexing.html#Consistency_Guarantees). If your table is
>>>>> transactional, then it cannot get out of sync with the index.
>>>>>
>>>>>
>>>>>>
>>>>>> Local Indexes
>>>>>>
>>>>>> <snip>
>>>>>> all local index data in the separate shadow column families in the
>>>>>> same data table. At read time when the local index is used, every
>>>>>> region
>>>>>> must be examined for the data as the exact region location of index
>>>>>> data
>>>>>> cannot be predetermined. Thus some overhead occurs at read-time.
>>>>>> </snip>
>>>>>>
>>>>>> 4) Are there any requirements on table PK and index key regarding key
>>>>>> ordering?
>>>>>>
>>>>> No
>>>>>
>>>>>
>>>>>>
>>>>>> 5) How is something locally indexed if the keys are completely
>>>>>> mismatched?
>>>>>> I get the sense that it doesn't matter given that "every region must
>>>>>> be
>>>>>> examined".
>>>>>>
>>>>>
>>>>> The rows of a local index are sorted in each region. The client just
>>>>> has to do a merge sort between all the data it gets back for the scans 
>>>>> over
>>>>> each region. This is very fast, so not too much overhead here.
>>>>>
>>>>>
>>>>>>
>>>>>> Mutable Tables
>>>>>>
>>>>>> <snip>
>>>>>> indexes on non transactional mutable tables are only ever a single
>>>>>> batch of edits behind the primary table
>>>>>> </snip>
>>>>>>
>>>>>> 6) If my use case updates a table and then reads from an index, it
>>>>>> seems a
>>>>>> likely race condition that I can read-my-write.
>>>>>>
>>>>>
>>>>> From the same client, there is no race condition. The upsert statement
>>>>> is synchronous, so when control returns back to you, all of your data has
>>>>> been written (both to the data and index table(s)).
>>>>>
>>>>> If the read happens from a different client than the write, with
>>>>> global, mutable, non transactional indexes, it's possible that a read 
>>>>> could
>>>>> occur after the write to the data table but before the write to the index
>>>>> table(s) (since the with global indexes, the regions for the index table
>>>>> are potentially on different region servers than the regions of the data
>>>>> table).
>>>>>
>>>>> With local indexes the above is even more unlikely because the writes
>>>>> are all occurring to the same region server, but in theory it's still
>>>>> possible. With the fix that was made as part of HBASE-15600, this wouldn't
>>>>> be possible at all, though.
>>>>>
>>>>> With transactional tables, this scenario isn't possible.
>>>>>
>>>>>
>>>>>>
>>>>>> 7) Would you be willing to bet that most reads are consistent with the
>>>>>> table and only in rare scenarios is the table/index out of sync?
>>>>>>
>>>>> Yes
>>>>>
>>>>>>
>>>>>> I appreciate your help and feedback on these questions.  Thanks,
>>>>>> --Matthew
>>>>>>
>>>>>
>>>>> Thanks,
>>>>> James
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: LOCAL vs TRANSACTIONAL indexes

Reply via email to