Hi Stan, Thanks for the extensive analysis. Alex G. could you please step in and share your thoughts. Seems it's time to revisit IEP-4 and prioritize the gaps.
-- Denis On Wed, Oct 31, 2018 at 7:56 AM Stanislav Lukyanov <[email protected]> wrote: > Hi Igniters, > > I've been looking into various scenarios of Partition Loss Policies usage > recently, > and found a number of issues in the current implementation. > > I'll start with an overview, but if you'd like to dive to a proposal I > have right now then please > feel free to scroll down to TLDR. > > The list of issues is below: > https://issues.apache.org/jira/browse/IGNITE-10041: Partition loss > policies work incorrectly with BLT > https://issues.apache.org/jira/browse/IGNITE-10043: Lost partitions list > is reset when only one server is alive in the cluster > https://issues.apache.org/jira/browse/IGNITE-9841: SQL doesn't take lost > partitions into account when persistence is enabled > https://issues.apache.org/jira/browse/IGNITE-10057: SQL queries hang > during rebalance if there are LOST partitions > https://issues.apache.org/jira/browse/IGNITE-9902: ScanQuery doesn't take > lost partitions into account > https://issues.apache.org/jira/browse/IGNITE-10059: Local scan query > against LOST partition fails > https://issues.apache.org/jira/browse/IGNITE-10044: LOST partition is > marked as OWNING after the owner rejoins with existing persistent data > https://issues.apache.org/jira/browse/IGNITE-10058: resetLostPartitions() > leaves an additional copy of a partition in the cluster > > I'm sure this is not a complete list, but this is what I could find by > tackling how queries and > persistence interact with current handling of partition loss. > > It seems that the issues - from this list and some other fixed recently - > can be split into three categories > - corner case bugs - there are always some, and we can fix them as they > show up > - handling of lost partitions by different APIs - while JCache API handles > lost partitions fine, > SQL and Scan queries have known issues; other APIs, such different types > of queries, DataStreamer, > etc probably need to have more testing > - Partition Loss Policices + BLT ( > https://issues.apache.org/jira/browse/IGNITE-10041) - BLT seems > to be fundametally conflicting with the pre-existing semantics of > partition loss > > While the former two categories can be solved case-by-case, the last one > needs a wider design effort. > We need to reimagine our partition loss semantics for BLT, and change > behavior accordingly. > For now, most of the features don't really work for a cache with BLT, with > only READ_WRITE_SAFE and > READ_ONLY_SAFE working correctly (good thing - these two are the most > useful policies anyway). > > TLDR: we have issues with partition loss policices, and the largest one is > that BLT semantics > conflict with most partition lost policices, and we need to address this > somehow. > > What I suggest to do right now: > 1. Deny the configurations that don't work - e.g. just throw an exception > if a cache starts > with BLT and PartitionLossPolicy.IGNORE or others. > 2. Change default PartitionLossPolicy to READ_WRITE_SAFE *for persistent > caches only*. > This is what effectively in place for the persistent caches already (since > IGNORE semantics are > not supported), so this shouldn't bring a lot of compatibility issues. > > I believe doing this will at least help us to protect the users from > unexpected/inconsistent behavior. > Actual design changes can be done later, e.g. as a part of IEP 4 Phase 2/3. > > WDYT? > > Thanks, > Stan > >
