Hi All,

I was chatting to Adam yesterday and I want to explore the index-on-write
indexing for Mango a bit more. I know there has been a bit of a discussion
that we should only use a background process to build mango indexes but I
think that building indexes as documents are created/updated along combined
with background processing for existing documents will be easier to
implement. Especially in the beginning as we build the new fdb layer.

Below is the process for building a new index:

1. When a user defines a new index on an existing database, save the index
definition and also save the sequence that the index was added at. The
index should also be marked that it is in a `building`  phase so it won’t
be used yet to service queries. (I’ll come back to this later)
2. Any write requests after that must read the new index definition and
update the index. When updating the new index, the writers should assume
that previous versions of the document have already been indexed.
3. At the same time a background process will start reading sections of the
changes feed and building the index, this background process will keep
processing the changes read until it reaches the sequence number that the
index was saved at. Once it reaches that point, the index is up to date and
will be marked as `active` and can be used to service queries.
4. There are some subtle behaviour around step 3 that is worth mentioning.
The background process will have the 5 second transaction limit, so it will
process smaller parts of the changes feed. Which means that it won’t have
one consistent view of the changes feed throughout the index building
process. This will lead to a conflict situation. For example when the
background process transaction is adding a document to the index while at
the same time a write request has a transaction that is updating the same
document. There are two possible outcomes to this, if the background
process wins, the write request will get a conflict. At that point the
write request will try to process the document again, read the old values
for that document, remove them from the index and add the new values to the
index. If the write request wins, and the background process gets a
conflict, then the background process can try again, the document would
have been removed from its old position in the changes feed and moved to
the later position, so the background process won’t see the document and
will then move on to the next one.
5. One other feature to add is to an index progress tracker. We can do this
by using doc_count for the database, and then have a counter value that the
background workers can increment with the number of documents it updated
for each batch update.  We would also have to update this counter on write
requests while the index is in building mode.
6. Something we can also explore is splitting the building of the index
across multiple workers, we can use the `get_boundary_keys` [1] API call on
the changes feed to get the full list of the changes feed keys grouped by
partition boundaries and then split that by workers.

Adding a building and active state to the index’s is a bit of a breaking
change, but I think its the right way to go. Currently with Mango what can
happen is a user creates an index and then immediately does a query that
would use that index. Mango would then have to build the whole index before
responding to that request. In this new index-on-write process, Mango would
ignore the new index until it is active which I think is the better way to
go on this.

Finally, a big acknowledgment to Adam who is the major brains behind this
design.

What do you think, I would like to hear any thoughts, questions or
suggestions on this design.

Cheers
Garren

[1]
https://apple.github.io/foundationdb/api-python.html?highlight=boundary_keys#fdb.locality.fdb.locality.get_boundary_keys

On Mon, Apr 8, 2019 at 3:50 PM Garren Smith <gar...@apache.org> wrote:

>
>
> On Tue, Apr 2, 2019 at 3:14 AM Adam Kocoloski <kocol...@apache.org> wrote:
>
>> Hi Will, great comments, I have replies to a couple of them.
>>
>> > On Apr 1, 2019, at 5:21 AM, Will Holley <willhol...@gmail.com> wrote:
>> >
>> > 2. Does the ICU sort key have a bounded length? Mostly I'm wondering
>> > whether we can guarantee that the generated keys will fit within the
>> > maximum FDB key length or if there needs to be some thought as to the
>> > failure mode / workaround. As Adam mentioned, it seems fine to store an
>> > encoded key given Mango (currently) always fetches the associated
>> document
>> > / fields from the primary index to filter on anyway. It might even be
>> > beneficial to have an additional layer of indirection and allow multiple
>> > docs to be associated with each row so that we can maintain compact
>> keys.
>>
>> Interesting thought on that layer of indirection; it reminds me of an
>> optimization applied in the Record Layer’s text indexes. Would have to
>> compare whether the extra reads needed to maintain the index that way are
>> an acceptable tradeoff.
>>
>> Good point on the sort key sizes, I’ve not seen any way to place a
>> reliably safe upper bound on the size of one that might be generated. The
>> ICU folks have some hand-wavey guidance at
>> http://userguide.icu-project.org/collation/architecture#TOC-Sort-key-size,
>> but it seems like we might be able to dig a little deeper.
>>
>> I personally haven’t given much thought to a workaround where a
>> user-defined index key exceeds 10 KB. We’ll definitely need to handle that
>> failure mode safely even without the sort key complication — people try
>> crazy things :)
>>
>
> For the 10 KB error, I think we should just return an error. As a
> comparison, MongoDB has a 1024 Byte limit
> https://docs.mongodb.com/manual/reference/limits/#Index-Key-Limit
>
>
>> > 3. I don't immediately see how you clear previous values from the index
>> > when a doc is updated, but I could easily be missing something obvious
>> :)
>>
>> Ah yeah, this part wasn’t explicit, was it?
>>
>> I think the idea is that these are simple indexes on specific fields of a
>> document, and we have a data model where those fields are already stored as
>> their own keys in FDB, so there’s no need (in the case of Mango) to
>> maintain a separate docid -> {viewid, [keys]} mapping like we do today in
>> each view group. Rather, the flow would go something like
>>
>> 1) Check which fields are supposed to be indexed
>> 2) Retrieve values for those fields in the ?DOCUMENTS space for the
>> parent revision
>> 3) Compare the parent values with the ones supplied in this transaction;
>> if any indexed values change, clear the old ones and insert the new ones
>>
>> with some additional caveats around checking that the supplied edit is
>> actually going to be winning (and therefore indexed) version after the
>> commit succeeds.
>>
>> > 4. Regarding "Index on write" behaviour, is there something in the
>> existing
>> > design (Mango overlaying mrview / lucene) that would prevent this? I can
>> > see some benefit for certain workloads (and headaches for others) but I
>> > don't see that it's necessarily coupled to the Mango design given
>> > background indexing of new/changed indexes needs to be supported anyway.
>>
>> I’m not sure I understand your question. In my mind the reason “index on
>> write" is more applicable for Mango JSON than for generalized views is
>> because in the view case batching is currently quite important to achieve
>> good throughput to the JS system. You’re of course correct that we need to
>> be able to re-generate Mango JSON indexes in the background as well.
>>
>> Adam
>>
>>
>>

Reply via email to