Re: [DISCUSS] New Reduce design for FDB

Garren Smith Wed, 24 Jun 2020 10:32:40 -0700

On Wed, Jun 24, 2020 at 6:47 PM Joan Touzet <woh...@apache.org> wrote:


> Hi Garren,
>
> If the "options" field is left out, what is the default behaviour?
>

All group_levels will be indexed. I imagine this is what most CouchDB uses
will want.


> Is there no way to specify multiple group_levels to get results that
> match the original CouchDB behaviour? Your changed behaviour would be
> acceptable if I could do something like `?group_level=2,3,4,5`.
>

I imagine we could, it would make the code a lot more complex. What is the
reason for that?
I find the fact that we return multiple group_levels for a set group_level
very confusing. To me it feels like
the reason we return extra group_levels is because of how b-tree's work
rather than it being a useful thing for a user.


> -Joan
>
> On 24/06/2020 08:03, Garren Smith wrote:
> > Quick Note I have a gist markdown version of this that might be easier to
> > read
> https://gist.github.com/garrensmith/1ad1176e007af9c389301b1b6b00f180
> >
> > Hi Everyone,
> >
> > The team at Cloudant have been relooking at Reduce indexes for CouchDB on
> > FDB and we want to simply what we had initially planned and change some
> of
> > the reduce behaviour compared to CouchDB 3.x
> >
> > Our initial design was to use a skip list. However this hasn’t proven to
> be
> > particularly useful approach. It would take very long to update and I
> can’t
> > find a good algorithm to query the skip list effectively.
> >
> > So instead I would like to propose a much simpler reduce implementation.
> I
> > would like to use this as the base for reduce and we can look at adding
> > more functionality later if we need to.
> >
> > For the new reduce design, instead of creating a skip list, we will
> instead
> > create group_level indexes for a key. For example say we have the
> following
> > keys we want to add to a reduce index:
> >
> > ```
> > ([2019, 6, 1] , 1)
> > ([2019, 6, 20] , 1)
> > ([2019, 7, 3] , 1)
> > ```
> >
> > We would then create the following group_level indexes:
> >
> > ```
> > Level 0:
> > (null, 3)
> >
> > Level=1:
> > ([2019], 3)
> >
> > Level 2:
> > ([2019,6], 2)
> > ([2019, 7] , 1)
> >
> > Level3:
> > ([2019, 6, 1,] , 1)
> > ([2019, 6, 20,] , 1)
> > ([2019, 7, 3,] , 1)
> > ```
> >
> > All of these group_level indexes would form part of the reduce index and
> > would be updated at the same time. We don’t need to know the actual
> > `group_levels` ahead of time as we would take any key we need to index
> look
> > at its length and add it to the group_levels it would belong to.
> >
> > Another nice optimization we can do with this is when a user creates a
> view
> > they can specify the number of group levels to index e.g:
> >
> > ```
> > {
> > _id: _design/my-ddoc
> >    views: {
> >       one: {
> >         map: function (doc) {emit(doc.val, 1)},
> >         reduce: "_sum"
> >       },
> >
> >       two: {
> >         map: function (doc) {emit(doc.age, 1)},
> >            reduce: "_count"
> >       }
> >     },
> >
> >     options: {group_levels: [1,3,5]}
> > }
> > ```
> > This gives the user the ability to trade off index build speed, storage
> > overhead and performance.
> >
> > One caveat of that, for now, is if a user changes the number of
> > `group_levels` to be indexed, the index is invalidated and we would have
> to
> > build it from scratch again. Later we could look at doing some work
> around
> > that so that isn’t the case.
> >
> > This design will result in a behaviour change. Previously with reduce if
> > you set `group_level=2`. It will return all results with `group_level=2`
> > and below. E.g  reduce key/values of the following:
> >
> > ```
> > # group = true
> > ("key":1,"value":2},
> > {"key":2,"value":2},
> > {"key":3,"value":2},
> > {"key":[1,1],"value":1},
> > {"key":[1,2,6],"value":1},
> > {"key":[2,1],"value":1},
> > {"key":[2,3,6],"value":1},
> > {"key":[3,1],"value":1},
> > {"key":[3,1,5],"value":1},
> > {"key":[3,4,5],"value":1}
> > ```
> >
> > Then doing a query group_level=2 returns:
> >
> > ```
> > # group_level = 2
> > {"rows":[
> > {"key":1,"value":2},
> > {"key":2,"value":2},
> > {"key":3,"value":2},
> > {"key":[1,1],"value":1},
> > {"key":[1,2],"value":1},
> > {"key":[2,1],"value":1},
> > {"key":[2,3],"value":1},
> > {"key":[3,1],"value":2},
> > {"key":[3,4],"value":1}
> > ]}
> > ```
> >
> > I want to **CHANGE** this behaviour, so if a query specifies
> > `group_level=2` then **only** `group_level=2` returns would be returned.
> > E.g from the example above the results would be:
> >
> > ```
> > # group_level = 2
> > {"rows":[
> > {"key":[1,1],"value":1},
> > {"key":[1,2],"value":1},
> > {"key":[2,1],"value":1},
> > {"key":[2,3],"value":1},
> > {"key":[3,1],"value":2},
> > {"key":[3,4],"value":1}
> > ]}
> > ```
> >
> >
> > ## Group_level=0
> > `Group_level=0` queries would work as follows:
> > 1. `group_level=0` without startkey/endkey and then the group_level=0
> index
> > is used
> > 2. For a `group_level=0` with a startkey/endkey or where `group_level=0`
> is
> > not indexed, the query will look for the smallest `group_level` and use
> > that to calculate the `group_level=0` result
> > 3. `group_level=0` indexes with a startkey/endkey could timeout and be
> slow
> > in some cases because we having to do quite a lot of aggregation when
> > reading keys. But I don’t think that is much different from how it is
> done
> > now.
> >
> > ## Group=true
> > We will always build the `group=true` index.
> >
> > ## Querying non-indexed group_level
> > If a query has a `group_level` that is not indexed. We can do two things
> > here, CouchDB could use the nearest  `group_level` to service the query
> or
> > it could return an error that this `group_level` is not available to
> query.
> > I would like to make this configurable so that an admin can choose how
> > reduce indexes behave.
> >
> > ## Supported Builtin Reduces
> > Initially, we would support reduces that can be updated by calculating a
> > delta change and applying it to all the group_levels. That means we can
> > support `_sum` and `_count` quite easily. Initially, we won’t implement
> > `max` and `min`. However, I would like to add them as soon after.
> >
> > I would also later like to add support for `_stats` reducer. The best
> > option I can think of is breaking up each field in `_stats` into its own
> > k/v row in FDB.
> >
> > I’m not sure how to handle the `_approx_count_distinct`. At the moment I
> > don’t understand the algorithm well enough to know if we could support
> it.
> >
> > ## Custom Reduces
> > Custom reduces will not be supported. This is because it will be tricky
> to
> > implement with this design and would not be very performant. Later if
> there
> > is a demand for this I guess we could look at doing it but with the
> caveat
> > that performance will be pretty bad. My view on custom reduces are that
> in
> > 99% of cases they are a bad choice so I would prefer we don’t support
> them.
> >
> > ## Replication
> > Some basic replication rules:
> > 1. If a design document is replicated across from an old database and
> does
> > not have the group_level option set. All `group_levels` will be
> supported.
> >
> > 2. If a design document is replicated across with a custom reduce or a
> > reduce we don’t support we will return an error when that reduce index is
> > queried. I’m thinking we would still build the map part of the index.
> >
> > ## limits
> > Reduce keys will have the same limits as map keys which are 8KB.
> >
> > Hopefully that all makes sense, let me know if I need to explain a point
> > further and I would love to hear your thoughts on this.
> >
>

Re: [DISCUSS] New Reduce design for FDB

Reply via email to