Hey all, good discussion, thanks!
I think I agree that we should keep the custom reduce feature, but I’m in favour of disabling it by default for the reasons stated in this thread. Or in voting terms: > On 13. Oct 2020, at 22:21, Robert Samuel Newson <rnew...@apache.org> wrote: > > Nick, let's broaden the thread to two questions then; > > 1) Deprecate custom reduce functions -1 > 2) Disable custom reduce functions by default, but don't deprecate them. +1 Best Jan — > > > >> On 13 Oct 2020, at 21:16, Nick Vatamaniuc <vatam...@gmail.com> wrote: >> >> In case of _sum, like Joan mentioned, we can emit objects or arrays >> and the built-in _sum will add the values of the fields together: >> >> So {"map": 'function(d){ emit(d._id, {"bar":1, "foo":2, "baz":3}); >> }', "reduce" : '_sum' } for 10 docs would produce {"bar": 10, "baz": >> 30, "foo": 20}. >> >> As for the deprecation, I wouldn't necessarily call for deprecation >> but I can see leaving it disabled by default and let the users enable >> it if they want to. If we see that there is a good demand for custom >> functions, and it is annoying for users to have to enable it, we could >> revert it back to enabled by default or like it was discussed, or, try >> to add more built-in reducers. >> >> Cheers, >> -Nick >> >> On Tue, Oct 13, 2020 at 3:38 PM Jonathan Hall <fli...@flimzy.com> wrote: >>> >>> So looking through the code that uses this, it looks like the main use >>> I've had for custom reduce functions is summing multiple values at >>> once. A rough equivalent of 'SELECT SUM(foo),SUM(bar),SUM(baz)'. >>> >>> The first thing that comes to mind to duplicate this functionality >>> without a custom reduce function would mean building one unique index >>> for each value that needs to be summed, which I expect would be a lot >>> less efficient. >>> >>> But maybe I'm overlooking a more clever and efficient alternative. >>> >>> Jonathan >>> >>> >>> On 10/13/20 6:31 PM, Robert Samuel Newson wrote: >>>> Hi, >>>> >>>> Yes, that's what I'm referring to, the javascript reduce function. >>>> >>>> I'm curious what you do with custom reduce that isn't covered by the >>>> built-in reduces? >>>> >>>> I also think if custom reduce was disabled by default that we would be >>>> motivated to expand this set of built-in reduce functions. >>>> >>>> B. >>>> >>>>> On 13 Oct 2020, at 17:06, Jonathan Hall <fli...@flimzy.com> wrote: >>>>> >>>>> To be clear, by "custom reduce functions" you mean this >>>>> (https://docs.couchdb.org/en/stable/ddocs/ddocs.html#reduce-and-rereduce-functions)? >>>>> >>>>> So by default, only built-in reduce functions could be used >>>>> (https://docs.couchdb.org/en/stable/ddocs/ddocs.html#built-in-reduce-functions)? >>>>> >>>>> If my understanding is correct, I guess I find it a but surprising. I've >>>>> always thought of map/reduce of one of the core features of CouchDB, so >>>>> to see half of that turned off (even if it can be re-enabled) makes me >>>>> squint a bit. And it is a feature I use, so I would not be in favor of >>>>> deprecating it entirely, without a clear proposal/documentation for an >>>>> alternative/work-around. >>>>> >>>>> Based on the explanation below, it doesn't sound like there's a technical >>>>> reason to deprecate it, but rather a user-experience reason. Is this >>>>> correct? >>>>> >>>>> If my understanding is correct, I'm not excited about the proposal, but >>>>> before I dive further into my thoughts, I'd like confirmation that I >>>>> actually understand the proposal, and am not worried about something else >>>>> ;) >>>>> >>>>> Jonathan >>>>> >>>>> >>>>> On 10/13/20 5:48 PM, Robert Samuel Newson wrote: >>>>>> Hi All, >>>>>> >>>>>> As part of CouchDB 4.0, which moves the storage tier of CouchDB into >>>>>> FoundationDB, we have struggled to reproduce the full map/reduce >>>>>> functionality. Happily this has now happened, and that work is now >>>>>> merged to the couchdb main branch. >>>>>> >>>>>> This functionality includes the use of custom (javascript) reduce >>>>>> functions. It is my experience that these are very often problematic, in >>>>>> that much more often than not the functions do not significantly reduce >>>>>> the input parameters into a smaller result (indeed, sometimes the output >>>>>> is the same or larger than the input). >>>>>> >>>>>> To that end, I'm asking if we should deprecate the feature entirely. >>>>>> >>>>>> In scope for this thread is the middle ground proposal that Paul Davis >>>>>> has written up here; >>>>>> >>>>>> https://github.com/apache/couchdb/pull/3214 >>>>>> >>>>>> Where custom reduces are not allowed by default but can be enabled. >>>>>> >>>>>> The core _ability_ to do custom reduces will always been maintained, >>>>>> this is intrinsic to the design of ebtree, the structure we use on top >>>>>> of FoundationDB to hold and maintain intermediate reduce values. >>>>>> >>>>>> My view is that we should merge #3214 and disable custom reduces by >>>>>> default. >>>>>> >>>>>> B. >>>>>> >