On Aug 15, 2013, at 10:09 , Robert Newson <rnew...@apache.org> wrote:
> A big +1 to Jason's clarification of "erlang" vs "native". CouchDB > could have shipped an erlang view server that worked in a separate > process and had the stdio overhead, to combine the slowness of the > protocol with the obtuseness of erlang. ;) > > Evaluating Javascript within the erlang VM process intrigues me, Jens, > how is that done in your case? I've not previously found the assertion > that V8 would be faster than SpiderMonkey for a view server compelling > since the bottleneck is almost never in the code evaluation, but I do > support CouchDB switching to it for the synergy effects of a closer > binding with node.js, but if it's running in the same process, that > would change (though I don't immediately see why the same couldn't be > done for SpiderMonkey). Off the top of my head, I don't know a safe > way to evaluate JS in the VM. A NIF-based approach would either be > quite elaborate or would trip all the scheduling problems that > long-running NIF's are now notorious for. > > At a step removed, the view server protocol itself seems like the > thing to improve on, it feels like that's the principal bottleneck. The code is here: https://github.com/couchbase/couchdb/tree/master/src/mapreduce I’d love for someone to pick this up and give CouchDB, say, a ./configure --enable-native-v8 option or a plugin that allows people to opt into the speed improvements made there. :) The choice for V8 was made because of easier integration API and more reliable releases as a standalone project, which I think was a smart move. IIRC it relies on a change to CouchDB-y internals that has not made it back from Couchbase to CouchDB (Filipe will know, but I doubt he’s reading this thread), but we should look into that and get us “native JS views”, at least as an option or plugin. CCing dev@. Jan -- > > B. > > > On 15 August 2013 08:22, Jason Smith <j...@apache.org> wrote: >> Yes, to a first approximation, with a native view, CouchDB is basically >> running eval() on your code. In my example, I took advantage of this to >> build a nonstandard response to satisfy an application. (Instead of a 404, >> we sent a designated fallback document body.) >> >> But, if you accumulate the list in a native view, a JavaScript view, or a >> hypothetical Erlang view (i.e. a subprocess), from the operating system's >> perspective, the memory for that list will be allocated somewhere. Either >> the CouchDB process asks for X KB more memory, or its subprocess will ask >> for it. So I think the total system impact is probably low in practice. >> >> So I guess my point is not that native views are wrong, just they have a >> cost so you should weigh the cost/benefit for your own project. In the case >> of manage_couchdb, I wrote a JavaScript implementation; but since sometimes >> I have an emergency and I must find conflicts ASAP, I made an Erlang >> version because it is worth it. >> >> >> On Thu, Aug 15, 2013 at 2:05 PM, Stanley Iriele <siriele...@gmail.com>wrote: >> >>> Whoa...OK...that I had no idea about...thanks for taking the time to go to >>> that granularity, by the way. >>> >>> So does this mean that the process memory is shared? As apposed to living >>> in its own space?.so if someone accumulates a large json object in a list >>> function its chewing up couchdb's memory?... I guess I'm a little confused >>> about what's in the same process and what isn't now >>> On Aug 14, 2013 11:57 PM, "Jason Smith" <j...@apache.org> wrote: >>> >>>> To me, an Erlang view is a view server which supports map, reduce, show, >>>> update, list, etc. functions in the Erlang language. (Basically it is >>>> implemented in Erlang.) >>>> >>>> A view server is a subprocess that runs beneath CouchDB which >>> communicates >>>> with it over standard i/o. It is a different process in the operating >>>> system and only interfaces with the main server using the view server >>>> protocol (basically a bunch of JSON messages going back and forth). >>>> >>>> I do not know of an Erlang view server which works well and is currently >>>> maintained. >>>> >>>> A native view (shipped by CouchDB but disabled by default) is some >>>> corner-cutting. Code is evaluated directly by the primary CouchDB server. >>>> Since CouchDB is Erlang, the native query server is necessarily Erlang. >>> The >>>> key difference is, your code is right there in the eye of the storm. You >>>> can call couch_server:open("some_db") and completely circumvent security >>>> and other invariants which CouchDB enforces. You can leak memory until >>> the >>>> kernel OOM killer terminates CouchDB. It's not about the language, it's >>>> that is is running inside the CouchDB process. >>>> >>>> >>>> >>>> On Thu, Aug 15, 2013 at 1:36 PM, Stanley Iriele <siriele...@gmail.com >>>>> wrote: >>>> >>>>> Wait....I'm a tad confused here..Jason what is the difference between >>>>> native views and Erlang views?... >>>>> On Aug 14, 2013 11:16 PM, "Jason Smith" <j...@apache.org> wrote: >>>>> >>>>>> Oh, also: >>>>>> >>>>>> They are **not** Erlang views. They are **native** views. We should >>>>>> emphasize the latter to remind ourselves about the security and >>>>> reliability >>>>>> risks which Bob identifies. >>>>>> >>>>>> They are very powerful, but it is a trade-off. Once I had a customer >>>> who >>>>>> had a basic "class" document describing common values. All other >>>>> documents >>>>>> were for modifications to the "base class" so to speak. He needed to >>>>> query >>>>>> by document ID, but if no such document existed, return the "base >>>> class" >>>>>> document instead. The product was already in the field and so the >>> code >>>>>> could not change. We had to change it in CouchDB. >>>>>> >>>>>> The fix was very simple: a _rewrite rule to a native _show function. >>> In >>>>> the >>>>>> show function, if the Doc was null, then we used the internal CouchDB >>>> API >>>>>> to fetch the default document. Voila. >>>>>> >>>>>> >>>>>> On Thu, Aug 15, 2013 at 1:08 PM, Jason Smith <j...@apache.org> wrote: >>>>>> >>>>>>> On Thursday, August 15, 2013, Andrey Kuprianov wrote: >>>>>>> >>>>>>>> Doesnt server performance downgrade, while views are being >>> rebuilt? >>>> So >>>>>> the >>>>>>>> faster they are rebuilt, the better for you. >>>>>>> >>>>>>> >>>>>>> If my view build would degrade total performance to cross an >>>>> unacceptable >>>>>>> threshold, then I am really riding the line! What about an >>> unplanned >>>>>>> compaction? What if one day the clients have a bug and load >>>> increases? >>>>>> What >>>>>>> if an unplanned disaster happens and a backup must be performed >>>>> urgently? >>>>>>> >>>>>>> I would evaluate view performance in the larger context of the >>> entire >>>>>>> application life cycle. >>>>>>> >>>>>>> Men seem to want to date beautiful women. It is a very high >>> priority >>>> at >>>>>>> the pub or whatever. But long-married men do not even think about >>>> their >>>>>>> wife's attractiveness because that is a small, superficial part of >>> a >>>>> much >>>>>>> larger story. >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Besides, looks like it's possible to do the same 3 steps with >>> design >>>>> doc >>>>>>>> views created in Erlang? Or is it just about using require() in >>>>> Node.js? >>>>>>>> >>>>>>> >>>>>>> Actually, yes that is a fine point. I myself prefer Node.js but >>>> anyone >>>>>> can >>>>>>> choose the best fit for them. >>>>>>> >>>>>>> And speaking more broadly, CouchDB is a very flexible platform so >>> it >>>> is >>>>>>> quite likely that my own policies do not apply to every use case. >>> In >>>>> fact >>>>>>> if I'm honest I use native views myself, usually for unplanned >>>>>>> troubleshooting, I want to find conflicts so I use manage_couchdb: >>>>>>> http://github.com/iriscouch/manage_couchdb >>>>>>> >>>>>>> My main point is, anybody time somebody says "performance" ask >>>> yourself >>>>>> if >>>>>>> it is really a "performance siren." Earlier in this thread, Jens >>>> raises >>>>>>> some examples of plausible true performance requirements, not just >>>>> siren >>>>>>> songs. >>>>>>> >>>>>> >>>>> >>>> >>>
signature.asc
Description: Message signed with OpenPGP using GPGMail