I was a little surprised by this, and did a quick grep on spawn on the couchdb code base, it isn't used that much and I wonder why. In particular for bulk doc updates I wonder why pmap isn't used (perhaps I am missing something), for my project I have needed to write a custom updater, a pre-writer, (great modular design now btw in 0.9 I was able to hook it all up - updater, query server from the ini file) and I am going up to a javascript function to parse an incoming document (not JSON) into multiple JSON documents I am then using a pmap to write this to the db, doesn't seem to be that complicated and I should then get an increase in speed to compensate for the initial javascript parsing delay.
Could all (or most) use of the plain map be replaced with a pmap in couchdb or am I missing something? Not to hijack this thread, but I have written a pre-writer, are there any plans to add filters (pre and post) on the couchdb roadmap? As a J2EE developer, I still see the benefit of the Erlang VM - easier to develop 'fast' applications in for sure. thanks, Norman On Sun, Aug 9, 2009 at 8:13 PM, Nitin Borwankar<[email protected]> wrote: > I believe there are at least 4 spindles. But I don't expect to be > continually writing or writing while generating indexes - this is meant to > be primarily a reference "datawarehouse" kind of read-only database - writes > will happen in batch mode once a day for a couple of hours during which time > no reads will be taking place. All initial view generation will happen when > no writes are taking place. > Having views on separate disk from db is a good suggestion I'll see about > the symlinks part as well. > > Thanks much all for all the info - this has been extremely useful and > helpful > > Nitin > > 37% of all statistics are made up on the spot > ------------------------------------------------------------------------------------- > Nitin Borwankar > [email protected] > > > On Sun, Aug 9, 2009 at 2:42 PM, Chris Anderson <[email protected]> wrote: > >> On Sun, Aug 9, 2009 at 2:25 PM, Nitin Borwankar<[email protected]> >> wrote: >> > On Sun, Aug 9, 2009 at 12:13 PM, Adam Kocoloski <[email protected]> >> wrote: >> > >> >> >> >> Hi Nitin, Jan's right, if you're only building views from a single >> design >> >> doc you won't get much indexing speedup from multi-core at the moment. >> We >> >> do spawn multiple couchjs processes (often one does the map and the >> other >> >> the reduce), but we don't map docs out to them simultaneously or >> anything. >> >> Also, the Erlang process communicating with couchjs blocks and waits >> for >> >> the results when it sends data out. Best, >> >> >> >> Adam >> >> >> > >> > Ok, so I could have multiple databases each with multiple design docs and >> > make a number of requests and have a number of couchjs processes spread >> out >> > across multiple cores right? >> > >> > I suppose what I am hearing is that views in a single design doc can >> become >> > a bottleneck but views in multiple design docs will get the benefit of >> > multiple couchjs processes. >> > >> > Since this part of my experiment is not couchapp based, I am completely >> fine >> > having a) multiple databases and b) multiple design docs per database. >> > >> >> Unless you have lots of disks, you won't gain performance from >> multiple dbs. However, you might do well you put views onto a >> different disk from the db file, if you'll be writing while generating >> indexes. or spread view files across disks with symlinks so disks >> don't have to seek as much to record view rows. >> >> > So this gives me the OS level scheduling of couchjs processes across >> cores, >> > I think. >> > >> > Nitin >> > >> > >> > >> > 37% of all statistics are made up on the spot >> > >> ------------------------------------------------------------------------------------- >> > Nitin Borwankar >> > [email protected] >> > >> >> >> >> -- >> Chris Anderson >> http://jchrisa.net >> http://couch.io >> >
