And that was 150 requests per second, of course, not per minute. Also, I tried my original test with about 200 requests per second and I think that the problem goes away (at least didn't see any problem for a couple of minutes.)
Thanks, Marco On 11 May 2012 15:45, Marco Monteiro <[email protected]> wrote: > Now the script really is attached. Promise. > > > On 11 May 2012 15:43, Marco Monteiro <[email protected]> wrote: > >> I was trying nodeload but could not generate the load I need to trigger >> the problem. I attached the script. Can you tell me how to change the >> script >> to get to the load I need to trigger the problem? >> >> The attached script was making about 150 request per minute. >> >> Thanks, >> Marco >> >> >> On 11 May 2012 14:26, Robert Newson <[email protected]> wrote: >> >>> Can you reproduce this behavior with other benchmarking tools? ab, >>> nodeload, etc? >>> >>> B. >>> >>> On 11 May 2012 14:18, Marco Monteiro <[email protected]> wrote: >>> > Each node.js process had multiple concurrent requests. I just tried >>> with >>> > sequential requests and the problem persists. >>> > >>> > So, now I have 8 node.js processes each sending one write request only >>> > after the previous when is done. And the problem remains. >>> > >>> > The machine is not under any kind of huge load. Both top and iostat >>> report >>> > less than 10% machine use. The machines have 8 Core Xeon with 4 >>> > 10000 rpm hard disks in raid 10 and 16 Gb.of RAM. >>> > >>> > Note that I'm testing with less than 500 requests per second, at the >>> > moment. >>> > >>> > One more thing: when the problem happens, it's not that the database >>> becomes >>> > slow. It just drops the requests. And reads also fail. For example, >>> trying >>> > to >>> > use Futon I get a "connection was reset" message from firefox. >>> > >>> > This is on CouchDB 1.2. I'm going to try 1.1.1 next. >>> > >>> > Thanks, >>> > Marco >>> > >>> > On 11 May 2012 13:56, Robert Newson <[email protected]> wrote: >>> > >>> >> Perhaps CouchDB on this particular hardware isn't fast enough to cope >>> >> with 4,000 writes per second? >>> >> >>> >> Does your node.js test send every update asynchronously or is it >>> >> carefully controlling qps? For what it's worth, I've benchmarked >>> >> successfully using a node.js library called nodeload >>> >> (https://github.com/benschmaus/nodeload). It's been a while since I >>> >> last used it, and node has changed a few dozen times since then, but >>> >> it was pretty solid and sane when I was using it. >>> >> >>> >> B. >>> >> >>> >> On 11 May 2012 13:48, Marco Monteiro <[email protected]> wrote: >>> >> > Thanks, Robert. >>> >> > >>> >> > Disabling delayed commits did make the problem start later, but it >>> is >>> >> still >>> >> > there. >>> >> > >>> >> > It's funny that the first think that I checked when I first saw this >>> >> > problem was to >>> >> > make sure that delayed commits where enabled. >>> >> > >>> >> > Thanks, >>> >> > Marco >>> >> > >>> >> > On 11 May 2012 13:20, Robert Newson <[email protected]> wrote: >>> >> > >>> >> >> The first thing is to ensure you have disabled delayed commits; >>> >> >> >>> >> >> curl -XPUT -d '"false" >>> localhost:5984/_config/couchdb/delayed_commits >>> >> >> >>> >> >> This is the production setting anyway (though not the default >>> because >>> >> >> of complaints from incompetent benchmarkers). This will ensure an >>> >> >> fsync for each write and, as a consequence, will greatly smooth >>> your >>> >> >> insert performance. Since you said you were inserting concurrently, >>> >> >> you should not experience a slowdown either. >>> >> >> >>> >> >> B. >>> >> >> >>> >> >> On 11 May 2012 02:42, Marco Monteiro <[email protected]> >>> wrote: >>> >> >> > Hello! >>> >> >> > >>> >> >> > I'm running a load test on CouchDB. I have a cluster of 8 node.js >>> >> servers >>> >> >> > writing to >>> >> >> > CouchDB. They write about 30000 documents per minute (500 per >>> second). >>> >> >> > There are >>> >> >> > multiple concurrent requests form each server. There are no >>> updates: >>> >> >> > documents are >>> >> >> > created and not modified. >>> >> >> > >>> >> >> > I first tried CouchDB 1.1.1 from Debian 6.4 apt repo. After a few >>> >> >> minutes, >>> >> >> > CouchDB >>> >> >> > starts freezing for a period of 1 to 3 seconds about every 10 >>> >> seconds. It >>> >> >> > keeps this >>> >> >> > behaviour for some time and eventually it starts freezing more >>> >> frequently >>> >> >> > and for longer >>> >> >> > periods. When the database has about 1.5 million documents, >>> couchdb is >>> >> >> > freezing for >>> >> >> > more than 5 seconds each time. >>> >> >> > >>> >> >> > I then tried CouchDB 1.2, from build-couch. The freezes happen >>> with it >>> >> >> > also, but the >>> >> >> > behavior is much worse: in less than one minute it's freezing >>> for 5 >>> >> >> seconds >>> >> >> > or more, >>> >> >> > and it spends more time not doing anything than working. >>> >> >> > >>> >> >> > When testing with 1.1.1 I was writing only to one database. With >>> 1.2 I >>> >> >> > tried with one database >>> >> >> > and then with multiple databases but the problem was exactly the >>> same. >>> >> >> > >>> >> >> > The documents have about 10 properties, only numbers or string >>> and the >>> >> >> > strings are small >>> >> >> > (about 20 chars each). The document IDs are generated in the app >>> and >>> >> have >>> >> >> > the format >>> >> >> > >>> >> >> > <milliseconds since epoch>-<random 16 chars string> >>> >> >> > >>> >> >> > When CouchDB freezes, it's processor use (from top) goes to >>> zero. It >>> >> does >>> >> >> > not reply to read or write >>> >> >> > requests. The disk does not seem to be the problem as iostat >>> reports >>> >> >> near >>> >> >> > 0% utilization. >>> >> >> > CPU is mostly idle, and from the 16 GB of RAM, some of it is >>> free and >>> >> is >>> >> >> > not even used to >>> >> >> > cache disk. >>> >> >> > >>> >> >> > There are no error message in Couchdb log. >>> >> >> > >>> >> >> > I tried this in two different machines and the problem is the >>> same in >>> >> >> both. >>> >> >> > >>> >> >> > I did not change anything in the configuration files expect >>> changing >>> >> the >>> >> >> > database dir to use >>> >> >> > a RAID partition. >>> >> >> > >>> >> >> > Anyone has any idea of what the problem could be? >>> >> >> > >>> >> >> > Any help solving this issue is greatly appreciated. >>> >> >> > >>> >> >> > Thanks, >>> >> >> > Marco >>> >> >> >>> >> >>> >> >> >
