Hi Stephan,

welcome to CouchDB and asynchronous programming :)

Let’s annotate your code to see what’s happening:

const nano = require('nano')("http://localhost:5984”); # get a a nano instance, 
all good
const larch = nano.db.use('larch’);                    # tell nano which db to 
use, all good

larch.list().then((allDocs) => {                       # get a list of all 
docs, asynchronously
                                                       # this makes one request 
to CouchDB and
                                                       # CouchDB returns a 
large JSON object with
                                                       # all documents in an 
array structure.
                                                       # when this is read into 
nano, the callback
                                                       # in `then()` is called 
with that result.


   var len = allDocs.rows.length;                      # get the array length, 
all good
   console.log('total # of docs -> ' + len);           # console.log, all good

   allDocs.rows.forEach((document) => {                # now you are iterating 
over the result set
                                                       # from above. For each 
id of a document in
                                                       # the result set, you 
send a single request
                                                       # to CouchDB 
asynchronously, and wait for the
                                                       # result.
                                                       # The clincher is here: 
the forEach loop is
                                                       # synchronous, and the 
.get() is asynchronous.
                                                       # You might think the 
code runs as it reads:
                                                       # Iterate over my list, 
fetch one doc, iterate
                                                       # to next item in the 
list, fetch one doc and so
                                                       # on until the list is 
all iterated over.
                                                       #
                                                       # What really happens is 
this:
                                                       # Iterate over my list, 
start the fetching of one doc,
                                                       # and register a 
`then()` callback to print the result
                                                       # when it comes back. 
Then iterate to the next item
                                                       #
                                                       # Now, ’starting the 
fetching of one doc’ is a near
                                                       # instant operation that 
just sets everything up to
                                                       # make an HTTP request, 
so iterating over the whole
                                                       # list is really quick.
                                                       # In your case, you are 
starting 80000 requests to
                                                       # CouchDB in a VERY 
short amount of time, and you
                                                       # are quickly running 
out of operating system resources
                                                       # (ENOBUFS, or no more 
buffers to do networking with)
                                                       # to make HTTP requests.

       larch.get(document.id).then((body) => {
           console.log("id: " + body._id);
       });
   });
});

There are multiple ways to solve this in asynchronous programming. One would be 
a queue with a maximum parallel jobs setting that keeps concurrent operations 
to a nice minimum, but eventually gets you all the results. I’ve used 
https://www.npmjs.com/package/promise-queue in the past for this.

To get the same result in CouchDB, you can pass in the `include_docs` parameter 
set to `true`, then CouchDB will fetch the doc bodies for you in the original 
`larch.list()` request and include the doc bodies inside the result set.

Best
Jan
—

> On 12. Feb 2019, at 21:09, Stephan Mühlstrasser 
> <[email protected]> wrote:
> 
> Hi,
> 
> I'm new to CouchDB/nano and asynchronous JavaScript, and I hope this is
> the right place to ask questions about how to use the nano API.
> 
> I'm trying to process all documents in a local CouchDB database with the
> following code using nano (in the real program I want to get access to
> all fields of each document):
> 
> const nano = require('nano')("http://localhost:5984";);
> const larch = nano.db.use('larch');
> 
> larch.list().then((allDocs) => {
>    var len = allDocs.rows.length;
>    console.log('total # of docs -> ' + len);
> 
>    allDocs.rows.forEach((document) => {
>        larch.get(document.id).then((body) => {
>            console.log("id: " + body._id);
>        });
>    });
> });
> 
> The output is:
> 
> total # of docs -> 80973
> copydb.js:20
> (node:24704) UnhandledPromiseRejectionWarning: Error: connect ENOBUFS
> 127.0.0.1:5984 - Local (undefined:undefined)
> warning.js:18
>    at Object._errnoException (util.js:992:11)
>    at _exceptionWithHostPort (util.js:1014:20)
>    at internalConnect (net.js:960:16)
>    at defaultTriggerAsyncIdScope (internal/async_hooks.js:284:19)
>    at GetAddrInfoReqWrap.emitLookup [as callback] (net.js:1106:9)
>    at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:97:10)
> (node:24704) UnhandledPromiseRejectionWarning: Unhandled promise
> rejection. This error originated either by throwing inside of an async
> function without a catch block, or by rejecting a promise which was not
> handled with .catch(). (rejection id: 1)
> warning.js:18
> (node:24704) [DEP0018] DeprecationWarning: Unhandled promise rejections
> are deprecated. In the future, promise rejections that are not handled
> will terminate the Node.js process with a non-zero exit code.
> warning.js:18
> (node:24704) UnhandledPromiseRejectionWarning: Error: connect ENOBUFS
> 127.0.0.1:5984 - Local (undefined:undefined)
> ... and so on ...
> 
> This happens on a Windows 10 machine with CouchDB 2.1.1 and node
> v8.11.2. When I restrict the processing to a few rows by taking a slice
> of the allDocs.rows array then no errors occur.
> 
> I found some pointers on stackoverflow that this error may be caused by
> too many HTTP requests being done in parallel. So it looks like I'm
> using the API in a wrong way to process all documents in a database.
> 
> What would be the correct approach to avoid the "connect ENOBUFS" errors?
> 
> Thanks
> Stephan

-- 
Professional Support for Apache CouchDB:
https://neighbourhood.ie/couchdb-support/

Reply via email to