Hi,
did some testing again with the following software/hardware (used Jan’s
instructions):
Setup
=====
CouchDB: v. 3.2.2-749474c-dirty (q=2 / n=1)
deno 1.28.2 (release, x86_64-apple-darwin) / v8 10.9.194.1 / typescript 4.8.3
macOS 10.13 / iMac Mid 2011
Timing
======
1M docs Build-time mrview sm Build-time mrview deno Ratio sm/deno
of size in seconds in seconds
100 B 492 482
1,02
1000 B 1521 1531
0,99
10000 B 4943 5308
0,93
The docs are created with couchdyno [1]:
100 Bytes docs:
./venv/bin/couchdyno-setup -t 1000000 -s 100 -u 1000000 -w
http://a:[email protected]:15984/deno-100b
1000 Bytes docs:
./venv/bin/couchdyno-setup -t 1000000 -s 1000 -u 1000000 -w
http://a:[email protected]:15984/deno-1000b
10000 Bytes docs:
./venv/bin/couchdyno-setup -t 1000000 -s 10000 -u 1000000 -w
http://a:[email protected]:15984/deno-10000b
I used the following ddoc for testing spidermonkey:
{
"_id": "_design/sm",
"views": {
"sm": {
"map": "function (doc) {\n emit([doc.data, doc.ts], Date.now());\n}"
}
},
"language": "javascript"
}
and the ddoc for testing deno:
{
"_id": "_design/dn",
"views": {
"dn": {
"map": "function (doc) {\n emit([doc.data, doc.ts], Date.now());\n}"
}
},
"language": "deno“
}
You can play with it under the test-branch „couchdb-deno“ [2].
You need to /download/install deno and put it in your PATH var.
Start CouchDB with (change your path to main-deno.js):
make && COUCHDB_QUERY_SERVER_DENO="deno run --allow-write
/path/to/couchdb/share/server/main-deno.js" ./dev/run -a a:p -n 1
/Ronny
[1] https://github.com/cloudant-labs/couchdyno
[2] https://github.com/apache/couchdb/tree/couchdb-deno
> Am 15.05.2020 um 17:27 schrieb Nick Vatamaniuc <[email protected]>:
>
> That's really cool, Jan! Thanks for sharing.
>
> Deno is based on v8 so it is interesting to see how it compares in
> performance. Though, you're probably right that it could be just stdio
> limited here.
>
> Definitely like the simplicity and 500LoC part, too.
>
> Cheers,
> -Nick
>
> On Fri, May 15, 2020 at 6:39 AM Jan Lehnardt <[email protected]> wrote:
>>
>> By way of not really scientifically benchmarking this, but for getting a
>> feel for things, I ran timing tests with three different document size
>> classes:
>>
>> - 100 bytes
>> - 512 bytes
>> - 1024 bytes
>>
>> I’m using our trusty benchbulk.sh[1] script, so the majority of the data is
>> a single long value field with loads of `0`s in them. In no way
>> representative, but quick to produce 1M docs.
>>
>> I’m running this on an 8 core Mac Mini with a very fast SSD, three runs per
>> version. This is on my regular work machine, so other things are going on,
>> but the timings are surprisingly stable to the second.
>>
>> I measured how long it takes to build an index over 1M documents in a q=2
>> database, to leave enough cores for CouchDB and whatever else is going on on
>> the box. At no point are CPU or RAM maxed out, neither is the disk IO
>> capacity.
>>
>> Repeated and interleaved runs should shake out any file system caching
>> variability (which I couldn’t observe anyway).
>>
>> 100 byte docs:
>>
>> couchjs is ~10% faster than deno, while using ~60% less CPU (40% vs. 70%),
>> RAM usage is rather erratic, springs from 20MB to 180MB and back
>> periodically. deno RAM grows very slowly, maxing out at 110MB at the end of
>> the run, so I presume whatever long-generational GC isn’t even kicking in
>> yet.
>>
>> 512 byte docs:
>>
>> couchjs is ~5% faster than deno, same CPU and RAM profiles.
>>
>> 1024 byte docs:
>>
>> couchis is 20% slower(!) than deno, same CPU and RAM profiles.
>>
>> At 512 and 1024 byte docs, deno makes beam.smp work a little harder, about
>> 5% CPU usage.
>>
>> All of the runs take between 1 and 2 minutes, so longer-running impacts
>> aren’t showing here.
>>
>> As you can see, this is very unscientific, but gives us an interesting
>> direction.
>>
>> Depending on the workload, the deno query server *might* lead to faster
>> indexing, on larger docs, while making potentially better use of available
>> CPU resources (or less euphemistically: at the expense of using more CPU
>> time), and with a lot more stable RAM profile.
>>
>> Given that I was able to put this together relatively quickly, and deno is
>> very new, I find this rather promising.
>>
>> In addition, since it is rather easy distributing this query server (install
>> deno, download the .js file, set an env var, done), this might be a nice
>> community alternative to couchjs for folks who see benefits.
>>
>> I’d also like to see us taking this to the deno folks to see if they have
>> anything up their sleeve in terms of speeding up stdio, or if there are
>> tricks we can pull on the JS side.
>>
>> * * *
>>
>> One more interesting point I didn’t mention last night: this is entirely in
>> JS based on an existing runtime. As opposed to couchjs, where we currently
>> maintain a C and a C++ integration layer that nobody likes touching.
>>
>> A pure-JS implementation, and my cleaned up (albeit less feature-full)
>> ~500LoC of relatively modern JS might lead to renewed innovation in the
>> space. Who doesn’t like a well-defined performance game :)
>>
>> Plus, it’d be interesting to see if the TypeScript compiler could add more
>> optimisations once the query server implementation is translated and
>> type-annotated to be proper TypeScript.
>>
>> Best
>> Jan
>> —
>> [1]: https://github.com/apache/couchdb/blob/master/test/bench/benchbulk.sh
>>
>>> On 14. May 2020, at 22:01, Jan Lehnardt <[email protected]> wrote:
>>>
>>> Hey all,
>>>
>>> I got nerd sniped by Joan this morning:
>>>
>>> <+Wohali> hmmmmm. https://github.com/denoland/deno
>>> <+Wohali> i know i know another runtime but it's focused on security
>>>
>>> I wondered what it would take to make a couchjs variant based on deno.
>>> Turns out: about a day if you cut some corners ;)
>>>
>>> One of the interesting aspects, as Joan notes, is its
>>> more-secure-by-default, so I have some hopes that this might work out
>>> better than our ill-fated nodejs query server experiment from a few years
>>> back.
>>>
>>> I started by hacking up a readily generated main.js, then ran `make` again,
>>> and did it all again. Overall, it is ~30 LOC changes. Since there is no
>>> synchronous `readline()` available and JS code can either by sync or async,
>>> we can’t make it so one source could run in our couchjs or deno.
>>>
>>> So I went ahead and ripped all the basics out of our main.js and modernised
>>> things a little bit along the way. The result is a main-deno.js that can
>>> run map/reduce/rereduce/filter/view_filter/validate_doc_update functions
>>> (as validated by the query server spec).
>>>
>>> https://gist.github.com/janl/c3139bc72efe663e35005d8864c4201f
>>>
>>> I intentionally left out the couchappy functions, as at least lists with
>>> the `getRow()` function won’t be implementable without an API break. I also
>>> left out legacy compact with esprima/escodegen to keep things more
>>> manageable. Oh and no lib/modules, given today’s JS packaging tooling, it’s
>>> an easy choice to leave out.
>>>
>>> I haven’t done any sort of benchmarking, but I’d love for someone here to
>>> give this a try. Here’s how to hack up `./dev/run` to add support for
>>> `deno` design docs:
>>>
>>> https://gist.github.com/janl/01559f8617ef44afd5ceec39ec8389e8
>>>
>>> If you want to run this on a regular CouchDB setup, set up this env var
>>> before launching CouchDB:
>>>
>>> COUCHDB_QUERY_SERVER_DENO="deno run --allow-write /path/to/main-deno.js”
>>>
>>> `--allow-write` is only required for the debug log (/tmp/deno-qs.log),
>>> but won’t be required during operation, adding to the sandboxed nature of
>>> it all.
>>>
>>> And some proof of operation:
>>>
>>> https://gist.github.com/janl/8636d469420a1fd2de481ae8f5780854
>>>
>>> It’d be nice to see how stable this is in practice and if there are any
>>> meaningful performance / resource-usage differences. Any takers? I’ll
>>> answer any and all setup questions.
>>>
>>> Now I’m passing the nerd-snipe torch to Paul:
>>>
>>> <+jan____> uh, and it is embeddable
>>> https://deno.land/manual/embedding_deno
>>>
>>> Best
>>> Jan
>>> —
>>