Indexer speedup (for non-native view servers)
---------------------------------------------

                 Key: COUCHDB-1334
                 URL: https://issues.apache.org/jira/browse/COUCHDB-1334
             Project: CouchDB
          Issue Type: Improvement
          Components: Database Core, JavaScript View Server, View Server Support
            Reporter: Filipe Manana
            Assignee: Filipe Manana
             Fix For: 1.2
         Attachments: 0001-More-efficient-view-updater-writes.patch, 
0002-More-efficient-communication-with-the-view-server.patch, 
master-0002-More-efficient-communication-with-the-view-server.patch

The following 2 patches significantly improve view index generation/update time 
and reduce CPU consumption.

The first patch makes the view updater's batching more efficient, by ensuring 
each btree bulk insertion adds/removes a minimum of N (=100) key/value pairts. 
This also makes the index file size grow not so fast with old data (old btree 
nodes basically). This behaviour is already done in master/trunk in the new 
indexer (by Paul Davis).

The second patch maximizes the throughput with an external view server (such as 
couchjs). Basically it makes the pipe (erlang port) communication between the 
Erlang VM (couch_os_process basically) and the view server more efficient since 
the 2 sides spend less time block on reading from the pipe.

Here follow some benchmarks.


test database at  http://fdmanana.iriscouch.com/test_db  (1 million documents)


branch 1.2.x

$ echo 3 > /proc/sys/vm/drop_caches
$ time curl http://localhost:5984/test_db/_design/test/_view/test1
{"rows":[
{"key":null,"value":1000000}
]}

real    2m45.097s
user    0m0.006s
sys     0m0.007s

view file size: 333Mb

CPU usage:

$ sar 1 60
22:27:20  %usr  %nice   %sys   %idle
22:27:21   38      0     12     50
(....)
22:28:21   39      0     13     49
Average:     39      0     13     47   


branch 1.2.x + batch patch (first patch)

$ echo 3 > /proc/sys/vm/drop_caches
$ time curl http://localhost:5984/test_db/_design/test/_view/test1
{"rows":[
{"key":null,"value":1000000}
]}

real    2m12.736s
user    0m0.006s
sys     0m0.005s

view file size 72Mb


branch 1.2.x + batch patch + os_process patch

$ echo 3 > /proc/sys/vm/drop_caches
$ time curl http://localhost:5984/test_db/_design/test/_view/test1
{"rows":[
{"key":null,"value":1000000}
]}

real    1m9.330s
user    0m0.006s
sys     0m0.004s

view file size:  72Mb

CPU usage:

$ sar 1 60
22:22:55  %usr  %nice   %sys   %idle
22:23:53   22      0      6     72
(....)
22:23:55   22      0      6     72
Average:     22      0      7     70   



master/trunk

$ echo 3 > /proc/sys/vm/drop_caches
$ time curl http://localhost:5984/test_db/_design/test/_view/test1
{"rows":[
{"key":null,"value":1000000}
]}

real    1m57.296s
user    0m0.006s
sys     0m0.005s


master/trunk + os_process patch

$ echo 3 > /proc/sys/vm/drop_caches
$ time curl http://localhost:5984/test_db/_design/test/_view/test1
{"rows":[
{"key":null,"value":1000000}
]}

real    0m53.768s
user    0m0.006s
sys     0m0.006s




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to