[ 
https://issues.apache.org/jira/browse/COUCHDB-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13147375#comment-13147375
 ] 

Paul Joseph Davis commented on COUCHDB-1334:
--------------------------------------------

@Filipe, Awesome, this is considerably better than the old version.

As to this part:

+        % Can throw badarg error, when OsProc Pid is dead.
+        (catch port_connect(OsProc#os_proc.port, Pid))

That looks like the key to what I hadn't managed to track down when I tried 
something similar. I'm pretty sure we should be fine with couchspawnkillable, 
but do we need to close the port here and/or ignore some port exit status 
messages in this process? The other thing that is a bit confusing is why 
OsProc's Pid would be dead while its sitting idle. I haven't thought through 
all the implications here I guess.

Does this version maintain the same speedups as before? When I tried this 
approach I was actually doing it slightly differently by having the doc reader 
process send docs to the port which would then forward them directly to the 
writer process. There was some stuff that got a bit funky when I tried this 
though. IIRC it was something like, I had to pass deleted doc update_seq's 
directly to the writer process or it'd break  if the last update was a deletion 
(cause the writer would never see the update seq). Anyway, just a thought.

Good work on this.
                
> Indexer speedup (for non-native view servers)
> ---------------------------------------------
>
>                 Key: COUCHDB-1334
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1334
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core, JavaScript View Server, View Server 
> Support
>            Reporter: Filipe Manana
>            Assignee: Filipe Manana
>             Fix For: 1.2
>
>         Attachments: 0001-More-efficient-view-updater-writes.patch, 
> 0002-More-efficient-communication-with-the-view-server.patch, 
> master-0002-More-efficient-communication-with-the-view-server.patch, 
> master-2-0002-More-efficient-communication-with-the-view-server.patch
>
>
> The following 2 patches significantly improve view index generation/update 
> time and reduce CPU consumption.
> The first patch makes the view updater's batching more efficient, by ensuring 
> each btree bulk insertion adds/removes a minimum of N (=100) key/value 
> pairts. This also makes the index file size grow not so fast with old data 
> (old btree nodes basically). This behaviour is already done in master/trunk 
> in the new indexer (by Paul Davis).
> The second patch maximizes the throughput with an external view server (such 
> as couchjs). Basically it makes the pipe (erlang port) communication between 
> the Erlang VM (couch_os_process basically) and the view server more efficient 
> since the 2 sides spend less time block on reading from the pipe.
> Here follow some benchmarks.
> test database at  http://fdmanana.iriscouch.com/test_db  (1 million documents)
> branch 1.2.x
> $ echo 3 > /proc/sys/vm/drop_caches
> $ time curl http://localhost:5984/test_db/_design/test/_view/test1
> {"rows":[
> {"key":null,"value":1000000}
> ]}
> real  2m45.097s
> user  0m0.006s
> sys   0m0.007s
> view file size: 333Mb
> CPU usage:
> $ sar 1 60
> 22:27:20  %usr  %nice   %sys   %idle
> 22:27:21   38      0     12     50
> (....)
> 22:28:21   39      0     13     49
> Average:     39      0     13     47   
> branch 1.2.x + batch patch (first patch)
> $ echo 3 > /proc/sys/vm/drop_caches
> $ time curl http://localhost:5984/test_db/_design/test/_view/test1
> {"rows":[
> {"key":null,"value":1000000}
> ]}
> real  2m12.736s
> user  0m0.006s
> sys   0m0.005s
> view file size 72Mb
> branch 1.2.x + batch patch + os_process patch
> $ echo 3 > /proc/sys/vm/drop_caches
> $ time curl http://localhost:5984/test_db/_design/test/_view/test1
> {"rows":[
> {"key":null,"value":1000000}
> ]}
> real  1m9.330s
> user  0m0.006s
> sys   0m0.004s
> view file size:  72Mb
> CPU usage:
> $ sar 1 60
> 22:22:55  %usr  %nice   %sys   %idle
> 22:23:53   22      0      6     72
> (....)
> 22:23:55   22      0      6     72
> Average:     22      0      7     70   
> master/trunk
> $ echo 3 > /proc/sys/vm/drop_caches
> $ time curl http://localhost:5984/test_db/_design/test/_view/test1
> {"rows":[
> {"key":null,"value":1000000}
> ]}
> real  1m57.296s
> user  0m0.006s
> sys   0m0.005s
> master/trunk + os_process patch
> $ echo 3 > /proc/sys/vm/drop_caches
> $ time curl http://localhost:5984/test_db/_design/test/_view/test1
> {"rows":[
> {"key":null,"value":1000000}
> ]}
> real  0m53.768s
> user  0m0.006s
> sys   0m0.006s

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to