[jira] [Commented] (COUCHDB-1461) replication timeout and loop
[ https://issues.apache.org/jira/browse/COUCHDB-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283325#comment-13283325 ] Filipe Manana commented on COUCHDB-1461: Thanks for testing Benjamin. I will merge a small variant of that patch soon. replication timeout and loop Key: COUCHDB-1461 URL: https://issues.apache.org/jira/browse/COUCHDB-1461 Project: CouchDB Issue Type: Bug Affects Versions: 1.2, 1.3 Reporter: Benoit Chesneau Attachments: 12x-0001-Avoid-possible-timeout-initializing-replications.patch, master-0001-Avoid-possible-timeout-initializing-replications.patch, test.py When you try to do at the same time a replication in both way, it will timeout then restart after 5s. Sometimes it won't be able to recover well. Adding a sleep between 2 reps is possibly solving it but it shouldn't be needed. Attached is a script using couchdbkit to reproduce the problem. SERVER_URI need to be changed to point to your couchdb node. Log: 09:09:24.016 [info] 127.0.0.1 - - HEAD /testdb1/ 404 09:09:24.028 [info] 127.0.0.1 - - PUT /testdb1/ 201 09:09:24.033 [info] 127.0.0.1 - - HEAD /testdb2/ 404 09:09:24.046 [info] 127.0.0.1 - - PUT /testdb2/ 201 09:09:24.071 [info] 127.0.0.1 - - GET /_replicator/_all_docs?include_docs=true 200 09:09:28.110 [info] 127.0.0.1 - - PUT /_replicator/rep1 201 09:09:28.119 [info] 127.0.0.1 - - PUT /_replicator/rep2 201 09:09:28.121 [info] Attempting to start replication `23280770e617f3a82f398b8eca09aaef` (document `rep1`). 09:09:28.123 [info] Attempting to start replication `e42aaea4a0ceb931930834ecf7b79600` (document `rep2`). 09:09:28.169 [info] 127.0.0.1 - - HEAD /testdb2/ 200 09:09:28.172 [info] 127.0.0.1 - - GET /testdb2/ 200 09:09:28.176 [info] 127.0.0.1 - - GET /testdb2/_local/e42aaea4a0ceb931930834ecf7b79600 404 09:09:28.179 [info] 127.0.0.1 - - GET /testdb2/_local/f129a5531f82eb089a3e1ca9e80c9ad2 404 09:09:28.194 [info] Replication `e42aaea4a0ceb931930834ecf7b79600` is using: 4 worker processes a worker batch size of 500 20 HTTP connections a connection timeout of 3 milliseconds 10 retries per request socket options are: [{keepalive,true},{nodelay,false}] 09:09:28.196 [info] 127.0.0.1 - - GET /testdb2/_changes?feed=normalstyle=all_docssince=0heartbeat=1 200 09:09:28.202 [info] Document `rep2` triggered replication `e42aaea4a0ceb931930834ecf7b79600` 09:09:28.203 [info] starting new replication `e42aaea4a0ceb931930834ecf7b79600` at 0.262.0 (`http://localhost:15984/testdb2/` - `testdb1`) 09:09:28.208 [info] 127.0.0.1 - - HEAD /testdb2/ 200 09:09:28.212 [info] 127.0.0.1 - - GET /testdb2/ 200 09:09:28.218 [info] 127.0.0.1 - - GET /testdb2/_local/23280770e617f3a82f398b8eca09aaef 404 09:09:28.219 [info] Replication `e42aaea4a0ceb931930834ecf7b79600` finished (triggered by document `rep2`) 09:09:28.223 [info] 127.0.0.1 - - GET /testdb2/_local/4b04e1e066f4ad1f988669036080ed9c 404 09:09:28.225 [info] Replication `23280770e617f3a82f398b8eca09aaef` is using: 4 worker processes a worker batch size of 500 20 HTTP connections a connection timeout of 3 milliseconds 10 retries per request socket options are: [{keepalive,true},{nodelay,false}] 09:09:58.203 [error] gen_server 0.287.0 terminated with reason: killed 09:09:58.207 [error] CRASH REPORT Process 0.287.0 with 0 neighbours crashed with reason: {killed,[{gen_server,terminate,6,[{file,gen_server.erl},{line,737}]},{proc_lib,init_p_do_apply,3,[{file,proc_lib.erl},{line,227}]}]} 09:09:58.215 [error] Error in replication `23280770e617f3a82f398b8eca09aaef` (triggered by document `rep1`): timeout Restarting replication in 5 seconds. 09:10:03.223 [info] 127.0.0.1 - - HEAD /testdb2/ 200 09:10:03.227 [info] 127.0.0.1 - - GET /testdb2/ 200 09:10:03.231 [info] 127.0.0.1 - - GET /testdb2/_local/23280770e617f3a82f398b8eca09aaef 404 09:10:03.235 [info] 127.0.0.1 - - GET /testdb2/_local/4b04e1e066f4ad1f988669036080ed9c 404 09:10:03.237 [info] Replication `23280770e617f3a82f398b8eca09aaef` is using: 4 worker processes a worker batch size of 500 20 HTTP connections a connection timeout of 3 milliseconds 10 retries per request socket options are: [{keepalive,true},{nodelay,false}] 09:10:03.244 [info] Document `rep1` triggered replication `23280770e617f3a82f398b8eca09aaef` 09:10:03.245 [info] starting new replication `23280770e617f3a82f398b8eca09aaef` at 0.335.0 (`testdb1` - `http://localhost:15984/testdb2/`) 09:10:03.253 [info] Replication `23280770e617f3a82f398b8eca09aaef` finished (triggered by document `rep1`) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please
[jira] [Closed] (COUCHDB-1461) replication timeout and loop
[ https://issues.apache.org/jira/browse/COUCHDB-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana closed COUCHDB-1461. -- Resolution: Fixed Fix Version/s: 1.2.1 replication timeout and loop Key: COUCHDB-1461 URL: https://issues.apache.org/jira/browse/COUCHDB-1461 Project: CouchDB Issue Type: Bug Affects Versions: 1.2, 1.3 Reporter: Benoit Chesneau Fix For: 1.2.1 Attachments: 12x-0001-Avoid-possible-timeout-initializing-replications.patch, master-0001-Avoid-possible-timeout-initializing-replications.patch, test.py When you try to do at the same time a replication in both way, it will timeout then restart after 5s. Sometimes it won't be able to recover well. Adding a sleep between 2 reps is possibly solving it but it shouldn't be needed. Attached is a script using couchdbkit to reproduce the problem. SERVER_URI need to be changed to point to your couchdb node. Log: 09:09:24.016 [info] 127.0.0.1 - - HEAD /testdb1/ 404 09:09:24.028 [info] 127.0.0.1 - - PUT /testdb1/ 201 09:09:24.033 [info] 127.0.0.1 - - HEAD /testdb2/ 404 09:09:24.046 [info] 127.0.0.1 - - PUT /testdb2/ 201 09:09:24.071 [info] 127.0.0.1 - - GET /_replicator/_all_docs?include_docs=true 200 09:09:28.110 [info] 127.0.0.1 - - PUT /_replicator/rep1 201 09:09:28.119 [info] 127.0.0.1 - - PUT /_replicator/rep2 201 09:09:28.121 [info] Attempting to start replication `23280770e617f3a82f398b8eca09aaef` (document `rep1`). 09:09:28.123 [info] Attempting to start replication `e42aaea4a0ceb931930834ecf7b79600` (document `rep2`). 09:09:28.169 [info] 127.0.0.1 - - HEAD /testdb2/ 200 09:09:28.172 [info] 127.0.0.1 - - GET /testdb2/ 200 09:09:28.176 [info] 127.0.0.1 - - GET /testdb2/_local/e42aaea4a0ceb931930834ecf7b79600 404 09:09:28.179 [info] 127.0.0.1 - - GET /testdb2/_local/f129a5531f82eb089a3e1ca9e80c9ad2 404 09:09:28.194 [info] Replication `e42aaea4a0ceb931930834ecf7b79600` is using: 4 worker processes a worker batch size of 500 20 HTTP connections a connection timeout of 3 milliseconds 10 retries per request socket options are: [{keepalive,true},{nodelay,false}] 09:09:28.196 [info] 127.0.0.1 - - GET /testdb2/_changes?feed=normalstyle=all_docssince=0heartbeat=1 200 09:09:28.202 [info] Document `rep2` triggered replication `e42aaea4a0ceb931930834ecf7b79600` 09:09:28.203 [info] starting new replication `e42aaea4a0ceb931930834ecf7b79600` at 0.262.0 (`http://localhost:15984/testdb2/` - `testdb1`) 09:09:28.208 [info] 127.0.0.1 - - HEAD /testdb2/ 200 09:09:28.212 [info] 127.0.0.1 - - GET /testdb2/ 200 09:09:28.218 [info] 127.0.0.1 - - GET /testdb2/_local/23280770e617f3a82f398b8eca09aaef 404 09:09:28.219 [info] Replication `e42aaea4a0ceb931930834ecf7b79600` finished (triggered by document `rep2`) 09:09:28.223 [info] 127.0.0.1 - - GET /testdb2/_local/4b04e1e066f4ad1f988669036080ed9c 404 09:09:28.225 [info] Replication `23280770e617f3a82f398b8eca09aaef` is using: 4 worker processes a worker batch size of 500 20 HTTP connections a connection timeout of 3 milliseconds 10 retries per request socket options are: [{keepalive,true},{nodelay,false}] 09:09:58.203 [error] gen_server 0.287.0 terminated with reason: killed 09:09:58.207 [error] CRASH REPORT Process 0.287.0 with 0 neighbours crashed with reason: {killed,[{gen_server,terminate,6,[{file,gen_server.erl},{line,737}]},{proc_lib,init_p_do_apply,3,[{file,proc_lib.erl},{line,227}]}]} 09:09:58.215 [error] Error in replication `23280770e617f3a82f398b8eca09aaef` (triggered by document `rep1`): timeout Restarting replication in 5 seconds. 09:10:03.223 [info] 127.0.0.1 - - HEAD /testdb2/ 200 09:10:03.227 [info] 127.0.0.1 - - GET /testdb2/ 200 09:10:03.231 [info] 127.0.0.1 - - GET /testdb2/_local/23280770e617f3a82f398b8eca09aaef 404 09:10:03.235 [info] 127.0.0.1 - - GET /testdb2/_local/4b04e1e066f4ad1f988669036080ed9c 404 09:10:03.237 [info] Replication `23280770e617f3a82f398b8eca09aaef` is using: 4 worker processes a worker batch size of 500 20 HTTP connections a connection timeout of 3 milliseconds 10 retries per request socket options are: [{keepalive,true},{nodelay,false}] 09:10:03.244 [info] Document `rep1` triggered replication `23280770e617f3a82f398b8eca09aaef` 09:10:03.245 [info] starting new replication `23280770e617f3a82f398b8eca09aaef` at 0.335.0 (`testdb1` - `http://localhost:15984/testdb2/`) 09:10:03.253 [info] Replication `23280770e617f3a82f398b8eca09aaef` finished (triggered by document `rep1`) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators:
[jira] [Commented] (COUCHDB-1461) replication timeout and loop
[ https://issues.apache.org/jira/browse/COUCHDB-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283383#comment-13283383 ] Filipe Manana commented on COUCHDB-1461: Ops, only saw you comment now. It is pushed anyway. replication timeout and loop Key: COUCHDB-1461 URL: https://issues.apache.org/jira/browse/COUCHDB-1461 Project: CouchDB Issue Type: Bug Affects Versions: 1.2, 1.3 Reporter: Benoit Chesneau Fix For: 1.2.1 Attachments: 12x-0001-Avoid-possible-timeout-initializing-replications.patch, master-0001-Avoid-possible-timeout-initializing-replications.patch, test.py When you try to do at the same time a replication in both way, it will timeout then restart after 5s. Sometimes it won't be able to recover well. Adding a sleep between 2 reps is possibly solving it but it shouldn't be needed. Attached is a script using couchdbkit to reproduce the problem. SERVER_URI need to be changed to point to your couchdb node. Log: 09:09:24.016 [info] 127.0.0.1 - - HEAD /testdb1/ 404 09:09:24.028 [info] 127.0.0.1 - - PUT /testdb1/ 201 09:09:24.033 [info] 127.0.0.1 - - HEAD /testdb2/ 404 09:09:24.046 [info] 127.0.0.1 - - PUT /testdb2/ 201 09:09:24.071 [info] 127.0.0.1 - - GET /_replicator/_all_docs?include_docs=true 200 09:09:28.110 [info] 127.0.0.1 - - PUT /_replicator/rep1 201 09:09:28.119 [info] 127.0.0.1 - - PUT /_replicator/rep2 201 09:09:28.121 [info] Attempting to start replication `23280770e617f3a82f398b8eca09aaef` (document `rep1`). 09:09:28.123 [info] Attempting to start replication `e42aaea4a0ceb931930834ecf7b79600` (document `rep2`). 09:09:28.169 [info] 127.0.0.1 - - HEAD /testdb2/ 200 09:09:28.172 [info] 127.0.0.1 - - GET /testdb2/ 200 09:09:28.176 [info] 127.0.0.1 - - GET /testdb2/_local/e42aaea4a0ceb931930834ecf7b79600 404 09:09:28.179 [info] 127.0.0.1 - - GET /testdb2/_local/f129a5531f82eb089a3e1ca9e80c9ad2 404 09:09:28.194 [info] Replication `e42aaea4a0ceb931930834ecf7b79600` is using: 4 worker processes a worker batch size of 500 20 HTTP connections a connection timeout of 3 milliseconds 10 retries per request socket options are: [{keepalive,true},{nodelay,false}] 09:09:28.196 [info] 127.0.0.1 - - GET /testdb2/_changes?feed=normalstyle=all_docssince=0heartbeat=1 200 09:09:28.202 [info] Document `rep2` triggered replication `e42aaea4a0ceb931930834ecf7b79600` 09:09:28.203 [info] starting new replication `e42aaea4a0ceb931930834ecf7b79600` at 0.262.0 (`http://localhost:15984/testdb2/` - `testdb1`) 09:09:28.208 [info] 127.0.0.1 - - HEAD /testdb2/ 200 09:09:28.212 [info] 127.0.0.1 - - GET /testdb2/ 200 09:09:28.218 [info] 127.0.0.1 - - GET /testdb2/_local/23280770e617f3a82f398b8eca09aaef 404 09:09:28.219 [info] Replication `e42aaea4a0ceb931930834ecf7b79600` finished (triggered by document `rep2`) 09:09:28.223 [info] 127.0.0.1 - - GET /testdb2/_local/4b04e1e066f4ad1f988669036080ed9c 404 09:09:28.225 [info] Replication `23280770e617f3a82f398b8eca09aaef` is using: 4 worker processes a worker batch size of 500 20 HTTP connections a connection timeout of 3 milliseconds 10 retries per request socket options are: [{keepalive,true},{nodelay,false}] 09:09:58.203 [error] gen_server 0.287.0 terminated with reason: killed 09:09:58.207 [error] CRASH REPORT Process 0.287.0 with 0 neighbours crashed with reason: {killed,[{gen_server,terminate,6,[{file,gen_server.erl},{line,737}]},{proc_lib,init_p_do_apply,3,[{file,proc_lib.erl},{line,227}]}]} 09:09:58.215 [error] Error in replication `23280770e617f3a82f398b8eca09aaef` (triggered by document `rep1`): timeout Restarting replication in 5 seconds. 09:10:03.223 [info] 127.0.0.1 - - HEAD /testdb2/ 200 09:10:03.227 [info] 127.0.0.1 - - GET /testdb2/ 200 09:10:03.231 [info] 127.0.0.1 - - GET /testdb2/_local/23280770e617f3a82f398b8eca09aaef 404 09:10:03.235 [info] 127.0.0.1 - - GET /testdb2/_local/4b04e1e066f4ad1f988669036080ed9c 404 09:10:03.237 [info] Replication `23280770e617f3a82f398b8eca09aaef` is using: 4 worker processes a worker batch size of 500 20 HTTP connections a connection timeout of 3 milliseconds 10 retries per request socket options are: [{keepalive,true},{nodelay,false}] 09:10:03.244 [info] Document `rep1` triggered replication `23280770e617f3a82f398b8eca09aaef` 09:10:03.245 [info] starting new replication `23280770e617f3a82f398b8eca09aaef` at 0.335.0 (`testdb1` - `http://localhost:15984/testdb2/`) 09:10:03.253 [info] Replication `23280770e617f3a82f398b8eca09aaef` finished (triggered by document `rep1`) -- This message is automatically generated by JIRA. If you think it was sent incorrectly,
[jira] [Commented] (COUCHDB-1289) heartbeats skipped when continuous changes feed filter function produces no results
[ https://issues.apache.org/jira/browse/COUCHDB-1289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108477#comment-13108477 ] Filipe Manana commented on COUCHDB-1289: Good finding Bob. What about using a simple function that is called before processing each changes row and uses the process dictionary to reduce code size/complexity, along the lines: maybe_timeout(TimeoutFun) - Now = now(), Before = case get(changes_timeout) of undefined - Now; OldNow - OldNow end, case timer:diff(Now, Before) = Timeout of true - Acc2 = TimeoutFun(get(changes_user_acc)) put(changes_timeout(Now)), put(changes_user_acc, Acc2); false - ok end. Probably it misses some cases, i haven't thought about this issue before. heartbeats skipped when continuous changes feed filter function produces no results --- Key: COUCHDB-1289 URL: https://issues.apache.org/jira/browse/COUCHDB-1289 Project: CouchDB Issue Type: Bug Components: Database Core Reporter: Bob Dionne Assignee: Bob Dionne Priority: Minor if the changes feed has a filter function that produces no results, db_updated messages will still be sent and the heartbeat timeout will never be reached. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (COUCHDB-1288) More efficient builtin filters _doc_ids and _design
[ https://issues.apache.org/jira/browse/COUCHDB-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana resolved COUCHDB-1288. Resolution: Fixed Fix Version/s: 1.2 Applied to trunk and branch 1.2.x More efficient builtin filters _doc_ids and _design --- Key: COUCHDB-1288 URL: https://issues.apache.org/jira/browse/COUCHDB-1288 Project: CouchDB Issue Type: Improvement Reporter: Filipe Manana Fix For: 1.2 Attachments: couchdb_1288_2.patch, couchdb_1288_3.patch We have the _doc_ids and _design _changes filter as of CouchDB 1.1.0. While they meet the expectations of applications/users, they're far from efficient for large databases. Basically the implementation folds the entire seq btree and then filters values by the document's ID, causing too much IO and busting caches. This makes replication by doc IDs not so efficient as it could be. The proposed patch avoids this by doing direct lookups in the ID btree, for _doc_ids, and ranged fold for _design. If there are no objections, I would apply to branch 1.2.x besides -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (COUCHDB-1288) More efficient builtin filters _doc_ids and _design
[ https://issues.apache.org/jira/browse/COUCHDB-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana updated COUCHDB-1288: --- Attachment: (was: couchdb_1288_2.patch) More efficient builtin filters _doc_ids and _design --- Key: COUCHDB-1288 URL: https://issues.apache.org/jira/browse/COUCHDB-1288 Project: CouchDB Issue Type: Improvement Reporter: Filipe Manana Attachments: couchdb_1288.patch We have the _doc_ids and _design _changes filter as of CouchDB 1.1.0. While they meet the expectations of applications/users, they're far from efficient for large databases. Basically the implementation folds the entire seq btree and then filters values by the document's ID, causing too much IO and busting caches. This makes replication by doc IDs not so efficient as it could be. The proposed patch avoids this by doing direct lookups in the ID btree, for _doc_ids, and ranged fold for _design. If there are no objections, I would apply to branch 1.2.x besides -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (COUCHDB-1288) More efficient builtin filters _doc_ids and _design
[ https://issues.apache.org/jira/browse/COUCHDB-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana updated COUCHDB-1288: --- Attachment: couchdb_1288_2.patch More efficient builtin filters _doc_ids and _design --- Key: COUCHDB-1288 URL: https://issues.apache.org/jira/browse/COUCHDB-1288 Project: CouchDB Issue Type: Improvement Reporter: Filipe Manana Attachments: couchdb_1288_2.patch We have the _doc_ids and _design _changes filter as of CouchDB 1.1.0. While they meet the expectations of applications/users, they're far from efficient for large databases. Basically the implementation folds the entire seq btree and then filters values by the document's ID, causing too much IO and busting caches. This makes replication by doc IDs not so efficient as it could be. The proposed patch avoids this by doing direct lookups in the ID btree, for _doc_ids, and ranged fold for _design. If there are no objections, I would apply to branch 1.2.x besides -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (COUCHDB-1288) More efficient builtin filters _doc_ids and _design
[ https://issues.apache.org/jira/browse/COUCHDB-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana updated COUCHDB-1288: --- Attachment: (was: couchdb_1288.patch) More efficient builtin filters _doc_ids and _design --- Key: COUCHDB-1288 URL: https://issues.apache.org/jira/browse/COUCHDB-1288 Project: CouchDB Issue Type: Improvement Reporter: Filipe Manana Attachments: couchdb_1288_2.patch We have the _doc_ids and _design _changes filter as of CouchDB 1.1.0. While they meet the expectations of applications/users, they're far from efficient for large databases. Basically the implementation folds the entire seq btree and then filters values by the document's ID, causing too much IO and busting caches. This makes replication by doc IDs not so efficient as it could be. The proposed patch avoids this by doing direct lookups in the ID btree, for _doc_ids, and ranged fold for _design. If there are no objections, I would apply to branch 1.2.x besides -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1288) More efficient builtin filters _doc_ids and _design
[ https://issues.apache.org/jira/browse/COUCHDB-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107653#comment-13107653 ] Filipe Manana commented on COUCHDB-1288: This still needs some small work for the continuous case and a test. More efficient builtin filters _doc_ids and _design --- Key: COUCHDB-1288 URL: https://issues.apache.org/jira/browse/COUCHDB-1288 Project: CouchDB Issue Type: Improvement Reporter: Filipe Manana Attachments: couchdb_1288_2.patch We have the _doc_ids and _design _changes filter as of CouchDB 1.1.0. While they meet the expectations of applications/users, they're far from efficient for large databases. Basically the implementation folds the entire seq btree and then filters values by the document's ID, causing too much IO and busting caches. This makes replication by doc IDs not so efficient as it could be. The proposed patch avoids this by doing direct lookups in the ID btree, for _doc_ids, and ranged fold for _design. If there are no objections, I would apply to branch 1.2.x besides -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (COUCHDB-1288) More efficient builtin filters _doc_ids and _design
[ https://issues.apache.org/jira/browse/COUCHDB-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana updated COUCHDB-1288: --- Attachment: couchdb_1288_3.patch Added patch with test case, including the case for continuous changes. More efficient builtin filters _doc_ids and _design --- Key: COUCHDB-1288 URL: https://issues.apache.org/jira/browse/COUCHDB-1288 Project: CouchDB Issue Type: Improvement Reporter: Filipe Manana Attachments: couchdb_1288_2.patch, couchdb_1288_3.patch We have the _doc_ids and _design _changes filter as of CouchDB 1.1.0. While they meet the expectations of applications/users, they're far from efficient for large databases. Basically the implementation folds the entire seq btree and then filters values by the document's ID, causing too much IO and busting caches. This makes replication by doc IDs not so efficient as it could be. The proposed patch avoids this by doing direct lookups in the ID btree, for _doc_ids, and ranged fold for _design. If there are no objections, I would apply to branch 1.2.x besides -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1288) More efficient builtin filters _doc_ids and _design
[ https://issues.apache.org/jira/browse/COUCHDB-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108036#comment-13108036 ] Filipe Manana commented on COUCHDB-1288: Thanks Bob. If it's separate issue, unrelated to any changes from this patch, it should go into a separate patch/ticket :) More efficient builtin filters _doc_ids and _design --- Key: COUCHDB-1288 URL: https://issues.apache.org/jira/browse/COUCHDB-1288 Project: CouchDB Issue Type: Improvement Reporter: Filipe Manana Attachments: couchdb_1288_2.patch, couchdb_1288_3.patch We have the _doc_ids and _design _changes filter as of CouchDB 1.1.0. While they meet the expectations of applications/users, they're far from efficient for large databases. Basically the implementation folds the entire seq btree and then filters values by the document's ID, causing too much IO and busting caches. This makes replication by doc IDs not so efficient as it could be. The proposed patch avoids this by doing direct lookups in the ID btree, for _doc_ids, and ranged fold for _design. If there are no objections, I would apply to branch 1.2.x besides -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1286) Cannot replicate with nginx as reverse proxy
[ https://issues.apache.org/jira/browse/COUCHDB-1286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107542#comment-13107542 ] Filipe Manana commented on COUCHDB-1286: And does it work, with the same nginx configuration, with CouchDB 1.1.0 or older? If not, it's very likely an nginx bad/missing configuration. Cannot replicate with nginx as reverse proxy Key: COUCHDB-1286 URL: https://issues.apache.org/jira/browse/COUCHDB-1286 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.2 Reporter: Dale Harvey Attachments: repl.log Replicating between 2 CouchDB instances on 1.2.x branch, one sites behind nginx, a push replication to the one behind nginx fails with the attached log -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (COUCHDB-1286) Cannot replicate with nginx as reverse proxy
[ https://issues.apache.org/jira/browse/COUCHDB-1286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana reassigned COUCHDB-1286: -- Assignee: Filipe Manana Cannot replicate with nginx as reverse proxy Key: COUCHDB-1286 URL: https://issues.apache.org/jira/browse/COUCHDB-1286 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.2 Reporter: Dale Harvey Assignee: Filipe Manana Attachments: nginx.conf, repl.log Replicating between 2 CouchDB instances on 1.2.x branch, one sites behind nginx, a push replication to the one behind nginx fails with the attached log -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1286) Cannot replicate with nginx as reverse proxy
[ https://issues.apache.org/jira/browse/COUCHDB-1286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107544#comment-13107544 ] Filipe Manana commented on COUCHDB-1286: Seems like nginx doesn't like put/post requests with a chunked transfer encoding: http://www.atnan.com/2008/8/8/transfer-encoding-chunked-chunky-http I'll change it to not use chunked encoding with _bulk_docs. Cannot replicate with nginx as reverse proxy Key: COUCHDB-1286 URL: https://issues.apache.org/jira/browse/COUCHDB-1286 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.2 Reporter: Dale Harvey Attachments: nginx.conf, repl.log Replicating between 2 CouchDB instances on 1.2.x branch, one sites behind nginx, a push replication to the one behind nginx fails with the attached log -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (COUCHDB-1286) Cannot replicate with nginx as reverse proxy
[ https://issues.apache.org/jira/browse/COUCHDB-1286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana resolved COUCHDB-1286. Resolution: Fixed Fix Version/s: 1.2 Thanks Dale. Applied to trunk and branch 1.2.x Cannot replicate with nginx as reverse proxy Key: COUCHDB-1286 URL: https://issues.apache.org/jira/browse/COUCHDB-1286 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.2 Reporter: Dale Harvey Assignee: Filipe Manana Fix For: 1.2 Attachments: couchdb-1286_bulk_docs.patch, nginx.conf, repl.log Replicating between 2 CouchDB instances on 1.2.x branch, one sites behind nginx, a push replication to the one behind nginx fails with the attached log -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (COUCHDB-1288) More efficient builtin filters _doc_ids and _design
[ https://issues.apache.org/jira/browse/COUCHDB-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana updated COUCHDB-1288: --- Attachment: couchdb_1288.patch More efficient builtin filters _doc_ids and _design --- Key: COUCHDB-1288 URL: https://issues.apache.org/jira/browse/COUCHDB-1288 Project: CouchDB Issue Type: Improvement Reporter: Filipe Manana Attachments: couchdb_1288.patch We have the _doc_ids and _design _changes filter as of CouchDB 1.1.0. While they meet the expectations of applications/users, they're far from efficient for large databases. Basically the implementation folds the entire seq btree and then filters values by the document's ID, causing too much IO and busting caches. This makes replication by doc IDs not so efficient as it could be. The proposed patch avoids this by doing direct lookups in the ID btree, for _doc_ids, and ranged fold for _design. If there are no objections, I would apply to branch 1.2.x besides -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (COUCHDB-1288) More efficient builtin filters _doc_ids and _design
[ https://issues.apache.org/jira/browse/COUCHDB-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana updated COUCHDB-1288: --- Attachment: couchdb_1288.patch More efficient builtin filters _doc_ids and _design --- Key: COUCHDB-1288 URL: https://issues.apache.org/jira/browse/COUCHDB-1288 Project: CouchDB Issue Type: Improvement Reporter: Filipe Manana Attachments: couchdb_1288.patch We have the _doc_ids and _design _changes filter as of CouchDB 1.1.0. While they meet the expectations of applications/users, they're far from efficient for large databases. Basically the implementation folds the entire seq btree and then filters values by the document's ID, causing too much IO and busting caches. This makes replication by doc IDs not so efficient as it could be. The proposed patch avoids this by doing direct lookups in the ID btree, for _doc_ids, and ranged fold for _design. If there are no objections, I would apply to branch 1.2.x besides -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (COUCHDB-1288) More efficient builtin filters _doc_ids and _design
[ https://issues.apache.org/jira/browse/COUCHDB-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana updated COUCHDB-1288: --- Attachment: (was: couchdb_1288.patch) More efficient builtin filters _doc_ids and _design --- Key: COUCHDB-1288 URL: https://issues.apache.org/jira/browse/COUCHDB-1288 Project: CouchDB Issue Type: Improvement Reporter: Filipe Manana Attachments: couchdb_1288.patch We have the _doc_ids and _design _changes filter as of CouchDB 1.1.0. While they meet the expectations of applications/users, they're far from efficient for large databases. Basically the implementation folds the entire seq btree and then filters values by the document's ID, causing too much IO and busting caches. This makes replication by doc IDs not so efficient as it could be. The proposed patch avoids this by doing direct lookups in the ID btree, for _doc_ids, and ranged fold for _design. If there are no objections, I would apply to branch 1.2.x besides -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (COUCHDB-1288) More efficient builtin filters _doc_ids and _design
[ https://issues.apache.org/jira/browse/COUCHDB-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana updated COUCHDB-1288: --- Attachment: couchdb_1288_2.patch Second version of the patch, for _doc_ids, the optimized code patch is only triggered if the number of doc IDs is not greater than 100. This is too avoid loading too many full_doc_info records into memory, which can be big if the rev trees are long and/or with many branches. More efficient builtin filters _doc_ids and _design --- Key: COUCHDB-1288 URL: https://issues.apache.org/jira/browse/COUCHDB-1288 Project: CouchDB Issue Type: Improvement Reporter: Filipe Manana Attachments: couchdb_1288.patch, couchdb_1288_2.patch We have the _doc_ids and _design _changes filter as of CouchDB 1.1.0. While they meet the expectations of applications/users, they're far from efficient for large databases. Basically the implementation folds the entire seq btree and then filters values by the document's ID, causing too much IO and busting caches. This makes replication by doc IDs not so efficient as it could be. The proposed patch avoids this by doing direct lookups in the ID btree, for _doc_ids, and ranged fold for _design. If there are no objections, I would apply to branch 1.2.x besides -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1283) Impossible to compact view groups when number of active databases max_dbs_open
[ https://issues.apache.org/jira/browse/COUCHDB-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13106692#comment-13106692 ] Filipe Manana commented on COUCHDB-1283: I forgot to paste it yesterday, here's the error I get when a view recompact happens with trunk: $ ./test/etap/run -v test/etap/201-view-group-shutdown.t test/etap/201-view-group-shutdown.t .. # Current time local 2011-09-16 00:10:22 # Using etap version 0.3.4 1..17 Apache CouchDB 0.0.0 (LogLevel=info) is starting. Apache CouchDB has started. Time to relax. [info] [0.2.0] Apache CouchDB has started on http://127.0.0.1:58243/ # View group updated ok 1 - Spawned writer 1 ok 2 - Spawned writer 2 ok 3 - Writer 1 opened his database ok 4 - Writer 2 opened his database # View group updated ok 5 - Spawned writer 3 ok 6 - Writer 3 got {error, all_dbs_active} when opening his database ok 7 - Writer 1 still alive ok 8 - Writer 2 still alive ok 9 - Writer 3 still alive [info] [0.179.0] Recompacting index couch_test_view_group_shutdown _design/foo at 20001 [error] [emulator] Error in process 0.190.0 with exit value: {undef,[{couch_index_updater,update,[couch_mrview_index,{mrst,16 bytes,0.181.0,30 bytes,11 bytes,10 bytes,[],{[]},[{mrview,0,0,0,[4 bytes,4 bytes,4 bytes,4 bytes,3 bytes],[],37 bytes,{btree,0.181.0,{29679751,{2,[]},29237352},#Funcouch_btree.3.126133433,#Funcouch_btree.4.37628535,#Funcouch_ejson_compare.less_json_ids.2,#Funcouch_mrview_util.8.13864802,snappy},[]}],{btree,0.181.0,{384486,2,384638},#Funcouch_btree.3.126133433,#Funcouch_btree.4.37628535,#Funcouch_btree.5.9554535,#Funcouch_mrview_util.6.41372338,snappy},20001,0,undefined,undefined,undefined,undefined,undefined,nil}]}]} =ERROR REPORT 16-Sep-2011::00:10:38 === Error in process 0.190.0 with exit value: {undef,[{couch_index_updater,update,[couch_mrview_index,{mrst,16 bytes,0.181.0,30 bytes,11 bytes,10 bytes,[],{[]},[{mrview,0,0,0,[4 bytes,4 bytes,4 bytes,4 bytes,3 bytes],[],37 bytes,{btree,0.181.0,{29679751,{2,[]},29237352},#Funcouch_btree.3.126133433,#Funcouch_btree.4.37628535,#Funcouch_ejson_compare.less_json_ids.2,#Funcouch_mrview_util.8.13864802,snappy},[]}],{btree,0.181.0,{384486,2,384638},#Funcouch_btree.3.126133433,#Funcouch_btree.4.37628535,#Funcouch_btree.5.9554535,#Funcouch_mrview_util.6.41372338,snappy},20001,0,undefined,undefined,undefined,undefined,undefined,nil}]}]} couch_index_updater:update/2 doesn't exist, but there's a 3-arity version of it, however I'm not sure where to get the 3rd argument for it. Impossible to compact view groups when number of active databases max_dbs_open Key: COUCHDB-1283 URL: https://issues.apache.org/jira/browse/COUCHDB-1283 Project: CouchDB Issue Type: Bug Reporter: Filipe Manana Assignee: Paul Joseph Davis Fix For: 1.1.1, 1.2 Attachments: couchdb-1283_12x.patch, couchdb-1283_trunk.patch Mike Leddy recently reported this issue in the users mailing list: http://mail-archives.apache.org/mod_mbox/couchdb-user/201109.mbox/%3c1315949945.22123.22.ca...@mike.loop.com.br%3E The attached patch is the simplest solution I can think of - keeping the database open until the view compaction finishes. The patch includes a test case. It will need to be updated after Paul's view index refactoring (COUCHDB-1270). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (COUCHDB-1284) Couchdb crashes when configuring replication filter
[ https://issues.apache.org/jira/browse/COUCHDB-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana closed COUCHDB-1284. -- Resolution: Duplicate Fix Version/s: 1.1.1 Hi, Your issue is the same as COUCHDB-1199 (I read the log file). It is already fixed for the next maintenance release 1.1.1 and 1.2.0. Couchdb crashes when configuring replication filter --- Key: COUCHDB-1284 URL: https://issues.apache.org/jira/browse/COUCHDB-1284 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.1 Environment: Ubuntu 10.04 Desktop, i386 Reporter: Daniel Gonzalez Labels: replication_crash_with_filter Fix For: 1.1.1 Attachments: couchdb.log, filter, filter, replication_document Original Estimate: 72h Remaining Estimate: 72h As soon as I configure a replication with a certain filter, couchdb crashes. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1266) Add stats field to _active_tasks
[ https://issues.apache.org/jira/browse/COUCHDB-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13106979#comment-13106979 ] Filipe Manana commented on COUCHDB-1266: I just rebased the patch on current trunk, after Paul's view refactoring: https://github.com/fdmanana/couchdb/commit/9c5481157fc47090a20fe098eb18a7cdcc823231 I'll rename this ticket's title, as the purpose is no longer to add a stats field but rather to make _active_tasks more generic, less rigid, and allow different tasks to specify different properties. Add stats field to _active_tasks Key: COUCHDB-1266 URL: https://issues.apache.org/jira/browse/COUCHDB-1266 Project: CouchDB Issue Type: Improvement Reporter: Filipe Manana Assignee: Filipe Manana Fix For: 1.2 Attachments: alternate-api.patch This proposal is simply to add a stats field to the _active_tasks results. This field can be an arbitrary JSON value and each task can set it to whatever is appropriate for it.The following patch also defines some basic stats for the existing tasks: 1) database compaction - # changes done, total changes, # of revisions copied, # of attachments copied and progress (an integer percentage, same as what is exposed in the existing text field status); 2) view compaction - # of ids copied, total number of ids, # of kvs copied, total number of kvs and progress 3) view indexing - # changes done, total changes, # inserted kvs, # deleted kvs, progress 4) replication - # missing revisions checked, # missing revisions found, # docs read, # docs written, # doc write failures, source seq number, checkpointed source seq number, progress. A screenshot of Futon with 3 different tasks: http://dl.dropbox.com/u/25067962/active_tasks_stats.png Patch at: https://github.com/fdmanana/couchdb/compare/task_stats.diff -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (COUCHDB-1266) Make _active_tasks more flexible
[ https://issues.apache.org/jira/browse/COUCHDB-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana updated COUCHDB-1266: --- Description: This proposal is simply to allow the output of _active_tasks to be less rigid. Basically to allow each task to be able to output different JSON fields. Somethings like the status text simply go away. Instead application can built it based on more granular fields provided by _active_tasks. Some examples: 1) progress (an integer percentage, for all tasks) 2) database (for compactions and indexer tasks) 3) design_document (for indexer and view compaction tasks) 4) source and target (for replications) 5) docs_read, docs_written, doc_write_failures, missing_revs_found, missing_revs_checked, source_seq, checkpointed_source_seq and continuous for replications was: This proposal is simply to add a stats field to the _active_tasks results. This field can be an arbitrary JSON value and each task can set it to whatever is appropriate for it.The following patch also defines some basic stats for the existing tasks: 1) database compaction - # changes done, total changes, # of revisions copied, # of attachments copied and progress (an integer percentage, same as what is exposed in the existing text field status); 2) view compaction - # of ids copied, total number of ids, # of kvs copied, total number of kvs and progress 3) view indexing - # changes done, total changes, # inserted kvs, # deleted kvs, progress 4) replication - # missing revisions checked, # missing revisions found, # docs read, # docs written, # doc write failures, source seq number, checkpointed source seq number, progress. A screenshot of Futon with 3 different tasks: http://dl.dropbox.com/u/25067962/active_tasks_stats.png Patch at: https://github.com/fdmanana/couchdb/compare/task_stats.diff Summary: Make _active_tasks more flexible (was: Add stats field to _active_tasks) Make _active_tasks more flexible Key: COUCHDB-1266 URL: https://issues.apache.org/jira/browse/COUCHDB-1266 Project: CouchDB Issue Type: Improvement Reporter: Filipe Manana Assignee: Filipe Manana Fix For: 1.2 Attachments: alternate-api.patch This proposal is simply to allow the output of _active_tasks to be less rigid. Basically to allow each task to be able to output different JSON fields. Somethings like the status text simply go away. Instead application can built it based on more granular fields provided by _active_tasks. Some examples: 1) progress (an integer percentage, for all tasks) 2) database (for compactions and indexer tasks) 3) design_document (for indexer and view compaction tasks) 4) source and target (for replications) 5) docs_read, docs_written, doc_write_failures, missing_revs_found, missing_revs_checked, source_seq, checkpointed_source_seq and continuous for replications -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (COUCHDB-1266) Make _active_tasks more flexible
[ https://issues.apache.org/jira/browse/COUCHDB-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana resolved COUCHDB-1266. Resolution: Fixed Make _active_tasks more flexible Key: COUCHDB-1266 URL: https://issues.apache.org/jira/browse/COUCHDB-1266 Project: CouchDB Issue Type: Improvement Reporter: Filipe Manana Assignee: Filipe Manana Fix For: 1.2 Attachments: alternate-api.patch This proposal is simply to allow the output of _active_tasks to be less rigid. Basically to allow each task to be able to output different JSON fields. Somethings like the status text simply go away. Instead application can built it based on more granular fields provided by _active_tasks. Some examples: 1) progress (an integer percentage, for all tasks) 2) database (for compactions and indexer tasks) 3) design_document (for indexer and view compaction tasks) 4) source and target (for replications) 5) docs_read, docs_written, doc_write_failures, missing_revs_found, missing_revs_checked, source_seq, checkpointed_source_seq and continuous for replications -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1266) Make _active_tasks more flexible
[ https://issues.apache.org/jira/browse/COUCHDB-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107008#comment-13107008 ] Filipe Manana commented on COUCHDB-1266: Applied to trunk and branch 1.2.x. Here's an _active_tasks example output: $ curl http://localhost:5984/_active_tasks [ { pid:0.242.0, changes_done:31209, database:indexer_test_3, design_document:_design/test, progress:5, started_on:1316228432, total_changes:551201, type:indexer, updated_on:1316228461 } ], { pid:0.1156.0, database:indexer_test_3, design_document:_design/test, progress:21, started_on:1316229336, type:view_compaction, updated_on:1316229352 }, { pid:0.1303.0, checkpointed_source_seq:17333, continuous:false, doc_write_failures:0, docs_read:17833, docs_written:17833, missing_revisions_found:17833, progress:3, revisions_checked:17833, source:http://fdmanana.couchone.com/indexer_test/;, source_seq:551202, started_on:1316229471, target:indexer_test, type:replication, updated_on:1316230082 } ] Make _active_tasks more flexible Key: COUCHDB-1266 URL: https://issues.apache.org/jira/browse/COUCHDB-1266 Project: CouchDB Issue Type: Improvement Reporter: Filipe Manana Assignee: Filipe Manana Fix For: 1.2 Attachments: alternate-api.patch This proposal is simply to allow the output of _active_tasks to be less rigid. Basically to allow each task to be able to output different JSON fields. Somethings like the status text simply go away. Instead application can built it based on more granular fields provided by _active_tasks. Some examples: 1) progress (an integer percentage, for all tasks) 2) database (for compactions and indexer tasks) 3) design_document (for indexer and view compaction tasks) 4) source and target (for replications) 5) docs_read, docs_written, doc_write_failures, missing_revs_found, missing_revs_checked, source_seq, checkpointed_source_seq and continuous for replications -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (COUCHDB-1271) Impossible to cancel replications in some scenarios
[ https://issues.apache.org/jira/browse/COUCHDB-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana resolved COUCHDB-1271. Resolution: Fixed Applied to trunk and branch 1.2.x Impossible to cancel replications in some scenarios --- Key: COUCHDB-1271 URL: https://issues.apache.org/jira/browse/COUCHDB-1271 Project: CouchDB Issue Type: Bug Components: Replication Reporter: Filipe Manana Assignee: Filipe Manana Fix For: 1.2 In some scenarios it's impossible to cancel a replication by posting to /_replicate, namely: 1) A filtered replication is started, the filter's code is updated in the source database, therefore's a subsequent cancel request will not generate the old replication ID anymore, has it got a different filter code; 2) Dynamically changing the httpd port will also result in the impossibility of computing the right replication ID Finally, it's also nicer for users to not need to remember the exact replication object posted before to /_replicate. The new approach, in addition to the current approach, allows something as simple as: POST /_replicate {replication_id: 0a81b645497e6270611ec3419767a584+continuous+create_target, cancel: true} The replication ID can be obtained from a continuous replication request's response (field _local_id), _active_tasks (field replication_id) or from the log. Aliases _local_id and id are allowed instead of replication_id. Patch: https://github.com/fdmanana/couchdb/commit/c8909ea9dbcbc1f52c5c0f87e1a95102a3edfa9f.diff (depends on COUCHDB-1266) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (COUCHDB-1283) Impossible to compact view groups when number of active databases max_dbs_open
[ https://issues.apache.org/jira/browse/COUCHDB-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana updated COUCHDB-1283: --- Attachment: couchdb-1283.patch Impossible to compact view groups when number of active databases max_dbs_open Key: COUCHDB-1283 URL: https://issues.apache.org/jira/browse/COUCHDB-1283 Project: CouchDB Issue Type: Bug Reporter: Filipe Manana Fix For: 1.1.1, 1.2 Attachments: couchdb-1283.patch Mike Leddy recently reported this issue in the users mailing list: http://mail-archives.apache.org/mod_mbox/couchdb-user/201109.mbox/%3c1315949945.22123.22.ca...@mike.loop.com.br%3E The attached patch is the simplest solution I can think of - keeping the database open until the view compaction finishes. The patch includes a test case. It will need to be updated after Paul's view index refactoring (COUCHDB-1270). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (COUCHDB-1283) Impossible to compact view groups when number of active databases max_dbs_open
Impossible to compact view groups when number of active databases max_dbs_open Key: COUCHDB-1283 URL: https://issues.apache.org/jira/browse/COUCHDB-1283 Project: CouchDB Issue Type: Bug Reporter: Filipe Manana Fix For: 1.1.1, 1.2 Attachments: couchdb-1283.patch Mike Leddy recently reported this issue in the users mailing list: http://mail-archives.apache.org/mod_mbox/couchdb-user/201109.mbox/%3c1315949945.22123.22.ca...@mike.loop.com.br%3E The attached patch is the simplest solution I can think of - keeping the database open until the view compaction finishes. The patch includes a test case. It will need to be updated after Paul's view index refactoring (COUCHDB-1270). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (COUCHDB-1283) Impossible to compact view groups when number of active databases max_dbs_open
[ https://issues.apache.org/jira/browse/COUCHDB-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana reassigned COUCHDB-1283: -- Assignee: Filipe Manana Impossible to compact view groups when number of active databases max_dbs_open Key: COUCHDB-1283 URL: https://issues.apache.org/jira/browse/COUCHDB-1283 Project: CouchDB Issue Type: Bug Reporter: Filipe Manana Assignee: Filipe Manana Fix For: 1.1.1, 1.2 Attachments: couchdb-1283.patch Mike Leddy recently reported this issue in the users mailing list: http://mail-archives.apache.org/mod_mbox/couchdb-user/201109.mbox/%3c1315949945.22123.22.ca...@mike.loop.com.br%3E The attached patch is the simplest solution I can think of - keeping the database open until the view compaction finishes. The patch includes a test case. It will need to be updated after Paul's view index refactoring (COUCHDB-1270). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (COUCHDB-1283) Impossible to compact view groups when number of active databases max_dbs_open
[ https://issues.apache.org/jira/browse/COUCHDB-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana updated COUCHDB-1283: --- Attachment: (was: couchdb-1283.patch) Impossible to compact view groups when number of active databases max_dbs_open Key: COUCHDB-1283 URL: https://issues.apache.org/jira/browse/COUCHDB-1283 Project: CouchDB Issue Type: Bug Reporter: Filipe Manana Assignee: Filipe Manana Fix For: 1.1.1, 1.2 Mike Leddy recently reported this issue in the users mailing list: http://mail-archives.apache.org/mod_mbox/couchdb-user/201109.mbox/%3c1315949945.22123.22.ca...@mike.loop.com.br%3E The attached patch is the simplest solution I can think of - keeping the database open until the view compaction finishes. The patch includes a test case. It will need to be updated after Paul's view index refactoring (COUCHDB-1270). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (COUCHDB-1283) Impossible to compact view groups when number of active databases max_dbs_open
[ https://issues.apache.org/jira/browse/COUCHDB-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana reassigned COUCHDB-1283: -- Assignee: Paul Joseph Davis (was: Filipe Manana) Impossible to compact view groups when number of active databases max_dbs_open Key: COUCHDB-1283 URL: https://issues.apache.org/jira/browse/COUCHDB-1283 Project: CouchDB Issue Type: Bug Reporter: Filipe Manana Assignee: Paul Joseph Davis Fix For: 1.1.1, 1.2 Attachments: couchdb-1283_12x.patch, couchdb-1283_trunk.patch Mike Leddy recently reported this issue in the users mailing list: http://mail-archives.apache.org/mod_mbox/couchdb-user/201109.mbox/%3c1315949945.22123.22.ca...@mike.loop.com.br%3E The attached patch is the simplest solution I can think of - keeping the database open until the view compaction finishes. The patch includes a test case. It will need to be updated after Paul's view index refactoring (COUCHDB-1270). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1270) Rewrite the view engine
[ https://issues.apache.org/jira/browse/COUCHDB-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13105805#comment-13105805 ] Filipe Manana commented on COUCHDB-1270: Paul, I have a few questions/remarks, all are minor things that I didn't notice before and I don't consider them blockers of any kind. 1) There was an _all_docs with ?include_docs=true optimization which avoided doing a lookups in the id btree to get the documents. This seems to be gone (COUCHDB-1061). I can help here; 2) There's no information dumped to the log file when the updater starts, stops or checkpoints. At least when it starts and stops, I find it useful to have it in the log. Dunno if this was intentional or not; 3) When the view group is shutdown, because the associated database was closed, there's no information logged anymore. I find the logging useful for this scenario, for example to diagnose COUCHDB-1283; 4) For the view index file, we used to have a ref counter for each. I don't see anything equivalent now. What happens when a client is folding a view and before it finishes folding it, the view is compacted, the file switch happens and the original file is deleted? Once again, great work on this refactoring. Rewrite the view engine --- Key: COUCHDB-1270 URL: https://issues.apache.org/jira/browse/COUCHDB-1270 Project: CouchDB Issue Type: Improvement Components: JavaScript View Server Reporter: Paul Joseph Davis Attachments: 0001-Minor-changes-for-new-indexing-engine.patch, 0002-Create-the-couch_index-application.patch, 0003-Create-the-couch_mrview-application.patch, 0004-Remove-the-old-view-engine.patch The view engine has been creaky and cluttered. As shown by GeoCouch, adding new indexers basically involves copying the entire view engine and hacking the parts that are different. In short, the opposite of good engineering. Over the last couple weeks I've refactored the view engine and reimplemented the map/reduce view engine. These changes are 100% internal and no external behavior has changed. Performance is just a tiny bit better than trunk. I did do some playing trying to improve view update times and there are some dances we could do, but for the time being I wanted to keep the same general architecture for updates so that the changes are minimal. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1267) Bignums in views aren't sorted correctly.
[ https://issues.apache.org/jira/browse/COUCHDB-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13102168#comment-13102168 ] Filipe Manana commented on COUCHDB-1267: The corresponding OTP patch landed into OTP's dev branch, which means the fix will be in R14B04 (to be released by early October afaik): https://github.com/erlang/otp/commit/262a9af33d3ceb4cb032c434b100cea7d4b0d60e I just updated configure.ac to enable the NIF only if the OTP version is R14B03. Leaving this ticket open for a while however. Bignums in views aren't sorted correctly. - Key: COUCHDB-1267 URL: https://issues.apache.org/jira/browse/COUCHDB-1267 Project: CouchDB Issue Type: Bug Components: JavaScript View Server Affects Versions: 1.1.1, 1.2 Reporter: Paul Joseph Davis Priority: Blocker Easily tested by creating a doc and hitting this as a temp view: function(doc) { emit([bylatest,-1303085266000], null); emit([bylatest,-1298817134000], null); emit([bylatest,-1294536544000], null); emit([bylatest,-1294505612000], null); emit([bylatest,-117870480], null); } -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1270) Rewrite the view engine
[ https://issues.apache.org/jira/browse/COUCHDB-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13101666#comment-13101666 ] Filipe Manana commented on COUCHDB-1270: Paul just one question, looking at the 3rd patch: diff --git a/share/www/script/test/view_compaction.js b/share/www/script/test/view_compaction.js index 4c75184..ebf6fe1 100644 --- a/share/www/script/test/view_compaction.js +++ b/share/www/script/test/view_compaction.js @@ -87,7 +87,7 @@ couchTests.view_compaction = function(debug) { T(data_size_before_compact disk_size_before_compact, data size file size); // compact view group - var xhr = CouchDB.request(POST, / + db.name + /_compact + /foo); + var xhr = CouchDB.request(POST, / + db.name + /_design/foo/_compact); T(JSON.parse(xhr.responseText).ok === true); resp = db.designInfo(_design/foo); diff --git a/src/Makefile.am b/src/Makefile.am Has the URI for view compaction changed or is it just an alternative? Rewrite the view engine --- Key: COUCHDB-1270 URL: https://issues.apache.org/jira/browse/COUCHDB-1270 Project: CouchDB Issue Type: Improvement Components: JavaScript View Server Reporter: Paul Joseph Davis Attachments: 0001-Minor-changes-for-new-indexing-engine.patch, 0002-Create-the-couch_index-application.patch, 0003-Create-the-couch_mrview-application.patch, 0004-Remove-the-old-view-engine.patch The view engine has been creaky and cluttered. As shown by GeoCouch, adding new indexers basically involves copying the entire view engine and hacking the parts that are different. In short, the opposite of good engineering. Over the last couple weeks I've refactored the view engine and reimplemented the map/reduce view engine. These changes are 100% internal and no external behavior has changed. Performance is just a tiny bit better than trunk. I did do some playing trying to improve view update times and there are some dances we could do, but for the time being I wanted to keep the same general architecture for updates so that the changes are minimal. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1271) Impossible to cancel replications in some scenarios
[ https://issues.apache.org/jira/browse/COUCHDB-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13099191#comment-13099191 ] Filipe Manana commented on COUCHDB-1271: Robert, if one clause doesn't need the user context, I think it's simpler to not receive it. As for the different calls, they're necessary, as when the user specifies an id an additional step must be done (checking of permission). Thanks for the opinion. Impossible to cancel replications in some scenarios --- Key: COUCHDB-1271 URL: https://issues.apache.org/jira/browse/COUCHDB-1271 Project: CouchDB Issue Type: Bug Components: Replication Reporter: Filipe Manana Assignee: Filipe Manana Fix For: 1.2 In some scenarios it's impossible to cancel a replication by posting to /_replicate, namely: 1) A filtered replication is started, the filter's code is updated in the source database, therefore's a subsequent cancel request will not generate the old replication ID anymore, has it got a different filter code; 2) Dynamically changing the httpd port will also result in the impossibility of computing the right replication ID Finally, it's also nicer for users to not need to remember the exact replication object posted before to /_replicate. The new approach, in addition to the current approach, allows something as simple as: POST /_replicate {replication_id: 0a81b645497e6270611ec3419767a584+continuous+create_target, cancel: true} The replication ID can be obtained from a continuous replication request's response (field _local_id), _active_tasks (field replication_id) or from the log. Aliases _local_id and id are allowed instead of replication_id. Patch: https://github.com/fdmanana/couchdb/commit/c8909ea9dbcbc1f52c5c0f87e1a95102a3edfa9f.diff (depends on COUCHDB-1266) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1260) Fix CouchJS/SpiderMonkey compatibility
[ https://issues.apache.org/jira/browse/COUCHDB-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13099882#comment-13099882 ] Filipe Manana commented on COUCHDB-1260: Tested with xulrunner-devel-1.9.2.18. All tests pass. Fix CouchJS/SpiderMonkey compatibility -- Key: COUCHDB-1260 URL: https://issues.apache.org/jira/browse/COUCHDB-1260 Project: CouchDB Issue Type: Improvement Components: JavaScript View Server Affects Versions: 1.2 Reporter: Paul Joseph Davis Fix For: 1.2 Attachments: COUCHDB-1260.patch I just finished refactoring couchjs so that I can make it compatible with all of the major versions of SpiderMonkey. I've run the tests against 1.7.0, 1.8.0rc1, Homebrew ~= 1.8.5, and 1.8.5. The only test that fails is view_sandboxing.js for 1.7.0 because sandboxing doesn't work there. I would appreciate if people could pull this down and run the Futon test suite to check that tests pass for everyone. I'm specifically interested to hear from those of you using one of the XULRunner SpiderMonkey's or otherwise any SpiderMonkey that's not a tarball from the Mozilla FTP directory. I'll give people a couple days to report back but if I don't hear anything I'll push this in. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1270) Rewrite the view engine
[ https://issues.apache.org/jira/browse/COUCHDB-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097748#comment-13097748 ] Filipe Manana commented on COUCHDB-1270: Paul, All fine for me Rewrite the view engine --- Key: COUCHDB-1270 URL: https://issues.apache.org/jira/browse/COUCHDB-1270 Project: CouchDB Issue Type: Improvement Components: JavaScript View Server Reporter: Paul Joseph Davis Attachments: 0001-Minor-changes-for-new-indexing-engine.patch, 0002-Create-the-couch_index-application.patch, 0003-Create-the-couch_mrview-application.patch, 0004-Remove-the-old-view-engine.patch The view engine has been creaky and cluttered. As shown by GeoCouch, adding new indexers basically involves copying the entire view engine and hacking the parts that are different. In short, the opposite of good engineering. Over the last couple weeks I've refactored the view engine and reimplemented the map/reduce view engine. These changes are 100% internal and no external behavior has changed. Performance is just a tiny bit better than trunk. I did do some playing trying to improve view update times and there are some dances we could do, but for the time being I wanted to keep the same general architecture for updates so that the changes are minimal. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1266) Add stats field to _active_tasks
[ https://issues.apache.org/jira/browse/COUCHDB-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13096969#comment-13096969 ] Filipe Manana commented on COUCHDB-1266: Patch updated, https://github.com/fdmanana/couchdb/compare/task_stats.diff The status text string in futon is now constructed by JavaScript via the better structured _active_tasks output. Add stats field to _active_tasks Key: COUCHDB-1266 URL: https://issues.apache.org/jira/browse/COUCHDB-1266 Project: CouchDB Issue Type: Improvement Reporter: Filipe Manana Assignee: Filipe Manana Fix For: 1.2 Attachments: alternate-api.patch This proposal is simply to add a stats field to the _active_tasks results. This field can be an arbitrary JSON value and each task can set it to whatever is appropriate for it.The following patch also defines some basic stats for the existing tasks: 1) database compaction - # changes done, total changes, # of revisions copied, # of attachments copied and progress (an integer percentage, same as what is exposed in the existing text field status); 2) view compaction - # of ids copied, total number of ids, # of kvs copied, total number of kvs and progress 3) view indexing - # changes done, total changes, # inserted kvs, # deleted kvs, progress 4) replication - # missing revisions checked, # missing revisions found, # docs read, # docs written, # doc write failures, source seq number, checkpointed source seq number, progress. A screenshot of Futon with 3 different tasks: http://dl.dropbox.com/u/25067962/active_tasks_stats.png Patch at: https://github.com/fdmanana/couchdb/compare/task_stats.diff -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1270) Rewrite the view engine
[ https://issues.apache.org/jira/browse/COUCHDB-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13096973#comment-13096973 ] Filipe Manana commented on COUCHDB-1270: Also, can you share those performance test results? Rewrite the view engine --- Key: COUCHDB-1270 URL: https://issues.apache.org/jira/browse/COUCHDB-1270 Project: CouchDB Issue Type: Improvement Components: JavaScript View Server Reporter: Paul Joseph Davis Attachments: 0001-Minor-changes-for-new-indexing-engine.patch, 0002-Create-the-couch_index-application.patch, 0003-Create-the-couch_mrview-application.patch, 0004-Remove-the-old-view-engine.patch The view engine has been creaky and cluttered. As shown by GeoCouch, adding new indexers basically involves copying the entire view engine and hacking the parts that are different. In short, the opposite of good engineering. Over the last couple weeks I've refactored the view engine and reimplemented the map/reduce view engine. These changes are 100% internal and no external behavior has changed. Performance is just a tiny bit better than trunk. I did do some playing trying to improve view update times and there are some dances we could do, but for the time being I wanted to keep the same general architecture for updates so that the changes are minimal. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1266) Add stats field to _active_tasks
[ https://issues.apache.org/jira/browse/COUCHDB-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13094143#comment-13094143 ] Filipe Manana commented on COUCHDB-1266: Thanks Paul and Adam, I agree the current status text field could go away, as it can be built from this stats property. However I would be more confortable doing this change for 2.0 for e.g. I like to see this in Futon. I agree it currently doesn't look like very formatted and so on, so it could get polished by someone more confortable with Futon/CSS/HTML. About the progress, right now it's trivial to compute (and it's more of an estimation, without a very high degree of accuracy). But i prefer to leave it's computation to the tasks and not the applications. This is because how it's calculated can change in the future as the implementation of the tasks (compaction, indexing, replication) evolve. I agree that with setelement and #record.fieldname tricks some code size can be reduced. Add stats field to _active_tasks Key: COUCHDB-1266 URL: https://issues.apache.org/jira/browse/COUCHDB-1266 Project: CouchDB Issue Type: Improvement Reporter: Filipe Manana Assignee: Filipe Manana Fix For: 1.2 This proposal is simply to add a stats field to the _active_tasks results. This field can be an arbitrary JSON value and each task can set it to whatever is appropriate for it.The following patch also defines some basic stats for the existing tasks: 1) database compaction - # changes done, total changes, # of revisions copied, # of attachments copied and progress (an integer percentage, same as what is exposed in the existing text field status); 2) view compaction - # of ids copied, total number of ids, # of kvs copied, total number of kvs and progress 3) view indexing - # changes done, total changes, # inserted kvs, # deleted kvs, progress 4) replication - # missing revisions checked, # missing revisions found, # docs read, # docs written, # doc write failures, source seq number, checkpointed source seq number, progress. A screenshot of Futon with 3 different tasks: http://dl.dropbox.com/u/25067962/active_tasks_stats.png Patch at: https://github.com/fdmanana/couchdb/compare/task_stats.diff -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1266) Add stats field to _active_tasks
[ https://issues.apache.org/jira/browse/COUCHDB-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13094194#comment-13094194 ] Filipe Manana commented on COUCHDB-1266: I see what you mean now Paul. I still think the progress field should be given by couch and is one of the fields people care most about. As for Futon, specially for replication tasks, I see all (or most) of those stats meaningful for the normal user. Other than displaying the json object, I don't see a way to display these fields which are not common for all tasks, I guess someone specialized on UIs would find a much better way. Add stats field to _active_tasks Key: COUCHDB-1266 URL: https://issues.apache.org/jira/browse/COUCHDB-1266 Project: CouchDB Issue Type: Improvement Reporter: Filipe Manana Assignee: Filipe Manana Fix For: 1.2 Attachments: alternate-api.patch This proposal is simply to add a stats field to the _active_tasks results. This field can be an arbitrary JSON value and each task can set it to whatever is appropriate for it.The following patch also defines some basic stats for the existing tasks: 1) database compaction - # changes done, total changes, # of revisions copied, # of attachments copied and progress (an integer percentage, same as what is exposed in the existing text field status); 2) view compaction - # of ids copied, total number of ids, # of kvs copied, total number of kvs and progress 3) view indexing - # changes done, total changes, # inserted kvs, # deleted kvs, progress 4) replication - # missing revisions checked, # missing revisions found, # docs read, # docs written, # doc write failures, source seq number, checkpointed source seq number, progress. A screenshot of Futon with 3 different tasks: http://dl.dropbox.com/u/25067962/active_tasks_stats.png Patch at: https://github.com/fdmanana/couchdb/compare/task_stats.diff -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1266) Add stats field to _active_tasks
[ https://issues.apache.org/jira/browse/COUCHDB-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13094203#comment-13094203 ] Filipe Manana commented on COUCHDB-1266: Exactly Paul. I'll be changing the type field to be easier to handle with, for example for view compaction it is the string View Group Compaction - something like view_compaction would be easier for applications. Then Futon, using a simple switch/if-then-else logic can display View Group Compaction or something else. Add stats field to _active_tasks Key: COUCHDB-1266 URL: https://issues.apache.org/jira/browse/COUCHDB-1266 Project: CouchDB Issue Type: Improvement Reporter: Filipe Manana Assignee: Filipe Manana Fix For: 1.2 Attachments: alternate-api.patch This proposal is simply to add a stats field to the _active_tasks results. This field can be an arbitrary JSON value and each task can set it to whatever is appropriate for it.The following patch also defines some basic stats for the existing tasks: 1) database compaction - # changes done, total changes, # of revisions copied, # of attachments copied and progress (an integer percentage, same as what is exposed in the existing text field status); 2) view compaction - # of ids copied, total number of ids, # of kvs copied, total number of kvs and progress 3) view indexing - # changes done, total changes, # inserted kvs, # deleted kvs, progress 4) replication - # missing revisions checked, # missing revisions found, # docs read, # docs written, # doc write failures, source seq number, checkpointed source seq number, progress. A screenshot of Futon with 3 different tasks: http://dl.dropbox.com/u/25067962/active_tasks_stats.png Patch at: https://github.com/fdmanana/couchdb/compare/task_stats.diff -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1265) Replication can introduce duplicates into the seq_btree.
[ https://issues.apache.org/jira/browse/COUCHDB-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13093017#comment-13093017 ] Filipe Manana commented on COUCHDB-1265: Paul, looks good. Just a side note, due to upgrade code in trunk for the branch sizes, instead of matching against {_Deleted, _DiskPos, OldTreeSeq, _Sie}, it would likely be better to call is_tuple(Value) and a element(3, Value) (or match against 3 and 4 element tuples). Replication can introduce duplicates into the seq_btree. Key: COUCHDB-1265 URL: https://issues.apache.org/jira/browse/COUCHDB-1265 Project: CouchDB Issue Type: Bug Components: Database Core Reporter: Paul Joseph Davis Assignee: Paul Joseph Davis Attachments: COUCHDB-1265.patch, replication-frenzy.py Full description, test, and patch to follow shortly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1258) eheap_alloc OOM errors when attempting to build selected views
[ https://issues.apache.org/jira/browse/COUCHDB-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13090592#comment-13090592 ] Filipe Manana commented on COUCHDB-1258: Robert, That solution works perfectly fine for my case. Memory consumption doesn't go beyond ~40Mb (for the whole Erlang VM). Perhaps a lower limit (to match the work queue max items) could be generally better, I haven't tested it however. eheap_alloc OOM errors when attempting to build selected views -- Key: COUCHDB-1258 URL: https://issues.apache.org/jira/browse/COUCHDB-1258 Project: CouchDB Issue Type: Bug Affects Versions: 1.1 Environment: CentOS 5.6, 1GB RAM (approx 800MB free) CouchDb 1.1.0, Erlang R14B-03, js lib 1.7.0-8 EPEL RPM build Couch database exhibiting this behaviour: {db_name:activity_new,doc_count:593274,doc_del_count:4743352,update_seq:10559287,purge_seq:0,compact_running:false,disk_size:3366396013,instance_start_time:1314097423985726,disk_format_version:5,committed_update_seq:10559287} Reporter: James Cohen Priority: Critical We spotted OOM errors crashing our CouchDb instance when attempting to rebuild selected views. CouchDb was dying with the following messages (worth noting that the type reported varies between heap/old_heap eheap_alloc: Cannot allocate 478288480 bytes of memory (of type heap). eheap_alloc: Cannot allocate 597860600 bytes of memory (of type heap). eheap_alloc: Cannot allocate 747325720 bytes of memory (of type old_heap). eheap_alloc: Cannot allocate 597860600 bytes of memory (of type old_heap). By modifying the view I was able to find a view that could consistently crash the server and another that ran fine. They are as follows: Runs out of memory v.quickly { _id: _design/cleanup, _rev: 5-e004fbab278355e9d08763877e5a8295, views: { byDate: { map: function(doc) { if (! doc.action) emit([doc.date], doc); } } } } Runs fine with minimal memory usage (returns 88128 docs in the view) { _id: _design/cleanup, _rev: 6-3823be6b72ca2441e235addfece6900c, views: { byDate: { map: function(doc) { if (doc.action) emit([doc.date], doc); } } } } The only difference between the two is the negation of the if conditional. memory usage was monitored with top on the machine while the view was being built. Under correct behaviour I could see beam.smp using just 3 or 4% of the server's memory. With the view that causes problems that memory usage increased until the RAM/swap on the server was exhausted (as you can see from the error messages around 500/700MB) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1153) Database and view index compaction daemon
[ https://issues.apache.org/jira/browse/COUCHDB-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13089727#comment-13089727 ] Filipe Manana commented on COUCHDB-1153: Paul, Thanks for your long analysis. My use case test, with a fairly large amount of databases, was under the scenario where they're constantly being updated with new documents, with delayed commits set to true (which makes couch_db:is_idle/1 return false very often) and several indexers running at the same time (not more than 25 indexers). I agree this could be much better in all aspects, and that's the motivation for having it disabled by default and having details such as the periodicity of the scans configurable. The allowed period parameter also helps people ensuring compactions will happen only in low activity periods (not an uncommon case). I would like to see all these details improved, together with a better configuration system (either using such a _meta object or _local documents per database, or whatever else), but that is far beyond this feature. Database and view index compaction daemon - Key: COUCHDB-1153 URL: https://issues.apache.org/jira/browse/COUCHDB-1153 Project: CouchDB Issue Type: New Feature Environment: trunk Reporter: Filipe Manana Assignee: Filipe Manana Priority: Minor Labels: compaction I've recently written an Erlang process to automatically compact databases and they're views based on some configurable parameters. These parameters can be global or per database and are: minimum database fragmentation, minimum view fragmentation, allowed period and strict_window (whether an ongoing compaction should be canceled if it doesn't finish within the allowed period). These fragmentation values are based on the recently added data_size parameter to the database and view group information URIs (COUCHDB-1132). I've documented the .ini configuration, as a comment in default.ini, which I paste here: [compaction_daemon] ; The delay, in seconds, between each check for which database and view indexes ; need to be compacted. check_interval = 60 ; If a database or view index file is smaller then this value (in bytes), ; compaction will not happen. Very small files always have a very high ; fragmentation therefore it's not worth to compact them. min_file_size = 131072 [compactions] ; List of compaction rules for the compaction daemon. ; The daemon compacts databases and they're respective view groups when all the ; condition parameters are satisfied. Configuration can be per database or ; global, and it has the following format: ; ; database_name = parameter=value [, parameter=value]* ; _default = parameter=value [, parameter=value]* ; ; Possible parameters: ; ; * db_fragmentation - If the ratio (as an integer percentage), of the amount ; of old data (and its supporting metadata) over the database ; file size is equal to or greater then this value, this ; database compaction condition is satisfied. ; This value is computed as: ; ; (file_size - data_size) / file_size * 100 ; ; The data_size and file_size values can be obtained when ; querying a database's information URI (GET /dbname/). ; ; * view_fragmentation - If the ratio (as an integer percentage), of the amount ;of old data (and its supporting metadata) over the view ;index (view group) file size is equal to or greater then ;this value, then this view index compaction condition is ;satisfied. This value is computed as: ; ;(file_size - data_size) / file_size * 100 ; ;The data_size and file_size values can be obtained when ;querying a view group's information URI ;(GET /dbname/_design/groupname/_info). ; ; * period - The period for which a database (and its view groups) compaction ;is allowed. This value must obey the following format: ; ;HH:MM - HH:MM (HH in [0..23], MM in [0..59]) ; ; * strict_window - If a compaction is still running after the end of the allowed ; period, it will be canceled if this parameter is set to yes. ; It defaults to no and it's meaningful only if the *period* ; parameter is also specified. ; ; * parallel_view_compaction - If set to yes, the database and its views are ; compacted in parallel. This is only useful on ;
[jira] [Commented] (COUCHDB-1259) Replication ID is not stable if local server has a dynamic port number
[ https://issues.apache.org/jira/browse/COUCHDB-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13089836#comment-13089836 ] Filipe Manana commented on COUCHDB-1259: Thanks Jens for filing this and the detailed description. To be more clear, the idea would be to add a UUID to each server and use it as input to the replication id generation instead of the local port number. It would be something similar to what is done for the cookie authentication handler: https://github.com/apache/couchdb/blob/trunk/src/couchdb/couch_httpd_auth.erl#L240 If such uuid doesn't exist in the .ini (replicator section), we generate a new uuid and save it. Damien's thought about allowing the client to name them, sounds also very simple - perhaps using the _replication_id field in replication documents (which already exists and is currently automatically set by the replication manager). Replication ID is not stable if local server has a dynamic port number -- Key: COUCHDB-1259 URL: https://issues.apache.org/jira/browse/COUCHDB-1259 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.1 Reporter: Jens Alfke I noticed that when Couchbase Mobile running on iOS replicates to/from a remote server (on iriscouch in this case), the replication has to fetch the full _changes feed every time it starts. Filipe helped me track down the problem -- the replication ID is coming out different every time. The reason for this is that the local port number, which is one of the inputs to the hash that generates the replication ID, is randomly assigned by the OS. (I.e. it uses a port number of 0 when opening its listener socket.) This is because there could be multiple apps using Couchbase Mobile running on the same device and we can't have their ports colliding. The underlying problem is that CouchDB is attempting to generate a unique ID for a particular pair of {source, destination} databases, but it's basing it on attributes that aren't fundamental to the database and can change, like the hostname or port number. One solution, proposed by Filipe and me, is to assign each database (or each server?) a random UUID when it's created, and use that to generate replication IDs. Another solution, proposed by Damien, is to have CouchDB let the client work out the replication ID on its own, and set it as a property in the replication document (or the JSON body of a _replicate request.) This is even more flexible and will handle tricky scenarios like full P2P replication where there may be no low-level way to uniquely identify the remote database being synced with. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1153) Database and view index compaction daemon
[ https://issues.apache.org/jira/browse/COUCHDB-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088483#comment-13088483 ] Filipe Manana commented on COUCHDB-1153: Thanks for your concerns Robert. Regarding the percentage, I haven't seen before any comment about it. I think it's more clear to users what the ranges are when explicitly using percentages. Either way this is a very minor thing imho. I've been testing this over the last 2 months or so in systems with a high number of databases where all of them are constantly being updated. It behaves fairly well, specially for the case where the number of databases is = max_dbs_open. I haven't seen the case where a compaction lasts forever due to retries (2 to 5 retries were the worst scenario I think I ever saw). There's a tradeoff always. Asking the filesystem for a list of .couch files and opening each respective database to check if it needs to be compacted - There's a price to pay here yes, but there are also costs of not compacting databases often - databases (and views) with high amounts of wasted space are much less cache friendly, and the server can run out of disk space quickly, something clearly undesirable as well. This is far from perfect yes (and so are many features of CouchDB or any other software). The periodicity of the scans if configurable, and doing such scans in systems with a small amount of databases (= max_dbs_open) is not a major problem. Such deployments aren't an uncommon case. If you have a patch or concrete proposal, I'll be happy to review and get it into the repository. Database and view index compaction daemon - Key: COUCHDB-1153 URL: https://issues.apache.org/jira/browse/COUCHDB-1153 Project: CouchDB Issue Type: New Feature Environment: trunk Reporter: Filipe Manana Assignee: Filipe Manana Priority: Minor Labels: compaction I've recently written an Erlang process to automatically compact databases and they're views based on some configurable parameters. These parameters can be global or per database and are: minimum database fragmentation, minimum view fragmentation, allowed period and strict_window (whether an ongoing compaction should be canceled if it doesn't finish within the allowed period). These fragmentation values are based on the recently added data_size parameter to the database and view group information URIs (COUCHDB-1132). I've documented the .ini configuration, as a comment in default.ini, which I paste here: [compaction_daemon] ; The delay, in seconds, between each check for which database and view indexes ; need to be compacted. check_interval = 60 ; If a database or view index file is smaller then this value (in bytes), ; compaction will not happen. Very small files always have a very high ; fragmentation therefore it's not worth to compact them. min_file_size = 131072 [compactions] ; List of compaction rules for the compaction daemon. ; The daemon compacts databases and they're respective view groups when all the ; condition parameters are satisfied. Configuration can be per database or ; global, and it has the following format: ; ; database_name = parameter=value [, parameter=value]* ; _default = parameter=value [, parameter=value]* ; ; Possible parameters: ; ; * db_fragmentation - If the ratio (as an integer percentage), of the amount ; of old data (and its supporting metadata) over the database ; file size is equal to or greater then this value, this ; database compaction condition is satisfied. ; This value is computed as: ; ; (file_size - data_size) / file_size * 100 ; ; The data_size and file_size values can be obtained when ; querying a database's information URI (GET /dbname/). ; ; * view_fragmentation - If the ratio (as an integer percentage), of the amount ;of old data (and its supporting metadata) over the view ;index (view group) file size is equal to or greater then ;this value, then this view index compaction condition is ;satisfied. This value is computed as: ; ;(file_size - data_size) / file_size * 100 ; ;The data_size and file_size values can be obtained when ;querying a view group's information URI ;(GET /dbname/_design/groupname/_info). ; ; * period - The period for which a database (and its view groups) compaction ;is allowed. This value must obey the following format: ; ;
[jira] [Commented] (COUCHDB-1256) Incremental requests to _changes can skip revisions
[ https://issues.apache.org/jira/browse/COUCHDB-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1303#comment-1303 ] Filipe Manana commented on COUCHDB-1256: I think skiping foo's revision 1-0dc33db52a43872b6f3371cef7de0277 when asking with since=2 seems correct, since that revision corresponds to sequence 1. Like this the replicator does not receive revisions it already received before. Am I missing something? Incremental requests to _changes can skip revisions --- Key: COUCHDB-1256 URL: https://issues.apache.org/jira/browse/COUCHDB-1256 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 0.10, 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2, 1.1, 1.0.3 Environment: confirmed on Apache CouchDB 1.1.0, bug appears to be present in 1.0.3 and trunk Reporter: Adam Kocoloski Assignee: Adam Kocoloski Priority: Blocker Fix For: 1.0.4, 1.1.1, 1.2 Requests to _changes with style=all_docssince=N (requests made by the replicator) are liable to suppress revisions of a document. The following sequence of curl commands demonstrates the bug: curl -X PUT localhost:5985/revseq {ok:true} curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo -d '{a:123}' {ok:true,id:foo,rev:1-0dc33db52a43872b6f3371cef7de0277} curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/bar -d '{a:456}' {ok:true,id:bar,rev:1-cc609831f0ca66e8cd3d4c1e0d98108a} % stick a conflict revision in foo curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo?new_edits=false -d '{_rev:1-cc609831f0ca66e8cd3d4c1e0d98108a, a:123}' {ok:true,id:foo,rev:1-cc609831f0ca66e8cd3d4c1e0d98108a} % request without since= gives the expected result curl -Hcontent-type:application/json localhost:5985/revseq/_changes?style=all_docs {results:[ {seq:2,id:bar,changes:[{rev:1-cc609831f0ca66e8cd3d4c1e0d98108a}]}, {seq:3,id:foo,changes:[{rev:1-cc609831f0ca66e8cd3d4c1e0d98108a},{rev:1-0dc33db52a43872b6f3371cef7de0277}]} ], last_seq:3} % request starting from since=2 suppresses revision 1-0dc33db52a43872b6f3371cef7de0277 of foo macbook:~ (master) $ curl localhost:5985/revseq/_changes?style=all_docs\since=2 {results:[ {seq:3,id:foo,changes:[{rev:1-cc609831f0ca66e8cd3d4c1e0d98108a}]} ], last_seq:3} I believe the fix is something like this (though we could refactor further because Style is unused): diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl index e8705be..65aeca3 100644 --- a/src/couchdb/couch_db.erl +++ b/src/couchdb/couch_db.erl @@ -1029,19 +1029,7 @@ changes_since(Db, Style, StartSeq, Fun, Acc) - changes_since(Db, Style, StartSeq, Fun, [], Acc). changes_since(Db, Style, StartSeq, Fun, Options, Acc) - -Wrapper = fun(DocInfo, _Offset, Acc2) - -#doc_info{revs=Revs} = DocInfo, -DocInfo2 = -case Style of -main_only - -DocInfo; -all_docs - -% remove revs before the seq -DocInfo#doc_info{revs=[RevInfo || -#rev_info{seq=RevSeq}=RevInfo - Revs, StartSeq RevSeq]} -end, -Fun(DocInfo2, Acc2) -end, +Wrapper = fun(DocInfo, _Offset, Acc2) - Fun(DocInfo, Acc2) end, {ok, _LastReduction, AccOut} = couch_btree:fold(by_seq_btree(Db), Wrapper, Acc, [{start_key, StartSeq + 1}] ++ Options), {ok, AccOut}. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1256) Incremental requests to _changes can skip revisions
[ https://issues.apache.org/jira/browse/COUCHDB-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1302#comment-1302 ] Filipe Manana commented on COUCHDB-1256: I think skiping foo's revision 1-0dc33db52a43872b6f3371cef7de0277 when asking with since=2 seems correct, since that revision corresponds to sequence 1. Like this the replicator does not receive revisions it already received before. Am I missing something? Incremental requests to _changes can skip revisions --- Key: COUCHDB-1256 URL: https://issues.apache.org/jira/browse/COUCHDB-1256 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 0.10, 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2, 1.1, 1.0.3 Environment: confirmed on Apache CouchDB 1.1.0, bug appears to be present in 1.0.3 and trunk Reporter: Adam Kocoloski Assignee: Adam Kocoloski Priority: Blocker Fix For: 1.0.4, 1.1.1, 1.2 Requests to _changes with style=all_docssince=N (requests made by the replicator) are liable to suppress revisions of a document. The following sequence of curl commands demonstrates the bug: curl -X PUT localhost:5985/revseq {ok:true} curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo -d '{a:123}' {ok:true,id:foo,rev:1-0dc33db52a43872b6f3371cef7de0277} curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/bar -d '{a:456}' {ok:true,id:bar,rev:1-cc609831f0ca66e8cd3d4c1e0d98108a} % stick a conflict revision in foo curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo?new_edits=false -d '{_rev:1-cc609831f0ca66e8cd3d4c1e0d98108a, a:123}' {ok:true,id:foo,rev:1-cc609831f0ca66e8cd3d4c1e0d98108a} % request without since= gives the expected result curl -Hcontent-type:application/json localhost:5985/revseq/_changes?style=all_docs {results:[ {seq:2,id:bar,changes:[{rev:1-cc609831f0ca66e8cd3d4c1e0d98108a}]}, {seq:3,id:foo,changes:[{rev:1-cc609831f0ca66e8cd3d4c1e0d98108a},{rev:1-0dc33db52a43872b6f3371cef7de0277}]} ], last_seq:3} % request starting from since=2 suppresses revision 1-0dc33db52a43872b6f3371cef7de0277 of foo macbook:~ (master) $ curl localhost:5985/revseq/_changes?style=all_docs\since=2 {results:[ {seq:3,id:foo,changes:[{rev:1-cc609831f0ca66e8cd3d4c1e0d98108a}]} ], last_seq:3} I believe the fix is something like this (though we could refactor further because Style is unused): diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl index e8705be..65aeca3 100644 --- a/src/couchdb/couch_db.erl +++ b/src/couchdb/couch_db.erl @@ -1029,19 +1029,7 @@ changes_since(Db, Style, StartSeq, Fun, Acc) - changes_since(Db, Style, StartSeq, Fun, [], Acc). changes_since(Db, Style, StartSeq, Fun, Options, Acc) - -Wrapper = fun(DocInfo, _Offset, Acc2) - -#doc_info{revs=Revs} = DocInfo, -DocInfo2 = -case Style of -main_only - -DocInfo; -all_docs - -% remove revs before the seq -DocInfo#doc_info{revs=[RevInfo || -#rev_info{seq=RevSeq}=RevInfo - Revs, StartSeq RevSeq]} -end, -Fun(DocInfo2, Acc2) -end, +Wrapper = fun(DocInfo, _Offset, Acc2) - Fun(DocInfo, Acc2) end, {ok, _LastReduction, AccOut} = couch_btree:fold(by_seq_btree(Db), Wrapper, Acc, [{start_key, StartSeq + 1}] ++ Options), {ok, AccOut}. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1238) CouchDB uses _users db for storing oauth credentials
[ https://issues.apache.org/jira/browse/COUCHDB-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088909#comment-13088909 ] Filipe Manana commented on COUCHDB-1238: Pete, I rebased the patch with latest trunk and made some necessary small reorganizations to make it more compliant with the current codebase. Let me know if you agree with it. I also added a documentation comment in the .ini file. Here's the modified patch: https://github.com/fdmanana/couchdb/compare/oauth_users_db I'll wait for peer review CouchDB uses _users db for storing oauth credentials Key: COUCHDB-1238 URL: https://issues.apache.org/jira/browse/COUCHDB-1238 Project: CouchDB Issue Type: New Feature Components: Database Core Affects Versions: 1.1 Reporter: Pete Vander Giessen Assignee: Filipe Manana Fix For: 1.2 Attachments: git_commits_as_patch.zip, oauth_users_db_patch.zip We want to store oauth credentials in the _users db, rather than in the .ini. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1256) Incremental requests to _changes can skip revisions
[ https://issues.apache.org/jira/browse/COUCHDB-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088933#comment-13088933 ] Filipe Manana commented on COUCHDB-1256: Adam if the replicator checkpointed sequence 2, it means it received and processed revision 1-0dc33db52a43872b6f3371cef7de0277 of foo. Otherwise it would be a bug in the replicator. If it crashes before processing seq 3 (which lists revision 1-cc609831f0ca66e8cd3d4c1e0d98108a of foo), on restart when requesting _changes?since=2style=all_docs it will receive the revision of foo which hasn't yet processed (1-cc609831f0ca66e8cd3d4c1e0d98108a). Incremental requests to _changes can skip revisions --- Key: COUCHDB-1256 URL: https://issues.apache.org/jira/browse/COUCHDB-1256 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 0.10, 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2, 1.1, 1.0.3 Environment: confirmed on Apache CouchDB 1.1.0, bug appears to be present in 1.0.3 and trunk Reporter: Adam Kocoloski Assignee: Adam Kocoloski Priority: Blocker Fix For: 1.0.4, 1.1.1, 1.2 Attachments: jira-1256-test.diff Requests to _changes with style=all_docssince=N (requests made by the replicator) are liable to suppress revisions of a document. The following sequence of curl commands demonstrates the bug: curl -X PUT localhost:5985/revseq {ok:true} curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo -d '{a:123}' {ok:true,id:foo,rev:1-0dc33db52a43872b6f3371cef7de0277} curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/bar -d '{a:456}' {ok:true,id:bar,rev:1-cc609831f0ca66e8cd3d4c1e0d98108a} % stick a conflict revision in foo curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo?new_edits=false -d '{_rev:1-cc609831f0ca66e8cd3d4c1e0d98108a, a:123}' {ok:true,id:foo,rev:1-cc609831f0ca66e8cd3d4c1e0d98108a} % request without since= gives the expected result curl -Hcontent-type:application/json localhost:5985/revseq/_changes?style=all_docs {results:[ {seq:2,id:bar,changes:[{rev:1-cc609831f0ca66e8cd3d4c1e0d98108a}]}, {seq:3,id:foo,changes:[{rev:1-cc609831f0ca66e8cd3d4c1e0d98108a},{rev:1-0dc33db52a43872b6f3371cef7de0277}]} ], last_seq:3} % request starting from since=2 suppresses revision 1-0dc33db52a43872b6f3371cef7de0277 of foo macbook:~ (master) $ curl localhost:5985/revseq/_changes?style=all_docs\since=2 {results:[ {seq:3,id:foo,changes:[{rev:1-cc609831f0ca66e8cd3d4c1e0d98108a}]} ], last_seq:3} I believe the fix is something like this (though we could refactor further because Style is unused): diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl index e8705be..65aeca3 100644 --- a/src/couchdb/couch_db.erl +++ b/src/couchdb/couch_db.erl @@ -1029,19 +1029,7 @@ changes_since(Db, Style, StartSeq, Fun, Acc) - changes_since(Db, Style, StartSeq, Fun, [], Acc). changes_since(Db, Style, StartSeq, Fun, Options, Acc) - -Wrapper = fun(DocInfo, _Offset, Acc2) - -#doc_info{revs=Revs} = DocInfo, -DocInfo2 = -case Style of -main_only - -DocInfo; -all_docs - -% remove revs before the seq -DocInfo#doc_info{revs=[RevInfo || -#rev_info{seq=RevSeq}=RevInfo - Revs, StartSeq RevSeq]} -end, -Fun(DocInfo2, Acc2) -end, +Wrapper = fun(DocInfo, _Offset, Acc2) - Fun(DocInfo, Acc2) end, {ok, _LastReduction, AccOut} = couch_btree:fold(by_seq_btree(Db), Wrapper, Acc, [{start_key, StartSeq + 1}] ++ Options), {ok, AccOut}. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1256) Incremental requests to _changes can skip revisions
[ https://issues.apache.org/jira/browse/COUCHDB-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088982#comment-13088982 ] Filipe Manana commented on COUCHDB-1256: Adam, with this later explanation, it's clear to me now. I was thinking the first replication request got a seq 1 changes row. Thanks for pointing it out. Incremental requests to _changes can skip revisions --- Key: COUCHDB-1256 URL: https://issues.apache.org/jira/browse/COUCHDB-1256 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 0.10, 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2, 1.1, 1.0.3 Environment: confirmed on Apache CouchDB 1.1.0, bug appears to be present in 1.0.3 and trunk Reporter: Adam Kocoloski Assignee: Adam Kocoloski Priority: Blocker Fix For: 1.0.4, 1.1.1, 1.2 Attachments: jira-1256-test.diff Requests to _changes with style=all_docssince=N (requests made by the replicator) are liable to suppress revisions of a document. The following sequence of curl commands demonstrates the bug: curl -X PUT localhost:5985/revseq {ok:true} curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo -d '{a:123}' {ok:true,id:foo,rev:1-0dc33db52a43872b6f3371cef7de0277} curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/bar -d '{a:456}' {ok:true,id:bar,rev:1-cc609831f0ca66e8cd3d4c1e0d98108a} % stick a conflict revision in foo curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo?new_edits=false -d '{_rev:1-cc609831f0ca66e8cd3d4c1e0d98108a, a:123}' {ok:true,id:foo,rev:1-cc609831f0ca66e8cd3d4c1e0d98108a} % request without since= gives the expected result curl -Hcontent-type:application/json localhost:5985/revseq/_changes?style=all_docs {results:[ {seq:2,id:bar,changes:[{rev:1-cc609831f0ca66e8cd3d4c1e0d98108a}]}, {seq:3,id:foo,changes:[{rev:1-cc609831f0ca66e8cd3d4c1e0d98108a},{rev:1-0dc33db52a43872b6f3371cef7de0277}]} ], last_seq:3} % request starting from since=2 suppresses revision 1-0dc33db52a43872b6f3371cef7de0277 of foo macbook:~ (master) $ curl localhost:5985/revseq/_changes?style=all_docs\since=2 {results:[ {seq:3,id:foo,changes:[{rev:1-cc609831f0ca66e8cd3d4c1e0d98108a}]} ], last_seq:3} I believe the fix is something like this (though we could refactor further because Style is unused): diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl index e8705be..65aeca3 100644 --- a/src/couchdb/couch_db.erl +++ b/src/couchdb/couch_db.erl @@ -1029,19 +1029,7 @@ changes_since(Db, Style, StartSeq, Fun, Acc) - changes_since(Db, Style, StartSeq, Fun, [], Acc). changes_since(Db, Style, StartSeq, Fun, Options, Acc) - -Wrapper = fun(DocInfo, _Offset, Acc2) - -#doc_info{revs=Revs} = DocInfo, -DocInfo2 = -case Style of -main_only - -DocInfo; -all_docs - -% remove revs before the seq -DocInfo#doc_info{revs=[RevInfo || -#rev_info{seq=RevSeq}=RevInfo - Revs, StartSeq RevSeq]} -end, -Fun(DocInfo2, Acc2) -end, +Wrapper = fun(DocInfo, _Offset, Acc2) - Fun(DocInfo, Acc2) end, {ok, _LastReduction, AccOut} = couch_btree:fold(by_seq_btree(Db), Wrapper, Acc, [{start_key, StartSeq + 1}] ++ Options), {ok, AccOut}. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (COUCHDB-1255) Deleting source db in continuous filtered replication crashes couchdb and prevents restart
[ https://issues.apache.org/jira/browse/COUCHDB-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana closed COUCHDB-1255. -- Resolution: Duplicate Fix Version/s: 1.1.1 Hi, Should no longer happen in the branch 1.1.x. You'll get an error message like the following in the log/consoleIf: [error] [0.104.0] Replication manager, error processing document `testrep1`: Could not open source database `dba`: {db_not_found,dba} If after trying out the patch still happens, please reopen this ticket. Deleting source db in continuous filtered replication crashes couchdb and prevents restart -- Key: COUCHDB-1255 URL: https://issues.apache.org/jira/browse/COUCHDB-1255 Project: CouchDB Issue Type: Bug Affects Versions: 1.1 Environment: OSX Reporter: Simon Robson Fix For: 1.1.1 Steps to reproduce: curl -X PUT http://user:pass@localhost:5984/dba curl -X PUT http://user:pass@localhost:5984/dbb curl -X PUT http://user:pass@localhost:5984/dba/_design/testapp -d '{filters: {rep_filter: function(doc, req) {return true}}}' -H Content-Type: application/json curl -X POST http://user:pass@localhost:5984/_replicator -d '{_id: testrep1, source: dba, target: http://user:pass@localhost:5984/dbb;, continuous: true, create_target: true, filter: testapp/rep_filter}' -H Content-Type: application/json # wait long enough for first replication checkpoint sleep 10 curl -X DELETE http://user:pass@localhost:5984/dba At this point couch crashes and can't be restarted. These issues are related: https://issues.apache.org/jira/browse/COUCHDB-1233 https://issues.apache.org/jira/browse/COUCHDB-1199 Maybe fixed in 1.1.1 by patch in above issue, but I don't have access to confirm. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-822) body_too_large error for external processes, when body is 3MB
[ https://issues.apache.org/jira/browse/COUCHDB-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087144#comment-13087144 ] Filipe Manana commented on COUCHDB-822: --- I think the right approach here is something like this: https://github.com/apache/couchdb/commit/cc2379ebbd45d842204642ac8a6a5a16669ffa4e The case where the request doesn't have neither a content-length neither is chunked encoded, should be very rare (not http compliant). Therefore it doesn't make sense to have a new setting. body_too_large error for external processes, when body is 3MB - Key: COUCHDB-822 URL: https://issues.apache.org/jira/browse/COUCHDB-822 Project: CouchDB Issue Type: Bug Components: HTTP Interface Affects Versions: 0.11 Environment: CouchDB 0.11 on Ubuntu 10.04, Reporter: Erik Edin Assignee: Robert Newson Priority: Minor Fix For: 1.2 I have a photo which is around 3 MB that I'm trying to PUT to an external process on the CouchDB database. The external process is called _upload. I get an uncaught error {exit,{body_too_large,content_length}} in the logs when trying this. Smaller photos (around 60 kB) seem to work just fine. This just happens with external processes. I can upload the photo as an attachment directly to a document, with no problems. The error is similar to an earlier bug in the mochiweb library that was fixed around Feb 2009, where mochiweb never used the max_document_size setting that was provided when calling mochiweb_request:recv_body. I believe, supported by the stack trace below, that the cause for this bug is that in couch_httpd_external:json_req_obj the function mochiweb_request:recv_body/0 is called, which uses the mochiweb default value on MaxBody, which is 1 MB. I think that couch_httpd_external:json_req_obj should call mochiweb_request:recv_body/1 instead, with the max_document_size setting as the argument. Here are the error logs from one of my attempts: [Thu, 08 Jul 2010 18:49:53 GMT] [debug] [0.3738.0] 'PUT' /pillowfight/_upload/6b1908c352129ddda396fa69ac003d11 {1,1} Headers: [{'Accept',*/*}, {'Content-Length',3093976}, {'Content-Type',image/jpg}, {Expect,100-continue}, {'Host',localhost:5984}, {'User-Agent',curl/7.19.7 (i486-pc-linux-gnu) libcurl/7.19.7 OpenSSL/0.9.8k zlib/1.2.3.3 libidn/1.15}] [Thu, 08 Jul 2010 18:49:53 GMT] [debug] [0.3738.0] OAuth Params: [] [Thu, 08 Jul 2010 18:49:53 GMT] [error] [0.3738.0] Uncaught error in HTTP request: {exit,{body_too_large,content_length}} [Thu, 08 Jul 2010 18:49:53 GMT] [info] [0.3738.0] Stacktrace: [{mochiweb_request,stream_body,5}, {mochiweb_request,recv_body,2}, {couch_httpd_external,json_req_obj,3}, {couch_httpd_external,process_external_req,3}, {couch_httpd_db,do_db_req,2}, {couch_httpd,handle_request_int,5}, {mochiweb_http,headers,5}, {proc_lib,init_p_do_apply,3}] -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1246) CouchJS process spawned and not killed on each Reduce Overflow Error
[ https://issues.apache.org/jira/browse/COUCHDB-1246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086826#comment-13086826 ] Filipe Manana commented on COUCHDB-1246: Thanks Paul, looks good to me. CouchJS process spawned and not killed on each Reduce Overflow Error Key: COUCHDB-1246 URL: https://issues.apache.org/jira/browse/COUCHDB-1246 Project: CouchDB Issue Type: Bug Components: JavaScript View Server Affects Versions: 1.1 Environment: Linux Debian Squeeze [query_server_config] reduce_limit = true os_process_limit = 25 Reporter: Michael Newman Attachments: COUCHDB-1246.patch, categories, os_pool_trunk.patch, os_pool_trunk.patch, os_pool_trunk.patch Running the view attached results in a reduce_overflow_error. For each reduce_overflow_error a process of /usr/lib/couchdb/bin/couchjs /usr/share/couchdb/server/main.js starts running. Once this gets to 25, which is the os_process_limit by default, all views result in a server error: timeout {gen_server,call,[couch_query_servers,{get_proc,javascript}]} As far as I can tell, these processes and the non-response from the views will continue until couch is restarted. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (COUCHDB-1246) CouchJS process spawned and not killed on each Reduce Overflow Error
[ https://issues.apache.org/jira/browse/COUCHDB-1246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana resolved COUCHDB-1246. Resolution: Fixed Fix Version/s: 1.2 1.1.1 Applied to trunk and branch 1.1.x CouchJS process spawned and not killed on each Reduce Overflow Error Key: COUCHDB-1246 URL: https://issues.apache.org/jira/browse/COUCHDB-1246 Project: CouchDB Issue Type: Bug Components: JavaScript View Server Affects Versions: 1.1 Environment: Linux Debian Squeeze [query_server_config] reduce_limit = true os_process_limit = 25 Reporter: Michael Newman Fix For: 1.1.1, 1.2 Attachments: COUCHDB-1246.patch, categories, os_pool_trunk.patch, os_pool_trunk.patch, os_pool_trunk.patch Running the view attached results in a reduce_overflow_error. For each reduce_overflow_error a process of /usr/lib/couchdb/bin/couchjs /usr/share/couchdb/server/main.js starts running. Once this gets to 25, which is the os_process_limit by default, all views result in a server error: timeout {gen_server,call,[couch_query_servers,{get_proc,javascript}]} As far as I can tell, these processes and the non-response from the views will continue until couch is restarted. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (COUCHDB-1246) CouchJS process spawned and not killed on each Reduce Overflow Error
[ https://issues.apache.org/jira/browse/COUCHDB-1246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana updated COUCHDB-1246: --- Attachment: os_pool_trunk.patch Paul, as we were discussing on IRC, I've a reproducible case where querying a view just hangs and we get os pool full, blocking subsequent requests. I attach here a wip patch, which adds an etap test (your patch is making this test fail). CouchJS process spawned and not killed on each Reduce Overflow Error Key: COUCHDB-1246 URL: https://issues.apache.org/jira/browse/COUCHDB-1246 Project: CouchDB Issue Type: Bug Components: JavaScript View Server Affects Versions: 1.1 Environment: Linux Debian Squeeze [query_server_config] reduce_limit = true os_process_limit = 25 Reporter: Michael Newman Attachments: COUCHDB-1246.patch, categories, os_pool_trunk.patch Running the view attached results in a reduce_overflow_error. For each reduce_overflow_error a process of /usr/lib/couchdb/bin/couchjs /usr/share/couchdb/server/main.js starts running. Once this gets to 25, which is the os_process_limit by default, all views result in a server error: timeout {gen_server,call,[couch_query_servers,{get_proc,javascript}]} As far as I can tell, these processes and the non-response from the views will continue until couch is restarted. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (COUCHDB-1246) CouchJS process spawned and not killed on each Reduce Overflow Error
[ https://issues.apache.org/jira/browse/COUCHDB-1246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana updated COUCHDB-1246: --- Attachment: os_pool_trunk.patch Thanks Paul. I'll probably be updating the patch one or two times until then. CouchJS process spawned and not killed on each Reduce Overflow Error Key: COUCHDB-1246 URL: https://issues.apache.org/jira/browse/COUCHDB-1246 Project: CouchDB Issue Type: Bug Components: JavaScript View Server Affects Versions: 1.1 Environment: Linux Debian Squeeze [query_server_config] reduce_limit = true os_process_limit = 25 Reporter: Michael Newman Attachments: COUCHDB-1246.patch, categories, os_pool_trunk.patch, os_pool_trunk.patch Running the view attached results in a reduce_overflow_error. For each reduce_overflow_error a process of /usr/lib/couchdb/bin/couchjs /usr/share/couchdb/server/main.js starts running. Once this gets to 25, which is the os_process_limit by default, all views result in a server error: timeout {gen_server,call,[couch_query_servers,{get_proc,javascript}]} As far as I can tell, these processes and the non-response from the views will continue until couch is restarted. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1153) Database and view index compaction daemon
[ https://issues.apache.org/jira/browse/COUCHDB-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086096#comment-13086096 ] Filipe Manana commented on COUCHDB-1153: Paul, I'm addressing all the concerns pointed before. Some of them are already done and be tracked in individual commits: https://github.com/fdmanana/couchdb/commits/compaction_daemon The thing I'm not 100% sure is how to make the config and load/start of os_mon. I'm not that familiar with all those OTP structuring details. I've come up with this so far: http://friendpaste.com/R43WflJ8r75MupvXuS98v Using the args_file you mentioned makes sense, but we already have some stuff that could be moved into that new file, so I think it should go into a separate change. I'll help/do it, just need to figure out exactly how to do it and integrate into the build system / startup scripts. Database and view index compaction daemon - Key: COUCHDB-1153 URL: https://issues.apache.org/jira/browse/COUCHDB-1153 Project: CouchDB Issue Type: New Feature Environment: trunk Reporter: Filipe Manana Assignee: Filipe Manana Priority: Minor Labels: compaction I've recently written an Erlang process to automatically compact databases and they're views based on some configurable parameters. These parameters can be global or per database and are: minimum database fragmentation, minimum view fragmentation, allowed period and strict_window (whether an ongoing compaction should be canceled if it doesn't finish within the allowed period). These fragmentation values are based on the recently added data_size parameter to the database and view group information URIs (COUCHDB-1132). I've documented the .ini configuration, as a comment in default.ini, which I paste here: [compaction_daemon] ; The delay, in seconds, between each check for which database and view indexes ; need to be compacted. check_interval = 60 ; If a database or view index file is smaller then this value (in bytes), ; compaction will not happen. Very small files always have a very high ; fragmentation therefore it's not worth to compact them. min_file_size = 131072 [compactions] ; List of compaction rules for the compaction daemon. ; The daemon compacts databases and they're respective view groups when all the ; condition parameters are satisfied. Configuration can be per database or ; global, and it has the following format: ; ; database_name = parameter=value [, parameter=value]* ; _default = parameter=value [, parameter=value]* ; ; Possible parameters: ; ; * db_fragmentation - If the ratio (as an integer percentage), of the amount ; of old data (and its supporting metadata) over the database ; file size is equal to or greater then this value, this ; database compaction condition is satisfied. ; This value is computed as: ; ; (file_size - data_size) / file_size * 100 ; ; The data_size and file_size values can be obtained when ; querying a database's information URI (GET /dbname/). ; ; * view_fragmentation - If the ratio (as an integer percentage), of the amount ;of old data (and its supporting metadata) over the view ;index (view group) file size is equal to or greater then ;this value, then this view index compaction condition is ;satisfied. This value is computed as: ; ;(file_size - data_size) / file_size * 100 ; ;The data_size and file_size values can be obtained when ;querying a view group's information URI ;(GET /dbname/_design/groupname/_info). ; ; * period - The period for which a database (and its view groups) compaction ;is allowed. This value must obey the following format: ; ;HH:MM - HH:MM (HH in [0..23], MM in [0..59]) ; ; * strict_window - If a compaction is still running after the end of the allowed ; period, it will be canceled if this parameter is set to yes. ; It defaults to no and it's meaningful only if the *period* ; parameter is also specified. ; ; * parallel_view_compaction - If set to yes, the database and its views are ; compacted in parallel. This is only useful on ; certain setups, like for example when the database ; and view index directories point to different ; disks. It defaults to no.
[jira] [Updated] (COUCHDB-1246) CouchJS process spawned and not killed on each Reduce Overflow Error
[ https://issues.apache.org/jira/browse/COUCHDB-1246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana updated COUCHDB-1246: --- Attachment: os_pool_trunk.patch CouchJS process spawned and not killed on each Reduce Overflow Error Key: COUCHDB-1246 URL: https://issues.apache.org/jira/browse/COUCHDB-1246 Project: CouchDB Issue Type: Bug Components: JavaScript View Server Affects Versions: 1.1 Environment: Linux Debian Squeeze [query_server_config] reduce_limit = true os_process_limit = 25 Reporter: Michael Newman Attachments: COUCHDB-1246.patch, categories, os_pool_trunk.patch, os_pool_trunk.patch, os_pool_trunk.patch Running the view attached results in a reduce_overflow_error. For each reduce_overflow_error a process of /usr/lib/couchdb/bin/couchjs /usr/share/couchdb/server/main.js starts running. Once this gets to 25, which is the os_process_limit by default, all views result in a server error: timeout {gen_server,call,[couch_query_servers,{get_proc,javascript}]} As far as I can tell, these processes and the non-response from the views will continue until couch is restarted. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1153) Database and view index compaction daemon
[ https://issues.apache.org/jira/browse/COUCHDB-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13085507#comment-13085507 ] Filipe Manana commented on COUCHDB-1153: Thanks Paul Not sure about what you mean with the loop weirdness. Doesn't seem complicated to me: loop() - do_stuff(), sleep(...), loop(). An alternative ti start os_mon (i really don't care) is to add it to list it as a dependency in the .app file. You're right about the couch_server. It's part of the reason why the autocompaction is disabled by default. Haven't seen however yet a big issue with about ~1000 databases. An approach would be to wait a bit before opening a db if it's not in the lru cache perhahps. Certainly there's a lot of room for improvements in auto compaction and an initial implementation will unlikely ever be perfect for all scenarios. Database and view index compaction daemon - Key: COUCHDB-1153 URL: https://issues.apache.org/jira/browse/COUCHDB-1153 Project: CouchDB Issue Type: New Feature Environment: trunk Reporter: Filipe Manana Assignee: Filipe Manana Priority: Minor Labels: compaction I've recently written an Erlang process to automatically compact databases and they're views based on some configurable parameters. These parameters can be global or per database and are: minimum database fragmentation, minimum view fragmentation, allowed period and strict_window (whether an ongoing compaction should be canceled if it doesn't finish within the allowed period). These fragmentation values are based on the recently added data_size parameter to the database and view group information URIs (COUCHDB-1132). I've documented the .ini configuration, as a comment in default.ini, which I paste here: [compaction_daemon] ; The delay, in seconds, between each check for which database and view indexes ; need to be compacted. check_interval = 60 ; If a database or view index file is smaller then this value (in bytes), ; compaction will not happen. Very small files always have a very high ; fragmentation therefore it's not worth to compact them. min_file_size = 131072 [compactions] ; List of compaction rules for the compaction daemon. ; The daemon compacts databases and they're respective view groups when all the ; condition parameters are satisfied. Configuration can be per database or ; global, and it has the following format: ; ; database_name = parameter=value [, parameter=value]* ; _default = parameter=value [, parameter=value]* ; ; Possible parameters: ; ; * db_fragmentation - If the ratio (as an integer percentage), of the amount ; of old data (and its supporting metadata) over the database ; file size is equal to or greater then this value, this ; database compaction condition is satisfied. ; This value is computed as: ; ; (file_size - data_size) / file_size * 100 ; ; The data_size and file_size values can be obtained when ; querying a database's information URI (GET /dbname/). ; ; * view_fragmentation - If the ratio (as an integer percentage), of the amount ;of old data (and its supporting metadata) over the view ;index (view group) file size is equal to or greater then ;this value, then this view index compaction condition is ;satisfied. This value is computed as: ; ;(file_size - data_size) / file_size * 100 ; ;The data_size and file_size values can be obtained when ;querying a view group's information URI ;(GET /dbname/_design/groupname/_info). ; ; * period - The period for which a database (and its view groups) compaction ;is allowed. This value must obey the following format: ; ;HH:MM - HH:MM (HH in [0..23], MM in [0..59]) ; ; * strict_window - If a compaction is still running after the end of the allowed ; period, it will be canceled if this parameter is set to yes. ; It defaults to no and it's meaningful only if the *period* ; parameter is also specified. ; ; * parallel_view_compaction - If set to yes, the database and its views are ; compacted in parallel. This is only useful on ; certain setups, like for example when the database ; and view index directories point to different ; disks. It defaults to no. ; ; Before a
[jira] [Resolved] (COUCHDB-1227) Deleted validators are still run
[ https://issues.apache.org/jira/browse/COUCHDB-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana resolved COUCHDB-1227. Resolution: Fixed Fix Version/s: 1.1.1 Applied to trunk and branch 1.1.x Deleted validators are still run Key: COUCHDB-1227 URL: https://issues.apache.org/jira/browse/COUCHDB-1227 Project: CouchDB Issue Type: Bug Components: Database Core Affects Versions: 1.1 Environment: Windows XP Reporter: James Howe Assignee: Filipe Manana Labels: delete, validation Fix For: 1.1.1 Attachments: couchdb-1227.patch If design documents are deleted using an update to add _deleted while still retaining the validate_doc_update function, it is still run on updates. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (COUCHDB-1218) Better logger performance
[ https://issues.apache.org/jira/browse/COUCHDB-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana resolved COUCHDB-1218. Resolution: Fixed Fix Version/s: 1.2 Applied to trunk Better logger performance - Key: COUCHDB-1218 URL: https://issues.apache.org/jira/browse/COUCHDB-1218 Project: CouchDB Issue Type: Improvement Reporter: Filipe Manana Assignee: Filipe Manana Fix For: 1.2 Attachments: 0001-Better-logger-performance.patch I made some experiments with OTP's disk_log module (available since 2001 at least) to use it to manage the log file. It turns out I got better throughput by using it. Basically it adopts a strategy similar to the asynchronous couch_file Damien described in this thread: http://mail-archives.apache.org/mod_mbox/couchdb-dev/201106.mbox/%3c5c39fb5a-0aca-4ff9-bd90-2ebecf271...@apache.org%3E Here's a benchmark with relaximation, 50 writers, 100 readers, documents of 1Kb, delayed_commits set to false and 'info' log level (default): http://graphs.mikeal.couchone.com/#/graph/9e19f6d9eeb318c70cabcf67bc013c7f The reads got a better throughput (bottom graph, easier to visualize). The patch (also attached here), which has a descriptive comment, is at: https://github.com/fdmanana/couchdb/compare/logger_perf.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (COUCHDB-1238) CouchDB uses _users db for storing oauth credentials
[ https://issues.apache.org/jira/browse/COUCHDB-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana reassigned COUCHDB-1238: -- Assignee: Filipe Manana CouchDB uses _users db for storing oauth credentials Key: COUCHDB-1238 URL: https://issues.apache.org/jira/browse/COUCHDB-1238 Project: CouchDB Issue Type: New Feature Components: Database Core Affects Versions: 1.1 Reporter: Pete Vander Giessen Assignee: Filipe Manana Fix For: 1.2 Attachments: git_commits_as_patch.zip We want to store oauth credentials in the _users db, rather than in the .ini. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (COUCHDB-1241) function_clause error when making HTTP request to external process
[ https://issues.apache.org/jira/browse/COUCHDB-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana resolved COUCHDB-1241. Resolution: Fixed Fix applied to trunk function_clause error when making HTTP request to external process -- Key: COUCHDB-1241 URL: https://issues.apache.org/jira/browse/COUCHDB-1241 Project: CouchDB Issue Type: Bug Affects Versions: 1.2 Environment: Couchbase Single Server 2.0.0-dev4 on OS X 10.7 Reporter: Nathan Vander Wilt Assignee: Filipe Manana A Python external helper based on the old/original protocol (http://wiki.apache.org/couchdb/ExternalProcesses) that worked fine under earlier versions of CouchDB but is exploding on Couchbase Server v2.0.0-dev4 (1.2.0a-b11df55-git): [error] [0.16764.1] function_clause error in HTTP request [info] [0.16764.1] Stacktrace: [{couch_httpd_external, parse_external_response, [{json, {\json\: {\allows_originals\: false, \ok\: true}}}]}, {couch_httpd_external, send_external_response,2}, {couch_db_frontend,do_db_req,2}, {couch_httpd,handle_request_int,6}, {mochiweb_http,headers,5}, {proc_lib,init_p_do_apply,3}] [info] [0.16764.1] 127.0.0.1 - - POST /photos/_local_shutterstem/folder/imgsrc-abc123/update?utility=%2FVolumes%2FEOS_DIGITAL%2FDCIM%2FShutterStem%20-%20Import%20EOS%20350D.htmltoken=ABC123 500 My external process seems to be getting called successfully but CouchDB gets unhappy when trying to forward the result to the client. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (COUCHDB-1227) Deleted validators are still run
[ https://issues.apache.org/jira/browse/COUCHDB-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana reassigned COUCHDB-1227: -- Assignee: Filipe Manana Deleted validators are still run Key: COUCHDB-1227 URL: https://issues.apache.org/jira/browse/COUCHDB-1227 Project: CouchDB Issue Type: Bug Components: Database Core Affects Versions: 1.1 Environment: Windows XP Reporter: James Howe Assignee: Filipe Manana Labels: delete, validation If design documents are deleted using an update to add _deleted while still retaining the validate_doc_update function, it is still run on updates. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (COUCHDB-1227) Deleted validators are still run
[ https://issues.apache.org/jira/browse/COUCHDB-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana updated COUCHDB-1227: --- Attachment: couchdb-1227.patch Deleted validators are still run Key: COUCHDB-1227 URL: https://issues.apache.org/jira/browse/COUCHDB-1227 Project: CouchDB Issue Type: Bug Components: Database Core Affects Versions: 1.1 Environment: Windows XP Reporter: James Howe Assignee: Filipe Manana Labels: delete, validation Attachments: couchdb-1227.patch If design documents are deleted using an update to add _deleted while still retaining the validate_doc_update function, it is still run on updates. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (COUCHDB-1241) function_clause error when making HTTP request to external process
[ https://issues.apache.org/jira/browse/COUCHDB-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana reassigned COUCHDB-1241: -- Assignee: Filipe Manana function_clause error when making HTTP request to external process -- Key: COUCHDB-1241 URL: https://issues.apache.org/jira/browse/COUCHDB-1241 Project: CouchDB Issue Type: Bug Affects Versions: 1.2 Environment: Couchbase Single Server 2.0.0-dev4 on OS X 10.7 Reporter: Nathan Vander Wilt Assignee: Filipe Manana A Python external helper based on the old/original protocol (http://wiki.apache.org/couchdb/ExternalProcesses) that worked fine under earlier versions of CouchDB but is exploding on Couchbase Server v2.0.0-dev4 (1.2.0a-b11df55-git): [error] [0.16764.1] function_clause error in HTTP request [info] [0.16764.1] Stacktrace: [{couch_httpd_external, parse_external_response, [{json, {\json\: {\allows_originals\: false, \ok\: true}}}]}, {couch_httpd_external, send_external_response,2}, {couch_db_frontend,do_db_req,2}, {couch_httpd,handle_request_int,6}, {mochiweb_http,headers,5}, {proc_lib,init_p_do_apply,3}] [info] [0.16764.1] 127.0.0.1 - - POST /photos/_local_shutterstem/folder/imgsrc-abc123/update?utility=%2FVolumes%2FEOS_DIGITAL%2FDCIM%2FShutterStem%20-%20Import%20EOS%20350D.htmltoken=ABC123 500 My external process seems to be getting called successfully but CouchDB gets unhappy when trying to forward the result to the client. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1241) function_clause error when making HTTP request to external process
[ https://issues.apache.org/jira/browse/COUCHDB-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081143#comment-13081143 ] Filipe Manana commented on COUCHDB-1241: This is my fault here, and it certainly affects trunk. Nathan, in case you're able to patch and build from source, here's a quick fix: http://friendpaste.com/6g6tQIIvjnBEi7a2vkIQDc function_clause error when making HTTP request to external process -- Key: COUCHDB-1241 URL: https://issues.apache.org/jira/browse/COUCHDB-1241 Project: CouchDB Issue Type: Bug Affects Versions: 1.2 Environment: Couchbase Single Server 2.0.0-dev4 on OS X 10.7 Reporter: Nathan Vander Wilt A Python external helper based on the old/original protocol (http://wiki.apache.org/couchdb/ExternalProcesses) that worked fine under earlier versions of CouchDB but is exploding on Couchbase Server v2.0.0-dev4 (1.2.0a-b11df55-git): [error] [0.16764.1] function_clause error in HTTP request [info] [0.16764.1] Stacktrace: [{couch_httpd_external, parse_external_response, [{json, {\json\: {\allows_originals\: false, \ok\: true}}}]}, {couch_httpd_external, send_external_response,2}, {couch_db_frontend,do_db_req,2}, {couch_httpd,handle_request_int,6}, {mochiweb_http,headers,5}, {proc_lib,init_p_do_apply,3}] [info] [0.16764.1] 127.0.0.1 - - POST /photos/_local_shutterstem/folder/imgsrc-abc123/update?utility=%2FVolumes%2FEOS_DIGITAL%2FDCIM%2FShutterStem%20-%20Import%20EOS%20350D.htmltoken=ABC123 500 My external process seems to be getting called successfully but CouchDB gets unhappy when trying to forward the result to the client. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1234) Named Document Replication does not replicate the _deleted revision
[ https://issues.apache.org/jira/browse/COUCHDB-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081155#comment-13081155 ] Filipe Manana commented on COUCHDB-1234: Ok, i was thinking that by named document replication you were talking about the replicator database (1.1), but this doesn't apply since it's version 1.0.2. You're issue should no longer happen in 1.1.0. Named Document Replication does not replicate the _deleted revision --- Key: COUCHDB-1234 URL: https://issues.apache.org/jira/browse/COUCHDB-1234 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.0.2 Environment: CentOS 5.5 Reporter: Hans-D. Böhlau I would like to use Named Document Replication to replicate changes on a number of docs. I expect ALL changes of those docs to be replicated from source-db (test1) towards the target-db (test2). as-is: If a document changes its revision because of a normal modification, it works perfectly and fast. If a document (hdb1) changes its revision because of its deletion, the replicator logs an error. The document in my target-database remains alive. couch.log: [error] [0.6676.3] Replicator: error accessing doc hdb1 at http://vm-dmp-del1:5984/test1/, reason: not_found i expected: - ... Named Document Replication to mark a document as deleted in the target-db if it has been deleted in the source-db. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1238) CouchDB uses _users db for storing oauth credentials
[ https://issues.apache.org/jira/browse/COUCHDB-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13078322#comment-13078322 ] Filipe Manana commented on COUCHDB-1238: Thanks again Pete. Yes, a manual merge is going to be needed in order to extract that optimization :( CouchDB uses _users db for storing oauth credentials Key: COUCHDB-1238 URL: https://issues.apache.org/jira/browse/COUCHDB-1238 Project: CouchDB Issue Type: New Feature Components: Database Core Affects Versions: 1.1 Reporter: Pete Vander Giessen Fix For: 1.2 Attachments: git_commits_as_patch.zip We want to store oauth credentials in the _users db, rather than in the .ini. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1238) CouchDB uses _users db for storing oauth credentials
[ https://issues.apache.org/jira/browse/COUCHDB-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073780#comment-13073780 ] Filipe Manana commented on COUCHDB-1238: Pete, Thank you very much for doing it and testing it on the field. I'll look at the patches soon and on no objections I'll commit them to trunk only (it's quite a big new feature for 1.1.1). CouchDB uses _users db for storing oauth credentials Key: COUCHDB-1238 URL: https://issues.apache.org/jira/browse/COUCHDB-1238 Project: CouchDB Issue Type: New Feature Components: Database Core Affects Versions: 1.1.1 Reporter: Pete Vander Giessen Fix For: 1.1.1 Attachments: git_commits_as_patch.zip We want to store oauth credentials in the _users db, rather than in the .ini. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1238) CouchDB uses _users db for storing oauth credentials
[ https://issues.apache.org/jira/browse/COUCHDB-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13075988#comment-13075988 ] Filipe Manana commented on COUCHDB-1238: Pete, I had a quick look at them. It seems you extracted an early version that uses 2 separate views. Later on (and after having added other ubuntuone specific features), it was simplified to use a single view, that maps consumer key and access token to secrets and username. The relevant commit, branch oauth_delegation, is this one: https://github.com/fdmanana/ubuntuone-couchdb-server/commit/83d5045f867d019345fc5fd078710eb83189b12b#L2R405 Do you think you can integrate it as well? :) Anything that's related to delegation should go away, as it's a completely different feature. I know it might be hairy to extract this optimization from the branch :( Thanks a lot, and let me know if you need some guidance CouchDB uses _users db for storing oauth credentials Key: COUCHDB-1238 URL: https://issues.apache.org/jira/browse/COUCHDB-1238 Project: CouchDB Issue Type: New Feature Components: Database Core Affects Versions: 1.1 Reporter: Pete Vander Giessen Fix For: 1.2 Attachments: git_commits_as_patch.zip We want to store oauth credentials in the _users db, rather than in the .ini. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1234) Named Document Replication does not replicate the _deleted revision
[ https://issues.apache.org/jira/browse/COUCHDB-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13072415#comment-13072415 ] Filipe Manana commented on COUCHDB-1234: And if you don't trigger a named document replication, doesn't the same happens? Named Document Replication does not replicate the _deleted revision --- Key: COUCHDB-1234 URL: https://issues.apache.org/jira/browse/COUCHDB-1234 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.0.2 Environment: CentOS 5.5 Reporter: Hans-D. Böhlau I would like to use Named Document Replication to replicate changes on a number of docs. I expect ALL changes of those docs to be replicated from source-db (test1) towards the target-db (test2). as-is: If a document changes its revision because of a normal modification, it works perfectly and fast. If a document (hdb1) changes its revision because of its deletion, the replicator logs an error. The document in my target-database remains alive. couch.log: [error] [0.6676.3] Replicator: error accessing doc hdb1 at http://vm-dmp-del1:5984/test1/, reason: not_found i expected: - ... Named Document Replication to mark a document as deleted in the target-db if it has been deleted in the source-db. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (COUCHDB-1233) Invalid filter in _replicator entry cause CouchDB crash
[ https://issues.apache.org/jira/browse/COUCHDB-1233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana closed COUCHDB-1233. -- Resolution: Duplicate Fixed in COUCHDB-1199. Will be on 1.1.1. Invalid filter in _replicator entry cause CouchDB crash --- Key: COUCHDB-1233 URL: https://issues.apache.org/jira/browse/COUCHDB-1233 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.1 Environment: Max OSX Reporter: Martin Higham Attachments: crashreport.txt If I write an entry to the _replication database that contains a filter entry that doesn't exist in the source then CouchDB crashes. The _replicator entry is not updated to indicate an error but vanishes. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1226) Replication causes CouchDB to crash. I *suspect* a memory leak of some kind
[ https://issues.apache.org/jira/browse/COUCHDB-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069302#comment-13069302 ] Filipe Manana commented on COUCHDB-1226: Thanks for testing and reporting James Replication causes CouchDB to crash. I *suspect* a memory leak of some kind Key: COUCHDB-1226 URL: https://issues.apache.org/jira/browse/COUCHDB-1226 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.1 Environment: Gentoo Linux, CouchDB built using standard ebuild. Rebuilt July 2011. Reporter: James Marca Attachments: topcouch.log When replicating databases (pull replication), CouchDB will silently crash. I suspect a memory leak is leading to the crash, because I watch the beam process slowly creep up in RAM usage, then the server dies. For the crashing server, the log on debug doesn't seem very helpful. It says (with manually scrubbed server address): [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [0.10054.0] didn't find a replication log for http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [0.10054.0] didn't find a replication log for http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [0.10054.0] didn't find a replication log for vdsdata/d12/2007/1210882 [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [0.10054.0] didn't find a replication log for vdsdata/d12/2007/1210882 [Mon, 18 Jul 2011 16:23:20 GMT] [info] [0.10032.0] starting new replication 431a3f5bae52a6b27da72e42dc7b9fe3+create_target at 0.10054.0 [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [0.10070.0] missing_revs updating committed seq to 1 [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [0.83.0] New task status for 431a3f: http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ - vdsdata/d12/2007/1210882: W Processed source update #1 [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [0.10070.0] missing_revs updating committed seq to 2 [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [0.83.0] New task status for 431a3f: http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ - vdsdata/d12/2007/1210882: W Processed source update #2 [Mon, 18 Jul 2011 16:23:23 GMT] [debug] [0.10070.0] missing_revs updating committed seq to 10 [Mon, 18 Jul 2011 16:23:23 GMT] [debug] [0.83.0] New task status for 431a3f: http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ - vdsdata/d12/2007/1210882: W Processed source update #10 [Mon, 18 Jul 2011 16:23:24 GMT] [debug] [0.83.0] New task status for 431a3f: http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ - vdsdata/d12/2007/1210882: W Processed source update #14 [Mon, 18 Jul 2011 16:23:24 GMT] [debug] [0.10070.0] missing_revs updating committed seq to 14 [Mon, 18 Jul 2011 16:23:24 GMT] [debug] [0.10070.0] missing_revs updating committed seq to 20 [Mon, 18 Jul 2011 16:23:24 GMT] [debug] [0.83.0] New task status for 431a3f: http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ - vdsdata/d12/2007/1210882: W Processed source update #20 [Mon, 18 Jul 2011 16:23:25 GMT] [debug] [0.10054.0] target doesn't need a full commit [Mon, 18 Jul 2011 16:23:36 GMT] [info] [0.10054.0] recording a checkpoint for http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ - vdsdata/d12/2007/1210882 at source update_seq 20 Then, when I restart CouchDB, and restart the node.js program that is setting up the replication jobs, the crashed replication job picks up where it left off and completes just fine. Again, I scrubbed my server addresses in this log snippet.: [Mon, 18 Jul 2011 17:22:53 GMT] [debug] [0.3562.0] 'POST' /_replicate {1,1} from 128.*.*.* Headers: [{'Authorization',Basic amFtZXM6bWdpY24wbWIzcg==}, {'Connection',close}, {'Content-Type',application/json}, {'Host',***[pullserver]***.edu}, {'Transfer-Encoding',chunked}] [Mon, 18 Jul 2011 17:22:53 GMT] [debug] [0.3562.0] OAuth Params: [] [Mon, 18 Jul 2011 17:22:53 GMT] [debug] [0.3580.0] found a replication log for http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ [Mon, 18 Jul 2011 17:22:53 GMT] [debug] [0.3580.0] found a replication log for vdsdata/d12/2007/1210882 [Mon, 18 Jul 2011 17:22:53 GMT] [info] [0.3562.0] starting new replication 431a3f5bae52a6b27da72e42dc7b9fe3+create_target at 0.3580.0 [Mon, 18 Jul 2011 17:22:56 GMT] [debug] [0.3595.0] missing_revs updating committed seq to 22 [Mon, 18 Jul 2011 17:22:56 GMT] [debug] [0.83.0] New task status for 431a3f: http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ - vdsdata/d12/2007/1210882: W Processed source update #22 [Mon, 18 Jul 2011 17:22:56 GMT] [debug] [0.3595.0] missing_revs updating committed seq to 37 [Mon, 18 Jul 2011
[jira] [Commented] (COUCHDB-1226) Replication causes CouchDB to crash. I *suspect* a memory leak of some kind
[ https://issues.apache.org/jira/browse/COUCHDB-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067263#comment-13067263 ] Filipe Manana commented on COUCHDB-1226: Which Erlang OTP version? Also, have you tried compacting the databases before replication to see if it helps? (might have some relation to COUCHDB-968) Replication causes CouchDB to crash. I *suspect* a memory leak of some kind Key: COUCHDB-1226 URL: https://issues.apache.org/jira/browse/COUCHDB-1226 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.1 Environment: Gentoo Linux, CouchDB built using standard ebuild. Rebuilt July 2011. Reporter: James Marca Attachments: topcouch.log When replicating databases (pull replication), CouchDB will silently crash. I suspect a memory leak is leading to the crash, because I watch the beam process slowly creep up in RAM usage, then the server dies. For the crashing server, the log on debug doesn't seem very helpful. It says (with manually scrubbed server address): [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [0.10054.0] didn't find a replication log for http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [0.10054.0] didn't find a replication log for http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [0.10054.0] didn't find a replication log for vdsdata/d12/2007/1210882 [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [0.10054.0] didn't find a replication log for vdsdata/d12/2007/1210882 [Mon, 18 Jul 2011 16:23:20 GMT] [info] [0.10032.0] starting new replication 431a3f5bae52a6b27da72e42dc7b9fe3+create_target at 0.10054.0 [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [0.10070.0] missing_revs updating committed seq to 1 [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [0.83.0] New task status for 431a3f: http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ - vdsdata/d12/2007/1210882: W Processed source update #1 [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [0.10070.0] missing_revs updating committed seq to 2 [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [0.83.0] New task status for 431a3f: http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ - vdsdata/d12/2007/1210882: W Processed source update #2 [Mon, 18 Jul 2011 16:23:23 GMT] [debug] [0.10070.0] missing_revs updating committed seq to 10 [Mon, 18 Jul 2011 16:23:23 GMT] [debug] [0.83.0] New task status for 431a3f: http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ - vdsdata/d12/2007/1210882: W Processed source update #10 [Mon, 18 Jul 2011 16:23:24 GMT] [debug] [0.83.0] New task status for 431a3f: http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ - vdsdata/d12/2007/1210882: W Processed source update #14 [Mon, 18 Jul 2011 16:23:24 GMT] [debug] [0.10070.0] missing_revs updating committed seq to 14 [Mon, 18 Jul 2011 16:23:24 GMT] [debug] [0.10070.0] missing_revs updating committed seq to 20 [Mon, 18 Jul 2011 16:23:24 GMT] [debug] [0.83.0] New task status for 431a3f: http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ - vdsdata/d12/2007/1210882: W Processed source update #20 [Mon, 18 Jul 2011 16:23:25 GMT] [debug] [0.10054.0] target doesn't need a full commit [Mon, 18 Jul 2011 16:23:36 GMT] [info] [0.10054.0] recording a checkpoint for http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ - vdsdata/d12/2007/1210882 at source update_seq 20 Then, when I restart CouchDB, and restart the node.js program that is setting up the replication jobs, the crashed replication job picks up where it left off and completes just fine. Again, I scrubbed my server addresses in this log snippet.: [Mon, 18 Jul 2011 17:22:53 GMT] [debug] [0.3562.0] 'POST' /_replicate {1,1} from 128.*.*.* Headers: [{'Authorization',Basic amFtZXM6bWdpY24wbWIzcg==}, {'Connection',close}, {'Content-Type',application/json}, {'Host',***[pullserver]***.edu}, {'Transfer-Encoding',chunked}] [Mon, 18 Jul 2011 17:22:53 GMT] [debug] [0.3562.0] OAuth Params: [] [Mon, 18 Jul 2011 17:22:53 GMT] [debug] [0.3580.0] found a replication log for http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ [Mon, 18 Jul 2011 17:22:53 GMT] [debug] [0.3580.0] found a replication log for vdsdata/d12/2007/1210882 [Mon, 18 Jul 2011 17:22:53 GMT] [info] [0.3562.0] starting new replication 431a3f5bae52a6b27da72e42dc7b9fe3+create_target at 0.3580.0 [Mon, 18 Jul 2011 17:22:56 GMT] [debug] [0.3595.0] missing_revs updating committed seq to 22 [Mon, 18 Jul 2011 17:22:56 GMT] [debug] [0.83.0] New task status for 431a3f: http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ - vdsdata/d12/2007/1210882: W Processed source update #22 [Mon,
[jira] [Commented] (COUCHDB-1218) Better logger performance
[ https://issues.apache.org/jira/browse/COUCHDB-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064495#comment-13064495 ] Filipe Manana commented on COUCHDB-1218: Yep, disk_log was made with the purpose of logging into files. It has many features, like log rotation etc. It can log terms or raw data (i'm using the later), it can repair log files, etc. I'm using it in the simplest way possible, to achieve exactly the same of what is being done currently by couch_log. I haven't seen increase in cpu and memory usage (via dstat and htop) compared to current trunk. The async api basically puts the messages into a queue and then a disk_log worker is constantly dequeing from that queue and writing to the file. Something plugabble seems like a completely different issue and it's not what I'm trying to address here. Plus depending on syslog, or something else external, doesn't seem a good thing by default for me - how would it work on Windows, or mobile? That said, I like the simplicity of our logger - nothing fancy, small code and plain text files. Better logger performance - Key: COUCHDB-1218 URL: https://issues.apache.org/jira/browse/COUCHDB-1218 Project: CouchDB Issue Type: Improvement Reporter: Filipe Manana Assignee: Filipe Manana Attachments: 0001-Better-logger-performance.patch I made some experiments with OTP's disk_log module (available since 2001 at least) to use it to manage the log file. It turns out I got better throughput by using it. Basically it adopts a strategy similar to the asynchronous couch_file Damien described in this thread: http://mail-archives.apache.org/mod_mbox/couchdb-dev/201106.mbox/%3c5c39fb5a-0aca-4ff9-bd90-2ebecf271...@apache.org%3E Here's a benchmark with relaximation, 50 writers, 100 readers, documents of 1Kb, delayed_commits set to false and 'info' log level (default): http://graphs.mikeal.couchone.com/#/graph/9e19f6d9eeb318c70cabcf67bc013c7f The reads got a better throughput (bottom graph, easier to visualize). The patch (also attached here), which has a descriptive comment, is at: https://github.com/fdmanana/couchdb/compare/logger_perf.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1221) Replication Fails when trying to replicate to a database name that has previously been deleted
[ https://issues.apache.org/jira/browse/COUCHDB-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064778#comment-13064778 ] Filipe Manana commented on COUCHDB-1221: Looking at the error, it looks like to me you're running Erlang OTP R14B02. Is it true? You can check it by running: $ erl -noshell -eval 'io:put_chars([erlang:system_info(otp_release), $\n]).' -s erlang halt Replication Fails when trying to replicate to a database name that has previously been deleted -- Key: COUCHDB-1221 URL: https://issues.apache.org/jira/browse/COUCHDB-1221 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.0.2 Reporter: Chris Truskow Replication Fails when trying to replicate to a database name that had previously been deleted. Create a database xyz. Replicate anything to it. Delete xyz. Create database xyz. The following error appears if you try and replicate to it again. Replication failed: {error,{'EXIT',{badarg,[{erlang,apply,[gen_server,start_link,undefined]}, {supervisor,do_start_child,2}, {supervisor,handle_call,3}, {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]}}} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (COUCHDB-1218) Better logger performance
Better logger performance - Key: COUCHDB-1218 URL: https://issues.apache.org/jira/browse/COUCHDB-1218 Project: CouchDB Issue Type: Improvement Reporter: Filipe Manana Assignee: Filipe Manana Attachments: 0001-Better-logger-performance.patch I made some experiments with OTP's disk_log module (available since 2001 at least) to use it to manage the log file. It turns out I got better throughput by using it. Basically it adopts a strategy similar to the asynchronous couch_file Damien described in this thread: http://mail-archives.apache.org/mod_mbox/couchdb-dev/201106.mbox/%3c5c39fb5a-0aca-4ff9-bd90-2ebecf271...@apache.org%3E Here's a benchmark with relaximation, 50 writers, 100 readers, documents of 1Kb, delayed_commits set to false and 'info' log level (default): http://graphs.mikeal.couchone.com/#/graph/9e19f6d9eeb318c70cabcf67bc013c7f The reads got a better throughput (bottom graph, easier to visualize). The patch (also attached here), which has a descriptive comment, is at: https://github.com/fdmanana/couchdb/compare/logger_perf.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (COUCHDB-1218) Better logger performance
[ https://issues.apache.org/jira/browse/COUCHDB-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana updated COUCHDB-1218: --- Attachment: 0001-Better-logger-performance.patch Better logger performance - Key: COUCHDB-1218 URL: https://issues.apache.org/jira/browse/COUCHDB-1218 Project: CouchDB Issue Type: Improvement Reporter: Filipe Manana Assignee: Filipe Manana Attachments: 0001-Better-logger-performance.patch I made some experiments with OTP's disk_log module (available since 2001 at least) to use it to manage the log file. It turns out I got better throughput by using it. Basically it adopts a strategy similar to the asynchronous couch_file Damien described in this thread: http://mail-archives.apache.org/mod_mbox/couchdb-dev/201106.mbox/%3c5c39fb5a-0aca-4ff9-bd90-2ebecf271...@apache.org%3E Here's a benchmark with relaximation, 50 writers, 100 readers, documents of 1Kb, delayed_commits set to false and 'info' log level (default): http://graphs.mikeal.couchone.com/#/graph/9e19f6d9eeb318c70cabcf67bc013c7f The reads got a better throughput (bottom graph, easier to visualize). The patch (also attached here), which has a descriptive comment, is at: https://github.com/fdmanana/couchdb/compare/logger_perf.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (COUCHDB-1186) Speedups in the view indexer
[ https://issues.apache.org/jira/browse/COUCHDB-1186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana resolved COUCHDB-1186. Resolution: Fixed Applied to trunk Speedups in the view indexer Key: COUCHDB-1186 URL: https://issues.apache.org/jira/browse/COUCHDB-1186 Project: CouchDB Issue Type: Improvement Reporter: Filipe Manana Assignee: Filipe Manana Fix For: 1.2 The patches at [1] and [2] do 2 distinct optimizations to the view indexer 1) Use a NIF to implement couch_view:less_json/2; 2) Multiple small optimizations to couch_view_updater - the main one is to decode the view server's JSON only in the updater's write process, avoiding 2 EJSON term copying phases (couch_os_process - updater processes and writes work queue) [1] - https://github.com/fdmanana/couchdb/commit/3935a4a991abc32132c078e908dbc11925605602 [2] - https://github.com/fdmanana/couchdb/commit/cce325378723c863f05cca2192ac7bd58eedde1c Using these 2 patches, I've seen significant improvements to view generation time. Here I present as example the databases at: A) http://fdmanana.couchone.com/indexer_test_2 B) http://fdmanana.couchone.com/indexer_test_3 ## Trunk ### database A $ time curl http://localhost:5985/indexer_test_2/_design/test/_view/view1?limit=1 {total_rows:1102400,offset:0,rows:[ {id:00d49881-7bcf-4c3d-a65d-e44435eeb513,key:[dwarf,assassin,2,1.1],value:[{x:174347.18,y:127272.8},{x:35179.93,y:41550.55},{x:157014.38,y:172052.63},{x:116185.83,y:69871 .73},{x:153746.28,y:190006.59}]} ]} real 19m46.007s user 0m0.024s sys 0m0.020s ### Database B $ time curl http://localhost:5985/indexer_test_3/_design/test/_view/view1?limit=1 {total_rows:1102400,offset:0,rows:[ {id:00d49881-7bcf-4c3d-a65d-e44435eeb513,key:[dwarf,assassin,2,1.1],value:[{x:174347.18,y:127272.8},{x:35179.93,y:41550.55},{x:157014.38,y:172052.63},{x:116185.83,y:69871 .73},{x:153746.28,y:190006.59}]} ]} real 21m41.958s user 0m0.004s sys 0m0.028s ## Trunk + the 2 patches ### Database A $ time curl http://localhost:5984/indexer_test_2/_design/test/_view/view1?limit=1 {total_rows:1102400,offset:0,rows:[ {id:00d49881-7bcf-4c3d-a65d-e44435eeb513,key:[dwarf,assassin,2,1.1],value:[{x:174347.18,y:127272.8},{x:35179.93,y:41550.55},{x:157014.38,y:172052.63},{x:116185.83,y:69871.7 3},{x:153746.28,y:190006.59}]} ]} real16m1.820s user0m0.000s sys 0m0.028s (versus 19m46 with trunk) ### Database B $ time curl http://localhost:5984/indexer_test_3/_design/test/_view/view1?limit=1 {total_rows:1102400,offset:0,rows:[ {id:00d49881-7bcf-4c3d-a65d-e44435eeb513,key:[dwarf,assassin,2,1.1],value:[{x:174347.18,y:127272.8},{x:35179.93,y:41550.55},{x:157014.38,y:172052.63},{x:116185.83,y:69871.7 3},{x:153746.28,y:190006.59}]} ]} real17m22.778s user0m0.020s sys 0m0.016s (versus 21m41s with trunk) Repeating these tests, always clearing my OS/fs cache before running them (via `echo 3 /proc/sys/vm/drop_caches`), I always get about the same relative differences. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (COUCHDB-1212) Newly created user accounts cannot sign-in after _user database crashes
[ https://issues.apache.org/jira/browse/COUCHDB-1212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana updated COUCHDB-1212: --- Attachment: couchdb-1212.patch Another gen_server timeout likely because the pre-compaction database file is being deleted, making the system slower. I think it's safe to add infinity timeouts to all couch_db gen_server calls. Jan, do you think you can test the attached patch? Newly created user accounts cannot sign-in after _user database crashes Key: COUCHDB-1212 URL: https://issues.apache.org/jira/browse/COUCHDB-1212 Project: CouchDB Issue Type: Bug Components: Database Core, HTTP Interface Affects Versions: 1.0.2 Environment: Ubuntu 10.10, Erlang R14B02 (erts-5.8.3) Reporter: Jan van den Berg Priority: Critical Labels: _users, authentication Attachments: couchdb-1212.patch We have one (4,5 GB) couch database and we use the (default) _users database to store user accounts for a website. Once a week we need to restart couchdb because newly sign-up user accounts cannot login any more. They get a HTTP statuscode 401 from the _session HTTP interface. We update, and compact the database three times a day. This is the a stacktrace I see in the couch database log prior to when these issues occur. --- couch.log --- [Wed, 29 Jun 2011 22:02:46 GMT] [info] [0.117.0] Starting compaction for db fbm [Wed, 29 Jun 2011 22:02:46 GMT] [info] [0.5753.79] 127.0.0.1 - - 'POST' /fbm/_compact 202 [Wed, 29 Jun 2011 22:02:46 GMT] [info] [0.5770.79] 127.0.0.1 - - 'POST' /fbm/_view_cleanup 202 [Wed, 29 Jun 2011 22:10:19 GMT] [info] [0.5773.79] 86.9.246.184 - - 'GET' /_session 200 [Wed, 29 Jun 2011 22:24:39 GMT] [info] [0.6236.79] 85.28.105.161 - - 'GET' /_session 200 [Wed, 29 Jun 2011 22:25:06 GMT] [error] [0.84.0] ** Generic server couch_server terminating ** Last message in was {open,fbm, [{user_ctx,{user_ctx,null,[],undefined}}]} ** When Server state == {server,/opt/couchbase-server/var/lib/couchdb, {re_pattern,0,0, 69,82,67,80,116,0,0,0,16,0,0,0,1,0,0,0,0,0, 0,0,0,0,0,0,40,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,93,0,72,25,77,0,0,0,0,0,0,0,0,0,0,0,0,254, 255,255,7,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 77,0,0,0,0,16,171,255,3,0,0,0,128,254,255, 255,7,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,69,26, 84,0,72,0}, 100,2,Sat, 18 Jun 2011 14:00:44 GMT} ** Reason for termination == ** {timeout,{gen_server,call,[0.116.0,{open_ref_count,0.10417.79}]}} [Wed, 29 Jun 2011 22:25:06 GMT] [error] [0.84.0] {error_report,0.31.0, {0.84.0,crash_report, [[{initial_call,{couch_server,init,['Argument__1']}}, {pid,0.84.0}, {registered_name,couch_server}, {error_info, {exit, {timeout, {gen_server,call, [0.116.0,{open_ref_count,0.10417.79}]}}, [{gen_server,terminate,6},{proc_lib,init_p_do_apply,3}]}}, {ancestors,[couch_primary_services,couch_server_sup,0.32.0]}, {messages,[]}, {links,[0.91.0,0.483.0,0.116.0,0.79.0]}, {dictionary,[]}, {trap_exit,true}, {status,running}, {heap_size,6765}, {stack_size,24}, {reductions,206710598}], []]}} [Wed, 29 Jun 2011 22:25:06 GMT] [error] [0.79.0] {error_report,0.31.0, {0.79.0,supervisor_report, [{supervisor,{local,couch_primary_services}}, {errorContext,child_terminated}, {reason, {timeout, {gen_server,call,[0.116.0,{open_ref_count,0.10417.79}]}}}, {offender, [{pid,0.84.0}, {name,couch_server}, {mfargs,{couch_server,sup_start_link,[]}}, {restart_type,permanent}, {shutdown,1000}, {child_type,worker}]}]}} [Wed, 29 Jun 2011 22:25:06 GMT] [error] [0.91.0] ** Generic server 0.91.0 terminating ** Last message in was {'EXIT',0.84.0, {timeout, {gen_server,call, [0.116.0, {open_ref_count,0.10417.79}]}}} ** When Server state == {db,0.91.0,0.92.0,nil,1308405644393791, 0.90.0,0.94.0, {db_header,5,91,0, {378285,{30,9}}, {380466,39},
[jira] [Resolved] (COUCHDB-1201) Add Started on and Last updated on datetime fields to _active_tasks API and Futon UI
[ https://issues.apache.org/jira/browse/COUCHDB-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana resolved COUCHDB-1201. Resolution: Fixed Applied to trunk Add Started on and Last updated on datetime fields to _active_tasks API and Futon UI Key: COUCHDB-1201 URL: https://issues.apache.org/jira/browse/COUCHDB-1201 Project: CouchDB Issue Type: Improvement Components: Futon, HTTP Interface Reporter: Filipe Manana Assignee: Filipe Manana Fix For: 1.2 Attachments: active_tasks_times.png, couchdb-1201.patch This is a very simple and useful feature to add. It gives an idea for how long a task has been running and its progress (time when the last status text was updated) - users can use this to find out if a task is progressing too slowly or if it hanged. Screenshot sample and patch attached. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-857) no continuous replication with doc_ids param
[ https://issues.apache.org/jira/browse/COUCHDB-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13058861#comment-13058861 ] Filipe Manana commented on COUCHDB-857: --- Yep, this was added in 1.1.0 in fact. no continuous replication with doc_ids param Key: COUCHDB-857 URL: https://issues.apache.org/jira/browse/COUCHDB-857 Project: CouchDB Issue Type: Bug Affects Versions: 0.11.1, 0.11.2, 0.11.3, 1.0, 1.0.1, 1.0.2 Reporter: Benoit Chesneau I investigated more on this problem : http://markmail.org/message/bazjkhmwcrdp3kcf I've found that that *continuous* replication isn't possible with doc_ids parameter. I confirmed this reading the code : DocIds = couch_util:get_value(doc_ids, PostProps, nil), ... snip ... case DocIds of List when is_list(List) - % Fast replication using only a list of doc IDs to replicate. % Replication sessions, checkpoints and logs are not created % since the update sequence number of the source DB is not used % for determining which documents are copied into the target DB. . snip _ - % Replication using the _changes API (DB sequence update numbers). Why can't we have continuous replication with doc_ids ? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1205) Password of a pull-replicated db shown in stack trace
[ https://issues.apache.org/jira/browse/COUCHDB-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057178#comment-13057178 ] Filipe Manana commented on COUCHDB-1205: Which Couch version is it? Password of a pull-replicated db shown in stack trace - Key: COUCHDB-1205 URL: https://issues.apache.org/jira/browse/COUCHDB-1205 Project: CouchDB Issue Type: Bug Reporter: Jaakko Sipari Attachments: couchdb_passwd_log_entry.txt The full url (with username password) of a pull-replicated database can be displayed in the erlang stack trace. This can be problematic when the the party reading the logs for analysis purposes should not know/get the credentials. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (COUCHDB-691) CouchDB pull replication from HTTPS does not recover after disconnect (hangs)
[ https://issues.apache.org/jira/browse/COUCHDB-691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana closed COUCHDB-691. - Resolution: Fixed Fix Version/s: 1.0.2 1.1 Assignee: Filipe Manana Thanks Simon. This should have been fixed in 1.0.2 and 1.1.0 CouchDB pull replication from HTTPS does not recover after disconnect (hangs) - Key: COUCHDB-691 URL: https://issues.apache.org/jira/browse/COUCHDB-691 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 0.10.1 Environment: CouchDB 0.10.1 on Ubuntu 9.10 64bit with Nginx 0.7.62. Reporter: Simon Eisenmann Assignee: Filipe Manana Fix For: 1.1, 1.0.2 have several CouchDB instances replicating through untrusted network space. Thus these instances are behind a Nginx SSL-Proxy. Everything works fine though when for whatever reason one of the connection breaks then this pull replication never recovers. Even restarting the replication job does not have any effect despite not giving an error. Also in Futon the replication jobs are still reported as running (they never go away). I just have set up a local test environment with just two nodes replicating to each other. One of the nodes is behind Nginx with SSL, and the other is directly reachable unencrypted. When restarting the unencrypted instance the pull replication on the other Couch recovers like a charm and and things are in sync quickly again. Not so when i restart the instance behind HTTPS. This replication never results in any action again until the instance doing the pull replication is restarted. After a couple bit of debugging i found that it seems like the _changes feed is never again requested from the just restarted instance. As soon as i restart the instance i get the following entry in the Nginx log: 10.1.1.201 - - [10/Mar/2010:17:40:50 +0100] GET /database_1/_changes?style=all_docsheartbeat=1since=3135feed=continuous HTTP/1.1 200 408 - CouchDB/0.10.1 This means the long running connection has just finished (this was the former working replication request). Afterwards i would suspect the Couch to start up such a request again, though this never happens. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1204) Cannot pull replicate with HTTPS (using native SSL) but works for same database with HTTP
[ https://issues.apache.org/jira/browse/COUCHDB-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13056558#comment-13056558 ] Filipe Manana commented on COUCHDB-1204: Is it OTP R13B0x you're running? If so it's a known issue with the OTP release. Try R14. Cannot pull replicate with HTTPS (using native SSL) but works for same database with HTTP - Key: COUCHDB-1204 URL: https://issues.apache.org/jira/browse/COUCHDB-1204 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.1 Environment: Ubuntu 10.04 64bit, CouchDB 1.1.0, Erlang 1:13.b.3-dfsg-2ubuntu2.1, Spidermonkey 3.5.15 Reporter: Simon Eisenmann Labels: SSL, pull, replication I just tried out native SSL in CouchDB 1.1. SSL pull replication works fine for very small databases. Though for long running older databases it never works. So this means having CouchDB 1.1.0 running on Port 5984 non SSL and on 6984 SSL replication works for the first but not for the second URL (same source database, same local target). This is the output for the non SSL replication: {session_id:8703d846d4a90184b5cdc4358ebbdec4,start_time:Tue, 28 Jun 2011 14:57:15 GMT,end_time:Tue, 28 Jun 2011 14:57:40 GMT,start_last_seq:0,end_last_seq:1303811,recorded_seq:1303811,missing_checked:0,missing_found:14965,docs_read:14984,docs_written:14984,doc_write_failures:0} And this the error on the local CouchDB when trying this through SSL: 0.5735.8, {error,{badinfo,{tcp,#Port0.10031. Futon outputs something like Replication failed: {bad_term,0.897.9} The remote Couch does not have any error in the log.To me this seems to be an issue on the receiving side. Both Couches are the same version and Platform (Ubuntu 10.04 64bit). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (COUCHDB-1204) Cannot pull replicate with HTTPS (using native SSL) but works for same database with HTTP
[ https://issues.apache.org/jira/browse/COUCHDB-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana resolved COUCHDB-1204. Resolution: Not A Problem You're welcome Simon. Please reopen the ticket if you find other SSL related issues when pull replicating. Cannot pull replicate with HTTPS (using native SSL) but works for same database with HTTP - Key: COUCHDB-1204 URL: https://issues.apache.org/jira/browse/COUCHDB-1204 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.1 Environment: Ubuntu 10.04 64bit, CouchDB 1.1.0, Erlang 1:13.b.3-dfsg-2ubuntu2.1, Spidermonkey 3.5.15 Reporter: Simon Eisenmann Labels: SSL, pull, replication I just tried out native SSL in CouchDB 1.1. SSL pull replication works fine for very small databases. Though for long running older databases it never works. So this means having CouchDB 1.1.0 running on Port 5984 non SSL and on 6984 SSL replication works for the first but not for the second URL (same source database, same local target). This is the output for the non SSL replication: {session_id:8703d846d4a90184b5cdc4358ebbdec4,start_time:Tue, 28 Jun 2011 14:57:15 GMT,end_time:Tue, 28 Jun 2011 14:57:40 GMT,start_last_seq:0,end_last_seq:1303811,recorded_seq:1303811,missing_checked:0,missing_found:14965,docs_read:14984,docs_written:14984,doc_write_failures:0} And this the error on the local CouchDB when trying this through SSL: 0.5735.8, {error,{badinfo,{tcp,#Port0.10031. Futon outputs something like Replication failed: {bad_term,0.897.9} The remote Couch does not have any error in the log.To me this seems to be an issue on the receiving side. Both Couches are the same version and Platform (Ubuntu 10.04 64bit). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (COUCHDB-1201) Add Started on and Last updated on datetime fields to _active_tasks API and Futon UI
Add Started on and Last updated on datetime fields to _active_tasks API and Futon UI Key: COUCHDB-1201 URL: https://issues.apache.org/jira/browse/COUCHDB-1201 Project: CouchDB Issue Type: Improvement Components: Futon, HTTP Interface Reporter: Filipe Manana Assignee: Filipe Manana Fix For: 1.2 This is a very simple and useful feature to add. It gives an idea for how long a task has been running and its progress (time when the last status text was updated) - users can use this to find out if a task is progressing too slowly or if it hanged. Screenshot sample and patch attached. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (COUCHDB-1201) Add Started on and Last updated on datetime fields to _active_tasks API and Futon UI
[ https://issues.apache.org/jira/browse/COUCHDB-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana updated COUCHDB-1201: --- Attachment: (was: new_futon_active_tasks.png) Add Started on and Last updated on datetime fields to _active_tasks API and Futon UI Key: COUCHDB-1201 URL: https://issues.apache.org/jira/browse/COUCHDB-1201 Project: CouchDB Issue Type: Improvement Components: Futon, HTTP Interface Reporter: Filipe Manana Assignee: Filipe Manana Fix For: 1.2 Attachments: active_tasks_times.png, couchdb-1201.patch This is a very simple and useful feature to add. It gives an idea for how long a task has been running and its progress (time when the last status text was updated) - users can use this to find out if a task is progressing too slowly or if it hanged. Screenshot sample and patch attached. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (COUCHDB-1201) Add Started on and Last updated on datetime fields to _active_tasks API and Futon UI
[ https://issues.apache.org/jira/browse/COUCHDB-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana updated COUCHDB-1201: --- Attachment: couchdb-1201.patch new_futon_active_tasks.png Add Started on and Last updated on datetime fields to _active_tasks API and Futon UI Key: COUCHDB-1201 URL: https://issues.apache.org/jira/browse/COUCHDB-1201 Project: CouchDB Issue Type: Improvement Components: Futon, HTTP Interface Reporter: Filipe Manana Assignee: Filipe Manana Fix For: 1.2 Attachments: active_tasks_times.png, couchdb-1201.patch This is a very simple and useful feature to add. It gives an idea for how long a task has been running and its progress (time when the last status text was updated) - users can use this to find out if a task is progressing too slowly or if it hanged. Screenshot sample and patch attached. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (COUCHDB-1201) Add Started on and Last updated on datetime fields to _active_tasks API and Futon UI
[ https://issues.apache.org/jira/browse/COUCHDB-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana updated COUCHDB-1201: --- Attachment: active_tasks_times.png Add Started on and Last updated on datetime fields to _active_tasks API and Futon UI Key: COUCHDB-1201 URL: https://issues.apache.org/jira/browse/COUCHDB-1201 Project: CouchDB Issue Type: Improvement Components: Futon, HTTP Interface Reporter: Filipe Manana Assignee: Filipe Manana Fix For: 1.2 Attachments: active_tasks_times.png, couchdb-1201.patch This is a very simple and useful feature to add. It gives an idea for how long a task has been running and its progress (time when the last status text was updated) - users can use this to find out if a task is progressing too slowly or if it hanged. Screenshot sample and patch attached. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1197) trunk no longer builds on Windows
[ https://issues.apache.org/jira/browse/COUCHDB-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054344#comment-13054344 ] Filipe Manana commented on COUCHDB-1197: Dave, the yajl change, do you think you can contribute it upstream? trunk no longer builds on Windows - Key: COUCHDB-1197 URL: https://issues.apache.org/jira/browse/COUCHDB-1197 Project: CouchDB Issue Type: Bug Components: Build System, JavaScript View Server Environment: Windows 7 Enterprise x64 Cygwin MS Visual Studio 2008 Express Reporter: Dave Cottlehuber Labels: cygwin, windows Fix For: 1.2 Attachments: COUCHDB-1197_fix_ejson.patch, COUCHDB-1197_fix_ejson_v2.patch, COUCHDB-1197_fix_libcurl.patch ./configure fails - can no longer correctly find libcurl (after COUCHDB-1042) and instead compiles against cygwin's curl which is *bad*. Patch attached to resolve this. - finds jsapi.h correctly but can no longer use it. Work by dch to identify when it broke and how to fix this underway. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1197) trunk no longer builds on Windows
[ https://issues.apache.org/jira/browse/COUCHDB-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054410#comment-13054410 ] Filipe Manana commented on COUCHDB-1197: Dave, no idea at what version we are, but definitely we're behind version 2.0.0, which according to the release notes brings significant performance gains. I'm trying it out and see if no issues, maybe update trunk with v2.0.3 trunk no longer builds on Windows - Key: COUCHDB-1197 URL: https://issues.apache.org/jira/browse/COUCHDB-1197 Project: CouchDB Issue Type: Bug Components: Build System, JavaScript View Server Environment: Windows 7 Enterprise x64 Cygwin MS Visual Studio 2008 Express Reporter: Dave Cottlehuber Labels: cygwin, windows Fix For: 1.2 Attachments: COUCHDB-1197_fix_ejson.patch, COUCHDB-1197_fix_ejson_v2.patch, COUCHDB-1197_fix_libcurl.patch ./configure fails - can no longer correctly find libcurl (after COUCHDB-1042) and instead compiles against cygwin's curl which is *bad*. Patch attached to resolve this. - finds jsapi.h correctly but can no longer use it. Work by dch to identify when it broke and how to fix this underway. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1175) Improve content type negotiation for couchdb JSON responses
[ https://issues.apache.org/jira/browse/COUCHDB-1175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13052438#comment-13052438 ] Filipe Manana commented on COUCHDB-1175: Hi Jason, The resulting list not only depends on the Q values but also on the order of the input given to it (when Qs are equal) and the specificity of the types. The caller must give full types as the parameter to the call. E.g. Req:accepted_content_types([text/html, application/json]) For that call, if the Accept header is text/*;q=0.8, */*;q=0.5 it will return [text/html, application/json], meaning the client prefers text/html and then application/json. When the Accept header has both type/subtype and type/*, it will give preference to type/subtype if it matches any type given in the input list, example: Req9 = mochiweb_request:new(nil, 'GET', /foo, {1, 1}, mochiweb_headers:make([{Accept, text/*;q=0.9, text/html;q=0.5, */*;q=0.7}])), ?assertEqual([application/json, text/html], Req9:accepted_content_types([text/html, application/json])). I guess this is what you were concerned with? Improve content type negotiation for couchdb JSON responses --- Key: COUCHDB-1175 URL: https://issues.apache.org/jira/browse/COUCHDB-1175 Project: CouchDB Issue Type: Improvement Affects Versions: 1.0.2 Reporter: Robert Newson Assignee: Robert Newson Priority: Blocker Fix For: 1.1.1, 1.2 Currently we ignore qvalues when negotiation between 'application/json' and 'text/plain' when returning JSON responses. Specifically, we test directly for 'application/json' or 'text/plain' in the Accept header. Different branches have different bugs, though. Trunk returns 'application/json' if 'application/json' is present at all, even if it's less preferred than 'text/plain' when qvalues are accounted for. We should follow the standard. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (COUCHDB-1199) Doc in _replicator db missing when using filter property
[ https://issues.apache.org/jira/browse/COUCHDB-1199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana resolved COUCHDB-1199. Resolution: Fixed Fix Version/s: 1.2 1.1.1 Fix applied to trunk and 1.1.x. Thanks Andrew Doc in _replicator db missing when using filter property Key: COUCHDB-1199 URL: https://issues.apache.org/jira/browse/COUCHDB-1199 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.1 Environment: Ubuntu 10.10 Reporter: andrew henderson Assignee: Filipe Manana Fix For: 1.1.1, 1.2 Attachments: couchdb-1199-11x.patch Scenario1 below works as expected, Scenario2 fails. Scenario1 curl -X POST 'http://127.0.0.1:5984/_replicator/' -H Content-Type: application/json -d {\_id\:\test1_to_test2\\,\source\:\my_culture\\,\target\:\http:\/\/127.0.0.1:5984\/test2\\,\create_target\:\true\}{ok:true,id:test1_to_test2,rev:1-df297eda4880633bc0442590724014ff} Doc is created in _replicator Replication completes successfully Scenario2 curl -X POST 'http://127.0.0.1:5984/_replicator/' -H Content-Type: application/json -d {\_id\:\test1_to_test2\\,\source\:\my_culture\\,\target\:\http:\/\/127.0.0.1:5984\/test2\\,\create_target\:\true\,\filter\:\http:\/\/127.0.0.1:5984\/my_culture\/profile\/app_ddocs\} {ok:true,id:test1_to_test2,rev:1-97641b372d500d842688d217f97081da} Doc is not created in _replicator (in spite of 'ok' response above) No replication occurs. Now, I am not sure whether I got the right syntax in the filter property since I could find no documentation for it. In particular, whether the filter should be in the source db or _replicator. And do we use a full URL as above or just ddoc/filter name? The filter documented in Scenario2 does exist in source db. In any case, the doc in _replicator ought not to be getting lost as does happen. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1175) Improve content type negotiation for couchdb JSON responses
[ https://issues.apache.org/jira/browse/COUCHDB-1175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051946#comment-13051946 ] Filipe Manana commented on COUCHDB-1175: This should help in choosing the content-type the request prefers when it accepts multiple (explicit, via type/* or */* and with or without q) https://github.com/mochi/mochiweb/pull/49 Improve content type negotiation for couchdb JSON responses --- Key: COUCHDB-1175 URL: https://issues.apache.org/jira/browse/COUCHDB-1175 Project: CouchDB Issue Type: Improvement Affects Versions: 1.0.2 Reporter: Robert Newson Assignee: Robert Newson Priority: Blocker Fix For: 1.1.1, 1.2 Currently we ignore qvalues when negotiation between 'application/json' and 'text/plain' when returning JSON responses. Specifically, we test directly for 'application/json' or 'text/plain' in the Accept header. Different branches have different bugs, though. Trunk returns 'application/json' if 'application/json' is present at all, even if it's less preferred than 'text/plain' when qvalues are accounted for. We should follow the standard. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1199) Doc in _replicator db missing when using filter property
[ https://issues.apache.org/jira/browse/COUCHDB-1199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051947#comment-13051947 ] Filipe Manana commented on COUCHDB-1199: Thanks Andrew. I was able to reproduce it. It happens only when there's a filter specified which doesn't exist, throwing an exception and causing the replicator database gen_server to be restarted 10 times by the supervisor, after which is no longer restarted anymore. I'll fix this asap. Doc in _replicator db missing when using filter property Key: COUCHDB-1199 URL: https://issues.apache.org/jira/browse/COUCHDB-1199 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.1 Environment: Ubuntu 10.10 Reporter: andrew henderson Scenario1 below works as expected, Scenario2 fails. Scenario1 curl -X POST 'http://127.0.0.1:5984/_replicator/' -H Content-Type: application/json -d {\_id\:\test1_to_test2\\,\source\:\my_culture\\,\target\:\http:\/\/127.0.0.1:5984\/test2\\,\create_target\:\true\}{ok:true,id:test1_to_test2,rev:1-df297eda4880633bc0442590724014ff} Doc is created in _replicator Replication completes successfully Scenario2 curl -X POST 'http://127.0.0.1:5984/_replicator/' -H Content-Type: application/json -d {\_id\:\test1_to_test2\\,\source\:\my_culture\\,\target\:\http:\/\/127.0.0.1:5984\/test2\\,\create_target\:\true\,\filter\:\http:\/\/127.0.0.1:5984\/my_culture\/profile\/app_ddocs\} {ok:true,id:test1_to_test2,rev:1-97641b372d500d842688d217f97081da} Doc is not created in _replicator (in spite of 'ok' response above) No replication occurs. Now, I am not sure whether I got the right syntax in the filter property since I could find no documentation for it. In particular, whether the filter should be in the source db or _replicator. And do we use a full URL as above or just ddoc/filter name? The filter documented in Scenario2 does exist in source db. In any case, the doc in _replicator ought not to be getting lost as does happen. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (COUCHDB-1199) Doc in _replicator db missing when using filter property
[ https://issues.apache.org/jira/browse/COUCHDB-1199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana reassigned COUCHDB-1199: -- Assignee: Filipe Manana Doc in _replicator db missing when using filter property Key: COUCHDB-1199 URL: https://issues.apache.org/jira/browse/COUCHDB-1199 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.1 Environment: Ubuntu 10.10 Reporter: andrew henderson Assignee: Filipe Manana Scenario1 below works as expected, Scenario2 fails. Scenario1 curl -X POST 'http://127.0.0.1:5984/_replicator/' -H Content-Type: application/json -d {\_id\:\test1_to_test2\\,\source\:\my_culture\\,\target\:\http:\/\/127.0.0.1:5984\/test2\\,\create_target\:\true\}{ok:true,id:test1_to_test2,rev:1-df297eda4880633bc0442590724014ff} Doc is created in _replicator Replication completes successfully Scenario2 curl -X POST 'http://127.0.0.1:5984/_replicator/' -H Content-Type: application/json -d {\_id\:\test1_to_test2\\,\source\:\my_culture\\,\target\:\http:\/\/127.0.0.1:5984\/test2\\,\create_target\:\true\,\filter\:\http:\/\/127.0.0.1:5984\/my_culture\/profile\/app_ddocs\} {ok:true,id:test1_to_test2,rev:1-97641b372d500d842688d217f97081da} Doc is not created in _replicator (in spite of 'ok' response above) No replication occurs. Now, I am not sure whether I got the right syntax in the filter property since I could find no documentation for it. In particular, whether the filter should be in the source db or _replicator. And do we use a full URL as above or just ddoc/filter name? The filter documented in Scenario2 does exist in source db. In any case, the doc in _replicator ought not to be getting lost as does happen. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira