[jira] [Updated] (COUCHDB-1288) More efficient builtin filters _doc_ids and _design

2011-09-19 Thread Filipe Manana (JIRA)

 [ 
https://issues.apache.org/jira/browse/COUCHDB-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Filipe Manana updated COUCHDB-1288:
---

Attachment: couchdb_1288_3.patch

Added patch with test case, including the case for continuous changes.

> More efficient builtin filters _doc_ids and _design
> ---
>
> Key: COUCHDB-1288
> URL: https://issues.apache.org/jira/browse/COUCHDB-1288
> Project: CouchDB
>  Issue Type: Improvement
>Reporter: Filipe Manana
> Attachments: couchdb_1288_2.patch, couchdb_1288_3.patch
>
>
> We have the _doc_ids and _design _changes filter as of CouchDB 1.1.0.
> While they meet the expectations of applications/users, they're far from 
> efficient for large databases.
> Basically the implementation folds the entire seq btree and then filters 
> values by the document's ID, causing too much IO and busting caches. This 
> makes replication by doc IDs not so efficient as it could be.
> The proposed patch avoids this by doing direct lookups in the ID btree, for 
> _doc_ids, and ranged fold for _design.
> If there are no objections, I would apply to branch 1.2.x besides 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (COUCHDB-1288) More efficient builtin filters _doc_ids and _design

2011-09-18 Thread Filipe Manana (JIRA)

 [ 
https://issues.apache.org/jira/browse/COUCHDB-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Filipe Manana updated COUCHDB-1288:
---

Attachment: couchdb_1288_2.patch

> More efficient builtin filters _doc_ids and _design
> ---
>
> Key: COUCHDB-1288
> URL: https://issues.apache.org/jira/browse/COUCHDB-1288
> Project: CouchDB
>  Issue Type: Improvement
>Reporter: Filipe Manana
> Attachments: couchdb_1288_2.patch
>
>
> We have the _doc_ids and _design _changes filter as of CouchDB 1.1.0.
> While they meet the expectations of applications/users, they're far from 
> efficient for large databases.
> Basically the implementation folds the entire seq btree and then filters 
> values by the document's ID, causing too much IO and busting caches. This 
> makes replication by doc IDs not so efficient as it could be.
> The proposed patch avoids this by doing direct lookups in the ID btree, for 
> _doc_ids, and ranged fold for _design.
> If there are no objections, I would apply to branch 1.2.x besides 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (COUCHDB-1288) More efficient builtin filters _doc_ids and _design

2011-09-18 Thread Filipe Manana (JIRA)

 [ 
https://issues.apache.org/jira/browse/COUCHDB-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Filipe Manana updated COUCHDB-1288:
---

Attachment: (was: couchdb_1288.patch)

> More efficient builtin filters _doc_ids and _design
> ---
>
> Key: COUCHDB-1288
> URL: https://issues.apache.org/jira/browse/COUCHDB-1288
> Project: CouchDB
>  Issue Type: Improvement
>Reporter: Filipe Manana
> Attachments: couchdb_1288_2.patch
>
>
> We have the _doc_ids and _design _changes filter as of CouchDB 1.1.0.
> While they meet the expectations of applications/users, they're far from 
> efficient for large databases.
> Basically the implementation folds the entire seq btree and then filters 
> values by the document's ID, causing too much IO and busting caches. This 
> makes replication by doc IDs not so efficient as it could be.
> The proposed patch avoids this by doing direct lookups in the ID btree, for 
> _doc_ids, and ranged fold for _design.
> If there are no objections, I would apply to branch 1.2.x besides 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (COUCHDB-1288) More efficient builtin filters _doc_ids and _design

2011-09-18 Thread Filipe Manana (JIRA)

 [ 
https://issues.apache.org/jira/browse/COUCHDB-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Filipe Manana updated COUCHDB-1288:
---

Attachment: (was: couchdb_1288_2.patch)

> More efficient builtin filters _doc_ids and _design
> ---
>
> Key: COUCHDB-1288
> URL: https://issues.apache.org/jira/browse/COUCHDB-1288
> Project: CouchDB
>  Issue Type: Improvement
>Reporter: Filipe Manana
> Attachments: couchdb_1288.patch
>
>
> We have the _doc_ids and _design _changes filter as of CouchDB 1.1.0.
> While they meet the expectations of applications/users, they're far from 
> efficient for large databases.
> Basically the implementation folds the entire seq btree and then filters 
> values by the document's ID, causing too much IO and busting caches. This 
> makes replication by doc IDs not so efficient as it could be.
> The proposed patch avoids this by doing direct lookups in the ID btree, for 
> _doc_ids, and ranged fold for _design.
> If there are no objections, I would apply to branch 1.2.x besides 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (COUCHDB-1288) More efficient builtin filters _doc_ids and _design

2011-09-18 Thread Filipe Manana (JIRA)

 [ 
https://issues.apache.org/jira/browse/COUCHDB-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Filipe Manana updated COUCHDB-1288:
---

Attachment: couchdb_1288_2.patch

Second version of the patch, for _doc_ids, the optimized code patch is only 
triggered if the number of doc IDs is not greater than 100. This is too avoid 
loading too many full_doc_info records into memory, which can be big if the rev 
trees are long and/or with many branches.

> More efficient builtin filters _doc_ids and _design
> ---
>
> Key: COUCHDB-1288
> URL: https://issues.apache.org/jira/browse/COUCHDB-1288
> Project: CouchDB
>  Issue Type: Improvement
>Reporter: Filipe Manana
> Attachments: couchdb_1288.patch, couchdb_1288_2.patch
>
>
> We have the _doc_ids and _design _changes filter as of CouchDB 1.1.0.
> While they meet the expectations of applications/users, they're far from 
> efficient for large databases.
> Basically the implementation folds the entire seq btree and then filters 
> values by the document's ID, causing too much IO and busting caches. This 
> makes replication by doc IDs not so efficient as it could be.
> The proposed patch avoids this by doing direct lookups in the ID btree, for 
> _doc_ids, and ranged fold for _design.
> If there are no objections, I would apply to branch 1.2.x besides 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (COUCHDB-1288) More efficient builtin filters _doc_ids and _design

2011-09-18 Thread Filipe Manana (JIRA)

 [ 
https://issues.apache.org/jira/browse/COUCHDB-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Filipe Manana updated COUCHDB-1288:
---

Attachment: (was: couchdb_1288.patch)

> More efficient builtin filters _doc_ids and _design
> ---
>
> Key: COUCHDB-1288
> URL: https://issues.apache.org/jira/browse/COUCHDB-1288
> Project: CouchDB
>  Issue Type: Improvement
>Reporter: Filipe Manana
> Attachments: couchdb_1288.patch
>
>
> We have the _doc_ids and _design _changes filter as of CouchDB 1.1.0.
> While they meet the expectations of applications/users, they're far from 
> efficient for large databases.
> Basically the implementation folds the entire seq btree and then filters 
> values by the document's ID, causing too much IO and busting caches. This 
> makes replication by doc IDs not so efficient as it could be.
> The proposed patch avoids this by doing direct lookups in the ID btree, for 
> _doc_ids, and ranged fold for _design.
> If there are no objections, I would apply to branch 1.2.x besides 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (COUCHDB-1288) More efficient builtin filters _doc_ids and _design

2011-09-18 Thread Filipe Manana (JIRA)

 [ 
https://issues.apache.org/jira/browse/COUCHDB-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Filipe Manana updated COUCHDB-1288:
---

Attachment: couchdb_1288.patch

> More efficient builtin filters _doc_ids and _design
> ---
>
> Key: COUCHDB-1288
> URL: https://issues.apache.org/jira/browse/COUCHDB-1288
> Project: CouchDB
>  Issue Type: Improvement
>Reporter: Filipe Manana
> Attachments: couchdb_1288.patch
>
>
> We have the _doc_ids and _design _changes filter as of CouchDB 1.1.0.
> While they meet the expectations of applications/users, they're far from 
> efficient for large databases.
> Basically the implementation folds the entire seq btree and then filters 
> values by the document's ID, causing too much IO and busting caches. This 
> makes replication by doc IDs not so efficient as it could be.
> The proposed patch avoids this by doing direct lookups in the ID btree, for 
> _doc_ids, and ranged fold for _design.
> If there are no objections, I would apply to branch 1.2.x besides 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (COUCHDB-1288) More efficient builtin filters _doc_ids and _design

2011-09-18 Thread Filipe Manana (JIRA)

 [ 
https://issues.apache.org/jira/browse/COUCHDB-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Filipe Manana updated COUCHDB-1288:
---

Attachment: couchdb_1288.patch

> More efficient builtin filters _doc_ids and _design
> ---
>
> Key: COUCHDB-1288
> URL: https://issues.apache.org/jira/browse/COUCHDB-1288
> Project: CouchDB
>  Issue Type: Improvement
>Reporter: Filipe Manana
> Attachments: couchdb_1288.patch
>
>
> We have the _doc_ids and _design _changes filter as of CouchDB 1.1.0.
> While they meet the expectations of applications/users, they're far from 
> efficient for large databases.
> Basically the implementation folds the entire seq btree and then filters 
> values by the document's ID, causing too much IO and busting caches. This 
> makes replication by doc IDs not so efficient as it could be.
> The proposed patch avoids this by doing direct lookups in the ID btree, for 
> _doc_ids, and ranged fold for _design.
> If there are no objections, I would apply to branch 1.2.x besides 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira