Re: POST with _id
Another +1 here too - that has bitten me before... +1 from me, we've also hit that one Kev
[jira] Updated: (COUCHDB-69) Allow selective retaining of older revisions to a document
[ https://issues.apache.org/jira/browse/COUCHDB-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Davies updated COUCHDB-69: Attachment: history_revs.patch First stab at allowing old revisions to survive compaction *and* be replicated. Note that this required a change to the by_seq B-tree. I'm still working on making this configurable: currently you need to set the HISTORY_ENABLED macro to true in couch_db.hrl for this to be turned on. Allow selective retaining of older revisions to a document -- Key: COUCHDB-69 URL: https://issues.apache.org/jira/browse/COUCHDB-69 Project: CouchDB Issue Type: Improvement Components: Database Core Environment: All Reporter: Jan Lehnardt Assignee: Paul Joseph Davis Priority: Minor Fix For: 0.10 Attachments: history_revs.patch At the moment, compaction gets rid of all old revisions of a document. Also, replication also deals with the latest revision. It would be nice if it would be possible to specify a list of revisions to keep around that do not get compacted away and replicated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (COUCHDB-69) Allow selective retaining of older revisions to a document
[ https://issues.apache.org/jira/browse/COUCHDB-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Davies updated COUCHDB-69: Attachment: history_revs.2.patch Updated patch allowing per-db configuration. Allow selective retaining of older revisions to a document -- Key: COUCHDB-69 URL: https://issues.apache.org/jira/browse/COUCHDB-69 Project: CouchDB Issue Type: Improvement Components: Database Core Environment: All Reporter: Jan Lehnardt Assignee: Paul Joseph Davis Priority: Minor Fix For: 0.10 Attachments: history_revs.2.patch, history_revs.patch At the moment, compaction gets rid of all old revisions of a document. Also, replication also deals with the latest revision. It would be nice if it would be possible to specify a list of revisions to keep around that do not get compacted away and replicated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (COUCHDB-465) Produce sequential, but unique, document id's
[ https://issues.apache.org/jira/browse/COUCHDB-465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Newson updated COUCHDB-465: -- Attachment: sequence_id.patch Produce sequential, but unique, document id's - Key: COUCHDB-465 URL: https://issues.apache.org/jira/browse/COUCHDB-465 Project: CouchDB Issue Type: Improvement Reporter: Robert Newson Attachments: sequence_id.patch Currently, if the client does not specify an id (POST'ing a single document or using _bulk_docs) a random 16 byte value is created. This kind of key is particularly brutal on b+tree updates and the append-only nature of couchdb files. Attached is a patch to change this to a two-part identifier. The first part is a random 12 byte value and the remainder is a counter. The random prefix is rerandomized when the counter reaches its maximum. The rollover in the patch is at 16 million but can obviously be changed. The upshot is that the b+tree is updated in a better fashion, which should lead to performance benefits. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (COUCHDB-465) Produce sequential, but unique, document id's
Produce sequential, but unique, document id's - Key: COUCHDB-465 URL: https://issues.apache.org/jira/browse/COUCHDB-465 Project: CouchDB Issue Type: Improvement Reporter: Robert Newson Attachments: sequence_id.patch Currently, if the client does not specify an id (POST'ing a single document or using _bulk_docs) a random 16 byte value is created. This kind of key is particularly brutal on b+tree updates and the append-only nature of couchdb files. Attached is a patch to change this to a two-part identifier. The first part is a random 12 byte value and the remainder is a counter. The random prefix is rerandomized when the counter reaches its maximum. The rollover in the patch is at 16 million but can obviously be changed. The upshot is that the b+tree is updated in a better fashion, which should lead to performance benefits. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (COUCHDB-449) Turn off delayed commits by default
[ https://issues.apache.org/jira/browse/COUCHDB-449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Kocoloski updated COUCHDB-449: --- Attachment: delayed_commits_v1.patch Here's a patch to make delayed_commits a server-wide config option. The setting looks like [couchdb] delayed_commits = true and defaults to false. If finer-grained control is required users can override the default by setting the X-Couch-Full-Commit header to true or false. Jan mentioned enabling delayed_commits for the test suite. I didn't do this. Turn off delayed commits by default --- Key: COUCHDB-449 URL: https://issues.apache.org/jira/browse/COUCHDB-449 Project: CouchDB Issue Type: Bug Components: Database Core Affects Versions: 0.9, 0.9.1 Reporter: Jan Lehnardt Priority: Blocker Fix For: 0.10 Attachments: delayed_commits_v1.patch Delayed commits make CouchDB significantly faster. They also open a one second window for data loss. In 0.9 and trunk, delayed commits are enabled by default and can be overridden with HTTP headers and an explicit API call to flush the write buffer. I suggest to turn off delayed commits by default and use the same overrides to enable it per request. A per-database option is possible, too. One concern is developer workflow speed. The setting affects the test suite performance significantly. I'd opt to change couch.js to set the appropriate header to enable delayed commits for tests. CouchDB should guarantee data safety first and speed second, with sensible overrides. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (COUCHDB-466) couchdb oauth doesn't work behind reverse proxy
couchdb oauth doesn't work behind reverse proxy --- Key: COUCHDB-466 URL: https://issues.apache.org/jira/browse/COUCHDB-466 Project: CouchDB Issue Type: Improvement Components: HTTP Interface Affects Versions: 0.10 Reporter: Benoit Chesneau Fix For: 0.10 Attachments: x_forwarded_host.diff Currently oauth doesn't work behind a reverse proxy because signature is based on Host. Reverse proxy like apache, lighttpd pass to the proxied server some header that help him to know which host is forwared. Apache send X-Forwarded-For, Lighttpd X-Host, Patch attached fix this issue by testing if a custom forwarded host header is present and use it as Host. If it isn't present it will use Host header of fallback on socket detection like it is currently. All tests pass. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (COUCHDB-466) couchdb oauth doesn't work behind reverse proxy
[ https://issues.apache.org/jira/browse/COUCHDB-466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoit Chesneau updated COUCHDB-466: Attachment: x_forwarded_host.diff couchdb oauth doesn't work behind reverse proxy --- Key: COUCHDB-466 URL: https://issues.apache.org/jira/browse/COUCHDB-466 Project: CouchDB Issue Type: Improvement Components: HTTP Interface Affects Versions: 0.10 Reporter: Benoit Chesneau Fix For: 0.10 Attachments: x_forwarded_host.diff Currently oauth doesn't work behind a reverse proxy because signature is based on Host. Reverse proxy like apache, lighttpd pass to the proxied server some header that help him to know which host is forwared. Apache send X-Forwarded-For, Lighttpd X-Host, Patch attached fix this issue by testing if a custom forwarded host header is present and use it as Host. If it isn't present it will use Host header of fallback on socket detection like it is currently. All tests pass. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (COUCHDB-465) Produce sequential, but unique, document id's
[ https://issues.apache.org/jira/browse/COUCHDB-465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742907#action_12742907 ] Adam Kocoloski commented on COUCHDB-465: Nice work, Robert! I'm +1 on this patch. One concern is the guessability of IDs, but if users are really concerned about that they can always generate their own. Produce sequential, but unique, document id's - Key: COUCHDB-465 URL: https://issues.apache.org/jira/browse/COUCHDB-465 Project: CouchDB Issue Type: Improvement Reporter: Robert Newson Attachments: sequence_id.patch Currently, if the client does not specify an id (POST'ing a single document or using _bulk_docs) a random 16 byte value is created. This kind of key is particularly brutal on b+tree updates and the append-only nature of couchdb files. Attached is a patch to change this to a two-part identifier. The first part is a random 12 byte value and the remainder is a counter. The random prefix is rerandomized when the counter reaches its maximum. The rollover in the patch is at 16 million but can obviously be changed. The upshot is that the b+tree is updated in a better fashion, which should lead to performance benefits. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (COUCHDB-465) Produce sequential, but unique, document id's
[ https://issues.apache.org/jira/browse/COUCHDB-465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742911#action_12742911 ] Robert Newson commented on COUCHDB-465: --- Thanks! Guessability is a concern, which means this might need to be switchable. Perhaps couch_seq_generator becomes couch_id_generator and an ini file chooses between the two strategies, defaulting to the safest, but worst-case, new_uuid behavior. To get good keys for b+tree insertion necessarily makes them more guessable as they'd have to be close to existing keys by design. I do owe some quantitative benchmarking to support the assertions in the description. I did a 10k insertion test with a small document, {content: hello}, and average insertion rate per document was 2ms with random and 1ms with the patch. This was more to prove that I'd changed *something* rather than a measure of the actual improvement. I would expect to see improved insertion rates across a lot of scenarios, less difference between uncompacted and compacted size (barring document updates and deletes) as less of the b+tree is rewritten, and a smaller post-compaction size vs random. The exact extent of these improvements should be established by a decent benchmark. Produce sequential, but unique, document id's - Key: COUCHDB-465 URL: https://issues.apache.org/jira/browse/COUCHDB-465 Project: CouchDB Issue Type: Improvement Reporter: Robert Newson Attachments: sequence_id.patch Currently, if the client does not specify an id (POST'ing a single document or using _bulk_docs) a random 16 byte value is created. This kind of key is particularly brutal on b+tree updates and the append-only nature of couchdb files. Attached is a patch to change this to a two-part identifier. The first part is a random 12 byte value and the remainder is a counter. The random prefix is rerandomized when the counter reaches its maximum. The rollover in the patch is at 16 million but can obviously be changed. The upshot is that the b+tree is updated in a better fashion, which should lead to performance benefits. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (COUCHDB-296) A built-in conflicts view
[ https://issues.apache.org/jira/browse/COUCHDB-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742923#action_12742923 ] Zachary Zolton commented on COUCHDB-296: I think this is a duplicate of COUCHDB-462 track conflict count in db_info (was built-in conflicts view). So, we should probably close this out, right? A built-in conflicts view - Key: COUCHDB-296 URL: https://issues.apache.org/jira/browse/COUCHDB-296 Project: CouchDB Issue Type: New Feature Components: Database Core Affects Versions: 0.9 Reporter: Dirkjan Ochtman Priority: Minor Fix For: 0.10 It would be great if CouchDB came with a built-in db/_conflicts view. It could have code like the current test/view_conflicts.js. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (COUCHDB-422) PUT to _local/doc_id creates document '_local'
[ https://issues.apache.org/jira/browse/COUCHDB-422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742939#action_12742939 ] Adam Kocoloski commented on COUCHDB-422: Thanks for this report, Eric. I think _local docs should behave like _design docs; that is, we should accept a PUT to _local/foo just fine. PUT to _local/doc_id creates document '_local' -- Key: COUCHDB-422 URL: https://issues.apache.org/jira/browse/COUCHDB-422 Project: CouchDB Issue Type: Bug Affects Versions: 0.10 Environment: Ubuntu 9.04 Reporter: eric casteleijn Priority: Minor Fix For: 0.10 After davisp's revision r796246 doing a put to a document id like '_local/doc_id' results in a visible, and apparently non-local document with id '_local'. When escaping the slash as '%2F' everything works as before, and expected, i.e. a local document with the above id is created. To test: curl -X PUT -d '{foo: bar}' http://127.0.0.1:5987/db1/_local/yokal result: {ok:true,id:_local,rev:1-770307fe8d4210bab8ec65c59983e03c} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (COUCHDB-422) PUT to _local/doc_id creates document '_local'
[ https://issues.apache.org/jira/browse/COUCHDB-422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Kocoloski reassigned COUCHDB-422: -- Assignee: Paul Joseph Davis Paul, can you check into this sometime? PUT to _local/doc_id creates document '_local' -- Key: COUCHDB-422 URL: https://issues.apache.org/jira/browse/COUCHDB-422 Project: CouchDB Issue Type: Bug Affects Versions: 0.10 Environment: Ubuntu 9.04 Reporter: eric casteleijn Assignee: Paul Joseph Davis Priority: Minor Fix For: 0.10 After davisp's revision r796246 doing a put to a document id like '_local/doc_id' results in a visible, and apparently non-local document with id '_local'. When escaping the slash as '%2F' everything works as before, and expected, i.e. a local document with the above id is created. To test: curl -X PUT -d '{foo: bar}' http://127.0.0.1:5987/db1/_local/yokal result: {ok:true,id:_local,rev:1-770307fe8d4210bab8ec65c59983e03c} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Closed: (COUCHDB-419) Replication HTTP requests lack User-Agent and Accept headers
[ https://issues.apache.org/jira/browse/COUCHDB-419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Kocoloski closed COUCHDB-419. -- Resolution: Fixed Fix Version/s: 0.10 Thanks Henri, CouchDB should now be sending these headers with replication requests. Replication HTTP requests lack User-Agent and Accept headers Key: COUCHDB-419 URL: https://issues.apache.org/jira/browse/COUCHDB-419 Project: CouchDB Issue Type: Bug Components: Database Core Reporter: Henri Bergius Assignee: Adam Kocoloski Fix For: 0.10 Currently when making replication HTTP requests to a remote CouchDB instance, CouchDB makes the HTTP requests anonymously, without providing any information that it is a CouchDB instance and that it wants to receive JSON. Examples of what this could be: User-Agent: CouchDB/0.9.0 Accept: application/json See http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html It would be good to add these headers to HTTP requests made by CouchDB, as it makes it easier for other systems like Midgard to support the replication protocol: http://bergie.iki.fi/blog/couchdb_and_midgard_talking_with_each_other/ -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (COUCHDB-427) Trying to replicate a database from an old format increase cpu usage up to 100%
[ https://issues.apache.org/jira/browse/COUCHDB-427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Kocoloski updated COUCHDB-427: --- Attachment: duplicate_attachments_crash.txt Confirmed that this is still a problem on trunk. I'm attaching the full crash dump, but the key stacktrace is [error] [0.69.0] ** Generic server 0.69.0 terminating ** Last message in was {'EXIT',0.89.0, {{nocatch, {bad_request,Duplicate attachments}}, [{couch_db,check_dup_atts,1}, {couch_db,sort_and_check_atts,1}, {couch_db,'-update_docs/4-lc$^3/1-3-',2}, {couch_db,'-update_docs/4-lc$^2/1-2-',2}, {couch_db,'-update_docs/4-lc$^2/1-2-',2}, {couch_db,update_docs,4}, {couch_rep_writer,writer_loop,3}]}} Trying to replicate a database from an old format increase cpu usage up to 100% --- Key: COUCHDB-427 URL: https://issues.apache.org/jira/browse/COUCHDB-427 Project: CouchDB Issue Type: Bug Environment: osx, ubuntu, openbsd Reporter: Benoit Chesneau Priority: Critical Fix For: 0.10 Attachments: duplicate_attachments_crash.txt When you try to replicate a database from an old version of couchdb to latest, the cpu usage increase up to 100% and more instead of just hanging. You can try to replicate from http://benoitc.im/b to latest trunk or 0.9.1 to replicate this issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (COUCHDB-461) attachments - handle correctly chunked encoding and Content-Length, send attachments unchunked
[ https://issues.apache.org/jira/browse/COUCHDB-461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoit Chesneau updated COUCHDB-461: Attachment: attachments_put_length.diff attachments_get_length.diff I splitted previous patch to make it easier. First patch allow couchdb to send attachements (GET) whithout chunked encoding. Content-Length is fixed. Second patch improve the handling of PUT attachments and follow the standard. If transfert encoding is fixed, couchdb wil first look at it and if chunked use it, If no Transfer-Encoding header, Content-Length is used. With this patch CouchDB now react like current HTTP servers which fix problems with some client that give error 500 on couchdb behind a proxy like Apache server. attachments - handle correctly chunked encoding and Content-Length, send attachments unchunked -- Key: COUCHDB-461 URL: https://issues.apache.org/jira/browse/COUCHDB-461 Project: CouchDB Issue Type: Bug Reporter: Benoit Chesneau Fix For: 0.10 Attachments: attachments_get_length.diff, attachments_put_length.diff, chunked.diff, chunked2.diff This patch allow couchdb to send attachments unchunked, instead, Content-Length is fixed and content is streamed. It also fix attachments PUT by detecting first if encoding is chunked then test the length witch is the standard way to do it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: POST with _id
On Thu, Aug 13, 2009 at 12:21 AM, Kevin Jacksonfoamd...@gmail.com wrote: Another +1 here too - that has bitten me before... +1 from me, we've also hit that one Is there a Jira ticket open for this? I can easily imagine this thread being lost to the sands of time. -- Chris Anderson http://jchrisa.net http://couch.io
Re: [jira] Created: (COUCHDB-465) Produce sequential, but unique, document id's
On Thu, Aug 13, 2009 at 09:06:14AM -0700, Robert Newson (JIRA) wrote: Attached is a patch to change this to a two-part identifier. The first part is a random 12 byte value and the remainder is a counter. The random prefix is rerandomized when the counter reaches its maximum. The rollover in the patch is at 16 million but can obviously be changed. The upshot is that the b+tree is updated in a better fashion, which should lead to performance benefits. I'd like to suggest an alternative algorithm for consideration. - first 48 bits of the UUID is the time, in milliseconds, since 1 Jan 1970 - remaining 80 bits starts as a random value and increments from there, for example when doing a _bulk_docs insert (*) I have been using this algorithm for a while, generated client-side - it's in my 'couchtiny' ruby client. I did it this way so as to get monotonically-increasing doc ids; a view with equal keys will sort them in order of insertion into the DB. It also avoids having to keep a separate created_at timestamp field, because you can just get it from the id. def created_at Time.at(id[0,12].to_i(16) / 1000.0) rescue nil end Of course, the fact I generate uids like this demonstrates that there's no one-size-fits-all solution, but I just thought it was worth mentioning because you should get the B-tree insertion boost as a side-effect too. Regards, Brian. (*) It's your choice whether you want to re-randomize this when the next millisecond comes along, or just leave it to increment as a serial number. Even if you have multiple servers inserting documents into the same database, the chances of them using the same serial number within the same millisecond are infinitessimal, as long as they all start from an independent random point within the 2^80 possibilities. Wrapping would be very rare, but what I currently do is re-randomize for each bulk insert, and choose a starting random value which is more than 2^32 away from the ceiling.
[jira] Commented: (COUCHDB-465) Produce sequential, but unique, document id's
[ https://issues.apache.org/jira/browse/COUCHDB-465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742970#action_12742970 ] Brian Candler commented on COUCHDB-465: --- I'd like to suggest an alternative algorithm for consideration. - first 48 bits of the UUID is the time, in milliseconds, since 1 Jan 1970 - remaining 80 bits starts as a random value and increments from there, for example when doing a _bulk_docs insert (*) I have been using this algorithm for a while, generated client-side - it's in my 'couchtiny' ruby client. I did it this way so as to get monotonically-increasing doc ids; a view with equal keys will sort them in order of insertion into the DB. It also avoids having to keep a separate created_at timestamp field, because you can just get it from the id. def created_at Time.at(id[0,12].to_i(16) / 1000.0) rescue nil end Of course, the fact I generate uids like this demonstrates that there's no one-size-fits-all solution, but I just thought it was worth mentioning because you should get the B-tree insertion boost as a side-effect too. Regards, Brian. (*) It's your choice whether you want to re-randomize this when the next millisecond comes along, or just leave it to increment as a serial number. Even if you have multiple servers inserting documents into the same database, the chances of them using the same serial number within the same millisecond are infinitessimal, as long as they all start from an independent random point within the 2^80 possibilities. Wrapping would be very rare, but what I currently do is re-randomize for each bulk insert, and choose a starting random value which is more than 2^32 away from the ceiling. Produce sequential, but unique, document id's - Key: COUCHDB-465 URL: https://issues.apache.org/jira/browse/COUCHDB-465 Project: CouchDB Issue Type: Improvement Reporter: Robert Newson Attachments: sequence_id.patch Currently, if the client does not specify an id (POST'ing a single document or using _bulk_docs) a random 16 byte value is created. This kind of key is particularly brutal on b+tree updates and the append-only nature of couchdb files. Attached is a patch to change this to a two-part identifier. The first part is a random 12 byte value and the remainder is a counter. The random prefix is rerandomized when the counter reaches its maximum. The rollover in the patch is at 16 million but can obviously be changed. The upshot is that the b+tree is updated in a better fashion, which should lead to performance benefits. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (COUCHDB-465) Produce sequential, but unique, document id's
[ https://issues.apache.org/jira/browse/COUCHDB-465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742982#action_12742982 ] Robert Newson commented on COUCHDB-465: --- Another interesting algorithm. I could change the patch so there's a couch_id_generator where the algorithm is configurable, defaulting to the current one, if that would move things along? Produce sequential, but unique, document id's - Key: COUCHDB-465 URL: https://issues.apache.org/jira/browse/COUCHDB-465 Project: CouchDB Issue Type: Improvement Reporter: Robert Newson Attachments: sequence_id.patch Currently, if the client does not specify an id (POST'ing a single document or using _bulk_docs) a random 16 byte value is created. This kind of key is particularly brutal on b+tree updates and the append-only nature of couchdb files. Attached is a patch to change this to a two-part identifier. The first part is a random 12 byte value and the remainder is a counter. The random prefix is rerandomized when the counter reaches its maximum. The rollover in the patch is at 16 million but can obviously be changed. The upshot is that the b+tree is updated in a better fashion, which should lead to performance benefits. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (COUCHDB-465) Produce sequential, but unique, document id's
[ https://issues.apache.org/jira/browse/COUCHDB-465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Newson updated COUCHDB-465: -- Attachment: uuid_generator.patch I renamed couch_seq_generator to couch_uuid_generator. It supports two algorithms; the original random one and the new random+sequential. It defaults to random. To configure you need a new ini block; [uuid] algorithm=(random|sequence) Produce sequential, but unique, document id's - Key: COUCHDB-465 URL: https://issues.apache.org/jira/browse/COUCHDB-465 Project: CouchDB Issue Type: Improvement Reporter: Robert Newson Attachments: sequence_id.patch, uuid_generator.patch Currently, if the client does not specify an id (POST'ing a single document or using _bulk_docs) a random 16 byte value is created. This kind of key is particularly brutal on b+tree updates and the append-only nature of couchdb files. Attached is a patch to change this to a two-part identifier. The first part is a random 12 byte value and the remainder is a counter. The random prefix is rerandomized when the counter reaches its maximum. The rollover in the patch is at 16 million but can obviously be changed. The upshot is that the b+tree is updated in a better fashion, which should lead to performance benefits. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (COUCHDB-465) Produce sequential, but unique, document id's
[ https://issues.apache.org/jira/browse/COUCHDB-465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12743029#action_12743029 ] Joan Touzet commented on COUCHDB-465: - This is a great patch, and solves the problem of having to do it in client-side logic. +1 from me too! It looks like Brian's solution above is intended to allow _all_docs to return all documents in chronological order, thus getting a time-sorted view for free, i.e. without an extra field per document, extra view to maintain and update, extra view storage on the disk, etc. I admit I did the same for myself ;) but it isn't necessarily a consideration for everyone. For example, in a replication situation, you'd need to be sure your clocks were well synchronized, and that you didn't have collisions in the prefix portion. Perhaps providing a mechanism to declare your own function to override one of the two defaults (random, or rnewson's) would indeed be the best way forward, and the wiki could have a HOWTO with a set of small recipes on alternative approaches? Produce sequential, but unique, document id's - Key: COUCHDB-465 URL: https://issues.apache.org/jira/browse/COUCHDB-465 Project: CouchDB Issue Type: Improvement Reporter: Robert Newson Attachments: sequence_id.patch, uuid_generator.patch Currently, if the client does not specify an id (POST'ing a single document or using _bulk_docs) a random 16 byte value is created. This kind of key is particularly brutal on b+tree updates and the append-only nature of couchdb files. Attached is a patch to change this to a two-part identifier. The first part is a random 12 byte value and the remainder is a counter. The random prefix is rerandomized when the counter reaches its maximum. The rollover in the patch is at 16 million but can obviously be changed. The upshot is that the b+tree is updated in a better fashion, which should lead to performance benefits. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (COUCHDB-422) PUT to _local/doc_id creates document '_local'
[ https://issues.apache.org/jira/browse/COUCHDB-422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12743044#action_12743044 ] Paul Joseph Davis commented on COUCHDB-422: --- Checked this out and I can fairly easily make the patch with a caveat. Is it just me or do _local docs not allow attachments? PUT to _local/doc_id creates document '_local' -- Key: COUCHDB-422 URL: https://issues.apache.org/jira/browse/COUCHDB-422 Project: CouchDB Issue Type: Bug Affects Versions: 0.10 Environment: Ubuntu 9.04 Reporter: eric casteleijn Assignee: Paul Joseph Davis Priority: Minor Fix For: 0.10 After davisp's revision r796246 doing a put to a document id like '_local/doc_id' results in a visible, and apparently non-local document with id '_local'. When escaping the slash as '%2F' everything works as before, and expected, i.e. a local document with the above id is created. To test: curl -X PUT -d '{foo: bar}' http://127.0.0.1:5987/db1/_local/yokal result: {ok:true,id:_local,rev:1-770307fe8d4210bab8ec65c59983e03c} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (COUCHDB-461) attachments - handle correctly chunked encoding and Content-Length, send attachments unchunked
[ https://issues.apache.org/jira/browse/COUCHDB-461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12743047#action_12743047 ] Paul Joseph Davis commented on COUCHDB-461: --- Benoit, can you comment on how CouchDB is currently breaking the HTTP spec so we have a record of it? IIRC, it was something that we would expect chunked transfers by default instead of rejecting them but I'd like to have the reason written down. Also, I'd suggest that for attachments we don't use chunked transfers because we should never need it. Unless someone can give me a use case that absolutely requires receiving chunked attachments I'd vote to remove them and use a straight up Content-Lenght. attachments - handle correctly chunked encoding and Content-Length, send attachments unchunked -- Key: COUCHDB-461 URL: https://issues.apache.org/jira/browse/COUCHDB-461 Project: CouchDB Issue Type: Bug Reporter: Benoit Chesneau Fix For: 0.10 Attachments: attachments_get_length.diff, attachments_put_length.diff, chunked.diff, chunked2.diff This patch allow couchdb to send attachments unchunked, instead, Content-Length is fixed and content is streamed. It also fix attachments PUT by detecting first if encoding is chunked then test the length witch is the standard way to do it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (COUCHDB-465) Produce sequential, but unique, document id's
[ https://issues.apache.org/jira/browse/COUCHDB-465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12743049#action_12743049 ] Paul Joseph Davis commented on COUCHDB-465: --- Just to throw something out in the interest of complicating things, should we consider a query string override the configured default algorithm as well? Produce sequential, but unique, document id's - Key: COUCHDB-465 URL: https://issues.apache.org/jira/browse/COUCHDB-465 Project: CouchDB Issue Type: Improvement Reporter: Robert Newson Attachments: sequence_id.patch, uuid_generator.patch Currently, if the client does not specify an id (POST'ing a single document or using _bulk_docs) a random 16 byte value is created. This kind of key is particularly brutal on b+tree updates and the append-only nature of couchdb files. Attached is a patch to change this to a two-part identifier. The first part is a random 12 byte value and the remainder is a counter. The random prefix is rerandomized when the counter reaches its maximum. The rollover in the patch is at 16 million but can obviously be changed. The upshot is that the b+tree is updated in a better fashion, which should lead to performance benefits. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (COUCHDB-194) [startkey, endkey[: provide a right-open range selection method
[ https://issues.apache.org/jira/browse/COUCHDB-194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12743063#action_12743063 ] Chris Anderson commented on COUCHDB-194: Eric, Patches are totally welcome on this. _by_seq probably didn't get much attention lately, as I think it's still deprecated in favor of _changes. It's worth noting that docids are not collated with ICU when they are in the _all_docs view, so there are some places the collation rules can differ. if you are able to prepare a test case illustrating the change you'd like, it probably won't be hard to find something to finish the patch. [startkey, endkey[: provide a right-open range selection method --- Key: COUCHDB-194 URL: https://issues.apache.org/jira/browse/COUCHDB-194 Project: CouchDB Issue Type: Improvement Components: HTTP Interface Affects Versions: 0.10 Reporter: Maximillian Dornseif Priority: Blocker Fix For: 0.10 While writing something about using CouchDB I came across the issue of slice indexes (called startkey and endkey in CouchDB lingo). I found no exact definition of startkey and endkey anywhere in the documentation. Testing reveals that access on _all_docs and on views documents are retuned in the interval [startkey, endkey] = (startkey = k = endkey). I don't know if this was a conscious design decision. But I like to promote a slightly different interpretation (and thus API change): [startkey, endkey[ = (startkey = k endkey). Both approaches are valid and used in the real world. Ruby uses the inclusive (right-closed in math speak) first approach: l = [1,2,3,4] l.slice(1,2) = [2, 3] Python uses the exclusive (right-open in math speak) second approach: l = [1,2,3,4] l[1,2] [2] For array indices both work fine and which one to prefer is mostly an issue of habit. In spoken language both approaches are used: Have the Software done until saturday probably means right-open to the client and right-closed to the coder. But if you are working with keys that are more than array indexes, then right-open is much easier to handle. That is because you have to *guess* the biggest value you want to get. The Wiki at http://wiki.apache.org/couchdb/View_collation contains an example of that problem: It is suggested that you use startkey=_design/endkey=_design/Z or startkey=_design/endkey=_design/\u″ to get a list of all design documents - also the replication system in the db core uses the same hack. This breaks if a design document is named ZTop or \Iñtërnâtiônàlizætiøn. Such names might be unlikely but we are computer scientists; unlikely is a bad approach to software engineering. The think what we really want to ask CouchDB is to get all documents with keys starting with '_design/'. This is basically impossible to do with right-closed intervals. We could use startkey=_design/endkey=_design0″ ('0′ is the ASCII character after '/') and this will work fine ... until there is actually a document with the key _design0″ in the system. Unlikely, but ... To make selection by intervals reliable currently clients have to guess the last key (the approach) or use the fist key not to include (the _design0 approach) and then post process the result to remove the last element returned if it exactly matches the given endkey value. If couchdb would change to a right-open interval approach post processing would go away in most cases. See http://blogs.23.nu/c0re/2008/12/building-a-track-and-trace-application-with-couchdb/ for two real world examples. At least for string keys and float keys changing the meaning to [startkey, endkey[ would allow selections like * all strings starting with 'abc' * all numbers between 10.5 and 11 It also would hopefully break not to much existing code. Since the notion of endkey seems to be already considered fishy (see the Z approach) most code seems to try to avoid that issue. For example 'startkey=_design/endkey=_design/Z' still would work unless you have a design document being named exactly Z. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (COUCHDB-461) attachments - handle correctly chunked encoding and Content-Length, send attachments unchunked
[ https://issues.apache.org/jira/browse/COUCHDB-461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12743064#action_12743064 ] Benoit Chesneau commented on COUCHDB-461: - well according the spec : All HTTP/1.1 applications MUST be able to receive and decode the chunked transfer-coding, and MUST ignore chunk-extension extensions they do not understand. http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html Also for the order chunk then content-length, according the spec : Messages MUST NOT include both a Content-Length header field and a non-identity transfer-coding. If the message does include a non- identity transfer-coding, the Content-Length MUST be ignored. http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.4 I think chunked encoding is still good for some purpose, it allows the client to set buffer dynamically, and ignore bad chunks which could save some bandwidth/time . attachments - handle correctly chunked encoding and Content-Length, send attachments unchunked -- Key: COUCHDB-461 URL: https://issues.apache.org/jira/browse/COUCHDB-461 Project: CouchDB Issue Type: Bug Reporter: Benoit Chesneau Fix For: 0.10 Attachments: attachments_get_length.diff, attachments_put_length.diff, chunked.diff, chunked2.diff This patch allow couchdb to send attachments unchunked, instead, Content-Length is fixed and content is streamed. It also fix attachments PUT by detecting first if encoding is chunked then test the length witch is the standard way to do it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (COUCHDB-427) Trying to replicate a database from an old format increase cpu usage up to 100%
[ https://issues.apache.org/jira/browse/COUCHDB-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12743068#action_12743068 ] Adam Kocoloski commented on COUCHDB-427: So we identified the document that has duplicate attachments as http://benoitc.im/b/7cb06e5de28327c7fc81c7028bece5a3 and indeed it does have three attachments with the same name. I'm not sure what the next step is here. We certainly don't want replication to break, but this seems like such an edge case that I'm not sure it's worth putting in special code in the replicator to deal with it. Damien, the check_dup_atts code is your stuff, right? Do you know how Benoit could've ended up with 3 identically-named attachments in the past? Trying to replicate a database from an old format increase cpu usage up to 100% --- Key: COUCHDB-427 URL: https://issues.apache.org/jira/browse/COUCHDB-427 Project: CouchDB Issue Type: Bug Environment: osx, ubuntu, openbsd Reporter: Benoit Chesneau Priority: Critical Fix For: 0.10 Attachments: duplicate_attachments_crash.txt When you try to replicate a database from an old version of couchdb to latest, the cpu usage increase up to 100% and more instead of just hanging. You can try to replicate from http://benoitc.im/b to latest trunk or 0.9.1 to replicate this issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.