[jira] [Created] (COUCHDB-2327) Add string/array prefix match option, for view queries
Jens Alfke created COUCHDB-2327: --- Summary: Add string/array prefix match option, for view queries Key: COUCHDB-2327 URL: https://issues.apache.org/jira/browse/COUCHDB-2327 Project: CouchDB Issue Type: Improvement Security Level: public (Regular issues) Components: HTTP Interface Reporter: Jens Alfke View querying provides no clean way to match a string prefix The only advice I've seen is to set startkey to the prefix, and endkey to the prefix with some really high Unicode character appended, which is a total kludge*. There's a similar issue with matching an array prefix, e.g. all keys that start with [2014, ...]. Here the solution is less kludgy (append a {} to the endkey) but it's still very unintuitive to people learning CouchDB. I've had to explain it to newbies many times. I suggest adding an explicit query option to enable prefix matching. This doesn't need to mess with the actual query engine — all it has to do is modify the endkey by appending an appropriate Unicode character (in the string case) or empty object (in the array case.) If no `endkey` is given it will be based on the `startkey`. I've already implemented a comparable feature for Couchbase Lite: https://github.com/couchbase/couchbase-lite-ios/wiki/Query-Enhancements#prefix-matching Note that I made the `prefix_match` parameter an integer, not a boolean. This is to support cases where you want to match a prefix of a _nested component_ of the key, for example all keys in 2014 whose product name starts with 'f', where the startkey would be [2014, f] and the prefix_match would be 2 to indicate that it's the nested string that should be prefix-matched not the array. But in the common case you'd just set the value to 1 to indicate that the top level key should be prefix-matched. * Why is adding some high Unicode character a kludge? Because Unicode is so complicated and so inconsistently implemented. Doing this immediately opens the possibility of weird Unicode issues in your development language's string type, in its HTTP client library, and in Erlang's equivalents on the server side. Not to mention the swamp that is the Unicode specification itself — for instance, I've seen advice to use a character like \uFFFE, which was correct until Unicode went 32-bit, and tended to work alright for a while after that, but will now fail with emoji characters (which are both very commonly used and well outside the 16-bit range.) Actually whether it fails depends on whether your string implementation operates on UTF-16 (very common) or true Unicode code points. Like I said, it's a kludge. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (COUCHDB-2052) Add API for discovering feature availability
[ https://issues.apache.org/jira/browse/COUCHDB-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13894620#comment-13894620 ] Jens Alfke commented on COUCHDB-2052: - I don't think I understand your example, Benoit. couchbase-lite can use _bulk_get on the couchbase sync gateway or _changes and other things on couchdb Couchbase Lite always uses _changes and other things; those are core parts of the protocol. _bulk_get is simply an optimization to avoid lots of GET requests for individual documents. So it makes sense to ask whether the server supports _bulk_get, because the choice is to make one _bulk_get call or a series of GET /db/doc calls. 2 capabilities corresponding to 2 well defined api/protocols. Here why not something like REPCOUCHDB01 and REPCOUCHBASE01 There aren't two APIs or protocols. There's one, and there are simply some optional capabilities that can optimize it. Describing an expected behaviour is a way easier in my opinion than expecting that all applications are able to parse a message in time etc. I don't know what this means. why not getting them by issuing an OPTIONS method to / ? It's apparently not recommended to use OPTIONS (see the mailing list thread.) In RFC2616 the OPTIONS method is really vaguely defined; it seems it's really only useful for returning Allow: headers to show what methods are supported. I'd be wary of pushing it any further than that. Add API for discovering feature availability Key: COUCHDB-2052 URL: https://issues.apache.org/jira/browse/COUCHDB-2052 Project: CouchDB Issue Type: Improvement Security Level: public(Regular issues) Components: HTTP Interface Reporter: Jens Alfke I propose adding to the response of GET / a property called features or extensions whose value is an array of strings, each string being an agreed-upon identifier of a specific optional feature. For example: {couchdb: welcome, features: [_bulk_get, persona]}, vendor: … Rationale: Features are being added to CouchDB over time, plug-ins may add features, and there are compatible servers that may have nonstandard features (like _bulk_get). But there isn't a clear way for a client (which might be another server's replicator) to determine what features a server has. Currently a client looking at the response of a GET / has to figure out what server and version thereof it's talking to, and then has to consult hardcoded knowledge that version X of server Y supports feature Z. (True, you can often get away without needing to check, by assuming a feature exists but falling back to standard behavior if you get an error. But not all features may be so easy to detect — the behavior of an unaware server might be to ignore the feature and do the wrong thing, rather than returning an error — and anyway this adds extra round-trips that slow down the operation.) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (COUCHDB-2052) Add API for discovering feature availability
[ https://issues.apache.org/jira/browse/COUCHDB-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13894643#comment-13894643 ] Jens Alfke commented on COUCHDB-2052: - This discussion is starting to smell like bike-shedding to me (and it's not the first time that's happened with CouchDB.) I raised a fairly straightforward issue -- given different versions and implementations that might not all support the same functionality, how does a client know whether or not it can use a particular feature/function? -- and proposed a straightforward solution. My solution might not be perfect, but it's clearly specified and very easy to implement and to use. The responses here seem to be digressing, and no one is proposing anything concrete. The other ideas here also sound like they'd be significantly more complex. Basically, if someone else has an alternative proposal for how to do this, then specify it clearly and post it here. Add API for discovering feature availability Key: COUCHDB-2052 URL: https://issues.apache.org/jira/browse/COUCHDB-2052 Project: CouchDB Issue Type: Improvement Security Level: public(Regular issues) Components: HTTP Interface Reporter: Jens Alfke I propose adding to the response of GET / a property called features or extensions whose value is an array of strings, each string being an agreed-upon identifier of a specific optional feature. For example: {couchdb: welcome, features: [_bulk_get, persona]}, vendor: … Rationale: Features are being added to CouchDB over time, plug-ins may add features, and there are compatible servers that may have nonstandard features (like _bulk_get). But there isn't a clear way for a client (which might be another server's replicator) to determine what features a server has. Currently a client looking at the response of a GET / has to figure out what server and version thereof it's talking to, and then has to consult hardcoded knowledge that version X of server Y supports feature Z. (True, you can often get away without needing to check, by assuming a feature exists but falling back to standard behavior if you get an error. But not all features may be so easy to detect — the behavior of an unaware server might be to ignore the feature and do the wrong thing, rather than returning an error — and anyway this adds extra round-trips that slow down the operation.) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (COUCHDB-2052) Add API for discovering feature availability
[ https://issues.apache.org/jira/browse/COUCHDB-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13894677#comment-13894677 ] Jens Alfke commented on COUCHDB-2052: - The replicator attempts to use optimized endpoints of source and target nodes and, on a 404, falls back to less optimal endpoints This is definitely elegant and Web-like, but not optimal over slow mobile connections with high latency. If multiple resources need to be preflighted this way, it can start to add noticeable delay to the replication. I'm not saying that this approach is unacceptable; just be aware that it has a cost (one that CouchDB developers may not be keeping in mind since they're used to thinking of replication as being between two _servers_ with fast Ethernet connections :) Add API for discovering feature availability Key: COUCHDB-2052 URL: https://issues.apache.org/jira/browse/COUCHDB-2052 Project: CouchDB Issue Type: Improvement Security Level: public(Regular issues) Components: HTTP Interface Reporter: Jens Alfke I propose adding to the response of GET / a property called features or extensions whose value is an array of strings, each string being an agreed-upon identifier of a specific optional feature. For example: {couchdb: welcome, features: [_bulk_get, persona]}, vendor: … Rationale: Features are being added to CouchDB over time, plug-ins may add features, and there are compatible servers that may have nonstandard features (like _bulk_get). But there isn't a clear way for a client (which might be another server's replicator) to determine what features a server has. Currently a client looking at the response of a GET / has to figure out what server and version thereof it's talking to, and then has to consult hardcoded knowledge that version X of server Y supports feature Z. (True, you can often get away without needing to check, by assuming a feature exists but falling back to standard behavior if you get an error. But not all features may be so easy to detect — the behavior of an unaware server might be to ignore the feature and do the wrong thing, rather than returning an error — and anyway this adds extra round-trips that slow down the operation.) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (COUCHDB-2052) Add API for discovering feature availability
[ https://issues.apache.org/jira/browse/COUCHDB-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13894697#comment-13894697 ] Jens Alfke commented on COUCHDB-2052: - OK, here are examples from the Couchbase Sync Gateway: (1) WebSocket transport for the changes feed. The query param ?feed=websocket is recognized and triggers a change-over to the WebSocket protocol, with the server sending a message for every change. If you send such a request to CouchDB it will ignore the unrecognized feed type and send back a 'normal' changes feed instead; this can trigger a lot of wasted data transfer. (2) The body of a PUT/POST request sent to the server may be gzip-encoded (Content-Encoding: gzip). As far as I can tell the HTTP spec doesn't provide any way for the client to discover whether the server supports this. If the server doesn't, and also doesn't check the Content-Encoding request header, it'll end up trying to read the raw data and either barfing on it or (worse) storing it without any indication that it's zipped. (3) When the Gateway parses an incoming document in multipart/related format, it looks at the attachment bodies' Content-Disposition headers to discover which named attachments they correspond to, instead of (as CouchDB does or used to) assuming that they appear in the same order as the objects in the JSON body's `_attachments` dictionary. A client might not be able to generate the MIME bodies in exactly the same order as in the dictionary, so it might need to check for this capability and abort the replication if it's not there, since the alternative is getting the attachments mixed up. Add API for discovering feature availability Key: COUCHDB-2052 URL: https://issues.apache.org/jira/browse/COUCHDB-2052 Project: CouchDB Issue Type: Improvement Security Level: public(Regular issues) Components: HTTP Interface Reporter: Jens Alfke I propose adding to the response of GET / a property called features or extensions whose value is an array of strings, each string being an agreed-upon identifier of a specific optional feature. For example: {couchdb: welcome, features: [_bulk_get, persona]}, vendor: … Rationale: Features are being added to CouchDB over time, plug-ins may add features, and there are compatible servers that may have nonstandard features (like _bulk_get). But there isn't a clear way for a client (which might be another server's replicator) to determine what features a server has. Currently a client looking at the response of a GET / has to figure out what server and version thereof it's talking to, and then has to consult hardcoded knowledge that version X of server Y supports feature Z. (True, you can often get away without needing to check, by assuming a feature exists but falling back to standard behavior if you get an error. But not all features may be so easy to detect — the behavior of an unaware server might be to ignore the feature and do the wrong thing, rather than returning an error — and anyway this adds extra round-trips that slow down the operation.) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (COUCHDB-2052) Add API for discovering feature availability
[ https://issues.apache.org/jira/browse/COUCHDB-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13894701#comment-13894701 ] Jens Alfke commented on COUCHDB-2052: - Benoit: I am thinking that the faster answer to your problem would be listing all the URI we have on a node associated to the vendor id and version. This is not just about knowing what URIs are supported. See my previous post for three issues that can't be resolved this way. Add API for discovering feature availability Key: COUCHDB-2052 URL: https://issues.apache.org/jira/browse/COUCHDB-2052 Project: CouchDB Issue Type: Improvement Security Level: public(Regular issues) Components: HTTP Interface Reporter: Jens Alfke I propose adding to the response of GET / a property called features or extensions whose value is an array of strings, each string being an agreed-upon identifier of a specific optional feature. For example: {couchdb: welcome, features: [_bulk_get, persona]}, vendor: … Rationale: Features are being added to CouchDB over time, plug-ins may add features, and there are compatible servers that may have nonstandard features (like _bulk_get). But there isn't a clear way for a client (which might be another server's replicator) to determine what features a server has. Currently a client looking at the response of a GET / has to figure out what server and version thereof it's talking to, and then has to consult hardcoded knowledge that version X of server Y supports feature Z. (True, you can often get away without needing to check, by assuming a feature exists but falling back to standard behavior if you get an error. But not all features may be so easy to detect — the behavior of an unaware server might be to ignore the feature and do the wrong thing, rather than returning an error — and anyway this adds extra round-trips that slow down the operation.) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (COUCHDB-2052) Add API for discovering feature availability
[ https://issues.apache.org/jira/browse/COUCHDB-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13894880#comment-13894880 ] Jens Alfke commented on COUCHDB-2052: - Well, CouchDB doesn't implement HTTP 1.1 correctly then... (1) I tried sending a GET request for the changes feed with an Upgrade header and feed=websocket; it ignored the header and sent back the entire changes feed in normal format. (2) I tried sending it a gzip-encoded PUT request body, and it failed with a 400 status with message invalid_json. Apparently it ignored the Content-Encoding header. (I'm still on version 1.4, though.) (3) You're right that one could check the version, but part of the reason for this proposal was to avoid having to have hardcoded knowledge of versions. It's not just one version check either -- IrisCouch and Cloudant (and BigCouch?) have independent version numbers, so you'd have to know what versions they incorporated the fix into. I don't mean to pick on CouchDB. I'm sure most web server engines, aside from the big ones like Apache, don't pay attention to all the more obscure edges of the HTTP spec, like honoring Upgrade and Content-Encoding headers in requests. Add API for discovering feature availability Key: COUCHDB-2052 URL: https://issues.apache.org/jira/browse/COUCHDB-2052 Project: CouchDB Issue Type: Improvement Security Level: public(Regular issues) Components: HTTP Interface Reporter: Jens Alfke I propose adding to the response of GET / a property called features or extensions whose value is an array of strings, each string being an agreed-upon identifier of a specific optional feature. For example: {couchdb: welcome, features: [_bulk_get, persona]}, vendor: … Rationale: Features are being added to CouchDB over time, plug-ins may add features, and there are compatible servers that may have nonstandard features (like _bulk_get). But there isn't a clear way for a client (which might be another server's replicator) to determine what features a server has. Currently a client looking at the response of a GET / has to figure out what server and version thereof it's talking to, and then has to consult hardcoded knowledge that version X of server Y supports feature Z. (True, you can often get away without needing to check, by assuming a feature exists but falling back to standard behavior if you get an error. But not all features may be so easy to detect — the behavior of an unaware server might be to ignore the feature and do the wrong thing, rather than returning an error — and anyway this adds extra round-trips that slow down the operation.) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (COUCHDB-2054) Content-Encoding on requests is ignored; should decode or return 415
Jens Alfke created COUCHDB-2054: --- Summary: Content-Encoding on requests is ignored; should decode or return 415 Key: COUCHDB-2054 URL: https://issues.apache.org/jira/browse/COUCHDB-2054 Project: CouchDB Issue Type: Bug Security Level: public (Regular issues) Components: HTTP Interface Reporter: Jens Alfke CouchDB (as of 1.4) seems to ignore the Content-Encoding header on requests, and just parses the request body with no decoding. This causes incorrect behavior — most often it tries to parse the encoded data as JSON and will fail and return 400. CouchDB should either decode the request body (e.g. unzip it), or else return a 415 status. Decoding would be quite useful: requests with large JSON bodies (like revs_diff or the POST form of all_docs) can have their size cut in half by gzip encoding. HTTP 1.1 spec for Content-Encoding: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.11 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (COUCHDB-2052) Add API for discovering feature availability
[ https://issues.apache.org/jira/browse/COUCHDB-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13894971#comment-13894971 ] Jens Alfke commented on COUCHDB-2052: - Filed COUCHDB-2054 for the Content-Encoding handling. Add API for discovering feature availability Key: COUCHDB-2052 URL: https://issues.apache.org/jira/browse/COUCHDB-2052 Project: CouchDB Issue Type: Improvement Security Level: public(Regular issues) Components: HTTP Interface Reporter: Jens Alfke I propose adding to the response of GET / a property called features or extensions whose value is an array of strings, each string being an agreed-upon identifier of a specific optional feature. For example: {couchdb: welcome, features: [_bulk_get, persona]}, vendor: … Rationale: Features are being added to CouchDB over time, plug-ins may add features, and there are compatible servers that may have nonstandard features (like _bulk_get). But there isn't a clear way for a client (which might be another server's replicator) to determine what features a server has. Currently a client looking at the response of a GET / has to figure out what server and version thereof it's talking to, and then has to consult hardcoded knowledge that version X of server Y supports feature Z. (True, you can often get away without needing to check, by assuming a feature exists but falling back to standard behavior if you get an error. But not all features may be so easy to detect — the behavior of an unaware server might be to ignore the feature and do the wrong thing, rather than returning an error — and anyway this adds extra round-trips that slow down the operation.) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (COUCHDB-2052) Add API for discovering feature availability
[ https://issues.apache.org/jira/browse/COUCHDB-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13894944#comment-13894944 ] Jens Alfke commented on COUCHDB-2052: - An HTTP server is free to ignore an Upgrade request for a protocol it doesn't support. It still seems ugly to me that the server will start generating the changes feed and sending out changes until the client sees the non-101 response code and closes the socket. That's a bug to fix. Yeah, but it tells me that it's perhaps not realistic to rely on subtleties of HTTP negotiation for detecting features. I don't see much difference between checking version foo vs features.contains In this example there are at least three different vendor-and-version tests involved. Possibly more; for instance I don't know if rcouch has an independent version numbering scheme, and if so which version merged in the patch that fixed multipart parsing. Add API for discovering feature availability Key: COUCHDB-2052 URL: https://issues.apache.org/jira/browse/COUCHDB-2052 Project: CouchDB Issue Type: Improvement Security Level: public(Regular issues) Components: HTTP Interface Reporter: Jens Alfke I propose adding to the response of GET / a property called features or extensions whose value is an array of strings, each string being an agreed-upon identifier of a specific optional feature. For example: {couchdb: welcome, features: [_bulk_get, persona]}, vendor: … Rationale: Features are being added to CouchDB over time, plug-ins may add features, and there are compatible servers that may have nonstandard features (like _bulk_get). But there isn't a clear way for a client (which might be another server's replicator) to determine what features a server has. Currently a client looking at the response of a GET / has to figure out what server and version thereof it's talking to, and then has to consult hardcoded knowledge that version X of server Y supports feature Z. (True, you can often get away without needing to check, by assuming a feature exists but falling back to standard behavior if you get an error. But not all features may be so easy to detect — the behavior of an unaware server might be to ignore the feature and do the wrong thing, rather than returning an error — and anyway this adds extra round-trips that slow down the operation.) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (COUCHDB-2052) Add API for discovering feature availability
Jens Alfke created COUCHDB-2052: --- Summary: Add API for discovering feature availability Key: COUCHDB-2052 URL: https://issues.apache.org/jira/browse/COUCHDB-2052 Project: CouchDB Issue Type: Improvement Security Level: public (Regular issues) Components: HTTP Interface Reporter: Jens Alfke I propose adding to the response of GET / a property called features or extensions whose value is an array of strings, each string being an agreed-upon identifier of a specific optional feature. For example: {couchdb: welcome, features: [_bulk_get, persona]}, vendor: … Rationale: Features are being added to CouchDB over time, plug-ins may add features, and there are compatible servers that may have nonstandard features (like _bulk_get). But there isn't a clear way for a client (which might be another server's replicator) to determine what features a server has. Currently a client looking at the response of a GET / has to figure out what server and version thereof it's talking to, and then has to consult hardcoded knowledge that version X of server Y supports feature Z. (True, you can often get away without needing to check, by assuming a feature exists but falling back to standard behavior if you get an error. But not all features may be so easy to detect — the behavior of an unaware server might be to ignore the feature and do the wrong thing, rather than returning an error — and anyway this adds extra round-trips that slow down the operation.) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (COUCHDB-2052) Add API for discovering feature availability
[ https://issues.apache.org/jira/browse/COUCHDB-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13893472#comment-13893472 ] Jens Alfke commented on COUCHDB-2052: - This isn't just about expressing what URL endpoints exist. There are features that don't just add URLs; they might add query options to an existing URL, or they might manifest in other ways entirely, like the 'channels' feature of the Couchbase Sync Gateway. Add API for discovering feature availability Key: COUCHDB-2052 URL: https://issues.apache.org/jira/browse/COUCHDB-2052 Project: CouchDB Issue Type: Improvement Security Level: public(Regular issues) Components: HTTP Interface Reporter: Jens Alfke I propose adding to the response of GET / a property called features or extensions whose value is an array of strings, each string being an agreed-upon identifier of a specific optional feature. For example: {couchdb: welcome, features: [_bulk_get, persona]}, vendor: … Rationale: Features are being added to CouchDB over time, plug-ins may add features, and there are compatible servers that may have nonstandard features (like _bulk_get). But there isn't a clear way for a client (which might be another server's replicator) to determine what features a server has. Currently a client looking at the response of a GET / has to figure out what server and version thereof it's talking to, and then has to consult hardcoded knowledge that version X of server Y supports feature Z. (True, you can often get away without needing to check, by assuming a feature exists but falling back to standard behavior if you get an error. But not all features may be so easy to detect — the behavior of an unaware server might be to ignore the feature and do the wrong thing, rather than returning an error — and anyway this adds extra round-trips that slow down the operation.) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (COUCHDB-1938) API doc for _replicate doesn't describe dictionary form of source/target
Jens Alfke created COUCHDB-1938: --- Summary: API doc for _replicate doesn't describe dictionary form of source/target Key: COUCHDB-1938 URL: https://issues.apache.org/jira/browse/COUCHDB-1938 Project: CouchDB Issue Type: Bug Components: Documentation Reporter: Jens Alfke The API documentation for replication only describes the source and target properties as being strings containing database names or URLs. It doesn't describe the enhanced form where these are dictionaries/objects that can contain additional parameters like authentication settings or extra HTTP headers. The current docs that are missing this info: http://docs.couchdb.org/en/latest/api/server/common.html#replicate Unofficial docs from the wiki that describe the enhanced form: http://wiki.apache.org/couchdb/Replication#Authentication -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (COUCHDB-1479) Futon config UI won't allow WWW-Authenticate option to be added (name is lowercased)
[ https://issues.apache.org/jira/browse/COUCHDB-1479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790609#comment-13790609 ] Jens Alfke commented on COUCHDB-1479: - This seems to have gotten worse in CouchDB 1.4. Now, even if I edit the config file by hand, the key gets lowercased as soon as CouchDB starts. So the file is rewritten with the lowercased key, and the config setting doesn't take effect at all. So it appears that in CouchDB 1.4 the only way to configure HTTP auth is to go through Futon (or the REST API) to set this key -- it can't be set in the config file, and it won't persist across launches. Futon config UI won't allow WWW-Authenticate option to be added (name is lowercased) -- Key: COUCHDB-1479 URL: https://issues.apache.org/jira/browse/COUCHDB-1479 Project: CouchDB Issue Type: Bug Components: Futon Environment: Mac OS X Reporter: Jens Alfke Priority: Minor When using the config UI in futon to add a new option, via the Add a new section... link at the bottom of the page, the name of the option is lowercased when written to the .ini file. (For some reason the case is preserved when altering the runtime configuration, though, so the problem doesn't manifest itself until the next time the server is restarted.) This causes trouble when attempting to enable HTTP basic auth by adding a WWW-Authenticate option (value Basic) to the [httpd] section. The actual data written to the .ini file is: [httpd] www-authenticate = Basic which is not recognized when the server loads its configuration on restart. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (COUCHDB-1824) Official documentation of replication algorithm?
[ https://issues.apache.org/jira/browse/COUCHDB-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13780921#comment-13780921 ] Jens Alfke commented on COUCHDB-1824: - Then do I need to apply for push access to the repo? Or is there a procedure where I submit a patch or something like a Github pull request? Official documentation of replication algorithm? Key: COUCHDB-1824 URL: https://issues.apache.org/jira/browse/COUCHDB-1824 Project: CouchDB Issue Type: Documentation Components: Documentation Reporter: Nathan Vander Wilt Assignee: Alexander Shorin Fix For: 1.5.0 Though it's in some ways an internal detail, it might be nice to provide a canonical description of CouchDB's replication protocol (algorithm, really) in the documentation. See links at: http://wiki.apache.org/couchdb/Replication#Protocol_Documentation -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1824) Official documentation of replication algorithm?
[ https://issues.apache.org/jira/browse/COUCHDB-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13690407#comment-13690407 ] Jens Alfke commented on COUCHDB-1824: - They're both needed for interoperability. The HTTP calls used by the replicator are documented now, but only on the unofficial wiki, and only because I added documentation along the way while I wrote the TouchDB replicator. At the time a lot of the info existed only in the source code and in the heads of people like Filipe and Damien. My unstated assumption here is that replication interoperability is a good thing, and that other database implementations should support the same protocol and algorithms so they can freely replicate with CouchDB and each other. That's very powerful, and I don't know of any other open protocols for replication. Official documentation of replication algorithm? Key: COUCHDB-1824 URL: https://issues.apache.org/jira/browse/COUCHDB-1824 Project: CouchDB Issue Type: Documentation Components: Documentation Reporter: Nathan Vander Wilt Though it's in some ways an internal detail, it might be nice to provide a canonical description of CouchDB's replication protocol (algorithm, really) in the documentation. See links at: http://wiki.apache.org/couchdb/Replication#Protocol_Documentation -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1824) Official documentation of replication algorithm?
[ https://issues.apache.org/jira/browse/COUCHDB-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689440#comment-13689440 ] Jens Alfke commented on COUCHDB-1824: - BTW, this is not at all an internal detail, it's important for interoperability. Knowing how the replication protocol/algorithm works is crucial for any 3rd party software (like PouchDB / TouchDB / Couchbase Lite) that wants to be able to replicate with CouchDB. Official documentation of replication algorithm? Key: COUCHDB-1824 URL: https://issues.apache.org/jira/browse/COUCHDB-1824 Project: CouchDB Issue Type: Documentation Components: Documentation Reporter: Nathan Vander Wilt Though it's in some ways an internal detail, it might be nice to provide a canonical description of CouchDB's replication protocol (algorithm, really) in the documentation. See links at: http://wiki.apache.org/couchdb/Replication#Protocol_Documentation -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1824) Official documentation of replication algorithm?
[ https://issues.apache.org/jira/browse/COUCHDB-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689497#comment-13689497 ] Jens Alfke commented on COUCHDB-1824: - I'm just pointing out that this is more significant than the original description implies. Official documentation of replication algorithm? Key: COUCHDB-1824 URL: https://issues.apache.org/jira/browse/COUCHDB-1824 Project: CouchDB Issue Type: Documentation Components: Documentation Reporter: Nathan Vander Wilt Though it's in some ways an internal detail, it might be nice to provide a canonical description of CouchDB's replication protocol (algorithm, really) in the documentation. See links at: http://wiki.apache.org/couchdb/Replication#Protocol_Documentation -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1824) Official documentation of replication algorithm?
[ https://issues.apache.org/jira/browse/COUCHDB-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13686861#comment-13686861 ] Jens Alfke commented on COUCHDB-1824: - My latest docs are at https://github.com/couchbase/couchbase-lite-ios/wiki/Replication-Algorithm Feel free to reuse any of that text. Official documentation of replication algorithm? Key: COUCHDB-1824 URL: https://issues.apache.org/jira/browse/COUCHDB-1824 Project: CouchDB Issue Type: Bug Components: Documentation Reporter: Nathan Vander Wilt Though it's in some ways an internal detail, it might be nice to provide a canonical description of CouchDB's replication protocol (algorithm, really) in the documentation. See links at: http://wiki.apache.org/couchdb/Replication#Protocol_Documentation -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (COUCHDB-1670) Replicator crashes if numbers in checkpoint docs are expressed in scientific notation
Jens Alfke created COUCHDB-1670: --- Summary: Replicator crashes if numbers in checkpoint docs are expressed in scientific notation Key: COUCHDB-1670 URL: https://issues.apache.org/jira/browse/COUCHDB-1670 Project: CouchDB Issue Type: Bug Components: Replication Reporter: Jens Alfke The CouchDB 1.2 replicator process crashes with an Erlang exception when parsing a checkpoint document read back from a remote database, if numbers in the document were JSON-encoded in scientific notation instead of as integers. This includes the properties source_last_seq, end_last_seq, start_last_seq. That is, the following encoding works fine: ..., source_last_seq: 1234567, ... whereas this completely-equivalent encoding causes an exception: ..., source_last_seq: 1.234567e+06, ... This issue raised its head as a result of a CouchDB-compatible engine I'm writing (the Couchbase Sync Gateway) which can serve as a passive replication endpoint. It's implemented in Go, and the Go JSON package has the side effect of (a) parsing all JSON numbers into type 'double', and (b) encoding all doubles into JSON using scientific notation if they're more than six digits long. The net effect is that when CouchDB stores a checkpoint into the Sync Adapter's database and then later reads it back, it barfs due to the scientific notation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1670) Replicator crashes if numbers in checkpoint docs are expressed in scientific notation
[ https://issues.apache.org/jira/browse/COUCHDB-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13572655#comment-13572655 ] Jens Alfke commented on COUCHDB-1670: - since is opaque. You have to pass back exactly what you got from couchdb. I don't think that's a reasonable expectation. The JSON is going to be transformed anyway (to insert the _rev), so at some point it's going to be translated into an internal format and then regenerated. The output has to be an equivalent JSON document, but that doesn't mean byte-for-byte equivalence. For instance, object keys could be in a different order, Unicode escapes could be turned into literals or vice versa, and numbers might be represented differently, like changing to/from scientific notation or suppressing trailing zeros after the decimal point. Replicator crashes if numbers in checkpoint docs are expressed in scientific notation - Key: COUCHDB-1670 URL: https://issues.apache.org/jira/browse/COUCHDB-1670 Project: CouchDB Issue Type: Bug Components: Replication Reporter: Jens Alfke The CouchDB 1.2 replicator process crashes with an Erlang exception when parsing a checkpoint document read back from a remote database, if numbers in the document were JSON-encoded in scientific notation instead of as integers. This includes the properties source_last_seq, end_last_seq, start_last_seq. That is, the following encoding works fine: ..., source_last_seq: 1234567, ... whereas this completely-equivalent encoding causes an exception: ..., source_last_seq: 1.234567e+06, ... This issue raised its head as a result of a CouchDB-compatible engine I'm writing (the Couchbase Sync Gateway) which can serve as a passive replication endpoint. It's implemented in Go, and the Go JSON package has the side effect of (a) parsing all JSON numbers into type 'double', and (b) encoding all doubles into JSON using scientific notation if they're more than six digits long. The net effect is that when CouchDB stores a checkpoint into the Sync Adapter's database and then later reads it back, it barfs due to the scientific notation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1368) multipart/related document body doesn't identify which part is which attachment
[ https://issues.apache.org/jira/browse/COUCHDB-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13497240#comment-13497240 ] Jens Alfke commented on COUCHDB-1368: - Where is the branch? I don't see it in the github UI at https://github.com/apache/couchdb . Also, could you post a sample of what the MIME headers look like for an attachment part? multipart/related document body doesn't identify which part is which attachment --- Key: COUCHDB-1368 URL: https://issues.apache.org/jira/browse/COUCHDB-1368 Project: CouchDB Issue Type: Bug Components: HTTP Interface Reporter: Jens Alfke Priority: Minor If you GET a document with attachments in multipart/related format (by adding ?attachments=true and setting Accept:multipart/related), the MIME bodies for the attachments have no headers. This makes it difficult to tell which one is which. Damien says they're in the same order that they appear in the document's _attachments object ... which is fine if you're Erlang, because Erlang preserves the order of keys in a JSON object, but no other JSON implementation I know of does that (because they use hashtables instead of linked lists.) The upshot is that any non-Erlang code trying to parse such a response will have to do some by-hand parsing of the JSON data to get the _attachment keys in order. This can be fixed by adding a Content-ID header to each attachment body, whose value is the filename. It would be nice if other standard headers were added too, like Content-Type, Content-Length, Content-Encoding, as this would make it work better with existing MIME multipart libraries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1368) multipart/related document body doesn't identify which part is which attachment
[ https://issues.apache.org/jira/browse/COUCHDB-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13497418#comment-13497418 ] Jens Alfke commented on COUCHDB-1368: - It turns out Content-ID is not the correct header to use for the filename, because according to RFC2045 sec.7, Content-ID values must be generated to be world-unique. (I didn't know this when writing up this issue, but discovered it later on while implementing MIME support for TouchDB. I should have updated this issue too; sorry!) The most appropriate header to use seems to be Content-Disposition (RFC1806): Content-Disposition: attachment; filename=test.txt This is what TouchDB generates, and what it will recognize in incoming MIME documents. multipart/related document body doesn't identify which part is which attachment --- Key: COUCHDB-1368 URL: https://issues.apache.org/jira/browse/COUCHDB-1368 Project: CouchDB Issue Type: Bug Components: HTTP Interface Reporter: Jens Alfke Priority: Minor If you GET a document with attachments in multipart/related format (by adding ?attachments=true and setting Accept:multipart/related), the MIME bodies for the attachments have no headers. This makes it difficult to tell which one is which. Damien says they're in the same order that they appear in the document's _attachments object ... which is fine if you're Erlang, because Erlang preserves the order of keys in a JSON object, but no other JSON implementation I know of does that (because they use hashtables instead of linked lists.) The upshot is that any non-Erlang code trying to parse such a response will have to do some by-hand parsing of the JSON data to get the _attachment keys in order. This can be fixed by adding a Content-ID header to each attachment body, whose value is the filename. It would be nice if other standard headers were added too, like Content-Type, Content-Length, Content-Encoding, as this would make it work better with existing MIME multipart libraries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1259) Replication ID is not stable if local server has a dynamic port number
[ https://issues.apache.org/jira/browse/COUCHDB-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490886#comment-13490886 ] Jens Alfke commented on COUCHDB-1259: - I have looked at the patch but I don't really understand what it's doing, both because my Erlang is really weak and because I don't know the internals of CouchDB. So I can't really comment on the code. It does sound like what's being suggested goes beyond what I asked for. This bug is about the *local* server (the one running the replication) having a different IP address or port than the last time. The suggested patches seem to also cover changes to the *remote* server's URL. That's an interesting issue but IMHO not the same thing. The point of this bug is that the URL of the *local* server running the replication is irrelevant to the replication. If I'm opening connections to another server to replicate with it, it doesn't matter what port or IP address I am listening on, because there aren't any incoming connections happening. They don't affect the replication at all. As for Benoit's security issues: Replication has no security. Security applies at a more fundamental level of identifying who is connecting and authenticating that principal. You absolutely cannot make security tests based on IP addresses or port numbers. Replication ID is not stable if local server has a dynamic port number -- Key: COUCHDB-1259 URL: https://issues.apache.org/jira/browse/COUCHDB-1259 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.1 Reporter: Jens Alfke Assignee: Robert Newson Priority: Blocker Fix For: 1.3 Attachments: couchdb-1259.patch, couchdb-1259.patch I noticed that when Couchbase Mobile running on iOS replicates to/from a remote server (on iriscouch in this case), the replication has to fetch the full _changes feed every time it starts. Filipe helped me track down the problem -- the replication ID is coming out different every time. The reason for this is that the local port number, which is one of the inputs to the hash that generates the replication ID, is randomly assigned by the OS. (I.e. it uses a port number of 0 when opening its listener socket.) This is because there could be multiple apps using Couchbase Mobile running on the same device and we can't have their ports colliding. The underlying problem is that CouchDB is attempting to generate a unique ID for a particular pair of {source, destination} databases, but it's basing it on attributes that aren't fundamental to the database and can change, like the hostname or port number. One solution, proposed by Filipe and me, is to assign each database (or each server?) a random UUID when it's created, and use that to generate replication IDs. Another solution, proposed by Damien, is to have CouchDB let the client work out the replication ID on its own, and set it as a property in the replication document (or the JSON body of a _replicate request.) This is even more flexible and will handle tricky scenarios like full P2P replication where there may be no low-level way to uniquely identify the remote database being synced with. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1259) Replication ID is not stable if local server has a dynamic port number
[ https://issues.apache.org/jira/browse/COUCHDB-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490352#comment-13490352 ] Jens Alfke commented on COUCHDB-1259: - Benoit: I think you're misunderstanding the issue. This isn't something about P2P. It's just that if the local CouchDB is not *listening* on a fixed port number, then replications made by that server to/from another server aren't handled efficiently ... even though the local server's port number has nothing at all to do with the replication (since it's the one making the connections.) In a real P2P case, this change makes even more sense, because the addresses of the servers are unimportant -- as you said, it's the databases and their data that are the important thing. A UUID helps identify those. Replication ID is not stable if local server has a dynamic port number -- Key: COUCHDB-1259 URL: https://issues.apache.org/jira/browse/COUCHDB-1259 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.1 Reporter: Jens Alfke Assignee: Robert Newson Priority: Blocker Fix For: 1.3 Attachments: couchdb-1259.patch, couchdb-1259.patch I noticed that when Couchbase Mobile running on iOS replicates to/from a remote server (on iriscouch in this case), the replication has to fetch the full _changes feed every time it starts. Filipe helped me track down the problem -- the replication ID is coming out different every time. The reason for this is that the local port number, which is one of the inputs to the hash that generates the replication ID, is randomly assigned by the OS. (I.e. it uses a port number of 0 when opening its listener socket.) This is because there could be multiple apps using Couchbase Mobile running on the same device and we can't have their ports colliding. The underlying problem is that CouchDB is attempting to generate a unique ID for a particular pair of {source, destination} databases, but it's basing it on attributes that aren't fundamental to the database and can change, like the hostname or port number. One solution, proposed by Filipe and me, is to assign each database (or each server?) a random UUID when it's created, and use that to generate replication IDs. Another solution, proposed by Damien, is to have CouchDB let the client work out the replication ID on its own, and set it as a property in the replication document (or the JSON body of a _replicate request.) This is even more flexible and will handle tricky scenarios like full P2P replication where there may be no low-level way to uniquely identify the remote database being synced with. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1259) Replication ID is not stable if local server has a dynamic port number
[ https://issues.apache.org/jira/browse/COUCHDB-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490427#comment-13490427 ] Jens Alfke commented on COUCHDB-1259: - It is the responsibility of the application to know that. As a protocol the replication shouldn't force this way imo. Then how does the application do this? I haven't seen any API for it. Also, I don't see how this has anything to do with the case of a leaf node running a server that happens to have a dynamic port assignment. The port this node is running on has absolutely nothing to do with the replication. In the (now obsolete) case of Couchbase Mobile, the server doesn't even accept external requests, so its port number is purely an internal affair. I still have a feeling that we're talking about completely different things. But I can't really figure out what your point is... Replication ID is not stable if local server has a dynamic port number -- Key: COUCHDB-1259 URL: https://issues.apache.org/jira/browse/COUCHDB-1259 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.1 Reporter: Jens Alfke Assignee: Robert Newson Priority: Blocker Fix For: 1.3 Attachments: couchdb-1259.patch, couchdb-1259.patch I noticed that when Couchbase Mobile running on iOS replicates to/from a remote server (on iriscouch in this case), the replication has to fetch the full _changes feed every time it starts. Filipe helped me track down the problem -- the replication ID is coming out different every time. The reason for this is that the local port number, which is one of the inputs to the hash that generates the replication ID, is randomly assigned by the OS. (I.e. it uses a port number of 0 when opening its listener socket.) This is because there could be multiple apps using Couchbase Mobile running on the same device and we can't have their ports colliding. The underlying problem is that CouchDB is attempting to generate a unique ID for a particular pair of {source, destination} databases, but it's basing it on attributes that aren't fundamental to the database and can change, like the hostname or port number. One solution, proposed by Filipe and me, is to assign each database (or each server?) a random UUID when it's created, and use that to generate replication IDs. Another solution, proposed by Damien, is to have CouchDB let the client work out the replication ID on its own, and set it as a property in the replication document (or the JSON body of a _replicate request.) This is even more flexible and will handle tricky scenarios like full P2P replication where there may be no low-level way to uniquely identify the remote database being synced with. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1259) Replication ID is not stable if local server has a dynamic port number
[ https://issues.apache.org/jira/browse/COUCHDB-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490441#comment-13490441 ] Jens Alfke commented on COUCHDB-1259: - btw your example with couchbase mobile is generally solved by using the replication in pull mode only. So here it is relying on a fixed address to replicate. *sigh* No, that is exactly the situation I was describing. The mobile client is the only one initiating replication; it pulls from the central (fixed-address) server, and pushes changes to it. So the mobile device's IP address and port are irrelevant, right? Except that the replication state document stored in _local has an ID based on several things _including_ the local server's address and port number. So the effect is that, every time the app launches, all the replication state gets lost/invalidated, and it has to start over again the next time it replicates. TouchDB doesn't have this problem because I didn't write it with this design flaw :) Instead every local database has a UUID as suggested here, and that's used as part of the key. Replication ID is not stable if local server has a dynamic port number -- Key: COUCHDB-1259 URL: https://issues.apache.org/jira/browse/COUCHDB-1259 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.1 Reporter: Jens Alfke Assignee: Robert Newson Priority: Blocker Fix For: 1.3 Attachments: couchdb-1259.patch, couchdb-1259.patch I noticed that when Couchbase Mobile running on iOS replicates to/from a remote server (on iriscouch in this case), the replication has to fetch the full _changes feed every time it starts. Filipe helped me track down the problem -- the replication ID is coming out different every time. The reason for this is that the local port number, which is one of the inputs to the hash that generates the replication ID, is randomly assigned by the OS. (I.e. it uses a port number of 0 when opening its listener socket.) This is because there could be multiple apps using Couchbase Mobile running on the same device and we can't have their ports colliding. The underlying problem is that CouchDB is attempting to generate a unique ID for a particular pair of {source, destination} databases, but it's basing it on attributes that aren't fundamental to the database and can change, like the hostname or port number. One solution, proposed by Filipe and me, is to assign each database (or each server?) a random UUID when it's created, and use that to generate replication IDs. Another solution, proposed by Damien, is to have CouchDB let the client work out the replication ID on its own, and set it as a property in the replication document (or the JSON body of a _replicate request.) This is even more flexible and will handle tricky scenarios like full P2P replication where there may be no low-level way to uniquely identify the remote database being synced with. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1259) Replication ID is not stable if local server has a dynamic port number
[ https://issues.apache.org/jira/browse/COUCHDB-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490448#comment-13490448 ] Jens Alfke commented on COUCHDB-1259: - It seems like overkill to get the IANA to assign a fixed port number to an app that doesn't even listen on any external interfaces! The only use of that port is (was) over the loopback interface to let the application communicate with CouchDB. Passing zero for the port in the config file didn't make the problem go away. Apparently the replicator bases the ID on the actual random port number in use, not on the fixed 0 from the config. What will prevents an hostile node to connect back to your node with the same id? Hello, are you listening at all to what I'm writing? I've already said several times that the app does not accept incoming connections at all. It only makes outgoing connections to replicate. And in general: obviously in any real P2P app there would be actual security measures in place to authenticate connections, most likely by using both server and client SSL certs and verifying their public keys. Once the connection is made, _then_ database IDs can be used to restore the state of a replication. Replication ID is not stable if local server has a dynamic port number -- Key: COUCHDB-1259 URL: https://issues.apache.org/jira/browse/COUCHDB-1259 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.1 Reporter: Jens Alfke Assignee: Robert Newson Priority: Blocker Fix For: 1.3 Attachments: couchdb-1259.patch, couchdb-1259.patch I noticed that when Couchbase Mobile running on iOS replicates to/from a remote server (on iriscouch in this case), the replication has to fetch the full _changes feed every time it starts. Filipe helped me track down the problem -- the replication ID is coming out different every time. The reason for this is that the local port number, which is one of the inputs to the hash that generates the replication ID, is randomly assigned by the OS. (I.e. it uses a port number of 0 when opening its listener socket.) This is because there could be multiple apps using Couchbase Mobile running on the same device and we can't have their ports colliding. The underlying problem is that CouchDB is attempting to generate a unique ID for a particular pair of {source, destination} databases, but it's basing it on attributes that aren't fundamental to the database and can change, like the hostname or port number. One solution, proposed by Filipe and me, is to assign each database (or each server?) a random UUID when it's created, and use that to generate replication IDs. Another solution, proposed by Damien, is to have CouchDB let the client work out the replication ID on its own, and set it as a property in the replication document (or the JSON body of a _replicate request.) This is even more flexible and will handle tricky scenarios like full P2P replication where there may be no low-level way to uniquely identify the remote database being synced with. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1584) Allow passing of open_doc parameters to _all_docs
[ https://issues.apache.org/jira/browse/COUCHDB-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488911#comment-13488911 ] Jens Alfke commented on COUCHDB-1584: - This may actually not be sufficient to let the replicator fetch revisions in bulk. The problem is that _all_docs takes an array of docids, but not revids — so the caller has no control over which revision of a document to get; they'll always get the winning one. So (a) If a document is in conflict, the replicator will still have to use single-revision GETs to fetch the non-winning revision(s). (b) There can be race conditions where a document is updated after the _changes feed is sent, so the _all_docs request will return that new revision, not the one the replicator knows about. I don't think either of these cases will be all that common; it just means the replicator will have to be a bit careful to check the revids in the response from _all_docs, and possibly fetch some revisions one-by-one if it didn't get the right ones. Allow passing of open_doc parameters to _all_docs - Key: COUCHDB-1584 URL: https://issues.apache.org/jira/browse/COUCHDB-1584 Project: CouchDB Issue Type: New Feature Affects Versions: 1.2 Reporter: Jan Lehnardt Priority: Minor GET /_all_docs should take the same arguments as GET /db/doc /_all_docs?revisions=true /_all_docs?revs_info=true See http://wiki.apache.org/couchdb/HTTP_Document_API#GET for details -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1584) Allow passing of open_doc parameters to _all_docs
[ https://issues.apache.org/jira/browse/COUCHDB-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13487944#comment-13487944 ] Jens Alfke commented on COUCHDB-1584: - +1. A major benefit of this would be to enable significant speedups to 'pull' replication, since the replicator would be able to fetch new revisions in bulk. (Presently it has to fetch them individually, AFAIK, because that's the only way to get the rev history.) Allow passing of open_doc parameters to _all_docs - Key: COUCHDB-1584 URL: https://issues.apache.org/jira/browse/COUCHDB-1584 Project: CouchDB Issue Type: New Feature Affects Versions: 1.2 Reporter: Jan Lehnardt Priority: Minor GET /_all_docs should take the same arguments as GET /db/doc /_all_docs?revisions=true /_all_docs?revs_info=true See http://wiki.apache.org/couchdb/HTTP_Document_API#GET for details -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1584) Allow passing of open_doc parameters to _all_docs
[ https://issues.apache.org/jira/browse/COUCHDB-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13487960#comment-13487960 ] Jens Alfke commented on COUCHDB-1584: - Sure. I've worked a bit with the JS tests before; I haven't actually modified the code yet, but it looks straightforward enough. I can even do it in proper TDD fashion and write the tests before I have your implementation, then watch them fail :) Allow passing of open_doc parameters to _all_docs - Key: COUCHDB-1584 URL: https://issues.apache.org/jira/browse/COUCHDB-1584 Project: CouchDB Issue Type: New Feature Affects Versions: 1.2 Reporter: Jan Lehnardt Priority: Minor GET /_all_docs should take the same arguments as GET /db/doc /_all_docs?revisions=true /_all_docs?revs_info=true See http://wiki.apache.org/couchdb/HTTP_Document_API#GET for details -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1584) Allow passing of open_doc parameters to _all_docs
[ https://issues.apache.org/jira/browse/COUCHDB-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488332#comment-13488332 ] Jens Alfke commented on COUCHDB-1584: - @Jan, I've emailed you a patch that adds the tests. Allow passing of open_doc parameters to _all_docs - Key: COUCHDB-1584 URL: https://issues.apache.org/jira/browse/COUCHDB-1584 Project: CouchDB Issue Type: New Feature Affects Versions: 1.2 Reporter: Jan Lehnardt Priority: Minor GET /_all_docs should take the same arguments as GET /db/doc /_all_docs?revisions=true /_all_docs?revs_info=true See http://wiki.apache.org/couchdb/HTTP_Document_API#GET for details -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (COUCHDB-1570) Replicator writes corrupt remote checkpoint document on error; breaks replication
[ https://issues.apache.org/jira/browse/COUCHDB-1570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jens Alfke resolved COUCHDB-1570. - Resolution: Invalid Replicator writes corrupt remote checkpoint document on error; breaks replication - Key: COUCHDB-1570 URL: https://issues.apache.org/jira/browse/COUCHDB-1570 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.2 Environment: Mac OS X 10.8.2 Reporter: Jens Alfke Assignee: Randall Leeds Priority: Blocker Fix For: 1.3 If a 'push' replication receives error responses from the remote server while PUTting documents, it may write a corrupt checkpoint document to the remote server -- its _revisions._ids property is null. Any subsequent attempt to push to that server will cause the replicator to abort with an error when reading that replication document, since it requires _revisions.ids to be an array. The only workaround for this, short of deleting the remote database entirely, is to (a) identify the URL of the remote checkpoint document, and (b) delete it from the remote server. Otherwise you will never be able to push to that database again. Here's a failed replication attempt: $ curl -X POST --user snej :5984/_replicate --header Content-Type:application/json --data '{source:demo-shopping-attachments,target:http://localhost:4984/demo-shopping-attachments,create_target:true}' {error:doc_validation,reason:_revisions.ids isn't a array.} Here's the contents of the remote checkpoint document, once I identified its ID: $ curl :4984/demo-shopping-attachments/_local/5b913befe682d7bd1fbc24b1ce31cbc5 {_id:_local/5b913befe682d7bd1fbc24b1ce31cbc5,_rev:0-1,_revisions:{ids:null,start:0},history:[{doc_write_failures:2,docs_read:2,docs_written:0,end_last_seq:207,end_time:Sun, 21 Oct 2012 19:40:04 GMT,missing_checked:17,missing_found:2,recorded_seq:207,session_id:9a24dc37885b5b4e507ea90702b2fecf,start_last_seq:0,start_time:Sun, 21 Oct 2012 19:40:04 GMT}],replication_id_version:2,session_id:9a24dc37885b5b4e507ea90702b2fecf,source_last_seq:207} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1570) Replicator writes corrupt remote checkpoint document on error; breaks replication
[ https://issues.apache.org/jira/browse/COUCHDB-1570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486126#comment-13486126 ] Jens Alfke commented on COUCHDB-1570: - Sorry, this was my bug — CouchDB sends that _revisions property as part of the body of the PUT to the _local document, but it shouldn't be saved in my database. I added code to BaseCouch to strip underscored properties other than _rev, and that fixed the problem. Replicator writes corrupt remote checkpoint document on error; breaks replication - Key: COUCHDB-1570 URL: https://issues.apache.org/jira/browse/COUCHDB-1570 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.2 Environment: Mac OS X 10.8.2 Reporter: Jens Alfke Assignee: Randall Leeds Priority: Blocker Fix For: 1.3 If a 'push' replication receives error responses from the remote server while PUTting documents, it may write a corrupt checkpoint document to the remote server -- its _revisions._ids property is null. Any subsequent attempt to push to that server will cause the replicator to abort with an error when reading that replication document, since it requires _revisions.ids to be an array. The only workaround for this, short of deleting the remote database entirely, is to (a) identify the URL of the remote checkpoint document, and (b) delete it from the remote server. Otherwise you will never be able to push to that database again. Here's a failed replication attempt: $ curl -X POST --user snej :5984/_replicate --header Content-Type:application/json --data '{source:demo-shopping-attachments,target:http://localhost:4984/demo-shopping-attachments,create_target:true}' {error:doc_validation,reason:_revisions.ids isn't a array.} Here's the contents of the remote checkpoint document, once I identified its ID: $ curl :4984/demo-shopping-attachments/_local/5b913befe682d7bd1fbc24b1ce31cbc5 {_id:_local/5b913befe682d7bd1fbc24b1ce31cbc5,_rev:0-1,_revisions:{ids:null,start:0},history:[{doc_write_failures:2,docs_read:2,docs_written:0,end_last_seq:207,end_time:Sun, 21 Oct 2012 19:40:04 GMT,missing_checked:17,missing_found:2,recorded_seq:207,session_id:9a24dc37885b5b4e507ea90702b2fecf,start_last_seq:0,start_time:Sun, 21 Oct 2012 19:40:04 GMT}],replication_id_version:2,session_id:9a24dc37885b5b4e507ea90702b2fecf,source_last_seq:207} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1570) Replicator writes corrupt remote checkpoint document on error; breaks replication
[ https://issues.apache.org/jira/browse/COUCHDB-1570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486230#comment-13486230 ] Jens Alfke commented on COUCHDB-1570: - The problem was with BaseCouch's code for updating _local documents; it wasn't stripping out underscored property names like _revisions. I'm still not sure why CouchDB was sending a _revisions property in the PUT, but BaseCouch shouldn't have been storing it, or echoing it back on the next GET. Sorry for the confusion! Replicator writes corrupt remote checkpoint document on error; breaks replication - Key: COUCHDB-1570 URL: https://issues.apache.org/jira/browse/COUCHDB-1570 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.2 Environment: Mac OS X 10.8.2 Reporter: Jens Alfke Assignee: Randall Leeds Priority: Blocker Fix For: 1.3 If a 'push' replication receives error responses from the remote server while PUTting documents, it may write a corrupt checkpoint document to the remote server -- its _revisions._ids property is null. Any subsequent attempt to push to that server will cause the replicator to abort with an error when reading that replication document, since it requires _revisions.ids to be an array. The only workaround for this, short of deleting the remote database entirely, is to (a) identify the URL of the remote checkpoint document, and (b) delete it from the remote server. Otherwise you will never be able to push to that database again. Here's a failed replication attempt: $ curl -X POST --user snej :5984/_replicate --header Content-Type:application/json --data '{source:demo-shopping-attachments,target:http://localhost:4984/demo-shopping-attachments,create_target:true}' {error:doc_validation,reason:_revisions.ids isn't a array.} Here's the contents of the remote checkpoint document, once I identified its ID: $ curl :4984/demo-shopping-attachments/_local/5b913befe682d7bd1fbc24b1ce31cbc5 {_id:_local/5b913befe682d7bd1fbc24b1ce31cbc5,_rev:0-1,_revisions:{ids:null,start:0},history:[{doc_write_failures:2,docs_read:2,docs_written:0,end_last_seq:207,end_time:Sun, 21 Oct 2012 19:40:04 GMT,missing_checked:17,missing_found:2,recorded_seq:207,session_id:9a24dc37885b5b4e507ea90702b2fecf,start_last_seq:0,start_time:Sun, 21 Oct 2012 19:40:04 GMT}],replication_id_version:2,session_id:9a24dc37885b5b4e507ea90702b2fecf,source_last_seq:207} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (COUCHDB-1570) Replicator writes corrupt remote checkpoint document on error; breaks replication
[ https://issues.apache.org/jira/browse/COUCHDB-1570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jens Alfke updated COUCHDB-1570: Summary: Replicator writes corrupt remote checkpoint document on error; breaks replication (was: Replicator writes corrupt remote checkpoint document on error; breaks revision forever) Replicator writes corrupt remote checkpoint document on error; breaks replication - Key: COUCHDB-1570 URL: https://issues.apache.org/jira/browse/COUCHDB-1570 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.2 Environment: Mac OS X 10.8.2 Reporter: Jens Alfke Assignee: Randall Leeds Priority: Blocker Fix For: 1.3 If a 'push' replication receives error responses from the remote server while PUTting documents, it may write a corrupt checkpoint document to the remote server -- its _revisions._ids property is null. Any subsequent attempt to push to that server will cause the replicator to abort with an error when reading that replication document, since it requires _revisions.ids to be an array. The only workaround for this, short of deleting the remote database entirely, is to (a) identify the URL of the remote checkpoint document, and (b) delete it from the remote server. Otherwise you will never be able to push to that database again. Here's a failed replication attempt: $ curl -X POST --user snej :5984/_replicate --header Content-Type:application/json --data '{source:demo-shopping-attachments,target:http://localhost:4984/demo-shopping-attachments,create_target:true}' {error:doc_validation,reason:_revisions.ids isn't a array.} Here's the contents of the remote checkpoint document, once I identified its ID: $ curl :4984/demo-shopping-attachments/_local/5b913befe682d7bd1fbc24b1ce31cbc5 {_id:_local/5b913befe682d7bd1fbc24b1ce31cbc5,_rev:0-1,_revisions:{ids:null,start:0},history:[{doc_write_failures:2,docs_read:2,docs_written:0,end_last_seq:207,end_time:Sun, 21 Oct 2012 19:40:04 GMT,missing_checked:17,missing_found:2,recorded_seq:207,session_id:9a24dc37885b5b4e507ea90702b2fecf,start_last_seq:0,start_time:Sun, 21 Oct 2012 19:40:04 GMT}],replication_id_version:2,session_id:9a24dc37885b5b4e507ea90702b2fecf,source_last_seq:207} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (COUCHDB-1529) Add API to find the deleted branches of a doc's rev tree
Jens Alfke created COUCHDB-1529: --- Summary: Add API to find the deleted branches of a doc's rev tree Key: COUCHDB-1529 URL: https://issues.apache.org/jira/browse/COUCHDB-1529 Project: CouchDB Issue Type: Improvement Affects Versions: 1.2 Reporter: Jens Alfke Priority: Minor There's no efficient way to discover the complete revision tree of a document. API options like ?open_revs=all or ?conflicts will only return active/undeleted revisions; it won't discover deleted branches resulting from already-resolved conflicts. There should be an extra option, like ?include_deleted=true, to use along with open_revs to tell it to return even deleted leaf revisions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1529) Add API to find the deleted branches of a doc's rev tree
[ https://issues.apache.org/jira/browse/COUCHDB-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448249#comment-13448249 ] Jens Alfke commented on COUCHDB-1529: - Note: It's possible to do this currently by iterating through the database's entire _changes feed, since it includes deletions. That's expensive, though. Add API to find the deleted branches of a doc's rev tree Key: COUCHDB-1529 URL: https://issues.apache.org/jira/browse/COUCHDB-1529 Project: CouchDB Issue Type: Improvement Affects Versions: 1.2 Reporter: Jens Alfke Priority: Minor There's no efficient way to discover the complete revision tree of a document. API options like ?open_revs=all or ?conflicts will only return active/undeleted revisions; it won't discover deleted branches resulting from already-resolved conflicts. There should be an extra option, like ?include_deleted=true, to use along with open_revs to tell it to return even deleted leaf revisions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (COUCHDB-1530) Add a mode to _all_docs to include deleted docs
Jens Alfke created COUCHDB-1530: --- Summary: Add a mode to _all_docs to include deleted docs Key: COUCHDB-1530 URL: https://issues.apache.org/jira/browse/COUCHDB-1530 Project: CouchDB Issue Type: Improvement Affects Versions: 1.2 Reporter: Jens Alfke Priority: Minor There's currently no efficient way to discover the deleted documents in a database, since _all_docs only returns active documents. It can be useful to show all documents, even deleted ones, for troubleshooting or forensic purposes. I propose adding an option like ?include_deleted=true to _all_docs that will cause it to return deleted documents as well. To provide a way to tell deleted docs apart from active ones (if the include_docs option isn't on), the value object should include a deleted:true property for those docs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1530) Add a mode to _all_docs to include deleted docs
[ https://issues.apache.org/jira/browse/COUCHDB-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448271#comment-13448271 ] Jens Alfke commented on COUCHDB-1530: - Note: It's possible to find the deleted documents by iterating over the database's entire _changes feed, but it's expensive. Add a mode to _all_docs to include deleted docs --- Key: COUCHDB-1530 URL: https://issues.apache.org/jira/browse/COUCHDB-1530 Project: CouchDB Issue Type: Improvement Affects Versions: 1.2 Reporter: Jens Alfke Priority: Minor There's currently no efficient way to discover the deleted documents in a database, since _all_docs only returns active documents. It can be useful to show all documents, even deleted ones, for troubleshooting or forensic purposes. I propose adding an option like ?include_deleted=true to _all_docs that will cause it to return deleted documents as well. To provide a way to tell deleted docs apart from active ones (if the include_docs option isn't on), the value object should include a deleted:true property for those docs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (COUCHDB-1521) multipart parser gets multiple attachments mixed up
Jens Alfke created COUCHDB-1521: --- Summary: multipart parser gets multiple attachments mixed up Key: COUCHDB-1521 URL: https://issues.apache.org/jira/browse/COUCHDB-1521 Project: CouchDB Issue Type: Bug Components: HTTP Interface Affects Versions: 1.2 Reporter: Jens Alfke When receiving a document PUT in multipart format, CouchDB gets the attachments and MIME parts mixed up. Instead of looking at the headers of a MIME part to identify which attachment it is (most likely by using the 'filename' property of the 'Content-Disposition:' header), it processes the attachments according to the order in which their metadata objects appear in the JSON body's '_attachments:' object. The problem with this is that JSON objects (dictionaries) are _not_ ordered collections. I know that Erlang's implementation of them (as linked lists of key/value pairs) happens to be ordered, and I think some JavaScript implementations have the side effect of preserving order; but in many languages these are implemented as hash tables and genuinely unordered. This means that when a program written in such a language converts a native object to JSON, it has no control over (and probably no knowledge of) the order in which the keys of the JSON object are written out. This makes it impossible to then write the attachments in the same order. The only workaround seems to be for the program to implement its own custom JSON encoder just so that it can write object keys in a known order (probably sorted), which then enables it to write the attachment bodies in the same order. NOTE: This is the flip side of COUCHDB-1368 which I filed last year; that bug has to do with the same ordering issue when CouchDB _generates_ multipart responses (and presents similar problems for clients not written in Erlang.) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (COUCHDB-1479) Futon config UI won't allow WWW-Authenticate option to be added (name is lowercased)
Jens Alfke created COUCHDB-1479: --- Summary: Futon config UI won't allow WWW-Authenticate option to be added (name is lowercased) Key: COUCHDB-1479 URL: https://issues.apache.org/jira/browse/COUCHDB-1479 Project: CouchDB Issue Type: Bug Components: Futon Environment: Mac OS X Reporter: Jens Alfke Priority: Minor When using the config UI in futon to add a new option, via the Add a new section... link at the bottom of the page, the name of the option is lowercased when written to the .ini file. (For some reason the case is preserved when altering the runtime configuration, though, so the problem doesn't manifest itself until the next time the server is restarted.) This causes trouble when attempting to enable HTTP basic auth by adding a WWW-Authenticate option (value Basic) to the [httpd] section. The actual data written to the .ini file is: [httpd] www-authenticate = Basic which is not recognized when the server loads its configuration on restart. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1259) Replication ID is not stable if local server has a dynamic port number
[ https://issues.apache.org/jira/browse/COUCHDB-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13090396#comment-13090396 ] Jens Alfke commented on COUCHDB-1259: - @Jason — I see where you're coming from with bad web citizen but it refers only to a specific usage of CouchDB as a traditional web server. We have equally valid uses of it that do not act as traditional servers nor have stable URLs, for which we absolutely need this fix. I don't think arguments about whether URLs change or not are productive here in a specific bug report. Replication ID is not stable if local server has a dynamic port number -- Key: COUCHDB-1259 URL: https://issues.apache.org/jira/browse/COUCHDB-1259 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.1 Reporter: Jens Alfke I noticed that when Couchbase Mobile running on iOS replicates to/from a remote server (on iriscouch in this case), the replication has to fetch the full _changes feed every time it starts. Filipe helped me track down the problem -- the replication ID is coming out different every time. The reason for this is that the local port number, which is one of the inputs to the hash that generates the replication ID, is randomly assigned by the OS. (I.e. it uses a port number of 0 when opening its listener socket.) This is because there could be multiple apps using Couchbase Mobile running on the same device and we can't have their ports colliding. The underlying problem is that CouchDB is attempting to generate a unique ID for a particular pair of {source, destination} databases, but it's basing it on attributes that aren't fundamental to the database and can change, like the hostname or port number. One solution, proposed by Filipe and me, is to assign each database (or each server?) a random UUID when it's created, and use that to generate replication IDs. Another solution, proposed by Damien, is to have CouchDB let the client work out the replication ID on its own, and set it as a property in the replication document (or the JSON body of a _replicate request.) This is even more flexible and will handle tricky scenarios like full P2P replication where there may be no low-level way to uniquely identify the remote database being synced with. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (COUCHDB-1259) Replication ID is not stable if local server has a dynamic port number
Replication ID is not stable if local server has a dynamic port number -- Key: COUCHDB-1259 URL: https://issues.apache.org/jira/browse/COUCHDB-1259 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.1 Reporter: Jens Alfke I noticed that when Couchbase Mobile running on iOS replicates to/from a remote server (on iriscouch in this case), the replication has to fetch the full _changes feed every time it starts. Filipe helped me track down the problem -- the replication ID is coming out different every time. The reason for this is that the local port number, which is one of the inputs to the hash that generates the replication ID, is randomly assigned by the OS. (I.e. it uses a port number of 0 when opening its listener socket.) This is because there could be multiple apps using Couchbase Mobile running on the same device and we can't have their ports colliding. The underlying problem is that CouchDB is attempting to generate a unique ID for a particular pair of {source, destination} databases, but it's basing it on attributes that aren't fundamental to the database and can change, like the hostname or port number. One solution, proposed by Filipe and me, is to assign each database (or each server?) a random UUID when it's created, and use that to generate replication IDs. Another solution, proposed by Damien, is to have CouchDB let the client work out the replication ID on its own, and set it as a property in the replication document (or the JSON body of a _replicate request.) This is even more flexible and will handle tricky scenarios like full P2P replication where there may be no low-level way to uniquely identify the remote database being synced with. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (COUCHDB-1259) Replication ID is not stable if local server has a dynamic port number
[ https://issues.apache.org/jira/browse/COUCHDB-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13089976#comment-13089976 ] Jens Alfke commented on COUCHDB-1259: - @Jason: The reason I filed this bug report is that the URL of a database in Couchbase Mobile *isn't* unique. It's barely meaningful at all; it's of the form http://127.0.0.1:n; where n is an upredictable port number assigned by the TCP stack at launch time. That URL isn't even exposed to the outside world because the CouchDB server is only listening on the loopback interface. The state that's being represented by a replication ID is the contents of the database (at least as it was when it last synced.) So it seems the ID should be something that sticks to the database itself, not to any ephemeral manifestation like a URL. Replication ID is not stable if local server has a dynamic port number -- Key: COUCHDB-1259 URL: https://issues.apache.org/jira/browse/COUCHDB-1259 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 1.1 Reporter: Jens Alfke I noticed that when Couchbase Mobile running on iOS replicates to/from a remote server (on iriscouch in this case), the replication has to fetch the full _changes feed every time it starts. Filipe helped me track down the problem -- the replication ID is coming out different every time. The reason for this is that the local port number, which is one of the inputs to the hash that generates the replication ID, is randomly assigned by the OS. (I.e. it uses a port number of 0 when opening its listener socket.) This is because there could be multiple apps using Couchbase Mobile running on the same device and we can't have their ports colliding. The underlying problem is that CouchDB is attempting to generate a unique ID for a particular pair of {source, destination} databases, but it's basing it on attributes that aren't fundamental to the database and can change, like the hostname or port number. One solution, proposed by Filipe and me, is to assign each database (or each server?) a random UUID when it's created, and use that to generate replication IDs. Another solution, proposed by Damien, is to have CouchDB let the client work out the replication ID on its own, and set it as a property in the replication document (or the JSON body of a _replicate request.) This is even more flexible and will handle tricky scenarios like full P2P replication where there may be no low-level way to uniquely identify the remote database being synced with. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (COUCHDB-1206) View ETags may be incorrect if ?include_docs=true is specified
View ETags may be incorrect if ?include_docs=true is specified -- Key: COUCHDB-1206 URL: https://issues.apache.org/jira/browse/COUCHDB-1206 Project: CouchDB Issue Type: Bug Affects Versions: 1.1 Reporter: Jens Alfke Priority: Minor Change COUCHDB-799 altered the way ETags are assigned to views, by having the ETag change only when the view index changes, not when any document changes. Unfortunately this means that a view with the ?include_docs=true option can return an incorrect ETag. The reason is that if a document in the view is changed, but the change doesn't affect the view index, the result of the GET will change (it will contain the document's updated contents), but the ETag won't. This can result in stale data if the client uses a conditional GET, because it'll get a 304 even though the prior response is out of date. Robert Newson's analysis on the user@ list is I think the sanest fix is to make view etags for include_docs=true use the original algorithm, so that they always change if the database changes. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (COUCHDB-549) include_docs=true doesn't honour conflicts=true
[ https://issues.apache.org/jira/browse/COUCHDB-549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12838614#action_12838614 ] Jens Alfke commented on COUCHDB-549: This actually goes back to 0.10.0. From some historical evidence (a unit test in the CouchObjC library that used to work but now breaks) it looks like this changed sometime after 0.8. Is this considered a bug to be fixed, or just a design limitation? include_docs=true doesn't honour conflicts=true --- Key: COUCHDB-549 URL: https://issues.apache.org/jira/browse/COUCHDB-549 Project: CouchDB Issue Type: Improvement Components: HTTP Interface Affects Versions: 0.11 Reporter: Brian Candler Priority: Minor When you read a view and use the option 'include_docs=true' to get the source document in each result row, the option 'conflicts=true' is not honoured. You do not see a _conflicts member in the document, even if it is in a conflicting state. This feature request could be expanded in a couple of directions: 1. Make include_docs=true honour *all* options which a straightforward GET would honour - e.g. revs, revs_info, open_revs. Maybe this would be straightforward if they shared the same code path and options processing. 2. It has been suggested that 'conflicts=true' could be the default anyway. That is, whenever you retrieve a document, you get a _conflicts member if it is in a conflicting state, without having to ask for it. This would be unlikely to break things, but would make it less likely that conflicts would go unnoticed, and it would simplify the API a little. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (COUCHDB-265) HEAD requests get a Content-Length header
[ https://issues.apache.org/jira/browse/COUCHDB-265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12676075#action_12676075 ] Jens Alfke commented on COUCHDB-265: By the way, curl has exactly the same problem with HEAD requests to other servers, so I'd say this is a bug in curl itself. (For example, try curl -X HEAD http://www.google.com;.) HEAD requests get a Content-Length header - Key: COUCHDB-265 URL: https://issues.apache.org/jira/browse/COUCHDB-265 Project: CouchDB Issue Type: Bug Components: HTTP Interface Affects Versions: 0.9 Environment: curl + trunk Reporter: Paul Joseph Davis Fix For: 0.9 Looks like HEAD requests are returning a bogus Content-Length header. If I remember my HTTP spec correctly, HEAD requests are supposed to return no Content-Length or a Content-Length of 0 but I could be wrong on that. Either way, it confuses the crap out of curl: $ curl -X HEAD -i http://127.0.0.1:5984/ HTTP/1.1 200 OK Server: CouchDB/0.9.0a (Erlang OTP/R12B) Date: Mon, 23 Feb 2009 20:56:55 GMT Content-Type: text/plain;charset=utf-8 Content-Length: 40 Cache-Control: must-revalidate curl: (18) transfer closed with 40 bytes remaining to read Also, I just happened to be reading couch_http.erl the other day and I remember seeing a note that said mochiweb automatically strips bodies so internally HEAD requests are treated like a GET and mochiweb I guess just doesn't send a body. That's probably important. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.