[jira] [Created] (COUCHDB-2327) Add string/array prefix match option, for view queries

2014-09-11 Thread Jens Alfke (JIRA)
Jens Alfke created COUCHDB-2327:
---

 Summary: Add string/array prefix match option, for view queries
 Key: COUCHDB-2327
 URL: https://issues.apache.org/jira/browse/COUCHDB-2327
 Project: CouchDB
  Issue Type: Improvement
  Security Level: public (Regular issues)
  Components: HTTP Interface
Reporter: Jens Alfke


View querying provides no clean way to match a string prefix The only advice 
I've seen is to set startkey to the prefix, and endkey to the prefix with some 
really high Unicode character appended, which is a total kludge*.

There's a similar issue with matching an array prefix, e.g. all keys that 
start with [2014, ...]. Here the solution is less kludgy (append a {} to the 
endkey) but it's still very unintuitive to people learning CouchDB. I've had to 
explain it to newbies many times.

I suggest adding an explicit query option to enable prefix matching. This 
doesn't need to mess with the actual query engine — all it has to do is modify 
the endkey by appending an appropriate Unicode character (in the string case) 
or empty object (in the array case.) If no `endkey` is given it will be based 
on the `startkey`.

I've already implemented a comparable feature for Couchbase Lite:
https://github.com/couchbase/couchbase-lite-ios/wiki/Query-Enhancements#prefix-matching

Note that I made the `prefix_match` parameter an integer, not a boolean. This 
is to support cases where you want to match a prefix of a _nested component_ of 
the key, for example all keys in 2014 whose product name starts with 'f', 
where the startkey would be [2014, f] and the prefix_match would be 2 to 
indicate that it's the nested string that should be prefix-matched not the 
array. But in the common case you'd just set the value to 1 to indicate that 
the top level key should be prefix-matched.

* Why is adding some high Unicode character a kludge? Because Unicode is so 
complicated and so inconsistently implemented. Doing this immediately opens the 
possibility of weird Unicode issues in your development language's string type, 
in its HTTP client library, and in Erlang's equivalents on the server side. Not 
to mention the swamp that is the Unicode specification itself — for instance, 
I've seen advice to use a character like \uFFFE, which was correct until 
Unicode went 32-bit, and tended to work alright for a while after that, but 
will now fail with emoji characters (which are both very commonly used and well 
outside the 16-bit range.) Actually whether it fails depends on whether your 
string implementation operates on UTF-16 (very common) or true Unicode code 
points. Like I said, it's a kludge.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (COUCHDB-2052) Add API for discovering feature availability

2014-02-07 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13894620#comment-13894620
 ] 

Jens Alfke commented on COUCHDB-2052:
-

I don't think I understand your example, Benoit.

 couchbase-lite can use _bulk_get on the couchbase sync gateway or _changes 
 and other things on couchdb

Couchbase Lite always uses _changes and other things; those are core parts of 
the protocol. _bulk_get is simply an optimization to avoid lots of GET requests 
for individual documents. So it makes sense to ask whether the server supports 
_bulk_get, because the choice is to make one _bulk_get call or a series of GET 
/db/doc calls.

 2 capabilities corresponding to 2 well defined api/protocols. Here why not 
 something like REPCOUCHDB01 and REPCOUCHBASE01

There aren't two APIs or protocols. There's one, and there are simply some 
optional capabilities that can optimize it.

 Describing an expected behaviour is a way easier in my opinion than expecting 
 that all applications are able to parse a message in time etc.

I don't know what this means.

 why not getting them by issuing an OPTIONS method to / ?

It's apparently not recommended to use OPTIONS (see the mailing list thread.) 
In RFC2616 the OPTIONS method is really vaguely defined; it seems it's really 
only useful for returning Allow: headers to show what methods are supported. 
I'd be wary of pushing it any further than that.

 Add API for discovering feature availability
 

 Key: COUCHDB-2052
 URL: https://issues.apache.org/jira/browse/COUCHDB-2052
 Project: CouchDB
  Issue Type: Improvement
  Security Level: public(Regular issues) 
  Components: HTTP Interface
Reporter: Jens Alfke

 I propose adding to the response of GET / a property called features or 
 extensions whose value is an array of strings, each string being an 
 agreed-upon identifier of a specific optional feature. For example:
   {couchdb: welcome, features: [_bulk_get, persona]}, vendor: 
 …
 Rationale:
 Features are being added to CouchDB over time, plug-ins may add features, and 
 there are compatible servers that may have nonstandard features (like 
 _bulk_get). But there isn't a clear way for a client (which might be another 
 server's replicator) to determine what features a server has. Currently a 
 client looking at the response of a GET / has to figure out what server and 
 version thereof it's talking to, and then has to consult hardcoded knowledge 
 that version X of server Y supports feature Z.
 (True, you can often get away without needing to check, by assuming a feature 
 exists but falling back to standard behavior if you get an error. But not all 
 features may be so easy to detect — the behavior of an unaware server might 
 be to ignore the feature and do the wrong thing, rather than returning an 
 error — and anyway this adds extra round-trips that slow down the operation.)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (COUCHDB-2052) Add API for discovering feature availability

2014-02-07 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13894643#comment-13894643
 ] 

Jens Alfke commented on COUCHDB-2052:
-

This discussion is starting to smell like bike-shedding to me (and it's not the 
first time that's happened with CouchDB.) 

I raised a fairly straightforward issue -- given different versions and 
implementations that might not all support the same functionality, how does a 
client know whether or not it can use a particular feature/function? -- and 
proposed a straightforward solution.

My solution might not be perfect, but it's clearly specified and very easy to 
implement and to use. The responses here seem to be digressing, and no one is 
proposing anything concrete. The other ideas here also sound like they'd be 
significantly more complex.

Basically, if someone else has an alternative proposal for how to do this, then 
specify it clearly and post it here.

 Add API for discovering feature availability
 

 Key: COUCHDB-2052
 URL: https://issues.apache.org/jira/browse/COUCHDB-2052
 Project: CouchDB
  Issue Type: Improvement
  Security Level: public(Regular issues) 
  Components: HTTP Interface
Reporter: Jens Alfke

 I propose adding to the response of GET / a property called features or 
 extensions whose value is an array of strings, each string being an 
 agreed-upon identifier of a specific optional feature. For example:
   {couchdb: welcome, features: [_bulk_get, persona]}, vendor: 
 …
 Rationale:
 Features are being added to CouchDB over time, plug-ins may add features, and 
 there are compatible servers that may have nonstandard features (like 
 _bulk_get). But there isn't a clear way for a client (which might be another 
 server's replicator) to determine what features a server has. Currently a 
 client looking at the response of a GET / has to figure out what server and 
 version thereof it's talking to, and then has to consult hardcoded knowledge 
 that version X of server Y supports feature Z.
 (True, you can often get away without needing to check, by assuming a feature 
 exists but falling back to standard behavior if you get an error. But not all 
 features may be so easy to detect — the behavior of an unaware server might 
 be to ignore the feature and do the wrong thing, rather than returning an 
 error — and anyway this adds extra round-trips that slow down the operation.)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (COUCHDB-2052) Add API for discovering feature availability

2014-02-07 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13894677#comment-13894677
 ] 

Jens Alfke commented on COUCHDB-2052:
-

 The replicator attempts to use optimized endpoints of source and target nodes 
 and, on a 404, falls back to less optimal endpoints

This is definitely elegant and Web-like, but not optimal over slow mobile 
connections with high latency. If multiple resources need to be preflighted 
this way, it can start to add noticeable delay to the replication. I'm not 
saying that this approach is unacceptable; just be aware that it has a cost 
(one that CouchDB developers may not be keeping in mind since they're used to 
thinking of replication as being between two _servers_ with fast Ethernet 
connections :)

 Add API for discovering feature availability
 

 Key: COUCHDB-2052
 URL: https://issues.apache.org/jira/browse/COUCHDB-2052
 Project: CouchDB
  Issue Type: Improvement
  Security Level: public(Regular issues) 
  Components: HTTP Interface
Reporter: Jens Alfke

 I propose adding to the response of GET / a property called features or 
 extensions whose value is an array of strings, each string being an 
 agreed-upon identifier of a specific optional feature. For example:
   {couchdb: welcome, features: [_bulk_get, persona]}, vendor: 
 …
 Rationale:
 Features are being added to CouchDB over time, plug-ins may add features, and 
 there are compatible servers that may have nonstandard features (like 
 _bulk_get). But there isn't a clear way for a client (which might be another 
 server's replicator) to determine what features a server has. Currently a 
 client looking at the response of a GET / has to figure out what server and 
 version thereof it's talking to, and then has to consult hardcoded knowledge 
 that version X of server Y supports feature Z.
 (True, you can often get away without needing to check, by assuming a feature 
 exists but falling back to standard behavior if you get an error. But not all 
 features may be so easy to detect — the behavior of an unaware server might 
 be to ignore the feature and do the wrong thing, rather than returning an 
 error — and anyway this adds extra round-trips that slow down the operation.)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (COUCHDB-2052) Add API for discovering feature availability

2014-02-07 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13894697#comment-13894697
 ] 

Jens Alfke commented on COUCHDB-2052:
-

OK, here are examples from the Couchbase Sync Gateway:

(1) WebSocket transport for the changes feed. The query param ?feed=websocket 
is recognized and triggers a change-over to the WebSocket protocol, with the 
server sending a message for every change. If you send such a request to 
CouchDB it will ignore the unrecognized feed type and send back a 'normal' 
changes feed instead; this can trigger a lot of wasted data transfer.

(2) The body of a PUT/POST request sent to the server may be gzip-encoded 
(Content-Encoding: gzip). As far as I can tell the HTTP spec doesn't provide 
any way for the client to discover whether the server supports this. If the 
server doesn't, and also doesn't check the Content-Encoding request header, 
it'll end up trying to read the raw data and either barfing on it or (worse) 
storing it without any indication that it's zipped.

(3) When the Gateway parses an incoming document in multipart/related format, 
it looks at the attachment bodies' Content-Disposition headers to discover 
which named attachments they correspond to, instead of (as CouchDB does or used 
to) assuming that they appear in the same order as the objects in the JSON 
body's `_attachments` dictionary. A client might not be able to generate the 
MIME bodies in exactly the same order as in the dictionary, so it might need to 
check for this capability and abort the replication if it's not there, since 
the alternative is getting the attachments mixed up.

 Add API for discovering feature availability
 

 Key: COUCHDB-2052
 URL: https://issues.apache.org/jira/browse/COUCHDB-2052
 Project: CouchDB
  Issue Type: Improvement
  Security Level: public(Regular issues) 
  Components: HTTP Interface
Reporter: Jens Alfke

 I propose adding to the response of GET / a property called features or 
 extensions whose value is an array of strings, each string being an 
 agreed-upon identifier of a specific optional feature. For example:
   {couchdb: welcome, features: [_bulk_get, persona]}, vendor: 
 …
 Rationale:
 Features are being added to CouchDB over time, plug-ins may add features, and 
 there are compatible servers that may have nonstandard features (like 
 _bulk_get). But there isn't a clear way for a client (which might be another 
 server's replicator) to determine what features a server has. Currently a 
 client looking at the response of a GET / has to figure out what server and 
 version thereof it's talking to, and then has to consult hardcoded knowledge 
 that version X of server Y supports feature Z.
 (True, you can often get away without needing to check, by assuming a feature 
 exists but falling back to standard behavior if you get an error. But not all 
 features may be so easy to detect — the behavior of an unaware server might 
 be to ignore the feature and do the wrong thing, rather than returning an 
 error — and anyway this adds extra round-trips that slow down the operation.)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (COUCHDB-2052) Add API for discovering feature availability

2014-02-07 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13894701#comment-13894701
 ] 

Jens Alfke commented on COUCHDB-2052:
-

Benoit:
I am thinking that the faster answer to your problem would be listing all the 
URI we have on a node associated to the vendor id and version.

This is not just about knowing what URIs are supported. See my previous post 
for three issues that can't be resolved this way.

 Add API for discovering feature availability
 

 Key: COUCHDB-2052
 URL: https://issues.apache.org/jira/browse/COUCHDB-2052
 Project: CouchDB
  Issue Type: Improvement
  Security Level: public(Regular issues) 
  Components: HTTP Interface
Reporter: Jens Alfke

 I propose adding to the response of GET / a property called features or 
 extensions whose value is an array of strings, each string being an 
 agreed-upon identifier of a specific optional feature. For example:
   {couchdb: welcome, features: [_bulk_get, persona]}, vendor: 
 …
 Rationale:
 Features are being added to CouchDB over time, plug-ins may add features, and 
 there are compatible servers that may have nonstandard features (like 
 _bulk_get). But there isn't a clear way for a client (which might be another 
 server's replicator) to determine what features a server has. Currently a 
 client looking at the response of a GET / has to figure out what server and 
 version thereof it's talking to, and then has to consult hardcoded knowledge 
 that version X of server Y supports feature Z.
 (True, you can often get away without needing to check, by assuming a feature 
 exists but falling back to standard behavior if you get an error. But not all 
 features may be so easy to detect — the behavior of an unaware server might 
 be to ignore the feature and do the wrong thing, rather than returning an 
 error — and anyway this adds extra round-trips that slow down the operation.)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (COUCHDB-2052) Add API for discovering feature availability

2014-02-07 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13894880#comment-13894880
 ] 

Jens Alfke commented on COUCHDB-2052:
-

Well, CouchDB doesn't implement HTTP 1.1 correctly then...

(1) I tried sending a GET request for the changes feed with an Upgrade header 
and feed=websocket; it ignored the header and sent back the entire changes 
feed in normal format.
(2) I tried sending it a gzip-encoded PUT request body, and it failed with a 
400 status with message invalid_json. Apparently it ignored the 
Content-Encoding header. (I'm still on version 1.4, though.)
(3) You're right that one could check the version, but part of the reason for 
this proposal was to avoid having to have hardcoded knowledge of versions. It's 
not just one version check either -- IrisCouch and Cloudant (and BigCouch?) 
have independent version numbers, so you'd have to know what versions they 
incorporated the fix into.

I don't mean to pick on CouchDB. I'm sure most web server engines, aside from 
the big ones like Apache, don't pay attention to all the more obscure edges of 
the HTTP spec, like honoring Upgrade and Content-Encoding headers in requests.

 Add API for discovering feature availability
 

 Key: COUCHDB-2052
 URL: https://issues.apache.org/jira/browse/COUCHDB-2052
 Project: CouchDB
  Issue Type: Improvement
  Security Level: public(Regular issues) 
  Components: HTTP Interface
Reporter: Jens Alfke

 I propose adding to the response of GET / a property called features or 
 extensions whose value is an array of strings, each string being an 
 agreed-upon identifier of a specific optional feature. For example:
   {couchdb: welcome, features: [_bulk_get, persona]}, vendor: 
 …
 Rationale:
 Features are being added to CouchDB over time, plug-ins may add features, and 
 there are compatible servers that may have nonstandard features (like 
 _bulk_get). But there isn't a clear way for a client (which might be another 
 server's replicator) to determine what features a server has. Currently a 
 client looking at the response of a GET / has to figure out what server and 
 version thereof it's talking to, and then has to consult hardcoded knowledge 
 that version X of server Y supports feature Z.
 (True, you can often get away without needing to check, by assuming a feature 
 exists but falling back to standard behavior if you get an error. But not all 
 features may be so easy to detect — the behavior of an unaware server might 
 be to ignore the feature and do the wrong thing, rather than returning an 
 error — and anyway this adds extra round-trips that slow down the operation.)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (COUCHDB-2054) Content-Encoding on requests is ignored; should decode or return 415

2014-02-07 Thread Jens Alfke (JIRA)
Jens Alfke created COUCHDB-2054:
---

 Summary: Content-Encoding on requests is ignored; should decode or 
return 415
 Key: COUCHDB-2054
 URL: https://issues.apache.org/jira/browse/COUCHDB-2054
 Project: CouchDB
  Issue Type: Bug
  Security Level: public (Regular issues)
  Components: HTTP Interface
Reporter: Jens Alfke


CouchDB (as of 1.4) seems to ignore the Content-Encoding header on requests, 
and just parses the request body with no decoding. This causes incorrect 
behavior — most often it tries to parse the encoded data as JSON and will fail 
and return 400.

CouchDB should either decode the request body (e.g. unzip it), or else return a 
415 status. Decoding would be quite useful: requests with large JSON bodies 
(like revs_diff or the POST form of all_docs) can have their size cut in half 
by gzip encoding.

HTTP 1.1 spec for Content-Encoding:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.11



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (COUCHDB-2052) Add API for discovering feature availability

2014-02-07 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13894971#comment-13894971
 ] 

Jens Alfke commented on COUCHDB-2052:
-

Filed COUCHDB-2054 for the Content-Encoding handling.

 Add API for discovering feature availability
 

 Key: COUCHDB-2052
 URL: https://issues.apache.org/jira/browse/COUCHDB-2052
 Project: CouchDB
  Issue Type: Improvement
  Security Level: public(Regular issues) 
  Components: HTTP Interface
Reporter: Jens Alfke

 I propose adding to the response of GET / a property called features or 
 extensions whose value is an array of strings, each string being an 
 agreed-upon identifier of a specific optional feature. For example:
   {couchdb: welcome, features: [_bulk_get, persona]}, vendor: 
 …
 Rationale:
 Features are being added to CouchDB over time, plug-ins may add features, and 
 there are compatible servers that may have nonstandard features (like 
 _bulk_get). But there isn't a clear way for a client (which might be another 
 server's replicator) to determine what features a server has. Currently a 
 client looking at the response of a GET / has to figure out what server and 
 version thereof it's talking to, and then has to consult hardcoded knowledge 
 that version X of server Y supports feature Z.
 (True, you can often get away without needing to check, by assuming a feature 
 exists but falling back to standard behavior if you get an error. But not all 
 features may be so easy to detect — the behavior of an unaware server might 
 be to ignore the feature and do the wrong thing, rather than returning an 
 error — and anyway this adds extra round-trips that slow down the operation.)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (COUCHDB-2052) Add API for discovering feature availability

2014-02-07 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13894944#comment-13894944
 ] 

Jens Alfke commented on COUCHDB-2052:
-

 An HTTP server is free to ignore an Upgrade request for a protocol it doesn't 
 support.

It still seems ugly to me that the server will start generating the changes 
feed and sending out changes until the client sees the non-101 response code 
and closes the socket.

 That's a bug to fix.

Yeah, but it tells me that it's perhaps not realistic to rely on subtleties of 
HTTP negotiation for detecting features.

 I don't see much difference between checking version  foo vs 
 features.contains

In this example there are at least three different vendor-and-version tests 
involved. Possibly more; for instance I don't know if rcouch has an independent 
version numbering scheme, and if so which version merged in the patch that 
fixed multipart parsing.

 Add API for discovering feature availability
 

 Key: COUCHDB-2052
 URL: https://issues.apache.org/jira/browse/COUCHDB-2052
 Project: CouchDB
  Issue Type: Improvement
  Security Level: public(Regular issues) 
  Components: HTTP Interface
Reporter: Jens Alfke

 I propose adding to the response of GET / a property called features or 
 extensions whose value is an array of strings, each string being an 
 agreed-upon identifier of a specific optional feature. For example:
   {couchdb: welcome, features: [_bulk_get, persona]}, vendor: 
 …
 Rationale:
 Features are being added to CouchDB over time, plug-ins may add features, and 
 there are compatible servers that may have nonstandard features (like 
 _bulk_get). But there isn't a clear way for a client (which might be another 
 server's replicator) to determine what features a server has. Currently a 
 client looking at the response of a GET / has to figure out what server and 
 version thereof it's talking to, and then has to consult hardcoded knowledge 
 that version X of server Y supports feature Z.
 (True, you can often get away without needing to check, by assuming a feature 
 exists but falling back to standard behavior if you get an error. But not all 
 features may be so easy to detect — the behavior of an unaware server might 
 be to ignore the feature and do the wrong thing, rather than returning an 
 error — and anyway this adds extra round-trips that slow down the operation.)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (COUCHDB-2052) Add API for discovering feature availability

2014-02-06 Thread Jens Alfke (JIRA)
Jens Alfke created COUCHDB-2052:
---

 Summary: Add API for discovering feature availability
 Key: COUCHDB-2052
 URL: https://issues.apache.org/jira/browse/COUCHDB-2052
 Project: CouchDB
  Issue Type: Improvement
  Security Level: public (Regular issues)
  Components: HTTP Interface
Reporter: Jens Alfke


I propose adding to the response of GET / a property called features or 
extensions whose value is an array of strings, each string being an 
agreed-upon identifier of a specific optional feature. For example:
{couchdb: welcome, features: [_bulk_get, persona]}, vendor: 
…

Rationale:
Features are being added to CouchDB over time, plug-ins may add features, and 
there are compatible servers that may have nonstandard features (like 
_bulk_get). But there isn't a clear way for a client (which might be another 
server's replicator) to determine what features a server has. Currently a 
client looking at the response of a GET / has to figure out what server and 
version thereof it's talking to, and then has to consult hardcoded knowledge 
that version X of server Y supports feature Z.

(True, you can often get away without needing to check, by assuming a feature 
exists but falling back to standard behavior if you get an error. But not all 
features may be so easy to detect — the behavior of an unaware server might be 
to ignore the feature and do the wrong thing, rather than returning an error — 
and anyway this adds extra round-trips that slow down the operation.)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (COUCHDB-2052) Add API for discovering feature availability

2014-02-06 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13893472#comment-13893472
 ] 

Jens Alfke commented on COUCHDB-2052:
-

This isn't just about expressing what URL endpoints exist. There are features 
that don't just add URLs; they might add query options to an existing URL, or 
they might manifest in other ways entirely, like the 'channels' feature of the 
Couchbase Sync Gateway.

 Add API for discovering feature availability
 

 Key: COUCHDB-2052
 URL: https://issues.apache.org/jira/browse/COUCHDB-2052
 Project: CouchDB
  Issue Type: Improvement
  Security Level: public(Regular issues) 
  Components: HTTP Interface
Reporter: Jens Alfke

 I propose adding to the response of GET / a property called features or 
 extensions whose value is an array of strings, each string being an 
 agreed-upon identifier of a specific optional feature. For example:
   {couchdb: welcome, features: [_bulk_get, persona]}, vendor: 
 …
 Rationale:
 Features are being added to CouchDB over time, plug-ins may add features, and 
 there are compatible servers that may have nonstandard features (like 
 _bulk_get). But there isn't a clear way for a client (which might be another 
 server's replicator) to determine what features a server has. Currently a 
 client looking at the response of a GET / has to figure out what server and 
 version thereof it's talking to, and then has to consult hardcoded knowledge 
 that version X of server Y supports feature Z.
 (True, you can often get away without needing to check, by assuming a feature 
 exists but falling back to standard behavior if you get an error. But not all 
 features may be so easy to detect — the behavior of an unaware server might 
 be to ignore the feature and do the wrong thing, rather than returning an 
 error — and anyway this adds extra round-trips that slow down the operation.)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (COUCHDB-1938) API doc for _replicate doesn't describe dictionary form of source/target

2013-11-21 Thread Jens Alfke (JIRA)
Jens Alfke created COUCHDB-1938:
---

 Summary: API doc for _replicate doesn't describe dictionary form 
of source/target
 Key: COUCHDB-1938
 URL: https://issues.apache.org/jira/browse/COUCHDB-1938
 Project: CouchDB
  Issue Type: Bug
  Components: Documentation
Reporter: Jens Alfke


The API documentation for replication only describes the source and target 
properties as being strings containing database names or URLs. It doesn't 
describe the enhanced form where these are dictionaries/objects that can 
contain additional parameters like authentication settings or extra HTTP 
headers.

The current docs that are missing this info:
http://docs.couchdb.org/en/latest/api/server/common.html#replicate

Unofficial docs from the wiki that describe the enhanced form:
http://wiki.apache.org/couchdb/Replication#Authentication



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (COUCHDB-1479) Futon config UI won't allow WWW-Authenticate option to be added (name is lowercased)

2013-10-09 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790609#comment-13790609
 ] 

Jens Alfke commented on COUCHDB-1479:
-

This seems to have gotten worse in CouchDB 1.4. Now, even if I edit the config 
file by hand, the key gets lowercased as soon as CouchDB starts. So the file is 
rewritten with the lowercased key, and the config setting doesn't take effect 
at all.

So it appears that in CouchDB 1.4 the only way to configure HTTP auth is to go 
through Futon (or the REST API) to set this key -- it can't be set in the 
config file, and it won't persist across launches.

 Futon config UI won't allow WWW-Authenticate option to be added (name is 
 lowercased)
 --

 Key: COUCHDB-1479
 URL: https://issues.apache.org/jira/browse/COUCHDB-1479
 Project: CouchDB
  Issue Type: Bug
  Components: Futon
 Environment: Mac OS X
Reporter: Jens Alfke
Priority: Minor

 When using the config UI in futon to add a new option, via the Add a new 
 section... link at the bottom of the page, the name of the option is 
 lowercased when written to the .ini file. (For some reason the case is 
 preserved when altering the runtime configuration, though, so the problem 
 doesn't manifest itself until the next time the server is restarted.)
 This causes trouble when attempting to enable HTTP basic auth by adding a 
 WWW-Authenticate option (value Basic) to the [httpd] section. The actual 
 data written to the .ini file is:
 [httpd]
 www-authenticate = Basic
 which is not recognized when the server loads its configuration on restart.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (COUCHDB-1824) Official documentation of replication algorithm?

2013-09-28 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13780921#comment-13780921
 ] 

Jens Alfke commented on COUCHDB-1824:
-

Then do I need to apply for push access to the repo? Or is there a procedure 
where I submit a patch or something like a Github pull request?

 Official documentation of replication algorithm?
 

 Key: COUCHDB-1824
 URL: https://issues.apache.org/jira/browse/COUCHDB-1824
 Project: CouchDB
  Issue Type: Documentation
  Components: Documentation
Reporter: Nathan Vander Wilt
Assignee: Alexander Shorin
 Fix For: 1.5.0


 Though it's in some ways an internal detail, it might be nice to provide a 
 canonical description of CouchDB's replication protocol (algorithm, really) 
 in the documentation. See links at: 
 http://wiki.apache.org/couchdb/Replication#Protocol_Documentation

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (COUCHDB-1824) Official documentation of replication algorithm?

2013-06-21 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13690407#comment-13690407
 ] 

Jens Alfke commented on COUCHDB-1824:
-

They're both needed for interoperability. The HTTP calls used by the replicator 
are documented now, but only on the unofficial wiki, and only because I added 
documentation along the way while I wrote the TouchDB replicator. At the time a 
lot of the info existed only in the source code and in the heads of people like 
Filipe and Damien.

My unstated assumption here is that replication interoperability is a good 
thing, and that other database implementations should support the same protocol 
and algorithms so they can freely replicate with CouchDB and each other. That's 
very powerful, and I don't know of any other open protocols for replication.

 Official documentation of replication algorithm?
 

 Key: COUCHDB-1824
 URL: https://issues.apache.org/jira/browse/COUCHDB-1824
 Project: CouchDB
  Issue Type: Documentation
  Components: Documentation
Reporter: Nathan Vander Wilt

 Though it's in some ways an internal detail, it might be nice to provide a 
 canonical description of CouchDB's replication protocol (algorithm, really) 
 in the documentation. See links at: 
 http://wiki.apache.org/couchdb/Replication#Protocol_Documentation

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (COUCHDB-1824) Official documentation of replication algorithm?

2013-06-20 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689440#comment-13689440
 ] 

Jens Alfke commented on COUCHDB-1824:
-

BTW, this is not at all an internal detail, it's important for 
interoperability. Knowing how the replication protocol/algorithm works is 
crucial for any 3rd party software (like PouchDB / TouchDB / Couchbase Lite) 
that wants to be able to replicate with CouchDB.

 Official documentation of replication algorithm?
 

 Key: COUCHDB-1824
 URL: https://issues.apache.org/jira/browse/COUCHDB-1824
 Project: CouchDB
  Issue Type: Documentation
  Components: Documentation
Reporter: Nathan Vander Wilt

 Though it's in some ways an internal detail, it might be nice to provide a 
 canonical description of CouchDB's replication protocol (algorithm, really) 
 in the documentation. See links at: 
 http://wiki.apache.org/couchdb/Replication#Protocol_Documentation

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (COUCHDB-1824) Official documentation of replication algorithm?

2013-06-20 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689497#comment-13689497
 ] 

Jens Alfke commented on COUCHDB-1824:
-

I'm just pointing out that this is more significant than the original 
description implies.

 Official documentation of replication algorithm?
 

 Key: COUCHDB-1824
 URL: https://issues.apache.org/jira/browse/COUCHDB-1824
 Project: CouchDB
  Issue Type: Documentation
  Components: Documentation
Reporter: Nathan Vander Wilt

 Though it's in some ways an internal detail, it might be nice to provide a 
 canonical description of CouchDB's replication protocol (algorithm, really) 
 in the documentation. See links at: 
 http://wiki.apache.org/couchdb/Replication#Protocol_Documentation

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (COUCHDB-1824) Official documentation of replication algorithm?

2013-06-18 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13686861#comment-13686861
 ] 

Jens Alfke commented on COUCHDB-1824:
-

My latest docs are at 
https://github.com/couchbase/couchbase-lite-ios/wiki/Replication-Algorithm
Feel free to reuse any of that text.

 Official documentation of replication algorithm?
 

 Key: COUCHDB-1824
 URL: https://issues.apache.org/jira/browse/COUCHDB-1824
 Project: CouchDB
  Issue Type: Bug
  Components: Documentation
Reporter: Nathan Vander Wilt

 Though it's in some ways an internal detail, it might be nice to provide a 
 canonical description of CouchDB's replication protocol (algorithm, really) 
 in the documentation. See links at: 
 http://wiki.apache.org/couchdb/Replication#Protocol_Documentation

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (COUCHDB-1670) Replicator crashes if numbers in checkpoint docs are expressed in scientific notation

2013-02-06 Thread Jens Alfke (JIRA)
Jens Alfke created COUCHDB-1670:
---

 Summary: Replicator crashes if numbers in checkpoint docs are 
expressed in scientific notation
 Key: COUCHDB-1670
 URL: https://issues.apache.org/jira/browse/COUCHDB-1670
 Project: CouchDB
  Issue Type: Bug
  Components: Replication
Reporter: Jens Alfke


The CouchDB 1.2 replicator process crashes with an Erlang exception when 
parsing a checkpoint document read back from a remote database, if numbers in 
the document were JSON-encoded in scientific notation instead of as integers. 
This includes the properties source_last_seq, end_last_seq, start_last_seq.

That is, the following encoding works fine:
..., source_last_seq: 1234567, ...
whereas this completely-equivalent encoding causes an exception:
..., source_last_seq: 1.234567e+06, ...

This issue raised its head as a result of a CouchDB-compatible engine I'm 
writing (the Couchbase Sync Gateway) which can serve as a passive replication 
endpoint. It's implemented in Go, and the Go JSON package has the side effect 
of (a) parsing all JSON numbers into type 'double', and (b) encoding all 
doubles into JSON using scientific notation if they're more than six digits 
long. The net effect is that when CouchDB stores a checkpoint into the Sync 
Adapter's database and then later reads it back, it barfs due to the scientific 
notation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (COUCHDB-1670) Replicator crashes if numbers in checkpoint docs are expressed in scientific notation

2013-02-06 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13572655#comment-13572655
 ] 

Jens Alfke commented on COUCHDB-1670:
-

 since is opaque. You have to pass back exactly what you got from couchdb.

I don't think that's a reasonable expectation. The JSON is going to be 
transformed anyway (to insert the _rev), so at some point it's going to be 
translated into an internal format and then regenerated. The output has to be 
an equivalent JSON document, but that doesn't mean byte-for-byte equivalence. 
For instance, object keys could be in a different order, Unicode escapes could 
be turned into literals or vice versa, and numbers might be represented 
differently, like changing to/from scientific notation or suppressing trailing 
zeros after the decimal point.

 Replicator crashes if numbers in checkpoint docs are expressed in scientific 
 notation
 -

 Key: COUCHDB-1670
 URL: https://issues.apache.org/jira/browse/COUCHDB-1670
 Project: CouchDB
  Issue Type: Bug
  Components: Replication
Reporter: Jens Alfke

 The CouchDB 1.2 replicator process crashes with an Erlang exception when 
 parsing a checkpoint document read back from a remote database, if numbers in 
 the document were JSON-encoded in scientific notation instead of as integers. 
 This includes the properties source_last_seq, end_last_seq, start_last_seq.
 That is, the following encoding works fine:
 ..., source_last_seq: 1234567, ...
 whereas this completely-equivalent encoding causes an exception:
 ..., source_last_seq: 1.234567e+06, ...
 This issue raised its head as a result of a CouchDB-compatible engine I'm 
 writing (the Couchbase Sync Gateway) which can serve as a passive replication 
 endpoint. It's implemented in Go, and the Go JSON package has the side effect 
 of (a) parsing all JSON numbers into type 'double', and (b) encoding all 
 doubles into JSON using scientific notation if they're more than six digits 
 long. The net effect is that when CouchDB stores a checkpoint into the Sync 
 Adapter's database and then later reads it back, it barfs due to the 
 scientific notation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (COUCHDB-1368) multipart/related document body doesn't identify which part is which attachment

2012-11-14 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13497240#comment-13497240
 ] 

Jens Alfke commented on COUCHDB-1368:
-

Where is the branch? I don't see it in the github UI at 
https://github.com/apache/couchdb .

Also, could you post a sample of what the MIME headers look like for an 
attachment part?

 multipart/related document body doesn't identify which part is which 
 attachment
 ---

 Key: COUCHDB-1368
 URL: https://issues.apache.org/jira/browse/COUCHDB-1368
 Project: CouchDB
  Issue Type: Bug
  Components: HTTP Interface
Reporter: Jens Alfke
Priority: Minor

 If you GET a document with attachments in multipart/related format (by adding 
 ?attachments=true and setting Accept:multipart/related), the MIME bodies for 
 the attachments have no headers. This makes it difficult to tell which one is 
 which. Damien says they're in the same order that they appear in the 
 document's _attachments object ... which is fine if you're Erlang, because 
 Erlang preserves the order of keys in a JSON object, but no other JSON 
 implementation I know of does that (because they use hashtables instead of 
 linked lists.)
 The upshot is that any non-Erlang code trying to parse such a response will 
 have to do some by-hand parsing of the JSON data to get the _attachment keys 
 in order.
 This can be fixed by adding a Content-ID header to each attachment body, 
 whose value is the filename. It would be nice if other standard headers were 
 added too, like Content-Type, Content-Length, Content-Encoding, as this 
 would make it work better with existing MIME multipart libraries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (COUCHDB-1368) multipart/related document body doesn't identify which part is which attachment

2012-11-14 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13497418#comment-13497418
 ] 

Jens Alfke commented on COUCHDB-1368:
-

It turns out Content-ID is not the correct header to use for the filename, 
because according to RFC2045 sec.7, Content-ID values must be generated to be 
world-unique. (I didn't know this when writing up this issue, but discovered 
it later on while implementing MIME support for TouchDB. I should have updated 
this issue too; sorry!)

The most appropriate header to use seems to be Content-Disposition (RFC1806):

Content-Disposition: attachment; filename=test.txt

This is what TouchDB generates, and what it will recognize in incoming MIME 
documents.

 multipart/related document body doesn't identify which part is which 
 attachment
 ---

 Key: COUCHDB-1368
 URL: https://issues.apache.org/jira/browse/COUCHDB-1368
 Project: CouchDB
  Issue Type: Bug
  Components: HTTP Interface
Reporter: Jens Alfke
Priority: Minor

 If you GET a document with attachments in multipart/related format (by adding 
 ?attachments=true and setting Accept:multipart/related), the MIME bodies for 
 the attachments have no headers. This makes it difficult to tell which one is 
 which. Damien says they're in the same order that they appear in the 
 document's _attachments object ... which is fine if you're Erlang, because 
 Erlang preserves the order of keys in a JSON object, but no other JSON 
 implementation I know of does that (because they use hashtables instead of 
 linked lists.)
 The upshot is that any non-Erlang code trying to parse such a response will 
 have to do some by-hand parsing of the JSON data to get the _attachment keys 
 in order.
 This can be fixed by adding a Content-ID header to each attachment body, 
 whose value is the filename. It would be nice if other standard headers were 
 added too, like Content-Type, Content-Length, Content-Encoding, as this 
 would make it work better with existing MIME multipart libraries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (COUCHDB-1259) Replication ID is not stable if local server has a dynamic port number

2012-11-05 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490886#comment-13490886
 ] 

Jens Alfke commented on COUCHDB-1259:
-

I have looked at the patch but I don't really understand what it's doing, both 
because my Erlang is really weak and because I don't know the internals of 
CouchDB. So I can't really comment on the code.

It does sound like what's being suggested goes beyond what I asked for. This 
bug is about the *local* server (the one running the replication) having a 
different IP address or port than the last time. The suggested patches seem to 
also cover changes to the *remote* server's URL. That's an interesting issue 
but IMHO not the same thing.

The point of this bug is that the URL of the *local* server running the 
replication is irrelevant to the replication. If I'm opening connections to 
another server to replicate with it, it doesn't matter what port or IP address 
I am listening on, because there aren't any incoming connections happening. 
They don't affect the replication at all.

As for Benoit's security issues: Replication has no security. Security applies 
at a more fundamental level of identifying who is connecting and authenticating 
that principal. You absolutely cannot make security tests based on IP addresses 
or port numbers.

 Replication ID is not stable if local server has a dynamic port number
 --

 Key: COUCHDB-1259
 URL: https://issues.apache.org/jira/browse/COUCHDB-1259
 Project: CouchDB
  Issue Type: Bug
  Components: Replication
Affects Versions: 1.1
Reporter: Jens Alfke
Assignee: Robert Newson
Priority: Blocker
 Fix For: 1.3

 Attachments: couchdb-1259.patch, couchdb-1259.patch


 I noticed that when Couchbase Mobile running on iOS replicates to/from a 
 remote server (on iriscouch in this case), the replication has to fetch the 
 full _changes feed every time it starts. Filipe helped me track down the 
 problem -- the replication ID is coming out different every time. The reason 
 for this is that the local port number, which is one of the inputs to the 
 hash that generates the replication ID, is randomly assigned by the OS. (I.e. 
 it uses a port number of 0 when opening its listener socket.) This is because 
 there could be multiple apps using Couchbase Mobile running on the same 
 device and we can't have their ports colliding.
 The underlying problem is that CouchDB is attempting to generate a unique ID 
 for a particular pair of {source, destination} databases, but it's basing it 
 on attributes that aren't fundamental to the database and can change, like 
 the hostname or port number.
 One solution, proposed by Filipe and me, is to assign each database (or each 
 server?) a random UUID when it's created, and use that to generate 
 replication IDs.
 Another solution, proposed by Damien, is to have CouchDB let the client work 
 out the replication ID on its own, and set it as a property in the 
 replication document (or the JSON body of a _replicate request.) This is even 
 more flexible and will handle tricky scenarios like full P2P replication 
 where there may be no low-level way to uniquely identify the remote database 
 being synced with.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (COUCHDB-1259) Replication ID is not stable if local server has a dynamic port number

2012-11-04 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490352#comment-13490352
 ] 

Jens Alfke commented on COUCHDB-1259:
-

Benoit: I think you're misunderstanding the issue. This isn't something about 
P2P. It's just that if the local CouchDB is not *listening* on a fixed port 
number, then replications made by that server to/from another server aren't 
handled efficiently ... even though the local server's port number has nothing 
at all to do with the replication (since it's the one making the connections.)

In a real P2P case, this change makes even more sense, because the addresses of 
the servers are unimportant -- as you said, it's the databases and their data 
that are the important thing. A UUID helps identify those.

 Replication ID is not stable if local server has a dynamic port number
 --

 Key: COUCHDB-1259
 URL: https://issues.apache.org/jira/browse/COUCHDB-1259
 Project: CouchDB
  Issue Type: Bug
  Components: Replication
Affects Versions: 1.1
Reporter: Jens Alfke
Assignee: Robert Newson
Priority: Blocker
 Fix For: 1.3

 Attachments: couchdb-1259.patch, couchdb-1259.patch


 I noticed that when Couchbase Mobile running on iOS replicates to/from a 
 remote server (on iriscouch in this case), the replication has to fetch the 
 full _changes feed every time it starts. Filipe helped me track down the 
 problem -- the replication ID is coming out different every time. The reason 
 for this is that the local port number, which is one of the inputs to the 
 hash that generates the replication ID, is randomly assigned by the OS. (I.e. 
 it uses a port number of 0 when opening its listener socket.) This is because 
 there could be multiple apps using Couchbase Mobile running on the same 
 device and we can't have their ports colliding.
 The underlying problem is that CouchDB is attempting to generate a unique ID 
 for a particular pair of {source, destination} databases, but it's basing it 
 on attributes that aren't fundamental to the database and can change, like 
 the hostname or port number.
 One solution, proposed by Filipe and me, is to assign each database (or each 
 server?) a random UUID when it's created, and use that to generate 
 replication IDs.
 Another solution, proposed by Damien, is to have CouchDB let the client work 
 out the replication ID on its own, and set it as a property in the 
 replication document (or the JSON body of a _replicate request.) This is even 
 more flexible and will handle tricky scenarios like full P2P replication 
 where there may be no low-level way to uniquely identify the remote database 
 being synced with.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (COUCHDB-1259) Replication ID is not stable if local server has a dynamic port number

2012-11-04 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490427#comment-13490427
 ] 

Jens Alfke commented on COUCHDB-1259:
-

 It is the responsibility of the application to know that. As a protocol the 
 replication shouldn't force this way imo.

Then how does the application do this? I haven't seen any API for it.

Also, I don't see how this has anything to do with the case of a leaf node 
running a server that happens to have a dynamic port assignment. The port this 
node is running on has absolutely nothing to do with the replication. In the 
(now obsolete) case of Couchbase Mobile, the server doesn't even accept 
external requests, so its port number is purely an internal affair.

I still have a feeling that we're talking about completely different things. 
But I can't really figure out what your point is...

 Replication ID is not stable if local server has a dynamic port number
 --

 Key: COUCHDB-1259
 URL: https://issues.apache.org/jira/browse/COUCHDB-1259
 Project: CouchDB
  Issue Type: Bug
  Components: Replication
Affects Versions: 1.1
Reporter: Jens Alfke
Assignee: Robert Newson
Priority: Blocker
 Fix For: 1.3

 Attachments: couchdb-1259.patch, couchdb-1259.patch


 I noticed that when Couchbase Mobile running on iOS replicates to/from a 
 remote server (on iriscouch in this case), the replication has to fetch the 
 full _changes feed every time it starts. Filipe helped me track down the 
 problem -- the replication ID is coming out different every time. The reason 
 for this is that the local port number, which is one of the inputs to the 
 hash that generates the replication ID, is randomly assigned by the OS. (I.e. 
 it uses a port number of 0 when opening its listener socket.) This is because 
 there could be multiple apps using Couchbase Mobile running on the same 
 device and we can't have their ports colliding.
 The underlying problem is that CouchDB is attempting to generate a unique ID 
 for a particular pair of {source, destination} databases, but it's basing it 
 on attributes that aren't fundamental to the database and can change, like 
 the hostname or port number.
 One solution, proposed by Filipe and me, is to assign each database (or each 
 server?) a random UUID when it's created, and use that to generate 
 replication IDs.
 Another solution, proposed by Damien, is to have CouchDB let the client work 
 out the replication ID on its own, and set it as a property in the 
 replication document (or the JSON body of a _replicate request.) This is even 
 more flexible and will handle tricky scenarios like full P2P replication 
 where there may be no low-level way to uniquely identify the remote database 
 being synced with.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (COUCHDB-1259) Replication ID is not stable if local server has a dynamic port number

2012-11-04 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490441#comment-13490441
 ] 

Jens Alfke commented on COUCHDB-1259:
-

 btw your example with couchbase mobile is generally solved by using the 
 replication in pull mode only. So here it is relying on a fixed address to 
 replicate.

*sigh* No, that is exactly the situation I was describing. The mobile client is 
the only one initiating replication; it pulls from the central (fixed-address) 
server, and pushes changes to it. So the mobile device's IP address and port 
are irrelevant, right? Except that the replication state document stored in 
_local has an ID based on several things _including_ the local server's address 
and port number. So the effect is that, every time the app launches, all the 
replication state gets lost/invalidated, and it has to start over again the 
next time it replicates.

TouchDB doesn't have this problem because I didn't write it with this design 
flaw :) Instead every local database has a UUID as suggested here, and that's 
used as part of the key.

 Replication ID is not stable if local server has a dynamic port number
 --

 Key: COUCHDB-1259
 URL: https://issues.apache.org/jira/browse/COUCHDB-1259
 Project: CouchDB
  Issue Type: Bug
  Components: Replication
Affects Versions: 1.1
Reporter: Jens Alfke
Assignee: Robert Newson
Priority: Blocker
 Fix For: 1.3

 Attachments: couchdb-1259.patch, couchdb-1259.patch


 I noticed that when Couchbase Mobile running on iOS replicates to/from a 
 remote server (on iriscouch in this case), the replication has to fetch the 
 full _changes feed every time it starts. Filipe helped me track down the 
 problem -- the replication ID is coming out different every time. The reason 
 for this is that the local port number, which is one of the inputs to the 
 hash that generates the replication ID, is randomly assigned by the OS. (I.e. 
 it uses a port number of 0 when opening its listener socket.) This is because 
 there could be multiple apps using Couchbase Mobile running on the same 
 device and we can't have their ports colliding.
 The underlying problem is that CouchDB is attempting to generate a unique ID 
 for a particular pair of {source, destination} databases, but it's basing it 
 on attributes that aren't fundamental to the database and can change, like 
 the hostname or port number.
 One solution, proposed by Filipe and me, is to assign each database (or each 
 server?) a random UUID when it's created, and use that to generate 
 replication IDs.
 Another solution, proposed by Damien, is to have CouchDB let the client work 
 out the replication ID on its own, and set it as a property in the 
 replication document (or the JSON body of a _replicate request.) This is even 
 more flexible and will handle tricky scenarios like full P2P replication 
 where there may be no low-level way to uniquely identify the remote database 
 being synced with.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (COUCHDB-1259) Replication ID is not stable if local server has a dynamic port number

2012-11-04 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490448#comment-13490448
 ] 

Jens Alfke commented on COUCHDB-1259:
-

It seems like overkill to get the IANA to assign a fixed port number to an app 
that doesn't even listen on any external interfaces! The only use of that port 
is (was) over the loopback interface to let the application communicate with 
CouchDB.

Passing zero for the port in the config file didn't make the problem go away. 
Apparently the replicator bases the ID on the actual random port number in use, 
not on the fixed 0 from the config.

 What will prevents an hostile node to connect back to your node with the same 
 id?

Hello, are you listening at all to what I'm writing? I've already said several 
times that the app does not accept incoming connections at all. It only makes 
outgoing connections to replicate.

And in general: obviously in any real P2P app there would be actual security 
measures in place to authenticate connections, most likely by using both server 
and client SSL certs and verifying their public keys. Once the connection is 
made, _then_ database IDs can be used to restore the state of a replication.

 Replication ID is not stable if local server has a dynamic port number
 --

 Key: COUCHDB-1259
 URL: https://issues.apache.org/jira/browse/COUCHDB-1259
 Project: CouchDB
  Issue Type: Bug
  Components: Replication
Affects Versions: 1.1
Reporter: Jens Alfke
Assignee: Robert Newson
Priority: Blocker
 Fix For: 1.3

 Attachments: couchdb-1259.patch, couchdb-1259.patch


 I noticed that when Couchbase Mobile running on iOS replicates to/from a 
 remote server (on iriscouch in this case), the replication has to fetch the 
 full _changes feed every time it starts. Filipe helped me track down the 
 problem -- the replication ID is coming out different every time. The reason 
 for this is that the local port number, which is one of the inputs to the 
 hash that generates the replication ID, is randomly assigned by the OS. (I.e. 
 it uses a port number of 0 when opening its listener socket.) This is because 
 there could be multiple apps using Couchbase Mobile running on the same 
 device and we can't have their ports colliding.
 The underlying problem is that CouchDB is attempting to generate a unique ID 
 for a particular pair of {source, destination} databases, but it's basing it 
 on attributes that aren't fundamental to the database and can change, like 
 the hostname or port number.
 One solution, proposed by Filipe and me, is to assign each database (or each 
 server?) a random UUID when it's created, and use that to generate 
 replication IDs.
 Another solution, proposed by Damien, is to have CouchDB let the client work 
 out the replication ID on its own, and set it as a property in the 
 replication document (or the JSON body of a _replicate request.) This is even 
 more flexible and will handle tricky scenarios like full P2P replication 
 where there may be no low-level way to uniquely identify the remote database 
 being synced with.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (COUCHDB-1584) Allow passing of open_doc parameters to _all_docs

2012-11-01 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488911#comment-13488911
 ] 

Jens Alfke commented on COUCHDB-1584:
-

This may actually not be sufficient to let the replicator fetch revisions in 
bulk. The problem is that _all_docs takes an array of docids, but not revids — 
so the caller has no control over which revision of a document to get; they'll 
always get the winning one. So
(a) If a document is in conflict, the replicator will still have to use 
single-revision GETs to fetch the non-winning revision(s).
(b) There can be race conditions where a document is updated after the _changes 
feed is sent, so the _all_docs request will return that new revision, not the 
one the replicator knows about.

I don't think either of these cases will be all that common; it just means the 
replicator will have to be a bit careful to check the revids in the response 
from _all_docs, and possibly fetch some revisions one-by-one if it didn't get 
the right ones.

 Allow passing of open_doc parameters to _all_docs
 -

 Key: COUCHDB-1584
 URL: https://issues.apache.org/jira/browse/COUCHDB-1584
 Project: CouchDB
  Issue Type: New Feature
Affects Versions: 1.2
Reporter: Jan Lehnardt
Priority: Minor

 GET /_all_docs should take the same arguments as GET /db/doc
 /_all_docs?revisions=true
 /_all_docs?revs_info=true
 See http://wiki.apache.org/couchdb/HTTP_Document_API#GET for details

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (COUCHDB-1584) Allow passing of open_doc parameters to _all_docs

2012-10-31 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13487944#comment-13487944
 ] 

Jens Alfke commented on COUCHDB-1584:
-

+1. A major benefit of this would be to enable significant speedups to 'pull' 
replication, since the replicator would be able to fetch new revisions in bulk. 
(Presently it has to fetch them individually, AFAIK, because that's the only 
way to get the rev history.)

 Allow passing of open_doc parameters to _all_docs
 -

 Key: COUCHDB-1584
 URL: https://issues.apache.org/jira/browse/COUCHDB-1584
 Project: CouchDB
  Issue Type: New Feature
Affects Versions: 1.2
Reporter: Jan Lehnardt
Priority: Minor

 GET /_all_docs should take the same arguments as GET /db/doc
 /_all_docs?revisions=true
 /_all_docs?revs_info=true
 See http://wiki.apache.org/couchdb/HTTP_Document_API#GET for details

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (COUCHDB-1584) Allow passing of open_doc parameters to _all_docs

2012-10-31 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13487960#comment-13487960
 ] 

Jens Alfke commented on COUCHDB-1584:
-

Sure. I've worked a bit with the JS tests before; I haven't actually modified 
the code yet, but it looks straightforward enough.

I can even do it in proper TDD fashion and write the tests before I have your 
implementation, then watch them fail :)

 Allow passing of open_doc parameters to _all_docs
 -

 Key: COUCHDB-1584
 URL: https://issues.apache.org/jira/browse/COUCHDB-1584
 Project: CouchDB
  Issue Type: New Feature
Affects Versions: 1.2
Reporter: Jan Lehnardt
Priority: Minor

 GET /_all_docs should take the same arguments as GET /db/doc
 /_all_docs?revisions=true
 /_all_docs?revs_info=true
 See http://wiki.apache.org/couchdb/HTTP_Document_API#GET for details

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (COUCHDB-1584) Allow passing of open_doc parameters to _all_docs

2012-10-31 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488332#comment-13488332
 ] 

Jens Alfke commented on COUCHDB-1584:
-

@Jan, I've emailed you a patch that adds the tests.

 Allow passing of open_doc parameters to _all_docs
 -

 Key: COUCHDB-1584
 URL: https://issues.apache.org/jira/browse/COUCHDB-1584
 Project: CouchDB
  Issue Type: New Feature
Affects Versions: 1.2
Reporter: Jan Lehnardt
Priority: Minor

 GET /_all_docs should take the same arguments as GET /db/doc
 /_all_docs?revisions=true
 /_all_docs?revs_info=true
 See http://wiki.apache.org/couchdb/HTTP_Document_API#GET for details

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (COUCHDB-1570) Replicator writes corrupt remote checkpoint document on error; breaks replication

2012-10-29 Thread Jens Alfke (JIRA)

 [ 
https://issues.apache.org/jira/browse/COUCHDB-1570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jens Alfke resolved COUCHDB-1570.
-

Resolution: Invalid

 Replicator writes corrupt remote checkpoint document on error; breaks 
 replication
 -

 Key: COUCHDB-1570
 URL: https://issues.apache.org/jira/browse/COUCHDB-1570
 Project: CouchDB
  Issue Type: Bug
  Components: Replication
Affects Versions: 1.2
 Environment: Mac OS X 10.8.2
Reporter: Jens Alfke
Assignee: Randall Leeds
Priority: Blocker
 Fix For: 1.3


 If a 'push' replication receives error responses from the remote server while 
 PUTting documents, it may write a corrupt checkpoint document to the remote 
 server -- its _revisions._ids property is null. Any subsequent attempt to 
 push to that server will cause the replicator to abort with an error when 
 reading that replication document, since it requires _revisions.ids to be an 
 array.
 The only workaround for this, short of deleting the remote database entirely, 
 is to (a) identify the URL of the remote checkpoint document, and (b) delete 
 it from the remote server. Otherwise you will never be able to push to that 
 database again.
 Here's a failed replication attempt:
 $ curl -X POST --user snej :5984/_replicate --header 
 Content-Type:application/json --data 
 '{source:demo-shopping-attachments,target:http://localhost:4984/demo-shopping-attachments,create_target:true}'
 {error:doc_validation,reason:_revisions.ids isn't a array.}
 Here's the contents of the remote checkpoint document, once I identified its 
 ID:
 $ curl :4984/demo-shopping-attachments/_local/5b913befe682d7bd1fbc24b1ce31cbc5
 {_id:_local/5b913befe682d7bd1fbc24b1ce31cbc5,_rev:0-1,_revisions:{ids:null,start:0},history:[{doc_write_failures:2,docs_read:2,docs_written:0,end_last_seq:207,end_time:Sun,
  21 Oct 2012 19:40:04 
 GMT,missing_checked:17,missing_found:2,recorded_seq:207,session_id:9a24dc37885b5b4e507ea90702b2fecf,start_last_seq:0,start_time:Sun,
  21 Oct 2012 19:40:04 
 GMT}],replication_id_version:2,session_id:9a24dc37885b5b4e507ea90702b2fecf,source_last_seq:207}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (COUCHDB-1570) Replicator writes corrupt remote checkpoint document on error; breaks replication

2012-10-29 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486126#comment-13486126
 ] 

Jens Alfke commented on COUCHDB-1570:
-

Sorry, this was my bug — CouchDB sends that _revisions property as part of the 
body of the PUT to the _local document, but it shouldn't be saved in my 
database. I added code to BaseCouch to strip underscored properties other than 
_rev, and that fixed the problem.

 Replicator writes corrupt remote checkpoint document on error; breaks 
 replication
 -

 Key: COUCHDB-1570
 URL: https://issues.apache.org/jira/browse/COUCHDB-1570
 Project: CouchDB
  Issue Type: Bug
  Components: Replication
Affects Versions: 1.2
 Environment: Mac OS X 10.8.2
Reporter: Jens Alfke
Assignee: Randall Leeds
Priority: Blocker
 Fix For: 1.3


 If a 'push' replication receives error responses from the remote server while 
 PUTting documents, it may write a corrupt checkpoint document to the remote 
 server -- its _revisions._ids property is null. Any subsequent attempt to 
 push to that server will cause the replicator to abort with an error when 
 reading that replication document, since it requires _revisions.ids to be an 
 array.
 The only workaround for this, short of deleting the remote database entirely, 
 is to (a) identify the URL of the remote checkpoint document, and (b) delete 
 it from the remote server. Otherwise you will never be able to push to that 
 database again.
 Here's a failed replication attempt:
 $ curl -X POST --user snej :5984/_replicate --header 
 Content-Type:application/json --data 
 '{source:demo-shopping-attachments,target:http://localhost:4984/demo-shopping-attachments,create_target:true}'
 {error:doc_validation,reason:_revisions.ids isn't a array.}
 Here's the contents of the remote checkpoint document, once I identified its 
 ID:
 $ curl :4984/demo-shopping-attachments/_local/5b913befe682d7bd1fbc24b1ce31cbc5
 {_id:_local/5b913befe682d7bd1fbc24b1ce31cbc5,_rev:0-1,_revisions:{ids:null,start:0},history:[{doc_write_failures:2,docs_read:2,docs_written:0,end_last_seq:207,end_time:Sun,
  21 Oct 2012 19:40:04 
 GMT,missing_checked:17,missing_found:2,recorded_seq:207,session_id:9a24dc37885b5b4e507ea90702b2fecf,start_last_seq:0,start_time:Sun,
  21 Oct 2012 19:40:04 
 GMT}],replication_id_version:2,session_id:9a24dc37885b5b4e507ea90702b2fecf,source_last_seq:207}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (COUCHDB-1570) Replicator writes corrupt remote checkpoint document on error; breaks replication

2012-10-29 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486230#comment-13486230
 ] 

Jens Alfke commented on COUCHDB-1570:
-

The problem was with BaseCouch's code for updating _local documents; it wasn't 
stripping out underscored property names like _revisions. I'm still not sure 
why CouchDB was sending a _revisions property in the PUT, but BaseCouch 
shouldn't have been storing it, or echoing it back on the next GET. Sorry for 
the confusion!

 Replicator writes corrupt remote checkpoint document on error; breaks 
 replication
 -

 Key: COUCHDB-1570
 URL: https://issues.apache.org/jira/browse/COUCHDB-1570
 Project: CouchDB
  Issue Type: Bug
  Components: Replication
Affects Versions: 1.2
 Environment: Mac OS X 10.8.2
Reporter: Jens Alfke
Assignee: Randall Leeds
Priority: Blocker
 Fix For: 1.3


 If a 'push' replication receives error responses from the remote server while 
 PUTting documents, it may write a corrupt checkpoint document to the remote 
 server -- its _revisions._ids property is null. Any subsequent attempt to 
 push to that server will cause the replicator to abort with an error when 
 reading that replication document, since it requires _revisions.ids to be an 
 array.
 The only workaround for this, short of deleting the remote database entirely, 
 is to (a) identify the URL of the remote checkpoint document, and (b) delete 
 it from the remote server. Otherwise you will never be able to push to that 
 database again.
 Here's a failed replication attempt:
 $ curl -X POST --user snej :5984/_replicate --header 
 Content-Type:application/json --data 
 '{source:demo-shopping-attachments,target:http://localhost:4984/demo-shopping-attachments,create_target:true}'
 {error:doc_validation,reason:_revisions.ids isn't a array.}
 Here's the contents of the remote checkpoint document, once I identified its 
 ID:
 $ curl :4984/demo-shopping-attachments/_local/5b913befe682d7bd1fbc24b1ce31cbc5
 {_id:_local/5b913befe682d7bd1fbc24b1ce31cbc5,_rev:0-1,_revisions:{ids:null,start:0},history:[{doc_write_failures:2,docs_read:2,docs_written:0,end_last_seq:207,end_time:Sun,
  21 Oct 2012 19:40:04 
 GMT,missing_checked:17,missing_found:2,recorded_seq:207,session_id:9a24dc37885b5b4e507ea90702b2fecf,start_last_seq:0,start_time:Sun,
  21 Oct 2012 19:40:04 
 GMT}],replication_id_version:2,session_id:9a24dc37885b5b4e507ea90702b2fecf,source_last_seq:207}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (COUCHDB-1570) Replicator writes corrupt remote checkpoint document on error; breaks replication

2012-10-23 Thread Jens Alfke (JIRA)

 [ 
https://issues.apache.org/jira/browse/COUCHDB-1570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jens Alfke updated COUCHDB-1570:


Summary: Replicator writes corrupt remote checkpoint document on error; 
breaks replication  (was: Replicator writes corrupt remote checkpoint document 
on error; breaks revision forever)

 Replicator writes corrupt remote checkpoint document on error; breaks 
 replication
 -

 Key: COUCHDB-1570
 URL: https://issues.apache.org/jira/browse/COUCHDB-1570
 Project: CouchDB
  Issue Type: Bug
  Components: Replication
Affects Versions: 1.2
 Environment: Mac OS X 10.8.2
Reporter: Jens Alfke
Assignee: Randall Leeds
Priority: Blocker
 Fix For: 1.3


 If a 'push' replication receives error responses from the remote server while 
 PUTting documents, it may write a corrupt checkpoint document to the remote 
 server -- its _revisions._ids property is null. Any subsequent attempt to 
 push to that server will cause the replicator to abort with an error when 
 reading that replication document, since it requires _revisions.ids to be an 
 array.
 The only workaround for this, short of deleting the remote database entirely, 
 is to (a) identify the URL of the remote checkpoint document, and (b) delete 
 it from the remote server. Otherwise you will never be able to push to that 
 database again.
 Here's a failed replication attempt:
 $ curl -X POST --user snej :5984/_replicate --header 
 Content-Type:application/json --data 
 '{source:demo-shopping-attachments,target:http://localhost:4984/demo-shopping-attachments,create_target:true}'
 {error:doc_validation,reason:_revisions.ids isn't a array.}
 Here's the contents of the remote checkpoint document, once I identified its 
 ID:
 $ curl :4984/demo-shopping-attachments/_local/5b913befe682d7bd1fbc24b1ce31cbc5
 {_id:_local/5b913befe682d7bd1fbc24b1ce31cbc5,_rev:0-1,_revisions:{ids:null,start:0},history:[{doc_write_failures:2,docs_read:2,docs_written:0,end_last_seq:207,end_time:Sun,
  21 Oct 2012 19:40:04 
 GMT,missing_checked:17,missing_found:2,recorded_seq:207,session_id:9a24dc37885b5b4e507ea90702b2fecf,start_last_seq:0,start_time:Sun,
  21 Oct 2012 19:40:04 
 GMT}],replication_id_version:2,session_id:9a24dc37885b5b4e507ea90702b2fecf,source_last_seq:207}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (COUCHDB-1529) Add API to find the deleted branches of a doc's rev tree

2012-09-04 Thread Jens Alfke (JIRA)
Jens Alfke created COUCHDB-1529:
---

 Summary: Add API to find the deleted branches of a doc's rev tree
 Key: COUCHDB-1529
 URL: https://issues.apache.org/jira/browse/COUCHDB-1529
 Project: CouchDB
  Issue Type: Improvement
Affects Versions: 1.2
Reporter: Jens Alfke
Priority: Minor


There's no efficient way to discover the complete revision tree of a document. 
API options like ?open_revs=all or ?conflicts will only return active/undeleted 
revisions; it won't discover deleted branches resulting from already-resolved 
conflicts.

There should be an extra option, like ?include_deleted=true, to use along 
with open_revs to tell it to return even deleted leaf revisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (COUCHDB-1529) Add API to find the deleted branches of a doc's rev tree

2012-09-04 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448249#comment-13448249
 ] 

Jens Alfke commented on COUCHDB-1529:
-

Note: It's possible to do this currently by iterating through the database's 
entire _changes feed, since it includes deletions. That's expensive, though.

 Add API to find the deleted branches of a doc's rev tree
 

 Key: COUCHDB-1529
 URL: https://issues.apache.org/jira/browse/COUCHDB-1529
 Project: CouchDB
  Issue Type: Improvement
Affects Versions: 1.2
Reporter: Jens Alfke
Priority: Minor

 There's no efficient way to discover the complete revision tree of a 
 document. API options like ?open_revs=all or ?conflicts will only return 
 active/undeleted revisions; it won't discover deleted branches resulting from 
 already-resolved conflicts.
 There should be an extra option, like ?include_deleted=true, to use along 
 with open_revs to tell it to return even deleted leaf revisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (COUCHDB-1530) Add a mode to _all_docs to include deleted docs

2012-09-04 Thread Jens Alfke (JIRA)
Jens Alfke created COUCHDB-1530:
---

 Summary: Add a mode to _all_docs to include deleted docs
 Key: COUCHDB-1530
 URL: https://issues.apache.org/jira/browse/COUCHDB-1530
 Project: CouchDB
  Issue Type: Improvement
Affects Versions: 1.2
Reporter: Jens Alfke
Priority: Minor


There's currently no efficient way to discover the deleted documents in a 
database, since _all_docs only returns active documents. It can be useful to 
show all documents, even deleted ones, for troubleshooting or forensic purposes.

I propose adding an option like ?include_deleted=true to _all_docs that will 
cause it to return deleted documents as well. To provide a way to tell deleted 
docs apart from active ones (if the include_docs option isn't on), the value 
object should include a deleted:true property for those docs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (COUCHDB-1530) Add a mode to _all_docs to include deleted docs

2012-09-04 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448271#comment-13448271
 ] 

Jens Alfke commented on COUCHDB-1530:
-

Note: It's possible to find the deleted documents by iterating over the 
database's entire _changes feed, but it's expensive.

 Add a mode to _all_docs to include deleted docs
 ---

 Key: COUCHDB-1530
 URL: https://issues.apache.org/jira/browse/COUCHDB-1530
 Project: CouchDB
  Issue Type: Improvement
Affects Versions: 1.2
Reporter: Jens Alfke
Priority: Minor

 There's currently no efficient way to discover the deleted documents in a 
 database, since _all_docs only returns active documents. It can be useful to 
 show all documents, even deleted ones, for troubleshooting or forensic 
 purposes.
 I propose adding an option like ?include_deleted=true to _all_docs that 
 will cause it to return deleted documents as well. To provide a way to tell 
 deleted docs apart from active ones (if the include_docs option isn't on), 
 the value object should include a deleted:true property for those docs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (COUCHDB-1521) multipart parser gets multiple attachments mixed up

2012-08-08 Thread Jens Alfke (JIRA)
Jens Alfke created COUCHDB-1521:
---

 Summary: multipart parser gets multiple attachments mixed up
 Key: COUCHDB-1521
 URL: https://issues.apache.org/jira/browse/COUCHDB-1521
 Project: CouchDB
  Issue Type: Bug
  Components: HTTP Interface
Affects Versions: 1.2
Reporter: Jens Alfke


When receiving a document PUT in multipart format, CouchDB gets the attachments 
and MIME parts mixed up. Instead of looking at the headers of a MIME part to 
identify which attachment it is (most likely by using the 'filename' property 
of the 'Content-Disposition:' header), it processes the attachments according 
to the order in which their metadata objects appear in the JSON body's 
'_attachments:' object.

The problem with this is that JSON objects (dictionaries) are _not_ ordered 
collections. I know that Erlang's implementation of them (as linked lists of 
key/value pairs) happens to be ordered, and I think some JavaScript 
implementations have the side effect of preserving order; but in many languages 
these are implemented as hash tables and genuinely unordered.

This means that when a program written in such a language converts a native 
object to JSON, it has no control over (and probably no knowledge of) the order 
in which the keys of the JSON object are written out. This makes it impossible 
to then write the attachments in the same order.

The only workaround seems to be for the program to implement its own custom 
JSON encoder just so that it can write object keys in a known order (probably 
sorted), which then enables it to write the attachment bodies in the same order.

NOTE: This is the flip side of COUCHDB-1368 which I filed last year; that bug 
has to do with the same ordering issue when CouchDB _generates_ multipart 
responses (and presents similar problems for clients not written in Erlang.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (COUCHDB-1479) Futon config UI won't allow WWW-Authenticate option to be added (name is lowercased)

2012-05-13 Thread Jens Alfke (JIRA)
Jens Alfke created COUCHDB-1479:
---

 Summary: Futon config UI won't allow WWW-Authenticate option to 
be added (name is lowercased)
 Key: COUCHDB-1479
 URL: https://issues.apache.org/jira/browse/COUCHDB-1479
 Project: CouchDB
  Issue Type: Bug
  Components: Futon
 Environment: Mac OS X
Reporter: Jens Alfke
Priority: Minor


When using the config UI in futon to add a new option, via the Add a new 
section... link at the bottom of the page, the name of the option is 
lowercased when written to the .ini file. (For some reason the case is 
preserved when altering the runtime configuration, though, so the problem 
doesn't manifest itself until the next time the server is restarted.)

This causes trouble when attempting to enable HTTP basic auth by adding a 
WWW-Authenticate option (value Basic) to the [httpd] section. The actual 
data written to the .ini file is:

[httpd]
www-authenticate = Basic

which is not recognized when the server loads its configuration on restart.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (COUCHDB-1259) Replication ID is not stable if local server has a dynamic port number

2011-08-24 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13090396#comment-13090396
 ] 

Jens Alfke commented on COUCHDB-1259:
-

@Jason — I see where you're coming from with bad web citizen but it refers 
only to a specific usage of CouchDB as a traditional web server. We have 
equally valid uses of it that do not act as traditional servers nor have stable 
URLs, for which we absolutely need this fix. I don't think arguments about 
whether URLs change or not are productive here in a specific bug report.

 Replication ID is not stable if local server has a dynamic port number
 --

 Key: COUCHDB-1259
 URL: https://issues.apache.org/jira/browse/COUCHDB-1259
 Project: CouchDB
  Issue Type: Bug
  Components: Replication
Affects Versions: 1.1
Reporter: Jens Alfke

 I noticed that when Couchbase Mobile running on iOS replicates to/from a 
 remote server (on iriscouch in this case), the replication has to fetch the 
 full _changes feed every time it starts. Filipe helped me track down the 
 problem -- the replication ID is coming out different every time. The reason 
 for this is that the local port number, which is one of the inputs to the 
 hash that generates the replication ID, is randomly assigned by the OS. (I.e. 
 it uses a port number of 0 when opening its listener socket.) This is because 
 there could be multiple apps using Couchbase Mobile running on the same 
 device and we can't have their ports colliding.
 The underlying problem is that CouchDB is attempting to generate a unique ID 
 for a particular pair of {source, destination} databases, but it's basing it 
 on attributes that aren't fundamental to the database and can change, like 
 the hostname or port number.
 One solution, proposed by Filipe and me, is to assign each database (or each 
 server?) a random UUID when it's created, and use that to generate 
 replication IDs.
 Another solution, proposed by Damien, is to have CouchDB let the client work 
 out the replication ID on its own, and set it as a property in the 
 replication document (or the JSON body of a _replicate request.) This is even 
 more flexible and will handle tricky scenarios like full P2P replication 
 where there may be no low-level way to uniquely identify the remote database 
 being synced with.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (COUCHDB-1259) Replication ID is not stable if local server has a dynamic port number

2011-08-23 Thread Jens Alfke (JIRA)
Replication ID is not stable if local server has a dynamic port number
--

 Key: COUCHDB-1259
 URL: https://issues.apache.org/jira/browse/COUCHDB-1259
 Project: CouchDB
  Issue Type: Bug
  Components: Replication
Affects Versions: 1.1
Reporter: Jens Alfke


I noticed that when Couchbase Mobile running on iOS replicates to/from a remote 
server (on iriscouch in this case), the replication has to fetch the full 
_changes feed every time it starts. Filipe helped me track down the problem -- 
the replication ID is coming out different every time. The reason for this is 
that the local port number, which is one of the inputs to the hash that 
generates the replication ID, is randomly assigned by the OS. (I.e. it uses a 
port number of 0 when opening its listener socket.) This is because there could 
be multiple apps using Couchbase Mobile running on the same device and we can't 
have their ports colliding.

The underlying problem is that CouchDB is attempting to generate a unique ID 
for a particular pair of {source, destination} databases, but it's basing it on 
attributes that aren't fundamental to the database and can change, like the 
hostname or port number.

One solution, proposed by Filipe and me, is to assign each database (or each 
server?) a random UUID when it's created, and use that to generate replication 
IDs.

Another solution, proposed by Damien, is to have CouchDB let the client work 
out the replication ID on its own, and set it as a property in the replication 
document (or the JSON body of a _replicate request.) This is even more flexible 
and will handle tricky scenarios like full P2P replication where there may be 
no low-level way to uniquely identify the remote database being synced with.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (COUCHDB-1259) Replication ID is not stable if local server has a dynamic port number

2011-08-23 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13089976#comment-13089976
 ] 

Jens Alfke commented on COUCHDB-1259:
-

@Jason: The reason I filed this bug report is that the URL of a database in 
Couchbase Mobile *isn't* unique. It's barely meaningful at all; it's of the 
form http://127.0.0.1:n; where n is an upredictable port number 
assigned by the TCP stack at launch time. That URL isn't even exposed to the 
outside world because the CouchDB server is only listening on the loopback 
interface.

The state that's being represented by a replication ID is the contents of the 
database (at least as it was when it last synced.) So it seems the ID should be 
something that sticks to the database itself, not to any ephemeral 
manifestation like a URL.

 Replication ID is not stable if local server has a dynamic port number
 --

 Key: COUCHDB-1259
 URL: https://issues.apache.org/jira/browse/COUCHDB-1259
 Project: CouchDB
  Issue Type: Bug
  Components: Replication
Affects Versions: 1.1
Reporter: Jens Alfke

 I noticed that when Couchbase Mobile running on iOS replicates to/from a 
 remote server (on iriscouch in this case), the replication has to fetch the 
 full _changes feed every time it starts. Filipe helped me track down the 
 problem -- the replication ID is coming out different every time. The reason 
 for this is that the local port number, which is one of the inputs to the 
 hash that generates the replication ID, is randomly assigned by the OS. (I.e. 
 it uses a port number of 0 when opening its listener socket.) This is because 
 there could be multiple apps using Couchbase Mobile running on the same 
 device and we can't have their ports colliding.
 The underlying problem is that CouchDB is attempting to generate a unique ID 
 for a particular pair of {source, destination} databases, but it's basing it 
 on attributes that aren't fundamental to the database and can change, like 
 the hostname or port number.
 One solution, proposed by Filipe and me, is to assign each database (or each 
 server?) a random UUID when it's created, and use that to generate 
 replication IDs.
 Another solution, proposed by Damien, is to have CouchDB let the client work 
 out the replication ID on its own, and set it as a property in the 
 replication document (or the JSON body of a _replicate request.) This is even 
 more flexible and will handle tricky scenarios like full P2P replication 
 where there may be no low-level way to uniquely identify the remote database 
 being synced with.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (COUCHDB-1206) View ETags may be incorrect if ?include_docs=true is specified

2011-06-29 Thread Jens Alfke (JIRA)
View ETags may be incorrect if ?include_docs=true is specified
--

 Key: COUCHDB-1206
 URL: https://issues.apache.org/jira/browse/COUCHDB-1206
 Project: CouchDB
  Issue Type: Bug
Affects Versions: 1.1
Reporter: Jens Alfke
Priority: Minor


Change COUCHDB-799 altered the way ETags are assigned to views, by having the 
ETag change only when the view index changes, not when any document changes. 
Unfortunately this means that a view with the ?include_docs=true option can 
return an incorrect ETag. The reason is that if a document in the view is 
changed, but the change doesn't affect the view index, the result of the GET 
will change (it will contain the document's updated contents), but the ETag 
won't. This can result in stale data if the client uses a conditional GET, 
because it'll get a 304 even though the prior response is out of date.

Robert Newson's analysis on the user@ list is I think the sanest fix is to 
make view etags for include_docs=true use the original algorithm, so that they 
always change if the database changes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (COUCHDB-549) include_docs=true doesn't honour conflicts=true

2010-02-25 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12838614#action_12838614
 ] 

Jens Alfke commented on COUCHDB-549:


This actually goes back to 0.10.0.
From some historical evidence (a unit test in the CouchObjC library that used 
to work but now breaks) it looks like this changed sometime after 0.8.

Is this considered a bug to be fixed, or just a design limitation?

 include_docs=true doesn't honour conflicts=true
 ---

 Key: COUCHDB-549
 URL: https://issues.apache.org/jira/browse/COUCHDB-549
 Project: CouchDB
  Issue Type: Improvement
  Components: HTTP Interface
Affects Versions: 0.11
Reporter: Brian Candler
Priority: Minor

 When you read a view and use the option 'include_docs=true' to get the source 
 document in each result row, the option 'conflicts=true' is not honoured. You 
 do not see a _conflicts member in the document, even if it is in a 
 conflicting state.
 This feature request could be expanded in a couple of directions:
 1. Make include_docs=true honour *all* options which a straightforward GET 
 would honour - e.g. revs, revs_info, open_revs. Maybe this would be 
 straightforward if they shared the same code path and options processing.
 2. It has been suggested that 'conflicts=true' could be the default anyway. 
 That is, whenever you retrieve a document, you get a _conflicts member if it 
 is in a conflicting state, without having to ask for it. This would be 
 unlikely to break things, but would make it less likely that conflicts would 
 go unnoticed, and it would simplify the API a little.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (COUCHDB-265) HEAD requests get a Content-Length header

2009-02-23 Thread Jens Alfke (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12676075#action_12676075
 ] 

Jens Alfke commented on COUCHDB-265:


By the way, curl has exactly the same problem with HEAD requests to other 
servers, so I'd say this is a bug in curl itself. (For example, try curl -X 
HEAD http://www.google.com;.)

 HEAD requests get a Content-Length header
 -

 Key: COUCHDB-265
 URL: https://issues.apache.org/jira/browse/COUCHDB-265
 Project: CouchDB
  Issue Type: Bug
  Components: HTTP Interface
Affects Versions: 0.9
 Environment: curl + trunk
Reporter: Paul Joseph Davis
 Fix For: 0.9


 Looks like HEAD requests are returning a bogus Content-Length header. If I 
 remember my HTTP spec correctly, HEAD requests are supposed to return no 
 Content-Length or a Content-Length of 0 but I could be wrong on that. Either 
 way, it confuses the crap out of curl:
 $ curl -X HEAD -i http://127.0.0.1:5984/
 HTTP/1.1 200 OK
 Server: CouchDB/0.9.0a (Erlang OTP/R12B)
 Date: Mon, 23 Feb 2009 20:56:55 GMT
 Content-Type: text/plain;charset=utf-8
 Content-Length: 40
 Cache-Control: must-revalidate
 curl: (18) transfer closed with 40 bytes remaining to read
 Also, I just happened to be reading couch_http.erl the other day and I 
 remember seeing a note that said mochiweb automatically strips bodies so 
 internally HEAD requests are treated like a GET and mochiweb I guess just 
 doesn't send a body. That's probably important.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.