date:20090224

Devs,

Some of you have seen the work that's been going on over at the
CouchApp project. The project started as a thought experiment to see
what can be accomplished using only CouchDB as an app server. It turns
out to raise interesting questions for Couch, especially around REST,
hypermedia, and linking.

Some background: Futon is living proof that CouchDB can host pure Ajax
apps. On the other hand, JSON conforming to CouchDB's HTTP API is not
really RESTful, because a browser can't just pick it up and browse it
(without first loading a JS application that knows how to use the
API). I'm not interested in making Couch RESTful because I want
buzzword compliance, I'm interested because things that are RESTful
tend to win. I want Couch to win.

With _show and _list*, Couch can now host standalone apps that work
just fine without client-side JS. This makes it viable for a whole
additional set of deployment scenarios (like the ones where you want
search engines to index you).

Doing this has taught us some lessons, and motivates some potential
changes to the CouchDB API. I'm not ready to advocate for these
changes (because they are big, as in break-all-existing-clients big)
but I'm fairly certain that there is more REST intelligence on this ML
than there is on the IRC channel at any give time. Maybe together we
can cut the Gordian knot. So let me describe that knot.

== Begin actual technical problem ==

Currently, there is no way for an html attachment to a design document
to link to other resources provided by that design document, absent
client side scripting, or hardcoding the design document name in the
html (neither of which are acceptable).

If you are the HTML hosted at /db/_design/foo/index.html and you want
to provide browsers a link to /db/_view/foo/bar?limit=10 you can't.
You can link to other attachments in the same design document, very
easily.

One way to fix this it to give the resources made available by a
design document a common root. This means we can use hrefs like
"_show/docid" to link to a show function from an attachment.  So we
get paths like this:

/db/_design/foo/_view/bar?limit=10
/db/_design/foo/_show/docid
/db/_design/foo/index.html

The downside is that the URLs are longer (and that the change would
break all clients), the upside is the ability to link from one to the
other (and thus be part of the web).

== A related question ==

I checked a patch into Futon the other day (with a note here on dev@)
that links to any apps that are in any of your databases. This is not
meant as an end-user API. It is a step toward an end-user API. The key
similarity is the process for discovering apps. In my mind, an app is
a design document that provides a user interface.

Here's the screenshot of that feature that I linked from my earlier
dev post: http://img.skitch.com/20090225-ttb3gmd86unthjw9i6cqhjs9c9.png

Each app has a start page. Currently, an app's start page is defined
in the design_doc.couchapp.index field. (The details of that field are
subject to change based on the previous section of this mail.) If the
couchapp.index field does not exist, but the design doc has an
index.html attachment, then that is used as its start page. If a
design doc has neither the field nor an index.html attachment, it is
not considered to be an app, and is not linked to from Futon.

The question raised by all of this is how closely do we want CouchDB
to be intertwined with CouchApp?

I've tried to keep the CouchApp project out of the way of CouchDB,
because I'm trying to be humble and not effect CouchDB with this
experiment. Certainly we don't want to give people the impression that
CouchApps are the only way to use CouchDB. I've gone out of my way
whenever possible to make that clear.

OTOH, the CouchApp project is basically designed around CouchDB, to
fit it like a glove. The guiding principle, is that if it can't be
deployed to every unmodified CouchDB server, it's not a CouchApp. As
more people start to develop for CouchDB using CouchApp, things like
the index of available apps become more helpful. The question becomes
practical:

Currently listing the available apps takes quite a few HTTP requests
(Futon has to load all the design documents in each DB). If CouchDB
wanted to support CouchApps more directly, it could provide a JSON
resource at /db/_design/ that lists all design docs, along with the
absolute path to their start page, if they have one.

I also want to be clear that there are more ways to write portable,
standalone CouchDB apps than by using the CouchApp project. However,
CouchApp tries to be the simplest thing that could possibly work, for
getting files from your text editor into a design document. So
hopefully it can be a basic helper that people find useful even if
they aren't interested in the higher level helpers we've been adding
to it.

The CouchApp code is released under the Apache 2.0 license, so if
there were community interest in bringing it into the CouchDB project,
it would not b

Re: Fail on a simple case on replication



On 25/02/2009, at 2:55 PM, Chris Anderson wrote:


Reiterating: I think the clean solution is to remove the API for
loading docs at a particular rev. Instead we allow only the loading of
all conflicted revs (or of course the HEAD rev). I'll wait for people
to say why this is a bad idea before I say why it's a good idea.


Also, without access to the common ancestor of a document and it's  
conflicts, you can't use three way merging as a conflict resolution  
strategy, because you only have instantaneous state. Or am I wrong to  
think this is possible in any case? Can you chase down a conflict's  
ancestry to find the divergent point?


However, both this point and my previous point are moot, because the  
replication model means that access to arbitrary previous revisions is  
only likely on the node where the revisions were written. Access to  
only the head and it's conflicts (and the conflict chain I presume) is  
all that is consistent with replication.


I'm wondering therefore if lazy updating externals that respect the  
request's update_seq are not in fact possible given replication?


Antony Blakey
-
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Borrow money from pessimists - they don't expect it back.
  -- Steven Wright

Re: Fail on a simple case on replication

On Tue, Feb 24, 2009 at 8:45 PM, Antony Blakey  wrote:
>
> On 25/02/2009, at 2:55 PM, Chris Anderson wrote:
>
>> Reiterating: I think the clean solution is to remove the API for
>> loading docs at a particular rev. Instead we allow only the loading of
>> all conflicted revs (or of course the HEAD rev). I'll wait for people
>> to say why this is a bad idea before I say why it's a good idea.
>
> It might be a problem for externals that:
>
> a) want to use all_docs_by_seq as a lazy update mechanism without including
> the docs. I can't immediately think why you'd want to do that, but this
> would make it impossible.
>
> b) want to use the conflict data that is consistent with a given MVCC
> snapshot (e.g. the request's update_seq), for which they could theoretically
> need the data from a conflict that is no longer a head conflict.
>
> Edge cases admittedly, but disallowing access to previous revisions would
> force all queries to be dealing with the head, which isn't the case for lazy
> externals in particular.

These are good arguments for maintaining the functionality, but having
it "off" by default. If you need it for an external, you could turn it
on, for the nodes that use that external.

-- 
Chris Anderson
http://jchris.mfdz.com

Re: Fail on a simple case on replication



On 25/02/2009, at 2:55 PM, Chris Anderson wrote:


Reiterating: I think the clean solution is to remove the API for
loading docs at a particular rev. Instead we allow only the loading of
all conflicted revs (or of course the HEAD rev). I'll wait for people
to say why this is a bad idea before I say why it's a good idea.


It might be a problem for externals that:

a) want to use all_docs_by_seq as a lazy update mechanism without  
including the docs. I can't immediately think why you'd want to do  
that, but this would make it impossible.


b) want to use the conflict data that is consistent with a given MVCC  
snapshot (e.g. the request's update_seq), for which they could  
theoretically need the data from a conflict that is no longer a head  
conflict.


Edge cases admittedly, but disallowing access to previous revisions  
would force all queries to be dealing with the head, which isn't the  
case for lazy externals in particular.


Antony Blakey
-
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Did you hear about the Buddhist who refused Novocain during a root  
canal?

His goal: transcend dental medication.

Re: Fail on a simple case on replication

On Tue, Feb 24, 2009 at 9:52 AM, Chris Anderson  wrote:
> On another note, I was thinking about it some more, and I think that
> renaming _rev to _cc would be a huge pain in the ass for a lot of
> people (who don't go around abusing it) and it can probably be
> avoided.
>
> The only valid use case for requesting a particular _rev of a
> document, is in resolving conflicts introduced by replication. So if
> we restrict access to old revs (by default) to an endpoint which gives
> an array of documents (each conflicted rev) then it won't be usable as
> a revision control system, only as a conflict resolution system. If
> there's not an easy way to think you have implemented a version
> control system (eg no API endpoint for accessing non-conflicting revs)
> I bet we'll see misapprehension of _rev happen a lot less.
>

Trying to get this thread back on track about an actual small concrete
change to the code we could make that might keep people from trying to
use _rev as a versioning system.

Reiterating: I think the clean solution is to remove the API for
loading docs at a particular rev. Instead we allow only the loading of
all conflicted revs (or of course the HEAD rev). I'll wait for people
to say why this is a bad idea before I say why it's a good idea.

Cheers,
Chris

-- 
Chris Anderson
http://jchris.mfdz.com

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)



On 25/02/2009, at 1:40 AM, Jan Lehnardt wrote:

Instead of asking how community votes would be factored into the  
final result,
you constructed a hypothetical that frames the PMC as a  
dictatorship, doing as
it pleases regardless of community feedback. You then use this  
hypothetical to
draw the absurd conclusion that "community votes are irrelevant"  
and then seek

an explicit refutation.


My framing of this question matches the framing used in the Apache Way  
excerpt that Noah included; to whit:


However, the basic rule is that only PMC members have binding votes,  
and all others are either discouraged from voting (to keep the noise  
down) or else have their votes considered of an indicative or  
advisory nature only.


This was not an attack on the PMC, and it was not what I was thinking  
when asking that question. It seemed to me that on the surface of it,  
the process didn't match some underlying reality, and this except  
shows that to be true. According to the Apache Way, community votes  
aren't relevant to the outcome, in exactly the sense I meant. And  
considering the purpose of voting, that's a good thing. Apache  
discourages non-PMC members from voting.


I don't see this, or any other issue, as the PMC vs. anyone. This  
project is covered in Apache goodness now.


Antony Blakey
--
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Don't anthropomorphize computers. They hate that.

Re: Fail on a simple case on replication



On 25/02/2009, at 8:52 AM, Brian Candler wrote:


On Tue, Feb 24, 2009 at 01:48:56PM +0100, Jan Lehnardt wrote:

However, you must then be prepared for your database to be a single
file
which grows without bounds. If CouchDB wants to support this  
model, it

would
be helpful if the data were stored in chunks which can be backed up
separately.


rsync? :)


Doesn't work especially well on huge files.


What about this incremental backup strategy:

1. Split off the MVCC header
2. Compare the previous and current file lengths and split of the new  
tail

3. Backup the header and the tail

I'm not sure about 'kitchen sink', but I've seen desires expressed  
for more
pluggability; perhaps JSON could be a pluggable layer sitting on top  
of a

raw document store?


I've been thinking about the layering in CouchDB along these lines:

MVCC Store

Replication | Map/Reduce

in order to allow different replication strategies. I think of CouchDB  
as a collection of features that can themselves be plugged together to  
build systems with different semantics.


Rrather than making Couch more pluggable, we could make it more of a  
construction kit. There's already a flavour of this when you look at  
the .ini file that builds up a Couch server from different endpoints  
and daemons.


Antony Blakey
--
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

The truth does not change according to our ability to stomach it.
  -- Flannery O'Connor

[jira] Assigned: (COUCHDB-183) No pagination in Futon for reduce views


 [ 
https://issues.apache.org/jira/browse/COUCHDB-183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christopher Lenz reassigned COUCHDB-183:


Assignee: Christopher Lenz

> No pagination in Futon for reduce views
> ---
>
> Key: COUCHDB-183
> URL: https://issues.apache.org/jira/browse/COUCHDB-183
> Project: CouchDB
>  Issue Type: Bug
>  Components: Administration Console
>Affects Versions: 0.9
>Reporter: Jason Davies
>Assignee: Christopher Lenz
>Priority: Blocker
> Fix For: 0.9
>
> Attachments: futon_reduce_pagination.2.diff, 
> futon_reduce_pagination.diff
>
>
> Futon doesn't support paginating of reduce views at the moment, which can be 
> confusing for new users.  This is due to the difficulty of efficiently 
> working out the total number of rows available from a reduce view.
> I propose displaying something like "Showing x-y rows of unknown" at the 
> bottom, and showing a next/previous link if there are more results to be 
> displayed.  An efficient way to calculate whether there are next/previous 
> results would be to fetch 1 + rows_per_page + 1 (with appropriate offset 
> parameter etc.)
> I did start working on a patch - will post it here when it is done.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: Fail on a simple case on replication

2009-02-24 Thread Brian Candler

On Tue, Feb 24, 2009 at 01:48:56PM +0100, Jan Lehnardt wrote:
>> However, you must then be prepared for your database to be a single  
>> file
>> which grows without bounds. If CouchDB wants to support this model, it 
>> would
>> be helpful if the data were stored in chunks which can be backed up
>> separately.
>
> rsync? :)

Doesn't work especially well on huge files. Indeed, files >2GB aren't
handled well by many systems. If I had to migrate 200GB of data, I'd much
prefer 100 x 2GB than 1 x 200GB. It also has the advantage that 99 of those
files are not changing.

>> Just thinking out loud.
>
> This is quite interesting! :) I'd like to see such a system, but I'd  
> also like
> CouchDB not becoming an Apache-httpd style kitchen-sink for all things
> HTTP. Maybe Yaws is what you're looking for?

Yaws is just another Mochiweb, isn't it?

I'm not sure about 'kitchen sink', but I've seen desires expressed for more
pluggability; perhaps JSON could be a pluggable layer sitting on top of a
raw document store?

Regards,

Brian.

Re: Fail on a simple case on replication

2009-02-24 Thread Brian Candler

On Tue, Feb 24, 2009 at 11:17:20PM +1030, Antony Blakey wrote:
>
> On 24/02/2009, at 11:09 PM, Brian Candler wrote:
>
>> On a random tangent: has anyone considered a CouchDB-like system where
>> documents are raw blobs, rather than JSON? ISTM that:
>
> You'd need some way to attach/inject the metadata in both directions.

There's Content-Type (standard HTTP header in both directions), and there's
_rev (or previous _rev). The latter can be in the URL for a PUT, and perhaps
a header for a GET. If revisions were a document hash, there's the standard
Content-MD5 header, or the (less standard) Content-SHA1 header.

What else am I missing?

[jira] Resolved: (COUCHDB-255) Update MochiWeb


 [ 
https://issues.apache.org/jira/browse/COUCHDB-255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christopher Lenz resolved COUCHDB-255.
--

Resolution: Fixed

I've upgraded the included MochiWeb to r97 in r747575.

> Update MochiWeb
> ---
>
> Key: COUCHDB-255
> URL: https://issues.apache.org/jira/browse/COUCHDB-255
> Project: CouchDB
>  Issue Type: Task
>  Components: HTTP Interface
>Affects Versions: 0.8, 0.8.1, 0.9
>Reporter: Jan Lehnardt
>Assignee: Christopher Lenz
>Priority: Blocker
> Fix For: 0.9
>
>
> http://code.google.com/p/mochiweb/source/detail?r=89

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)

2009-02-24 Thread Paul Davis

On Tue, Feb 24, 2009 at 4:05 PM, Damien Katz  wrote:
>
> On Feb 24, 2009, at 2:13 PM, Paul Davis wrote:
>>
>> I'm a fan of the no-metadata-in-documents concept, but there are some
>> issues both philosophical and practical. Philosophically speaking, as
>> pointed out by the HTTP headers thread, we may be abusing headers when
>> we consider some of the more CouchDB specific concepts, I doubt that
>> there's an existing header for everything we'd need.
>>
>> Secondly _attachments and _rev_info are unbounded. I know there are
>> limits to the number of headers in a request I can only assume that
>> some clients might have limits for responses.
>
> Good points.
>
>>
>> The only thought I had that would satisfy most of the interesting bits
>> I've come up against would be to have two response versions: the raw
>> document body as we have now (minus metadata obviously) that includes
>> the very basic _id and _rev in the headers (I'm assuming there are
>> appropriate headers for these). And a second version that is a
>> multipart mime message that has parts corresponding to the doc body,
>> the longer metadata like _revs_info and then one part per attachment.
>> Including the different parts could be optional. And so far that's
>> missing some stuff like listing attachment info without getting the
>> entire body.
>>
>
> Sounds interesting.
>
>> The real kicker is how do we support clients lacking HTTP-fu. For
>> instance, a quick google [1] suggests that XHR probably isn't capable
>> of dealing with multipart messages.
>
> Stupid reality!
>
>> There's an obvious middle ground
>> that could allow different versions to be returned via URL parameters
>> though, and then maybe provide the "all content as multipart mime" as
>> an option.
>
>
> There could be a format where all the metadata is returned in the top level
> json object, and the json body is returned as a body field.
>

That hadn't occurred to me. I kinda liked the attachments via
multipart mime so i was more forcing everything else into that format.
At the moment I can't decide which form I prefer. I definitely see
sticking to JSON being easier to parse all around, though.

On side note that makes me chuckle, going with headers would probably
push the _rev attribute into an Etag header, thus sidestepping the
renaming issue for that part of the API :D

Anyway, I'll let that percolate a bit and read some more of the db and
replication api's to try and get a better handle on all the different
bits that actually touch the _ metadata.

HTH,
Paul Davis

>>
>> Anyway, that's about as far as I've thought through the different issues.
>
> Right. I've alway thought it might be a good idea to do something like this,
> but there are lots of small issues to work through. I hope someone tries,
> I'd like to see if its workable in practice.
>
> -Damien
>
>

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)

2009-02-24 Thread Paul Davis

On Tue, Feb 24, 2009 at 4:31 PM, Dave Bordoley  wrote:
>> The real kicker is how do we support clients lacking HTTP-fu. For
>> instance, a quick google [1] suggests that XHR probably isn't capable
>> of dealing with multipart messages. There's an obvious middle ground
>> that could allow different versions to be returned via URL parameters
>> though, and then maybe provide the "all content as multipart mime" as
>> an option.
>
> You can generate multipart requests using XHR by manually coding
> together a multipart body and setting the content type header to
> multipart/form-data (why you would ever do this is beyond me, but for
> sake of argument). What you can't do is read a file from disk and
> submit it using XHR due to browser security restrictions, which is why
> ajax apps use hidden iframes to asynchronously upload files.
>
> Dave
>

Dave,

Yeah, but this is concerned with parsing a returned multipart request.

HTH,
Paul Davis

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)

2009-02-24 Thread Dave Bordoley

> The real kicker is how do we support clients lacking HTTP-fu. For
> instance, a quick google [1] suggests that XHR probably isn't capable
> of dealing with multipart messages. There's an obvious middle ground
> that could allow different versions to be returned via URL parameters
> though, and then maybe provide the "all content as multipart mime" as
> an option.

You can generate multipart requests using XHR by manually coding
together a multipart body and setting the content type header to
multipart/form-data (why you would ever do this is beyond me, but for
sake of argument). What you can't do is read a file from disk and
submit it using XHR due to browser security restrictions, which is why
ajax apps use hidden iframes to asynchronously upload files.

Dave

[jira] Commented: (COUCHDB-255) Update MochiWeb


[ 
https://issues.apache.org/jira/browse/COUCHDB-255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676419#action_12676419
 ] 

Christopher Lenz commented on COUCHDB-255:
--

I've updated the vendor branch in r747541.

mochijson2 merges cleanly, no problem there.

However, there's a newly introduced bug in MochiWeb that's breaking our 
redirects, and thus Futon:

  http://code.google.com/p/mochiweb/issues/detail?id=27#c2

I'll wait to see whether/how this is fixed upstream before I proceed with the 
updating.

> Update MochiWeb
> ---
>
> Key: COUCHDB-255
> URL: https://issues.apache.org/jira/browse/COUCHDB-255
> Project: CouchDB
>  Issue Type: Task
>  Components: HTTP Interface
>Affects Versions: 0.8, 0.8.1, 0.9
>Reporter: Jan Lehnardt
>Assignee: Christopher Lenz
>Priority: Blocker
> Fix For: 0.9
>
>
> http://code.google.com/p/mochiweb/source/detail?r=89

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)



On Feb 24, 2009, at 2:13 PM, Paul Davis wrote:


I'm a fan of the no-metadata-in-documents concept, but there are some
issues both philosophical and practical. Philosophically speaking, as
pointed out by the HTTP headers thread, we may be abusing headers when
we consider some of the more CouchDB specific concepts, I doubt that
there's an existing header for everything we'd need.

Secondly _attachments and _rev_info are unbounded. I know there are
limits to the number of headers in a request I can only assume that
some clients might have limits for responses.


Good points.



The only thought I had that would satisfy most of the interesting bits
I've come up against would be to have two response versions: the raw
document body as we have now (minus metadata obviously) that includes
the very basic _id and _rev in the headers (I'm assuming there are
appropriate headers for these). And a second version that is a
multipart mime message that has parts corresponding to the doc body,
the longer metadata like _revs_info and then one part per attachment.
Including the different parts could be optional. And so far that's
missing some stuff like listing attachment info without getting the
entire body.



Sounds interesting.


The real kicker is how do we support clients lacking HTTP-fu. For
instance, a quick google [1] suggests that XHR probably isn't capable
of dealing with multipart messages.


Stupid reality!


There's an obvious middle ground
that could allow different versions to be returned via URL parameters
though, and then maybe provide the "all content as multipart mime" as
an option.



There could be a format where all the metadata is returned in the top  
level json object, and the json body is returned as a body field.




Anyway, that's about as far as I've thought through the different  
issues.


Right. I've alway thought it might be a good idea to do something like  
this, but there are lots of small issues to work through. I hope  
someone tries, I'd like to see if its workable in practice.


-Damien

Re: Lounge clustering framework



On 24 Feb 2009, at 20:54, Shaun Lindsay wrote:


Hey all,
We've been discussing the best way to handle releasing the Lounge  
code and
we have some questions that you, the couch devs, might be able to  
help out

with:

1. What license is preferred?  Since Couch is an Apache project, the  
Apache
license is probably appropriate, however, since the Lounge is more  
or less a

separate entity, we can probably release under any license.  Any
preferences?


If this or parts of this are ever fed back into CouchDB it'd be  
easiest to have

it under Apache License 2.0. Any oder new-style BSD license or MIT will
do, too, but Apache License 2.0 is very much preferred.

*If* any of the code will go into CouchDB we'll also need a contributers
license agreement (CLA)* which is another thing you lawyer might want
to look at, but probably not right now.

* See way down on http://www.apache.org/licenses/


2. Project hosting?  Again, since this is separate from Couch, it  
probably
doesn't make sense to have it in the Couch repo.  We were thinking  
google

code (since we use svn),  but I'm open to whatever.  Thoughts?


We can make it a sub-project of CouchDB eventually (just raising the  
option
for now), but to start off, any open source hoster will do. CouchDB  
used to

be on Google Code and it seems a reasonable choice. Of course, GitHub
is all the rage today and would give you more street-cred but I'd  
understand

if you don't want to switch version control systems.


Once we settle on a license, we'll need to run it by our lawyer to  
make sure
we're solid and, assuming that goes well, we should be good to give  
out the

code.


This is so fricken' awesome! Thanks!

Cheers
Jan
--

Lounge clustering framework

2009-02-24 Thread Shaun Lindsay

Hey all,
We've been discussing the best way to handle releasing the Lounge code and
we have some questions that you, the couch devs, might be able to help out
with:

1. What license is preferred?  Since Couch is an Apache project, the Apache
license is probably appropriate, however, since the Lounge is more or less a
separate entity, we can probably release under any license.  Any
preferences?

2. Project hosting?  Again, since this is separate from Couch, it probably
doesn't make sense to have it in the Couch repo.  We were thinking google
code (since we use svn),  but I'm open to whatever.  Thoughts?

Once we settle on a license, we'll need to run it by our lawyer to make sure
we're solid and, assuming that goes well, we should be good to give out the
code.

Thanks,
Shaun Lindsay
Meebo.com

[jira] Updated: (COUCHDB-183) No pagination in Futon for reduce views

2009-02-24 Thread Jason Davies (JIRA)


 [ 
https://issues.apache.org/jira/browse/COUCHDB-183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Davies updated COUCHDB-183:
-

Attachment: futon_reduce_pagination.2.diff

Updated patch for latest trunk (r747465).  Also the next/prev links are now 
disabled appropriately when at beginning or end of the keyspace.

Still to do: get rid of empty pages at beinning/end of keyspace.  This is a 
nice-to-have feature, and will involve fetching rows_per_page + 1 so we know 
whether to enable next/prev buttons or not.

> No pagination in Futon for reduce views
> ---
>
> Key: COUCHDB-183
> URL: https://issues.apache.org/jira/browse/COUCHDB-183
> Project: CouchDB
>  Issue Type: Bug
>  Components: Administration Console
>Affects Versions: 0.9
>Reporter: Jason Davies
>Priority: Blocker
> Fix For: 0.9
>
> Attachments: futon_reduce_pagination.2.diff, 
> futon_reduce_pagination.diff
>
>
> Futon doesn't support paginating of reduce views at the moment, which can be 
> confusing for new users.  This is due to the difficulty of efficiently 
> working out the total number of rows available from a reduce view.
> I propose displaying something like "Showing x-y rows of unknown" at the 
> bottom, and showing a next/previous link if there are more results to be 
> displayed.  An efficient way to calculate whether there are next/previous 
> results would be to fetch 1 + rows_per_page + 1 (with appropriate offset 
> parameter etc.)
> I did start working on a patch - will post it here when it is done.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)

2009-02-24 Thread Paul Davis

On Tue, Feb 24, 2009 at 1:13 PM, Damien Katz  wrote:
> I'll once again state my objection to the newlines, which is actually kind
> of weak.
>
> If we compute the revids deterministically (hash the canonical doc
> contents), then when we return the document back to the client, we can send
> as an integrity hash the same revid, because it is already pre-computed and
> stored, etc. What it could save us is the CPU cycles of computing the hash.
> I think we also get some nice free caching benefits too, but I'm not sure.
> But if we do, it might even save us the disk reads to get the doc to compute
> the hash. The problem is any standardized canonical representation is
> unlikely to included a newline at the end.
>
> Now I'm not even sure this scheme is workable either way, or only workable
> in very special instances which are too rare to be worth it. But if the
> scheme works, then it can simplify the code and make things more efficient,
> which are 2 very good things. However these benefits may never come, and
> we'll not have the newlines anyway. That would suck.
>
> But the problem if we just add the newlines, then later remove them,
> production apps and scripts that rely on that will break and make the change
> is very painful. Or impossible.
>
> So, those are the issues as I see them.
>
> Now the more I think about it, the more I think that unless we move all
> couchdb metadata to the http header, my ideas won't work. Moving everything
> meta to the header is a big change that has some supporters, but someone
> would need to do the work before it could even be considered.
>
> -Damien
>

I'm a fan of the no-metadata-in-documents concept, but there are some
issues both philosophical and practical. Philosophically speaking, as
pointed out by the HTTP headers thread, we may be abusing headers when
we consider some of the more CouchDB specific concepts, I doubt that
there's an existing header for everything we'd need.

Secondly _attachments and _rev_info are unbounded. I know there are
limits to the number of headers in a request I can only assume that
some clients might have limits for responses.

The only thought I had that would satisfy most of the interesting bits
I've come up against would be to have two response versions: the raw
document body as we have now (minus metadata obviously) that includes
the very basic _id and _rev in the headers (I'm assuming there are
appropriate headers for these). And a second version that is a
multipart mime message that has parts corresponding to the doc body,
the longer metadata like _revs_info and then one part per attachment.
Including the different parts could be optional. And so far that's
missing some stuff like listing attachment info without getting the
entire body.

The real kicker is how do we support clients lacking HTTP-fu. For
instance, a quick google [1] suggests that XHR probably isn't capable
of dealing with multipart messages. There's an obvious middle ground
that could allow different versions to be returned via URL parameters
though, and then maybe provide the "all content as multipart mime" as
an option.

Anyway, that's about as far as I've thought through the different issues.

HTH,
Paul Davis

[1] 
http://groups.google.com/group/mozilla.dev.tech.xml/browse_thread/thread/e1599de6fc31f2e8

> On Feb 24, 2009, at 12:30 PM, Chris Anderson wrote:
>
>> I go to sleep for 8 hours, and this is the thanks I get! ;)
>>
>> But on a more serious note, I think we should pull a hedge fund move,
>> (or maybe quantum entanglement?) and add to the newline patch, some
>> lines that would change the color of the CouchDB logo from red to
>> blue.
>>
>> OK actually - I have a new opinion about the newlines stuff. Since I
>> really don't care all that much, and I don't see a canonical JSON
>> format happening anytime soon, I'm fine with returning newlines at the
>> end of our responses.
>>
>> Some implementation notes:
>>
>> I haven't looked at the patch lately, but I know that there are lots
>> of little places in the code that it will have to touch.
>>
>> Also, and we haven't discussed this nearly as much as we probably
>> should, the implementation of JSONP would be quite similar. To
>> implement JSONP we'll need to do something like this:
>>
>> USER_SPECIFIED_CALLBACK_NAME + "(" + CouchDB's JSON response + ");"
>>
>> So it's like the newline at the end patch, but also at the beginnings...
>>
>> That is all.
>>
>> Chris
>>
>> --
>> Chris Anderson
>> http://jchris.mfdz.com
>
>

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)

That's Ok Noah. Right now, all I've got are some vague ideas, no code.  
I've stated my case, unless someone else has stronger objections (or  
actual code) I'm fine to leave it as is.


-Damien


On Feb 24, 2009, at 1:19 PM, Noah Slater wrote:


On Tue, Feb 24, 2009 at 01:13:31PM -0500, Damien Katz wrote:

I'll once again state my objection to the newlines, which is actually
kind of weak.


Sorry for jumping the gun there Damien.

If you would like to retroactively veto the change, I can back it out.

--
Noah Slater, http://tumbolia.org/nslater

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)

On Tue, Feb 24, 2009 at 10:25:40AM -0800, Chris Anderson wrote:
> What I mean is that there's nothing wrong with calculating revs-hashes
> in a Couch specific way. Bonus points if that way is easy to implement
> for client libs.

Aye, in the docs you simply put:

  Hashes are to be calculated [with|without] the trailing newline character.

Or something to that effect.

-- 
Noah Slater, http://tumbolia.org/nslater

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)

On Tue, Feb 24, 2009 at 10:24 AM, Chris Anderson  wrote:
> I think we have the freedom to get funny with the
> JSON responses.

Please pretend I was articulate and correct with this sentence. Thanks. ;)

What I mean is that there's nothing wrong with calculating revs-hashes
in a Couch specific way. Bonus points if that way is easy to implement
for client libs.

-- 
Chris Anderson
http://jchris.mfdz.com

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)

On Tue, Feb 24, 2009 at 10:13 AM, Damien Katz  wrote:
> I'll once again state my objection to the newlines, which is actually kind
> of weak.
>
> If we compute the revids deterministically (hash the canonical doc
> contents), then when we return the document back to the client, we can send
> as an integrity hash the same revid, because it is already pre-computed and
> stored, etc. What it could save us is the CPU cycles of computing the hash.
> I think we also get some nice free caching benefits too, but I'm not sure.
> But if we do, it might even save us the disk reads to get the doc to compute
> the hash. The problem is any standardized canonical representation is
> unlikely to included a newline at the end.
>
> Now I'm not even sure this scheme is workable either way, or only workable
> in very special instances which are too rare to be worth it. But if the
> scheme works, then it can simplify the code and make things more efficient,
> which are 2 very good things. However these benefits may never come, and
> we'll not have the newlines anyway. That would suck.
>
> But the problem if we just add the newlines, then later remove them,
> production apps and scripts that rely on that will break and make the change
> is very painful. Or impossible.
>

I don't see why we couldn't include the newlines in the input to the
hash function... For something as intimately related to CouchDB as the
calculation of revs, I think we have the freedom to get funny with the
JSON responses.

I think your point about getting the metadata out of the document
could also be accomplished by defining our function from a JSON object
to a hashable string, as one that ignores the value of _rev...

Having a special CouchDB hashable-doc function is not the prettiest
thing in the world, but we already have a bunch of other CouchDB -
only API stuff (like rereduce, etc) so I don't think it crosses an
important line.

Chris

-- 
Chris Anderson
http://jchris.mfdz.com

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)

On Tue, Feb 24, 2009 at 01:13:31PM -0500, Damien Katz wrote:
> I'll once again state my objection to the newlines, which is actually
> kind of weak.

Sorry for jumping the gun there Damien.

If you would like to retroactively veto the change, I can back it out.

-- 
Noah Slater, http://tumbolia.org/nslater

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)

I'll once again state my objection to the newlines, which is actually  
kind of weak.


If we compute the revids deterministically (hash the canonical doc  
contents), then when we return the document back to the client, we can  
send as an integrity hash the same revid, because it is already pre- 
computed and stored, etc. What it could save us is the CPU cycles of  
computing the hash. I think we also get some nice free caching  
benefits too, but I'm not sure. But if we do, it might even save us  
the disk reads to get the doc to compute the hash. The problem is any  
standardized canonical representation is unlikely to included a  
newline at the end.


Now I'm not even sure this scheme is workable either way, or only  
workable in very special instances which are too rare to be worth it.  
But if the scheme works, then it can simplify the code and make things  
more efficient, which are 2 very good things. However these benefits  
may never come, and we'll not have the newlines anyway. That would suck.


But the problem if we just add the newlines, then later remove them,  
production apps and scripts that rely on that will break and make the  
change is very painful. Or impossible.


So, those are the issues as I see them.

Now the more I think about it, the more I think that unless we move  
all couchdb metadata to the http header, my ideas won't work. Moving  
everything meta to the header is a big change that has some  
supporters, but someone would need to do the work before it could even  
be considered.


-Damien

On Feb 24, 2009, at 12:30 PM, Chris Anderson wrote:


I go to sleep for 8 hours, and this is the thanks I get! ;)

But on a more serious note, I think we should pull a hedge fund move,
(or maybe quantum entanglement?) and add to the newline patch, some
lines that would change the color of the CouchDB logo from red to
blue.

OK actually - I have a new opinion about the newlines stuff. Since I
really don't care all that much, and I don't see a canonical JSON
format happening anytime soon, I'm fine with returning newlines at the
end of our responses.

Some implementation notes:

I haven't looked at the patch lately, but I know that there are lots
of little places in the code that it will have to touch.

Also, and we haven't discussed this nearly as much as we probably
should, the implementation of JSONP would be quite similar. To
implement JSONP we'll need to do something like this:

USER_SPECIFIED_CALLBACK_NAME + "(" + CouchDB's JSON response + ");"

So it's like the newline at the end patch, but also at the  
beginnings...


That is all.

Chris

--
Chris Anderson
http://jchris.mfdz.com

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)

On Tue, Feb 24, 2009 at 06:39:32PM +0100, Jan Lehnardt wrote:
> or just commit the damn patch :)

Committed as r747465. Rock on.

-- 
Noah Slater, http://tumbolia.org/nslater

[jira] Closed: (COUCHDB-107) [PATCH] End JSON responses with newline char

2009-02-24 Thread Noah Slater (JIRA)


 [ 
https://issues.apache.org/jira/browse/COUCHDB-107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noah Slater closed COUCHDB-107.
---

Resolution: Fixed

Patch edited and committed as r747465.

> [PATCH] End JSON responses with newline char
> 
>
> Key: COUCHDB-107
> URL: https://issues.apache.org/jira/browse/COUCHDB-107
> Project: CouchDB
>  Issue Type: Improvement
>  Components: HTTP Interface
>Affects Versions: 0.9
>Reporter: Chris Anderson
>Priority: Blocker
> Fix For: 0.9
>
> Attachments: newline.diff, newline2.patch
>
>
> This patch adds a newline character to all JSON responses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (COUCHDB-255) Update MochiWeb

2009-02-24 Thread Chris Anderson (JIRA)


[ 
https://issues.apache.org/jira/browse/COUCHDB-255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676350#action_12676350
 ] 

Chris Anderson commented on COUCHDB-255:


One thing to take into account here is that our mochijson2 may not be the same 
as theirs. It's at least worth a look to make sure, before replacing ours.

> Update MochiWeb
> ---
>
> Key: COUCHDB-255
> URL: https://issues.apache.org/jira/browse/COUCHDB-255
> Project: CouchDB
>  Issue Type: Task
>  Components: HTTP Interface
>Affects Versions: 0.8, 0.8.1, 0.9
>Reporter: Jan Lehnardt
>Assignee: Christopher Lenz
>Priority: Blocker
> Fix For: 0.9
>
>
> http://code.google.com/p/mochiweb/source/detail?r=89

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (COUCHDB-266) PUTting json docs > 1MB causes Uncaught error in HTTP request: {exit,{body_too_large,content_length}}

2009-02-24 Thread Chris Anderson (JIRA)


[ 
https://issues.apache.org/jira/browse/COUCHDB-266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676348#action_12676348
 ] 

Chris Anderson commented on COUCHDB-266:


Thanks cmlenz!

> PUTting json docs > 1MB causes Uncaught error in HTTP request: 
> {exit,{body_too_large,content_length}} 
> --
>
> Key: COUCHDB-266
> URL: https://issues.apache.org/jira/browse/COUCHDB-266
> Project: CouchDB
>  Issue Type: Bug
>  Components: HTTP Interface
>Affects Versions: 0.9
> Environment:  Apache CouchDB 0.9.0a747258
>Reporter: Jeff Hinrichs
>Assignee: Christopher Lenz
> Fix For: 0.9
>
>
> error displays itself when trying to PUT  a json document that is > 1MB.  
> First noticed in the python interface, confirmed with curl
> [Tue, 24 Feb 2009 13:30:00 GMT] [error] [<0.1113.0>] Uncaught error in HTTP 
> request: {exit,{body_too_large,content_length}}
> 2 
> 3 [Tue, 24 Feb 2009 13:30:00 GMT] [debug] [<0.1113.0>] Stacktrace: 
> [{mochiweb_request,stream_body,5},
> 4 {mochiweb_request,recv_body,2},
> 5 {couch_httpd,json_body,1},
> 6 {couch_httpd_db,db_doc_req,3},
> 7 {couch_httpd_db,do_db_req,2},
> 8 {couch_httpd,handle_request,3},
> 9 {mochiweb_http,headers,4},
> 10{proc_lib,init_p,5}] 
> modifying src/mochiweb/mochiweb_request.erl > -define(MAX_RECV_BODY, 
> (1024*1024)) 
> to something bigger, say -define(MAX_RECV_BODY, (1024*1024*16))
> alleviates the problem temporarily.
> issue confirmed by cmlenz on irc, he believed it to be a regression

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [jira] Created: (COUCHDB-266) PUTting json docs > 1MB causes Uncaught error in HTTP request: {exit,{body_too_large,content_length}}

On Tue, Feb 24, 2009 at 5:52 AM, Jeff Hinrichs (JIRA)  wrote:
> PUTting json docs > 1MB causes Uncaught error in HTTP request: 
> {exit,{body_too_large,content_length}}
> --
>
>                 Key: COUCHDB-266
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-266
>             Project: CouchDB
>          Issue Type: Bug
>          Components: HTTP Interface
>    Affects Versions: 0.9
>         Environment:  Apache CouchDB 0.9.0a747258
>            Reporter: Jeff Hinrichs
>
>
> error displays itself when trying to PUT  a json document that is > 1MB.  
> First noticed in the python interface, confirmed with curl
>
> [Tue, 24 Feb 2009 13:30:00 GMT] [error] [<0.1113.0>] Uncaught error in HTTP 
> request: {exit,{body_too_large,content_length}}
> 2
> 3       [Tue, 24 Feb 2009 13:30:00 GMT] [debug] [<0.1113.0>] Stacktrace: 
> [{mochiweb_request,stream_body,5},
> 4       {mochiweb_request,recv_body,2},
> 5       {couch_httpd,json_body,1},
> 6       {couch_httpd_db,db_doc_req,3},
> 7       {couch_httpd_db,do_db_req,2},
> 8       {couch_httpd,handle_request,3},
> 9       {mochiweb_http,headers,4},
> 10      {proc_lib,init_p,5}]
>
> modifying src/mochiweb/mochiweb_request.erl > -define(MAX_RECV_BODY, 
> (1024*1024))
> to something bigger, say -define(MAX_RECV_BODY, (1024*1024*16))
> alleviates the problem temporarily.
>
> issue confirmed by cmlenz on irc, he believed it to be a regression

This makes sense to me. It's probably due to my Mochiweb patch, which
created an alternate interface for streaming big docs. Using that
interface should fix things. I'll dig in here.

Chris

-- 
Chris Anderson
http://jchris.mfdz.com

Re: Stats

On Tue, Feb 24, 2009 at 9:50 AM, Jan Lehnardt  wrote:
>
> They are tracked in the HTTP layer right now. I think what you're asking for
> is collecting the same stats one layer down for potential clients that don't
> use
> the HTTP API?
>
> The same applies to the `document_*` keys. Maybe they should be the ones
> moved one layer down?
>

It'd be great to have the stats at the lowest level that makes sense.
As we get less volatile in the Erlang code, I expect a lot of people
will be linking directly.


-- 
Chris Anderson
http://jchris.mfdz.com

Re: Fail on a simple case on replication

On Tue, Feb 24, 2009 at 3:50 AM, Damien Katz  wrote:
>
> With Chris Andersons's "show" document and "list" view work, we have the
> beginnings of that.
>

I was just going to reply with this point. The only thing I see as
missing to make CouchDB fully "RESTful" is hypermedia. When the
representational states are linked together in a way that can be
browsed, then we've got "high REST". Show and list do that, but they
require developer work. I like to call CouchDB RESTy, and with a
little bit of dev work, it can be RESTful.

This is exactly the slide that comes before my explanation of show and
list, in my latest talk notes: "But that's not REST!"

On another note, I was thinking about it some more, and I think that
renaming _rev to _cc would be a huge pain in the ass for a lot of
people (who don't go around abusing it) and it can probably be
avoided.

The only valid use case for requesting a particular _rev of a
document, is in resolving conflicts introduced by replication. So if
we restrict access to old revs (by default) to an endpoint which gives
an array of documents (each conflicted rev) then it won't be usable as
a revision control system, only as a conflict resolution system. If
there's not an easy way to think you have implemented a version
control system (eg no API endpoint for accessing non-conflicting revs)
I bet we'll see misapprehension of _rev happen a lot less.

Chris

Stealing from Damien's blog header for my one-time special sig:

EVERYBODY KEEPS ON TALKING ABOUT IT
NOBODY'S GETTING IT DONE

(applies to myself as well)

-- 
Chris Anderson
http://jchris.mfdz.com

Re: Stats



On 24 Feb 2009, at 18:34, Damien Katz wrote:


I'd suggest
- to move the `document_*` keys from `httpd` to `couchdb`,


Yes.



- to rename `httpd` to `http`.


I think because replication also makes http requests, it should stay  
as is.



Is there anything else that you think should look different?

MOVE and COPY should be httpd stats, and shouldn't be tracked at  
the  db level, except as the underlying create and deletion stats.


They are tracked in the HTTP layer right now. I think what you're  
asking for
is collecting the same stats one layer down for potential clients that  
don't use

the HTTP API?

The same applies to the `document_*` keys. Maybe they should be the ones
moved one layer down?

Cheers
Jan
--

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)



On 24 Feb 2009, at 18:34, Noah Slater wrote:


On Tue, Feb 24, 2009 at 09:30:40AM -0800, Chris Anderson wrote:

OK actually - I have a new opinion about the newlines stuff. Since I
really don't care all that much, and I don't see a canonical JSON
format happening anytime soon, I'm fine with returning newlines at  
the

end of our responses.


So, that sounds like a withdrawal of your veto.

... if Jan is happy to withdraw his as well, I can call another vote.


or just commit the damn patch :)

Re: Stats



On Feb 23, 2009, at 9:51 AM, Jan Lehnardt wrote:



On 22 Feb 2009, at 15:06, Jan Lehnardt wrote:
I mentioned this in an earlier mail but I'd like to bring it up  
again,

since your input is needed here. Metrics are identified with a
tuple `{Module, Key}`. `Module` is the module that initiates the
counting of the metric and `Key` is a uniquely identifies a metric
within a module. Until now, Alex and I just made up names as
they came up without much considering a consistent and intuitive
naming scheme. It would be great if you could help out checking
if the names are any good and suggest alternatives.


So far we have:

{couchdb, open_databases}
{couchdb, request_time}

{httpd, bulk_requests}
{httpd, head_requests}
{httpd, get_requests}
{httpd, put_requests}
{httpd, post_requests}
{httpd, delete_requests}
{httpd, copy_requests}
{httpd, move_requests}

{httpd, document_copies}
{httpd, document_creates}
{httpd, document_deletes}
{httpd, document_moves}
{httpd, document_reads}
{httpd, document_updates}
{httpd, requests}
{httpd, temporary_view_reads}
{httpd, view_reads}

{http_status_codes, Code} (Code is one of 200, 201, 203 ... )


I'd suggest
- to move the `document_*` keys from `httpd` to `couchdb`,


Yes.



- to rename `httpd` to `http`.


I think because replication also makes http requests, it should stay  
as is.





Is there anything else that you think should look different?




MOVE and COPY should be httpd stats, and shouldn't be tracked at the   
db level, except as the underlying create and deletion stats.


-Damien

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)

On Tue, Feb 24, 2009 at 09:30:40AM -0800, Chris Anderson wrote:
> OK actually - I have a new opinion about the newlines stuff. Since I
> really don't care all that much, and I don't see a canonical JSON
> format happening anytime soon, I'm fine with returning newlines at the
> end of our responses.

So, that sounds like a withdrawal of your veto.

... if Jan is happy to withdraw his as well, I can call another vote.

Best,

-- 
Noah Slater, http://tumbolia.org/nslater

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)

I go to sleep for 8 hours, and this is the thanks I get! ;)

But on a more serious note, I think we should pull a hedge fund move,
(or maybe quantum entanglement?) and add to the newline patch, some
lines that would change the color of the CouchDB logo from red to
blue.

OK actually - I have a new opinion about the newlines stuff. Since I
really don't care all that much, and I don't see a canonical JSON
format happening anytime soon, I'm fine with returning newlines at the
end of our responses.

Some implementation notes:

I haven't looked at the patch lately, but I know that there are lots
of little places in the code that it will have to touch.

Also, and we haven't discussed this nearly as much as we probably
should, the implementation of JSONP would be quite similar. To
implement JSONP we'll need to do something like this:

USER_SPECIFIED_CALLBACK_NAME + "(" + CouchDB's JSON response + ");"

So it's like the newline at the end patch, but also at the beginnings...

That is all.

Chris

-- 
Chris Anderson
http://jchris.mfdz.com

Re: ACID vs BASE

2009-02-24 Thread Zachary Zolton

@jan

And, it looks like they're doing both...?

"Because of the replication lag we mentioned earlier, however, you
might not see the change you just made! This experience is very
confusing for a user and also leads to double posting. We got around
this concern by setting a cookie in your browser with the current time
whenever you write something to our databases. The load balancer also
looks for that cookie and, if it notices that you wrote something
within 20 seconds, will unconditionally send you to California."

Though, instead of always choosing California over Virgina, I'd store
whether or not they should hit A or B, I guess.

Anyways, thanks for all the pointers!

-ZZ

On Tue, Feb 24, 2009 at 10:11 AM, Jan Lehnardt  wrote:
>
> On 24 Feb 2009, at 17:03, Zachary Zolton wrote:
>
>> Thanks for the reply! It looks like they go into the more advanced
>> Bayou consistency, and Byzantine failure modes, but I don't think I'll
>> need to cover that soon...
>>
>> But a more important question:
>>
>> If I have two couch servers: A and B
>>
>> And, I want to load-balance users between them, would it be the
>> responsibility of the web/app servers to ensure that a user session
>> "sticks" to either A or B, after performing a write/update? At least
>> until the data has had a chance to replicate between servers...?
>>
>> (I'm guessing this is what all the "monotonic updates discussion is
>> about...?)
>
> That would be one solution*, yes. Another would be to employ a write-through
> memcache.
>
> * http://www.facebook.com/note.php?note_id=23844338919&id=9445547199&index=0
>
> Cheers
>
> Jan
> --
>
>
>>
>> On Tue, Feb 24, 2009 at 9:54 AM, Jan Lehnardt  wrote:
>>>
>>> On 24 Feb 2009, at 16:49, Zachary Zolton wrote:
>>>
 As a developer (without an advanced degree :^P) trying to understand
 Eventual Consistency, I happened upon these slides:

 http://www.cs.berkeley.edu/~istoica/classes/cs268/06/notes/20-BFTx2.pdf

 I know consistency models are a hot topic around here, so I thought
 I'd ask if this would make a good introductory text for me to explain
 the techniques to some colleagues of mine. Or does anyone take
 theoretical issue with it contents?
>>>
>>> I skimmed the contents and it looks cool to me for an introduction.
>>>
>>> Cheers
>>> Jan
>>> --
>>>
>>>
>>
>
>

Re: ACID vs BASE



On 24 Feb 2009, at 17:03, Zachary Zolton wrote:


Thanks for the reply! It looks like they go into the more advanced
Bayou consistency, and Byzantine failure modes, but I don't think I'll
need to cover that soon...

But a more important question:

If I have two couch servers: A and B

And, I want to load-balance users between them, would it be the
responsibility of the web/app servers to ensure that a user session
"sticks" to either A or B, after performing a write/update? At least
until the data has had a chance to replicate between servers...?

(I'm guessing this is what all the "monotonic updates discussion is  
about...?)


That would be one solution*, yes. Another would be to employ a write- 
through

memcache.

* http://www.facebook.com/note.php?note_id=23844338919&id=9445547199&index=0

Cheers

Jan
--




On Tue, Feb 24, 2009 at 9:54 AM, Jan Lehnardt  wrote:


On 24 Feb 2009, at 16:49, Zachary Zolton wrote:


As a developer (without an advanced degree :^P) trying to understand
Eventual Consistency, I happened upon these slides:

http://www.cs.berkeley.edu/~istoica/classes/cs268/06/notes/20-BFTx2.pdf

I know consistency models are a hot topic around here, so I thought
I'd ask if this would make a good introductory text for me to  
explain

the techniques to some colleagues of mine. Or does anyone take
theoretical issue with it contents?


I skimmed the contents and it looks cool to me for an introduction.

Cheers
Jan
--

Re: ACID vs BASE

2009-02-24 Thread Zachary Zolton

Thanks for the reply! It looks like they go into the more advanced
Bayou consistency, and Byzantine failure modes, but I don't think I'll
need to cover that soon...

But a more important question:

If I have two couch servers: A and B

And, I want to load-balance users between them, would it be the
responsibility of the web/app servers to ensure that a user session
"sticks" to either A or B, after performing a write/update? At least
until the data has had a chance to replicate between servers...?

(I'm guessing this is what all the "monotonic updates discussion is about...?)

On Tue, Feb 24, 2009 at 9:54 AM, Jan Lehnardt  wrote:
>
> On 24 Feb 2009, at 16:49, Zachary Zolton wrote:
>
>> As a developer (without an advanced degree :^P) trying to understand
>> Eventual Consistency, I happened upon these slides:
>>
>> http://www.cs.berkeley.edu/~istoica/classes/cs268/06/notes/20-BFTx2.pdf
>>
>> I know consistency models are a hot topic around here, so I thought
>> I'd ask if this would make a good introductory text for me to explain
>> the techniques to some colleagues of mine. Or does anyone take
>> theoretical issue with it contents?
>
> I skimmed the contents and it looks cool to me for an introduction.
>
> Cheers
> Jan
> --
>
>

Re: ACID vs BASE



On 24 Feb 2009, at 16:49, Zachary Zolton wrote:


As a developer (without an advanced degree :^P) trying to understand
Eventual Consistency, I happened upon these slides:

http://www.cs.berkeley.edu/~istoica/classes/cs268/06/notes/20- 
BFTx2.pdf


I know consistency models are a hot topic around here, so I thought
I'd ask if this would make a good introductory text for me to explain
the techniques to some colleagues of mine. Or does anyone take
theoretical issue with it contents?


I skimmed the contents and it looks cool to me for an introduction.

Cheers
Jan
--

ACID vs BASE

2009-02-24 Thread Zachary Zolton

As a developer (without an advanced degree :^P) trying to understand
Eventual Consistency, I happened upon these slides:

http://www.cs.berkeley.edu/~istoica/classes/cs268/06/notes/20-BFTx2.pdf

I know consistency models are a hot topic around here, so I thought
I'd ask if this would make a good introductory text for me to explain
the techniques to some colleagues of mine. Or does anyone take
theoretical issue with it contents?

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)



On 24 Feb 2009, at 14:02, Noah Slater wrote:


On Tue, Feb 24, 2009 at 10:48:31PM +1030, Antony Blakey wrote:

On 24/02/2009, at 10:34 PM, Noah Slater wrote:

On Tue, Feb 24, 2009 at 10:14:04PM +1030, Antony Blakey wrote:
I'm a bit confused about this. Excuse me while I tread carefully.  
It

seems that the community vote is clearly a majority to accept the
patch.
If the end result of this vote is that we don't follow that vote
because
it's only the PMC vote that counts, doesn't that mean that  
community

votes are irrelevant?


Anthony, I am upset that you seem to be wanting to cause trouble
again.


It was a question Noah.

Your characterization of my intent as being 'to cause trouble', and
furthermore 'again', is entirely erroneous.


No, it most certainly was not just a question.

Instead of asking how community votes would be factored into the  
final result,
you constructed a hypothetical that frames the PMC as a  
dictatorship, doing as
it pleases regardless of community feedback. You then use this  
hypothetical to
draw the absurd conclusion that "community votes are irrelevant" and  
then seek

an explicit refutation.

You have consistently used this style of framing. It seems that you  
presume the
worst in people and processes instead of giving us the courteous  
benefit of the
doubt. It has been consistently pointed out to you that this causes  
offence and
unnecessary friction, and you have been politely asked to think  
about other ways

of approaching constructive discussion.


+1

Jan
--

[jira] Assigned: (COUCHDB-255) Update MochiWeb


 [ 
https://issues.apache.org/jira/browse/COUCHDB-255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christopher Lenz reassigned COUCHDB-255:


Assignee: Christopher Lenz

> Update MochiWeb
> ---
>
> Key: COUCHDB-255
> URL: https://issues.apache.org/jira/browse/COUCHDB-255
> Project: CouchDB
>  Issue Type: Task
>  Components: HTTP Interface
>Affects Versions: 0.8, 0.8.1, 0.9
>Reporter: Jan Lehnardt
>Assignee: Christopher Lenz
>Priority: Blocker
> Fix For: 0.9
>
>
> http://code.google.com/p/mochiweb/source/detail?r=89

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)



On 25/02/2009, at 12:45 AM, Noah Slater wrote:


On Wed, Feb 25, 2009 at 12:29:08AM +1030, Antony Blakey wrote:

My suggestion arising from this is that voting a community vote where
everyone states their case and casts a vote is followed by a PMC
decision. It seems (to me) confusing to have this multi-class voting
system which conflates two separate processes.

The PMC decision would only need explanation if it were a veto of a
community vote.


No, this jars strongly with my understanding of the Apache Way.

 Who is permitted to vote is, to some extent, a community-specific  
thing.
 However, the basic rule is that only PMC members have binding  
votes, and all
 others are either discouraged from voting (to keep the noise down)  
or else

 have their votes considered of an indicative or advisory nature only.

-- http://www.apache.org/foundation/voting.html

I think we have followed this official document to the letter, and  
in spirit.


You are suggesting that Apache voting should be done by the  
community at large,
and then the PMC should decide in private which option to take. This  
is not how
things are meant to work. We had one vote which was opened up to  
everyone. We
have properly taken advice from the community, but would like to  
have further

discussion before we accept or reject the change.


OK, that's clear.

Antony Blakey
-
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

In anything at all, perfection is finally attained not when there is  
no longer anything to add, but when there is no longer anything to  
take away.

  -- Antoine de Saint-Exupery

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)

On Wed, Feb 25, 2009 at 12:29:08AM +1030, Antony Blakey wrote:
> My suggestion arising from this is that voting a community vote where
> everyone states their case and casts a vote is followed by a PMC
> decision. It seems (to me) confusing to have this multi-class voting
> system which conflates two separate processes.
>
> The PMC decision would only need explanation if it were a veto of a
> community vote.

No, this jars strongly with my understanding of the Apache Way.

  Who is permitted to vote is, to some extent, a community-specific thing.
  However, the basic rule is that only PMC members have binding votes, and all
  others are either discouraged from voting (to keep the noise down) or else
  have their votes considered of an indicative or advisory nature only.

 -- http://www.apache.org/foundation/voting.html

I think we have followed this official document to the letter, and in spirit.

You are suggesting that Apache voting should be done by the community at large,
and then the PMC should decide in private which option to take. This is not how
things are meant to work. We had one vote which was opened up to everyone. We
have properly taken advice from the community, but would like to have further
discussion before we accept or reject the change.

Thanks,

-- 
Noah Slater, http://tumbolia.org/nslater

[jira] Commented: (COUCHDB-266) PUTting json docs > 1MB causes Uncaught error in HTTP request: {exit,{body_too_large,content_length}}


[ 
https://issues.apache.org/jira/browse/COUCHDB-266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676285#action_12676285
 ] 

Christopher Lenz commented on COUCHDB-266:
--

For the record, here's the MochiWeb issue: 
http://code.google.com/p/mochiweb/issues/detail?id=30

> PUTting json docs > 1MB causes Uncaught error in HTTP request: 
> {exit,{body_too_large,content_length}} 
> --
>
> Key: COUCHDB-266
> URL: https://issues.apache.org/jira/browse/COUCHDB-266
> Project: CouchDB
>  Issue Type: Bug
>  Components: HTTP Interface
>Affects Versions: 0.9
> Environment:  Apache CouchDB 0.9.0a747258
>Reporter: Jeff Hinrichs
>Assignee: Christopher Lenz
> Fix For: 0.9
>
>
> error displays itself when trying to PUT  a json document that is > 1MB.  
> First noticed in the python interface, confirmed with curl
> [Tue, 24 Feb 2009 13:30:00 GMT] [error] [<0.1113.0>] Uncaught error in HTTP 
> request: {exit,{body_too_large,content_length}}
> 2 
> 3 [Tue, 24 Feb 2009 13:30:00 GMT] [debug] [<0.1113.0>] Stacktrace: 
> [{mochiweb_request,stream_body,5},
> 4 {mochiweb_request,recv_body,2},
> 5 {couch_httpd,json_body,1},
> 6 {couch_httpd_db,db_doc_req,3},
> 7 {couch_httpd_db,do_db_req,2},
> 8 {couch_httpd,handle_request,3},
> 9 {mochiweb_http,headers,4},
> 10{proc_lib,init_p,5}] 
> modifying src/mochiweb/mochiweb_request.erl > -define(MAX_RECV_BODY, 
> (1024*1024)) 
> to something bigger, say -define(MAX_RECV_BODY, (1024*1024*16))
> alleviates the problem temporarily.
> issue confirmed by cmlenz on irc, he believed it to be a regression

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (COUCHDB-266) PUTting json docs > 1MB causes Uncaught error in HTTP request: {exit,{body_too_large,content_length}}


 [ 
https://issues.apache.org/jira/browse/COUCHDB-266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christopher Lenz resolved COUCHDB-266.
--

   Resolution: Fixed
Fix Version/s: 0.9

Patch applied in r747381.

> PUTting json docs > 1MB causes Uncaught error in HTTP request: 
> {exit,{body_too_large,content_length}} 
> --
>
> Key: COUCHDB-266
> URL: https://issues.apache.org/jira/browse/COUCHDB-266
> Project: CouchDB
>  Issue Type: Bug
>  Components: HTTP Interface
>Affects Versions: 0.9
> Environment:  Apache CouchDB 0.9.0a747258
>Reporter: Jeff Hinrichs
>Assignee: Christopher Lenz
> Fix For: 0.9
>
>
> error displays itself when trying to PUT  a json document that is > 1MB.  
> First noticed in the python interface, confirmed with curl
> [Tue, 24 Feb 2009 13:30:00 GMT] [error] [<0.1113.0>] Uncaught error in HTTP 
> request: {exit,{body_too_large,content_length}}
> 2 
> 3 [Tue, 24 Feb 2009 13:30:00 GMT] [debug] [<0.1113.0>] Stacktrace: 
> [{mochiweb_request,stream_body,5},
> 4 {mochiweb_request,recv_body,2},
> 5 {couch_httpd,json_body,1},
> 6 {couch_httpd_db,db_doc_req,3},
> 7 {couch_httpd_db,do_db_req,2},
> 8 {couch_httpd,handle_request,3},
> 9 {mochiweb_http,headers,4},
> 10{proc_lib,init_p,5}] 
> modifying src/mochiweb/mochiweb_request.erl > -define(MAX_RECV_BODY, 
> (1024*1024)) 
> to something bigger, say -define(MAX_RECV_BODY, (1024*1024*16))
> alleviates the problem temporarily.
> issue confirmed by cmlenz on irc, he believed it to be a regression

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)



On 24/02/2009, at 10:35 PM, Jan Lehnardt wrote:


No, you're absolutely right on the "Accept the patch" branch. But
there are enough community -1s to keep this open. A single
community -1 should be addressed in an ASF vote.


My suggestion arising from this is that voting a community vote where  
everyone states their case and casts a vote is followed by a PMC  
decision. It seems (to me) confusing to have this multi-class voting  
system which conflates two separate processes.


The PMC decision would only need explanation if it were a veto of a  
community vote.


Antony Blakey
-
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

A Man may make a Remark –
In itself – a quiet thing
That may furnish the Fuse unto a Spark
In dormant nature – lain –

Let us divide – with skill –
Let us discourse – with care –
Powder exists in Charcoal –
Before it exists in Fire –

  -– Emily Dickinson 913 (1865)

[jira] Commented: (COUCHDB-266) PUTting json docs > 1MB causes Uncaught error in HTTP request: {exit,{body_too_large,content_length}}


[ 
https://issues.apache.org/jira/browse/COUCHDB-266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676278#action_12676278
 ] 

Christopher Lenz commented on COUCHDB-266:
--

First, here's a simple way to reproduce this problem:

$ ls -lh big.json
-rw-r--r--  1 chris  staff   1,1M 24 Feb 14:36 big.json

$ curl -T big.json http://localhost:5984/testing/big
curl: (55) Send failure: Broken pipe
{"error":"body_too_large","reason":"content_length"}


The problem is caused by a mistake in the mochiweb_request:stream_body/4 that 
jchris submitted to MochiWeb. I've confirmed that the bug is also in the 
Mochiweb repository, and will create an issue for that project.

The patch is simple enough:

Index: src/mochiweb/mochiweb_request.erl 
=== 
--- src/mochiweb/mochiweb_request.erl   (revision 747380) 
+++ src/mochiweb/mochiweb_request.erl   (working copy) 
@@ -182,7 +182,7 @@ 
 true ->  
 {NewLength, [Bin | BinAcc]} 
 end 
-end, {0, []}, ?MAX_RECV_BODY), 
+end, {0, []}, MaxBody), 
 put(?SAVE_BODY, Body), 
 Body. 



> PUTting json docs > 1MB causes Uncaught error in HTTP request: 
> {exit,{body_too_large,content_length}} 
> --
>
> Key: COUCHDB-266
> URL: https://issues.apache.org/jira/browse/COUCHDB-266
> Project: CouchDB
>  Issue Type: Bug
>  Components: HTTP Interface
>Affects Versions: 0.9
> Environment:  Apache CouchDB 0.9.0a747258
>Reporter: Jeff Hinrichs
>Assignee: Christopher Lenz
>
> error displays itself when trying to PUT  a json document that is > 1MB.  
> First noticed in the python interface, confirmed with curl
> [Tue, 24 Feb 2009 13:30:00 GMT] [error] [<0.1113.0>] Uncaught error in HTTP 
> request: {exit,{body_too_large,content_length}}
> 2 
> 3 [Tue, 24 Feb 2009 13:30:00 GMT] [debug] [<0.1113.0>] Stacktrace: 
> [{mochiweb_request,stream_body,5},
> 4 {mochiweb_request,recv_body,2},
> 5 {couch_httpd,json_body,1},
> 6 {couch_httpd_db,db_doc_req,3},
> 7 {couch_httpd_db,do_db_req,2},
> 8 {couch_httpd,handle_request,3},
> 9 {mochiweb_http,headers,4},
> 10{proc_lib,init_p,5}] 
> modifying src/mochiweb/mochiweb_request.erl > -define(MAX_RECV_BODY, 
> (1024*1024)) 
> to something bigger, say -define(MAX_RECV_BODY, (1024*1024*16))
> alleviates the problem temporarily.
> issue confirmed by cmlenz on irc, he believed it to be a regression

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (COUCHDB-266) PUTting json docs > 1MB causes Uncaught error in HTTP request: {exit,{body_too_large,content_length}}


 [ 
https://issues.apache.org/jira/browse/COUCHDB-266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christopher Lenz reassigned COUCHDB-266:


Assignee: Christopher Lenz

> PUTting json docs > 1MB causes Uncaught error in HTTP request: 
> {exit,{body_too_large,content_length}} 
> --
>
> Key: COUCHDB-266
> URL: https://issues.apache.org/jira/browse/COUCHDB-266
> Project: CouchDB
>  Issue Type: Bug
>  Components: HTTP Interface
>Affects Versions: 0.9
> Environment:  Apache CouchDB 0.9.0a747258
>Reporter: Jeff Hinrichs
>Assignee: Christopher Lenz
>
> error displays itself when trying to PUT  a json document that is > 1MB.  
> First noticed in the python interface, confirmed with curl
> [Tue, 24 Feb 2009 13:30:00 GMT] [error] [<0.1113.0>] Uncaught error in HTTP 
> request: {exit,{body_too_large,content_length}}
> 2 
> 3 [Tue, 24 Feb 2009 13:30:00 GMT] [debug] [<0.1113.0>] Stacktrace: 
> [{mochiweb_request,stream_body,5},
> 4 {mochiweb_request,recv_body,2},
> 5 {couch_httpd,json_body,1},
> 6 {couch_httpd_db,db_doc_req,3},
> 7 {couch_httpd_db,do_db_req,2},
> 8 {couch_httpd,handle_request,3},
> 9 {mochiweb_http,headers,4},
> 10{proc_lib,init_p,5}] 
> modifying src/mochiweb/mochiweb_request.erl > -define(MAX_RECV_BODY, 
> (1024*1024)) 
> to something bigger, say -define(MAX_RECV_BODY, (1024*1024*16))
> alleviates the problem temporarily.
> issue confirmed by cmlenz on irc, he believed it to be a regression

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (COUCHDB-266) PUTting json docs > 1MB causes Uncaught error in HTTP request: {exit,{body_too_large,content_length}}

2009-02-24 Thread Jeff Hinrichs (JIRA)

PUTting json docs > 1MB causes Uncaught error in HTTP request: 
{exit,{body_too_large,content_length}} 
--

 Key: COUCHDB-266
 URL: https://issues.apache.org/jira/browse/COUCHDB-266
 Project: CouchDB
  Issue Type: Bug
  Components: HTTP Interface
Affects Versions: 0.9
 Environment:  Apache CouchDB 0.9.0a747258
Reporter: Jeff Hinrichs


error displays itself when trying to PUT  a json document that is > 1MB.  First 
noticed in the python interface, confirmed with curl

[Tue, 24 Feb 2009 13:30:00 GMT] [error] [<0.1113.0>] Uncaught error in HTTP 
request: {exit,{body_too_large,content_length}}
2   
3   [Tue, 24 Feb 2009 13:30:00 GMT] [debug] [<0.1113.0>] Stacktrace: 
[{mochiweb_request,stream_body,5},
4   {mochiweb_request,recv_body,2},
5   {couch_httpd,json_body,1},
6   {couch_httpd_db,db_doc_req,3},
7   {couch_httpd_db,do_db_req,2},
8   {couch_httpd,handle_request,3},
9   {mochiweb_http,headers,4},
10  {proc_lib,init_p,5}] 

modifying src/mochiweb/mochiweb_request.erl > -define(MAX_RECV_BODY, 
(1024*1024)) 
to something bigger, say -define(MAX_RECV_BODY, (1024*1024*16))
alleviates the problem temporarily.

issue confirmed by cmlenz on irc, he believed it to be a regression


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)


I'm sure this is of no interest to anyone. I've replied privately.

On 24/02/2009, at 11:32 PM, Noah Slater wrote:


No, it most certainly was not just a question.

Instead of asking how community votes would be factored into the  
final result,
you constructed a hypothetical that frames the PMC as a  
dictatorship, doing as
it pleases regardless of community feedback. You then use this  
hypothetical to
draw the absurd conclusion that "community votes are irrelevant" and  
then seek

an explicit refutation.

You have consistently used this style of framing. It seems that you  
presume the
worst in people and processes instead of giving us the courteous  
benefit of the
doubt. It has been consistently pointed out to you that this causes  
offence and
unnecessary friction, and you have been politely asked to think  
about other ways

of approaching constructive discussion.


Antony Blakey
-
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

All that is required for evil to triumph is that good men do nothing.

Re: Fail on a simple case on replication

2009-02-24 Thread Patrick Antivackis

2009/2/24 Jan Lehnardt 

>
> On 24 Feb 2009, at 13:52, Patrick Antivackis wrote:
>
>> It's like all politically correct terminology where you use a stupid
 expression in order to be as neutral as possible.


>>> You have a point here, it is about avoiding conflict. But I don't think
>>> we're looking for a neutral term here, but one with a better name.
>>> I'd go with _access_token if it weren't too long. _rev is nice and short
>>> and _token might as well be _wibble. API design is hard.
>>>
>>>
>> May be it's about conflict, but as it's also a previous release, it's by
>> definition a revision. The fact that the revision is no more there is not
>> changing the fact that it's a revision.
>>
>
> Haha, language ambiguity for the win :) I meant conflict between
> users applying prior understanding of the term "revision" to CouchDB
> revisions causing a conflict. I did not mean using _rev as a token to
> manage write conflicts for a document. I need to be more careful with
> these words :)
>

Don't worry i'm neither english speaking native too.


>
>
>
>  That's why if the name is changed, the functionality to access a previous
>> revision should be removed.
>>
>
> I could see that being a valid conclusion and I think that would be
> covered with disabling the feature by default and make it an opt-in
> like Damien suggested. We also could just nuke it completely and
> wait for complaints before reconsidering making it an opt-in.
>
>
Great so my vote becomes : -0

>
>
> Cheers
> Jan
> --
>
>
>  --
>>>
>>>
>>>
>>>
>>> IMO if you change this
>>>
 attribute name it's even better to remove all possibilities to a access
 a
 previous rev if still there, and change it's value by a timestamp


 Regards

 2009/2/24 Antony Blakey 


  On 24/02/2009, at 12:51 PM, Antony Blakey wrote:
>
> The project founder and the PMC, are all committed to that replication
>
>  model, which is derived from Notes.
>>
>>
>>  BTW I'm the only one in the community that has expressed any strong
> desire
> to change this - I'm not implying any community division, just pointing
> out
> that it's both an historical artifact, and accepted by the major
> contributors and committers.
>
> Antony Blakey
> --
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
>
> Plurality is not to be assumed without necessity
> -- William of Ockham (ca. 1285-1349)
>
>
>
>
>
>>>
>

Re: Fail on a simple case on replication



On 24 Feb 2009, at 13:52, Patrick Antivackis wrote:

It's like all politically correct terminology where you use a stupid
expression in order to be as neutral as possible.



You have a point here, it is about avoiding conflict. But I don't  
think

we're looking for a neutral term here, but one with a better name.
I'd go with _access_token if it weren't too long. _rev is nice and  
short

and _token might as well be _wibble. API design is hard.



May be it's about conflict, but as it's also a previous release,  
it's by
definition a revision. The fact that the revision is no more there  
is not

changing the fact that it's a revision.


Haha, language ambiguity for the win :) I meant conflict between
users applying prior understanding of the term "revision" to CouchDB
revisions causing a conflict. I did not mean using _rev as a token to
manage write conflicts for a document. I need to be more careful with
these words :)


That's why if the name is changed, the functionality to access a  
previous

revision should be removed.


I could see that being a valid conclusion and I think that would be
covered with disabling the feature by default and make it an opt-in
like Damien suggested. We also could just nuke it completely and
wait for complaints before reconsidering making it an opt-in.


Cheers
Jan
--



--




IMO if you change this
attribute name it's even better to remove all possibilities to a  
access a

previous rev if still there, and change it's value by a timestamp


Regards

2009/2/24 Antony Blakey 



On 24/02/2009, at 12:51 PM, Antony Blakey wrote:

The project founder and the PMC, are all committed to that  
replication



model, which is derived from Notes.



BTW I'm the only one in the community that has expressed any strong
desire
to change this - I'm not implying any community division, just  
pointing

out
that it's both an historical artifact, and accepted by the major
contributors and committers.

Antony Blakey
--
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Plurality is not to be assumed without necessity
-- William of Ockham (ca. 1285-1349)

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)

On Tue, Feb 24, 2009 at 10:48:31PM +1030, Antony Blakey wrote:
> On 24/02/2009, at 10:34 PM, Noah Slater wrote:
>> On Tue, Feb 24, 2009 at 10:14:04PM +1030, Antony Blakey wrote:
>>> I'm a bit confused about this. Excuse me while I tread carefully. It
>>> seems that the community vote is clearly a majority to accept the
>>> patch.
>>> If the end result of this vote is that we don't follow that vote
>>> because
>>> it's only the PMC vote that counts, doesn't that mean that community
>>> votes are irrelevant?
>>
>> Anthony, I am upset that you seem to be wanting to cause trouble
>> again.
>
> It was a question Noah.
>
> Your characterization of my intent as being 'to cause trouble', and
> furthermore 'again', is entirely erroneous.

No, it most certainly was not just a question.

Instead of asking how community votes would be factored into the final result,
you constructed a hypothetical that frames the PMC as a dictatorship, doing as
it pleases regardless of community feedback. You then use this hypothetical to
draw the absurd conclusion that "community votes are irrelevant" and then seek
an explicit refutation.

You have consistently used this style of framing. It seems that you presume the
worst in people and processes instead of giving us the courteous benefit of the
doubt. It has been consistently pointed out to you that this causes offence and
unnecessary friction, and you have been politely asked to think about other ways
of approaching constructive discussion.

Best,

-- 
Noah Slater, http://tumbolia.org/nslater

Re: Fail on a simple case on replication

2009-02-24 Thread Patrick Antivackis

Hi Jan,


>
>  Oh and by the way, in a use case where there is only one database and you
>> don't use compaction because you want to keep everything, well _rev is a
>> revision that can be used to see the history of the document.
>>
>
> You still shouldn't and that's what's in the documentation :) Just because
> you can tie a skateboard to a car and drive on the highway would make
> one hell of a fun ride, you are not advised to do so. :)
>

Don't worry ;) , on my side i not  do this,   as I know when  I will make
compaction, i run a program before compaction that will take care of
"archiving" previous rev.

 I really don't
> see the point of renaming an attribute to make it harder to understand it's
> role.
>

The suggestion here is to rename to make it _easier_ to understand
> because the connotations "revision" comes with are not entirely
> valid for CouchDB.
>
>> It's like all politically correct terminology where you use a stupid
>> expression in order to be as neutral as possible.
>>
>
> You have a point here, it is about avoiding conflict. But I don't think
> we're looking for a neutral term here, but one with a better name.
> I'd go with _access_token if it weren't too long. _rev is nice and short
> and _token might as well be _wibble. API design is hard.
>

May be it's about conflict, but as it's also a previous release, it's by
definition a revision. The fact that the revision is no more there is not
changing the fact that it's a revision.

That's why if the name is changed, the functionality to access a previous
revision should be removed.



>
>
> Cheers
> Jan
> --
>
>
>
>
>  IMO if you change this
>> attribute name it's even better to remove all possibilities to a access a
>> previous rev if still there, and change it's value by a timestamp
>>
>>
>> Regards
>>
>> 2009/2/24 Antony Blakey 
>>
>>
>>> On 24/02/2009, at 12:51 PM, Antony Blakey wrote:
>>>
>>> The project founder and the PMC, are all committed to that replication
>>>
 model, which is derived from Notes.


>>> BTW I'm the only one in the community that has expressed any strong
>>> desire
>>> to change this - I'm not implying any community division, just pointing
>>> out
>>> that it's both an historical artifact, and accepted by the major
>>> contributors and committers.
>>>
>>> Antony Blakey
>>> --
>>> CTO, Linkuistics Pty Ltd
>>> Ph: 0438 840 787
>>>
>>> Plurality is not to be assumed without necessity
>>> -- William of Ockham (ca. 1285-1349)
>>>
>>>
>>>
>>>
>

Re: Fail on a simple case on replication



On 24 Feb 2009, at 13:39, Brian Candler wrote:


On Tue, Feb 24, 2009 at 09:06:09AM +0100, Patrick Antivackis wrote:
Oh and by the way, in a use case where there is only one database  
and you
don't use compaction because you want to keep everything, well _rev  
is a

revision that can be used to see the history of the document.


This is a good point. If you follow "accountants don't use erasers"  
then you
will never compact (and maybe you want a flag which prevents  
compaction).


You'd not use revisions to keep records around but proper documents.


However, you must then be prepared for your database to be a single  
file
which grows without bounds. If CouchDB wants to support this model,  
it would

be helpful if the data were stored in chunks which can be backed up
separately.


rsync? :)


"Compaction" for saving space could be achieved by rewriting the  
database,
but keeping diffs for earlier revisions. At this point you would end  
up with

something roughly like git.

On a random tangent: has anyone considered a CouchDB-like system where
documents are raw blobs, rather than JSON? ISTM that:

- it would save a lot of conversion between Erlang terms and JSON
- it would remove the second-class nature of attachments
- it would allow structured data to be stored in arbitary formats  
(e.g. XML)
- it would allow map/reduce to work on binary data (e.g. use a map  
function

 to make thumbnails of all your jpegs)
- you could still use JSON quite happily, e.g.

 function map(type, data) {
   if (type == "application/json") {
 doc = evalcx(data);
 ... continue as normal
   }
 }

I guess some of the APIs would become a bit more awkward though. For
example, bulk document insert would probably become MIME multipart.

In principle, I think you could get today's CouchDB as a thin layer  
on top
of this. However, "attachments" do have interesting special  
semantics (e.g.

deleting a document deletes all its attachments) which might need some
parent/child relationship between documents to maintain. Having that
relationship between documents in a more general form could also be  
useful.


Just thinking out loud.


This is quite interesting! :) I'd like to see such a system, but I'd  
also like

CouchDB not becoming an Apache-httpd style kitchen-sink for all things
HTTP. Maybe Yaws is what you're looking for?

Cheers
Jan
--

Re: Fail on a simple case on replication



On 24/02/2009, at 11:09 PM, Brian Candler wrote:


On a random tangent: has anyone considered a CouchDB-like system where
documents are raw blobs, rather than JSON? ISTM that:


You'd need some way to attach/inject the metadata in both directions.

Antony Blakey
-
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

A reasonable man adapts himself to suit his environment. An  
unreasonable man persists in attempting to adapt his environment to  
suit himself. Therefore, all progress depends on the unreasonable man.

  -- George Bernard Shaw

Re: Fail on a simple case on replication

2009-02-24 Thread Brian Candler

On Tue, Feb 24, 2009 at 09:06:09AM +0100, Patrick Antivackis wrote:
> Oh and by the way, in a use case where there is only one database and you
> don't use compaction because you want to keep everything, well _rev is a
> revision that can be used to see the history of the document.

This is a good point. If you follow "accountants don't use erasers" then you
will never compact (and maybe you want a flag which prevents compaction).

However, you must then be prepared for your database to be a single file
which grows without bounds. If CouchDB wants to support this model, it would
be helpful if the data were stored in chunks which can be backed up
separately.

"Compaction" for saving space could be achieved by rewriting the database,
but keeping diffs for earlier revisions. At this point you would end up with
something roughly like git.

On a random tangent: has anyone considered a CouchDB-like system where
documents are raw blobs, rather than JSON? ISTM that:

- it would save a lot of conversion between Erlang terms and JSON
- it would remove the second-class nature of attachments
- it would allow structured data to be stored in arbitary formats (e.g. XML)
- it would allow map/reduce to work on binary data (e.g. use a map function
  to make thumbnails of all your jpegs)
- you could still use JSON quite happily, e.g.

  function map(type, data) {
if (type == "application/json") {
  doc = evalcx(data);
  ... continue as normal
}
  }

I guess some of the APIs would become a bit more awkward though. For
example, bulk document insert would probably become MIME multipart.

In principle, I think you could get today's CouchDB as a thin layer on top
of this. However, "attachments" do have interesting special semantics (e.g.
deleting a document deletes all its attachments) which might need some
parent/child relationship between documents to maintain. Having that
relationship between documents in a more general form could also be useful.

Just thinking out loud.

Regards,

Brian.

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)



On 24/02/2009, at 10:34 PM, Noah Slater wrote:


On Tue, Feb 24, 2009 at 10:14:04PM +1030, Antony Blakey wrote:

I'm a bit confused about this. Excuse me while I tread carefully. It
seems that the community vote is clearly a majority to accept the  
patch.
If the end result of this vote is that we don't follow that vote  
because

it's only the PMC vote that counts, doesn't that mean that community
votes are irrelevant?


Anthony, I am upset that you seem to be wanting to cause trouble  
again.


It was a question Noah.

Your characterization of my intent as being 'to cause trouble', and  
furthermore 'again', is entirely erroneous.


Antony Blakey
-
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Every task involves constraint,
Solve the thing without complaint;
There are magic links and chains
Forged to loose our rigid brains.
Structures, structures, though they bind,
Strangely liberate the mind.
  -- James Fallen

[jira] Commented: (COUCHDB-265) HEAD requests get a Content-Length header


[ 
https://issues.apache.org/jira/browse/COUCHDB-265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676254#action_12676254
 ] 

Christopher Lenz commented on COUCHDB-265:
--

For the record, the proper way to issue HEAD requests with curl is "curl -I 
http://localhost:5984/"; (that's the uppercase i option).

> HEAD requests get a Content-Length header
> -
>
> Key: COUCHDB-265
> URL: https://issues.apache.org/jira/browse/COUCHDB-265
> Project: CouchDB
>  Issue Type: Bug
>  Components: HTTP Interface
>Affects Versions: 0.9
> Environment: curl + trunk
>Reporter: Paul Joseph Davis
> Fix For: 0.9
>
>
> Looks like HEAD requests are returning a bogus Content-Length header. If I 
> remember my HTTP spec correctly, HEAD requests are supposed to return no 
> Content-Length or a Content-Length of 0 but I could be wrong on that. Either 
> way, it confuses the crap out of curl:
> $ curl -X HEAD -i http://127.0.0.1:5984/
> HTTP/1.1 200 OK
> Server: CouchDB/0.9.0a (Erlang OTP/R12B)
> Date: Mon, 23 Feb 2009 20:56:55 GMT
> Content-Type: text/plain;charset=utf-8
> Content-Length: 40
> Cache-Control: must-revalidate
> curl: (18) transfer closed with 40 bytes remaining to read
> Also, I just happened to be reading couch_http.erl the other day and I 
> remember seeing a note that said mochiweb automatically strips bodies so 
> internally HEAD requests are treated like a GET and mochiweb I guess just 
> doesn't send a body. That's probably important.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: Fail on a simple case on replication



On 24 Feb 2009, at 12:54, Antony Blakey wrote:



On 24/02/2009, at 10:11 PM, Robert Dionne wrote:

I read this thesis ages ago, and technically you are correct, if  
somewhat pedantic. I think CouchDB captures the just of being REST- 
ful and certainly from a marketing perspective it's timely.


That's why I say it's a marketing issue. Surely we shouldn't copy  
Microsoft's marketing tactics and deliberately misuse a term for  
marketing reasons. The site should say 'HTTP API'.


When I mention to potential customers that CouchDB database are  
accessed with URIs they say "oh it uses this new REST stuff,  
cool".  Often we have little choice over how the world takes an  
idea and runs with it.


But we don't have to be complicit. And remember this isn't about the  
world taking an *idea*. It's about people wanting a cool label to  
stick on their project, even if the label doesn't fit.


What term would you suggest for a service that fulfills Fielding's  
definition? Certainly the benefits of being 'RESTful' according to  
his definition don't flow on to CouchDB, because it's NOT actually  
RESTful.


Yes, you were right, let's not fire up this argument.

Sorry for the noise.

Cheers
Jan
--

Re: Fail on a simple case on replication

2009-02-24 Thread Robert Dionne



Robert Dionne
Chief Programmer
dio...@dionne-associates.com
203.231.9961



On Feb 24, 2009, at 6:54 AM, Antony Blakey wrote:



On 24/02/2009, at 10:11 PM, Robert Dionne wrote:

I read this thesis ages ago, and technically you are correct, if  
somewhat pedantic. I think CouchDB captures the just of being REST- 
ful and certainly from a marketing perspective it's timely.


That's why I say it's a marketing issue. Surely we shouldn't copy  
Microsoft's marketing tactics and deliberately misuse a term for  
marketing reasons. The site should say 'HTTP API'.


It's an exaggeration to suggest that CouchDB's use of the term REST  
is akin to Microsoft's marketing tactics. Nor is it a matter of being  
complicit. Your argument that it is not RESTful is similar to saying  
someone is not a good catholic because they eat meat on Fridays and  
subscribe to other reforms of recent vatican councils. REST is an  
interesting idea but let's face it, with all due respect to Roy  
Fielding, it's merely a statement that this is how the web works and  
what makes it work well. It generated excitement I think largely as a  
contrast to the ugliness of SOAP. I'm happy it produced a readable  
thesis.


In fact the fuzziness of the idea explains why there are so many  
arguments about what's RESTful or not.




When I mention to potential customers that CouchDB database are  
accessed with URIs they say "oh it uses this new REST stuff,  
cool".  Often we have little choice over how the world takes an  
idea and runs with it.


But we don't have to be complicit. And remember this isn't about  
the world taking an *idea*. It's about people wanting a cool label  
to stick on their project, even if the label doesn't fit.


What term would you suggest for a service that fulfills Fielding's  
definition? Certainly the benefits of being 'RESTful' according to  
his definition don't flow on to CouchDB, because it's NOT actually  
RESTful.


Antony Blakey
-
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

A Man may make a Remark –
In itself – a quiet thing
That may furnish the Fuse unto a Spark
In dormant nature – lain –

Let us divide – with skill –
Let us discourse – with care –
Powder exists in Charcoal –
Before it exists in Fire –

  -– Emily Dickinson 913 (1865)

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)



On 24 Feb 2009, at 12:44, Antony Blakey wrote:



On 23/02/2009, at 3:17 AM, Jan Lehnardt wrote:


Collecting:

On 23 Jan 2009, at 23:42, Noah Slater wrote:

* Accept the patch (or a modified version) and add newline chars


+1: 7 (2 binding)
-1: 3 (2 binding)

* Reject the patch (and any modified version) and do not add  
newlines chars


+1: 3 (2 binding)
-1: 4



* Further discussion, to be decided before we release 0.9


+1: 1 (1 binding)
-1: 1 (1 binding)


* Further discussion, to be decided after we release 0.9


+1:
-1: 2 (2 binding)

--

It looks like we have a draw with weigh-in from the community
on a +1 to accept the patch.

We need more discussion here.


I'm a bit confused about this. Excuse me while I tread carefully. It  
seems that the community vote is clearly a majority to accept the  
patch. If the end result of this vote is that we don't follow that  
vote because it's only the PMC vote that counts, doesn't that mean  
that community votes are irrelevant?


No, you're absolutely right on the "Accept the patch" branch. But
there are enough community -1s to keep this open. A single
community -1 should be addressed in an ASF vote.


Cheers
Jan
--

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)

On Tue, Feb 24, 2009 at 10:14:04PM +1030, Antony Blakey wrote:
> I'm a bit confused about this. Excuse me while I tread carefully. It
> seems that the community vote is clearly a majority to accept the patch.
> If the end result of this vote is that we don't follow that vote because
> it's only the PMC vote that counts, doesn't that mean that community
> votes are irrelevant?

Anthony, I am upset that you seem to be wanting to cause trouble again.

If you look back at my original email, you will find:

  * The community and PMC have decided to open this issue back up for
discussion, with the proviso that we complete our final decision before
releasing 0.9 -- which means another vote in a week or so. Heh.

  * The community was strongly in favour of accepting the patch, but the PMC was
almost completely split down the middle, with a slightly preference for not
accepting the patch.

Clearly the community votes are not irrelevant, otherwise I would not have
listed them or taken them into account. Apache is a meritocracy, meaning that
some votes carry more weight than others. When the PMC votes so heavily for
further discussion in the light of a strong community support, I hardly see that
you could suggest any other outcome than to open up the discussion.

Moreover, if you had actually bothered to read the rules for Apache voting on
code modification you would know that Chris Anderson and Jan Lehnardt had vetoed
the patch, meaning that it cannot possibly be accepted.

  http://www.apache.org/foundation/voting.html

Best.

-- 
Noah Slater, http://tumbolia.org/nslater

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)



On Feb 24, 2009, at 6:44 AM, Antony Blakey wrote:



On 23/02/2009, at 3:17 AM, Jan Lehnardt wrote:


Collecting:

On 23 Jan 2009, at 23:42, Noah Slater wrote:

* Accept the patch (or a modified version) and add newline chars


+1: 7 (2 binding)
-1: 3 (2 binding)

* Reject the patch (and any modified version) and do not add  
newlines chars


+1: 3 (2 binding)
-1: 4



* Further discussion, to be decided before we release 0.9


+1: 1 (1 binding)
-1: 1 (1 binding)


* Further discussion, to be decided after we release 0.9


+1:
-1: 2 (2 binding)

--

It looks like we have a draw with weigh-in from the community
on a +1 to accept the patch.

We need more discussion here.


I'm a bit confused about this. Excuse me while I tread carefully. It  
seems that the community vote is clearly a majority to accept the  
patch. If the end result of this vote is that we don't follow that  
vote because it's only the PMC vote that counts, doesn't that mean  
that community votes are irrelevant?


It means "We need more discussion here". Getting consensus is  
important, especially when the main development contributors have  
disagreements.


-Damien

Re: Fail on a simple case on replication



On 24/02/2009, at 10:11 PM, Robert Dionne wrote:

I read this thesis ages ago, and technically you are correct, if  
somewhat pedantic. I think CouchDB captures the just of being REST- 
ful and certainly from a marketing perspective it's timely.


That's why I say it's a marketing issue. Surely we shouldn't copy  
Microsoft's marketing tactics and deliberately misuse a term for  
marketing reasons. The site should say 'HTTP API'.


When I mention to potential customers that CouchDB database are  
accessed with URIs they say "oh it uses this new REST stuff, cool".   
Often we have little choice over how the world takes an idea and  
runs with it.


But we don't have to be complicit. And remember this isn't about the  
world taking an *idea*. It's about people wanting a cool label to  
stick on their project, even if the label doesn't fit.


What term would you suggest for a service that fulfills Fielding's  
definition? Certainly the benefits of being 'RESTful' according to his  
definition don't flow on to CouchDB, because it's NOT actually RESTful.


Antony Blakey
-
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

A Man may make a Remark –
In itself – a quiet thing
That may furnish the Fuse unto a Spark
In dormant nature – lain –

Let us divide – with skill –
Let us discourse – with care –
Powder exists in Charcoal –
Before it exists in Fire –

  -– Emily Dickinson 913 (1865)

Re: Fail on a simple case on replication

2009-02-24 Thread Robert Dionne



Robert Dionne
Chief Programmer
dio...@dionne-associates.com
203.231.9961



On Feb 24, 2009, at 5:52 AM, Jan Lehnardt wrote:


Hi Patrick,

On 24 Feb 2009, at 09:06, Patrick Antivackis wrote:

Oh and by the way, in a use case where there is only one database  
and you
don't use compaction because you want to keep everything, well  
_rev is a

revision that can be used to see the history of the document.


You still shouldn't and that's what's in the documentation :) Just  
because

you can tie a skateboard to a car and drive on the highway would make
one hell of a fun ride, you are not advised to do so. :)



I really don't
see the point of renaming an attribute to make it harder to  
understand it's

role.


The suggestion here is to rename to make it _easier_ to understand
because the connotations "revision" comes with are not entirely
valid for CouchDB.


I agree that this is important to fix. It is too easy to assume  
CouchDB supports revision history. A lot of folks made this mistake,  
myself included. It's really internal state needed for concurrency  
control, yet it's exposed to users and required to be maintained in  
the document. So it needs to be called something that reflects this  
internal use, like "_int_bit" or "_token" or "_cc_uid"







It's like all politically correct terminology where you use a stupid
expression in order to be as neutral as possible.


You have a point here, it is about avoiding conflict. But I don't  
think

we're looking for a neutral term here, but one with a better name.
I'd go with _access_token if it weren't too long. _rev is nice and  
short

and _token might as well be _wibble. API design is hard.


Cheers
Jan
--




IMO if you change this
attribute name it's even better to remove all possibilities to a  
access a

previous rev if still there, and change it's value by a timestamp


Regards

2009/2/24 Antony Blakey 



On 24/02/2009, at 12:51 PM, Antony Blakey wrote:

The project founder and the PMC, are all committed to that  
replication

model, which is derived from Notes.



BTW I'm the only one in the community that has expressed any  
strong desire
to change this - I'm not implying any community division, just  
pointing out

that it's both an historical artifact, and accepted by the major
contributors and committers.

Antony Blakey
--
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Plurality is not to be assumed without necessity
-- William of Ockham (ca. 1285-1349)

Re: Fail on a simple case on replication



On Feb 24, 2009, at 6:26 AM, Antony Blakey wrote:



On 24/02/2009, at 9:29 PM, Jan Lehnardt wrote:


CouchDB documents are limited to JSON (application/json) as the
content, that doesn't make the API less RESTful. If that's not the
right answer, I don't understand what you mean.


application/json doesn't define the semantics of the payload e.g.  
how to interact with the resource. To do that it would have to be  
application/json+couchdoc et al.



and it uses externally defined URL structures to effect operations.


Can you elaborate on that?


To be RESTful, the means of constructing URLs needs to be defined by  
the media type specification. For example, having ?rev= is a rule  
that is external to both the media type and the document.


A RESTful API would have a single entry point, with every other URL  
and service constructed/discovered by processing the content,  
applying the rules of the media type to the content to construct new  
URLS, just like HTML. The HTML web doesn't have a manual describing  
how to effect operations by constructing certain URLs beyond the  
interpretation of the content.


With Chris Andersons's "show" document and "list" view work, we have  
the beginnings of that.


-Damien

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)



On 23/02/2009, at 3:17 AM, Jan Lehnardt wrote:


Collecting:

On 23 Jan 2009, at 23:42, Noah Slater wrote:

* Accept the patch (or a modified version) and add newline chars


+1: 7 (2 binding)
-1: 3 (2 binding)

* Reject the patch (and any modified version) and do not add  
newlines chars


+1: 3 (2 binding)
-1: 4



* Further discussion, to be decided before we release 0.9


+1: 1 (1 binding)
-1: 1 (1 binding)


 * Further discussion, to be decided after we release 0.9


+1:
-1: 2 (2 binding)

--

It looks like we have a draw with weigh-in from the community
on a +1 to accept the patch.

We need more discussion here.


I'm a bit confused about this. Excuse me while I tread carefully. It  
seems that the community vote is clearly a majority to accept the  
patch. If the end result of this vote is that we don't follow that  
vote because it's only the PMC vote that counts, doesn't that mean  
that community votes are irrelevant?


Antony Blakey
--
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

A priest, a minister and a rabbi walk into a bar. The bartender says  
"What is this, a joke?"

Re: Fail on a simple case on replication

2009-02-24 Thread Robert Dionne



Robert Dionne
Chief Programmer
dio...@dionne-associates.com
203.231.9961



On Feb 24, 2009, at 6:26 AM, Antony Blakey wrote:



On 24/02/2009, at 9:29 PM, Jan Lehnardt wrote:


CouchDB documents are limited to JSON (application/json) as the
content, that doesn't make the API less RESTful. If that's not the
right answer, I don't understand what you mean.


application/json doesn't define the semantics of the payload e.g.  
how to interact with the resource. To do that it would have to be  
application/json+couchdoc et al.



and it uses externally defined URL structures to effect operations.


Can you elaborate on that?


To be RESTful, the means of constructing URLs needs to be defined  
by the media type specification. For example, having ?rev= is a  
rule that is external to both the media type and the document.


A RESTful API would have a single entry point, with every other URL  
and service constructed/discovered by processing the content,  
applying the rules of the media type to the content to construct  
new URLS, just like HTML. The HTML web doesn't have a manual  
describing how to effect operations by constructing certain URLs  
beyond the interpretation of the content.



Why then have folks like Sam Ruby* or Tim Bray not objected yet?
Not trying to pick a fight here, I'm just wondering if you are  
interpreting

"the spec" a little too strict?


The term is defined by Roy Fielding's thesis, and he has objected  
to the misuse of the term: http://roy.gbiv.com/untangled/2008/rest- 
apis-must-be-hypertext-driven. And the next post: http:// 
roy.gbiv.com/untangled/2008/specialization is also good.


Antony,

  I read this thesis ages ago, and technically you are correct, if  
somewhat pedantic. I think CouchDB captures the just of being REST- 
ful and certainly from a marketing perspective it's timely. When I  
mention to potential customers that CouchDB database are accessed  
with URIs they say "oh it uses this new REST stuff, cool".  Often we  
have little choice over how the world takes an idea and runs with it.


Regards,

Bob





My argument in this context is pointless. I know it's not going  
to change.


How about not trying to subtly create "them-and-us" situation? It  
seems
strange given that you clarified a statement about "the PMC"  
earlier in
this thread to avoid misinterpretation (thanks). Also, you never  
brought

this up, so how do you know it is not going to change?


I have brought this up before on couchdb-u...@incubator.apache.org  
- 15 November 2008, Subject: RESTful? (was: Re: Document Updates).  
Apache archives don't cover that time on that list.


Hence, my comment, - let's not fire up this argument. I meant that  
I wasn't going to waste m/l bandwidth rehashing an argument that  
has already been done and dusted in this context.


Antony Blakey
--
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

The project was so plagued by politics and ego that when the  
engineers requested technical oversight, our manager hired a  
psychologist instead.

 -- Ron Avitzur

Re: Fail on a simple case on replication

On 24/02/2009, at 9:29 PM, Jan Lehnardt wrote:

CouchDB documents are limited to JSON (application/json) as the
content, that doesn't make the API less RESTful. If that's not the
right answer, I don't understand what you mean.

application/json doesn't define the semantics of the payload e.g. how
to interact with the resource. To do that it would have to be
application/json+couchdoc et al.

and it uses externally defined URL structures to effect operations.

Can you elaborate on that?

To be RESTful, the means of constructing URLs needs to be defined by
the media type specification. For example, having ?rev= is a rule that
is external to both the media type and the document.

A RESTful API would have a single entry point, with every other URL
and service constructed/discovered by processing the content, applying
the rules of the media type to the content to construct new URLS, just
like HTML. The HTML web doesn't have a manual describing how to effect
operations by constructing certain URLs beyond the interpretation of
the content.

Why then have folks like Sam Ruby* or Tim Bray not objected yet?
Not trying to pick a fight here, I'm just wondering if you are
interpreting

"the spec" a little too strict?

The term is defined by Roy Fielding's thesis, and he has objected to
the misuse of the term: http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven
. And the next post: http://roy.gbiv.com/untangled/2008/specialization
is also good.

My argument in this context is pointless. I know it's not going to
change.

How about not trying to subtly create "them-and-us" situation? It
seems
strange given that you clarified a statement about "the PMC" earlier
in
this thread to avoid misinterpretation (thanks). Also, you never
brought

this up, so how do you know it is not going to change?

I have brought this up before on couchdb-u...@incubator.apache.org -
15 November 2008, Subject: RESTful? (was: Re: Document Updates).
Apache archives don't cover that time on that list.

Hence, my comment, - let's not fire up this argument. I meant that I
wasn't going to waste m/l bandwidth rehashing an argument that has
already been done and dusted in this context.

Antony Blakey
--
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

The project was so plagued by politics and ego that when the engineers
requested technical oversight, our manager hired a psychologist instead.

-- Ron Avitzur

Re: Fail on a simple case on replication



On 24 Feb 2009, at 11:44, Antony Blakey wrote:



On 24/02/2009, at 9:02 PM, Jan Lehnardt wrote:


Hi Antony,

On 24 Feb 2009, at 00:34, Antony Blakey wrote:


OTOH, one should use the correct term and not redefine existing  
terms to suit one's own purpose. In a tangentially related way,  
the use of the term RESTful wrt CouchDB is a marketing abomination.




I've heard that before. CouchDB's core document API is as
RESTful as it gets. But not all of CouchDB's API is RESTful
and it wouldn't even make sense. I don't see any abomination
going on here. Thanks.


Couch's core document API is not RESTful. It doesn't use a specific  
media type to define the interpretation of the content,


CouchDB documents are limited to JSON (application/json) as the
content, that doesn't make the API less RESTful. If that's not the
right answer, I don't understand what you mean.



and it uses externally defined URL structures to effect operations.


Can you elaborate on that?



That's not RESTful, and I don't think CouchDB should use the term.


Why then have folks like Sam Ruby* or Tim Bray not objected yet?
Not trying to pick a fight here, I'm just wondering if you are  
interpreting

"the spec" a little too strict?

*Sam having co-written "RESTful Web Services" for O'Reilly and being
chiefly responsible for CouchDB's incubation at the ASF.


My argument in this context is pointless. I know it's not going to  
change.


How about not trying to subtly create "them-and-us" situation? It seems
strange given that you clarified a statement about "the PMC" earlier in
this thread to avoid misinterpretation (thanks). Also, you never brought
this up, so how do you know it is not going to change?

Cheers
Jan
--

Re: Fail on a simple case on replication


Hi Patrick,

On 24 Feb 2009, at 09:06, Patrick Antivackis wrote:

Oh and by the way, in a use case where there is only one database  
and you
don't use compaction because you want to keep everything, well _rev  
is a

revision that can be used to see the history of the document.


You still shouldn't and that's what's in the documentation :) Just  
because

you can tie a skateboard to a car and drive on the highway would make
one hell of a fun ride, you are not advised to do so. :)



I really don't
see the point of renaming an attribute to make it harder to  
understand it's

role.


The suggestion here is to rename to make it _easier_ to understand
because the connotations "revision" comes with are not entirely
valid for CouchDB.



It's like all politically correct terminology where you use a stupid
expression in order to be as neutral as possible.


You have a point here, it is about avoiding conflict. But I don't think
we're looking for a neutral term here, but one with a better name.
I'd go with _access_token if it weren't too long. _rev is nice and short
and _token might as well be _wibble. API design is hard.


Cheers
Jan
--




IMO if you change this
attribute name it's even better to remove all possibilities to a  
access a

previous rev if still there, and change it's value by a timestamp


Regards

2009/2/24 Antony Blakey 



On 24/02/2009, at 12:51 PM, Antony Blakey wrote:

The project founder and the PMC, are all committed to that  
replication

model, which is derived from Notes.



BTW I'm the only one in the community that has expressed any strong  
desire
to change this - I'm not implying any community division, just  
pointing out

that it's both an historical artifact, and accepted by the major
contributors and committers.

Antony Blakey
--
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Plurality is not to be assumed without necessity
-- William of Ockham (ca. 1285-1349)

Re: Fail on a simple case on replication



On 24 Feb 2009, at 04:09, Jeff Hinrichs - DM&T wrote:

On Mon, Feb 23, 2009 at 8:43 PM, Chris Anderson   
wrote:
On Mon, Feb 23, 2009 at 6:30 PM, Damien Katz   
wrote:



Maybe we should change that use from ?rev... to ?conflict=


If we follow your _cc idea, we could change from ?rev= to ?cc=



I think if we change from _rev to something else, _cc for  
concurrency

control is good. I'm not sure this is necessary.


yes, if we make the change _cc is the best so far. I can already
imagine office workers thinking it stands for "conflict catcher".



Maybe we should only allow the ability to getting old revisions
(?disk_rev=...) with a setting in the ini, defaulting it off. That
discourages it's use as general purpose mechanism, but is easy to  
turn on if

you really need it.



Not a bad idea. The idea that you can't depend on it being available
would discourage apps from attempting to use _cc as an easy way to
provide undo functionality for users. Undo is a good feature, but  
undo
that sometimes randomly has been compacted away is worse than no  
undo.

I would point out that compaction is not a random event.  It is
controlled by the admin, correct?  To my knowledge, couch does not
spontaneously compact nor even currently support the idea of automated
compaction.


You're right, but that's one use-case. I'm used to think in the  
mindset of

a shared hosting provider where users do all sorts of crazy things and
the admins need to be able to control sensible operation. I'm not saying
this is the only other use-case but I'm seeing the general case of  
CouchDB

users not being admins and this not controlling compaction. Hence,
old revisions could go away at any time and undo should not rely on
the revision system to provide this functionality.



Also, earlier in the thread, Dean L, suggested allowing unlimited rev
history.  I think that his idea has merit in light of a talked about
patch that would limit revs history to length N.  If the ability to
control the size(N) of rev history is in the cards, why not allow N to
be infinity?  Before you just dismiss the idea, I would state that I
could see usefulness for this in special cases and remind you of the
old saw, "Accountants don't use erasers." And in the new age of
security and compliance, Auditors don't like erasers.


We're not dismissing the usefulness. By no means, but the revision
system is the wrong place to put this from CouchDB's design standpoint.

A little history.

I've been bugging Damien to implement this ever since I started playing
with CouchDB in October '06*. Our final discussion (that was about a
year ago) that I should be coming up with an RFC that covers all angles
of distributeed editing and replication that makes this easy to  
implement.

He said he couldn't come up with an easy way yet and there are edge-
cases lurking that are really hard. Based on his experience with  
distributed

database and my lack of it, I stopped the bugging. When I ever have the
time to think this all through I might propose an RFC covering all  
angles

and maybe a patch, but until then I keep quiet and work on providing
alternatives for users like the chapter in the Couch Book that Chris
hinted at.

Not saying it can't be done, but it is harder as it may sound, however
useful it is.

Lastly, there's so many other areas where the current CouchDB needs
improvement and where the vision and implementation details are much
clearer. Let's tackle them first.


Cheers
Jan
--
* I'm not trying to be intimidating or showing of my cool, I'm just  
pointing

out that Damien has been saying "no" (well, "not yet") to this feature
for quite some time and for good reasons.

Re: Fail on a simple case on replication



On 24/02/2009, at 9:02 PM, Jan Lehnardt wrote:


Hi Antony,

On 24 Feb 2009, at 00:34, Antony Blakey wrote:


OTOH, one should use the correct term and not redefine existing  
terms to suit one's own purpose. In a tangentially related way, the  
use of the term RESTful wrt CouchDB is a marketing abomination.




I've heard that before. CouchDB's core document API is as
RESTful as it gets. But not all of CouchDB's API is RESTful
and it wouldn't even make sense. I don't see any abomination
going on here. Thanks.


Couch's core document API is not RESTful. It doesn't use a specific  
media type to define the interpretation of the content, and it uses  
externally defined URL structures to effect operations. That's not  
RESTful, and I don't think CouchDB should use the term.


My argument in this context is pointless. I know it's not going to  
change.


Antony Blakey
--
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

The project was so plagued by politics and ego that when the engineers  
requested technical oversight, our manager hired a psychologist instead.

  -- Ron Avitzur

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)



On 24 Feb 2009, at 04:08, Chris Anderson wrote:

On Mon, Feb 23, 2009 at 1:31 AM, Christopher Lenz   
wrote:




Providing a reason for your -1 on accepting the patch would be a  
good start

;)

Personally, I don't think this whole thing is very important, but I  
don't

see any harm in adding the trailing newline right now.



I'm basically of the same mind. It's not very important to me either
way. I don't use command line clients much (oh and my bash-prompt has
a newline in it anyway...) so I lean slightly toward the "be like
Google and others" side of the fence.

If I understand correctly, Damien's objections have to do with the
interaction between a trailing newline and a potential future
canonical JSON format. He may have a point but we'd likely have to
change other things in the future if we wanted to use that canonical
format anyway. (Or maybe I'm mischaracterizing his old argument...)


Quick straw-poll whether this should be removed from the 0.9 (0.10)
blocking list?

+1

Cheers
Jan
--

Re: Fail on a simple case on replication

Hi Antony,

On 24 Feb 2009, at 00:34, Antony Blakey wrote:

OTOH, one should use the correct term and not redefine existing
terms to suit one's own purpose. In a tangentially related way, the
use of the term RESTful wrt CouchDB is a marketing abomination.

I've heard that before. CouchDB's core document API is as
RESTful as it gets. But not all of CouchDB's API is RESTful
and it wouldn't even make sense. I don't see any abomination
going on here. Thanks.

Your point behind the flame, not redefining existing terms:
The existing notion of a revision is that it is something you
can go back to. This is not what CouchDB revisions are, so
we are, right now, repurposing existing terminology.

I'm not saying, revision is wrong because it isn't. It's just not
a good choice for the API from a learning perspective. I under-
stand, that an API has more perspectives than learning it, so
we need to find out where to make the trade-off.

We're violating rules 1, 5, 6, 13, 14, and 15, probably more of
Rusty's rules of hard to use interfaces,
http://www.pointy-stick.com/blog/2008/01/09/api-design-rusty-levels/

The documentation about replication, the role of revisions, the lack
of inter-document consistency guarantees (including, crucially to
the operation model, the lack of Monotonic Write guarantees), really
needs to be expanded.

The consequences of CouchDB's underlying model aren't immediately
obvious, and should be spelled out, as I started to do here: http://mail-archives.apache.org/mod_mbox/couchdb-dev/200902.mbox/%3c0fddc57c-db78-4241-86de-549fecc8b...@gmail.com%3e
- which was obviously in the context of changing that mechanism,
but still the explanation and references are useful.

The wiki is open for all and everybody here welcomes useful additions.

Cheers
Jan
--

Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)