Re: Transactional _bulk_docs
I'm not keen on prolonging this agony, but I am going to respond to these points. On 06/02/2009, at 3:43 PM, Paul Davis wrote: A brief history: 1. The mythical IRC conversation on 'removing' the feature: (roughly quoted) It wasn't mythical - Damien has stated that is what happened. Why do you use the word 'mythical'? Damien: I don't think we can support transactional commits in the face of multiple nodes. We can do ACID writes to disk so that updates aren't lost, but checking with an unbounded number of servers that a commit doesn't conflict isn't feasible. Not unbounded. Look at Scalaris. And in any case, what exactly is this multi-node mode? Why compromise the API for something that is so ephemeral that conflict management isn't feasible? What IS feasible? View consistency? MVCC semantics. If I write a document and then read it, do I maybe see an earlier document. What about in the view? Because if views are going to have a consistency guarantee wrt. writes, then that looks to me like distributed MVCC. Is this not 2PC? And if views aren't consistent, then why bother? Why not have a client layer that does content-directed load balancing. Regardless, this discussion is also about whether supporting a single- node style of operation is useful, because CouchDB had an ACID _bulk_docs. IMO it's also about, the danger/cost of abstracting over operational models with considerably different operational characteristics - c.f. transparent distribution in object models. Everyone else: That's pretty reasonable. Everyone == ? Damien told me explicitly that this change was decided, and that decision was made on IRC. What's with this revisionist history Paul? 2. A patch was applied to trunk that made commits to CouchDB optionally ACID compliant (which gives users the traditional speed/safety choice) as well as removing the atomic 'all or none' semantics. If it's not all or none then it's not ACID with respect to the user data model. Conflicts are a metadata feature. Near as I can tell Damien has been nose to the grindstone for quite some time on this very specific part of the api. Would I like more status updates and ideas on where he's heading? Of course. Do I trust him? Yes. Is the community as a whole going to blindly accept some asinine patch that has no value that removes a crap load of functionality? No. Is the PMC going to accept some massive patch that has significant benefit, but as a side effect removes a key feature, for no good technical reason? That's what is happening. Damien's patches are neither asinine, nor of no value. On IRC Chris Anderson noted in response to a question that Damien has a heap of changes coming, but that we (the community) have to wait and see what they are. I tend to think that the 'discussion' that everyone keeps referring to hasn't even occurred yet. I look at the patch that was applied that caused this as an unfortunate early release. And if commits don't represent some sort of decision, what are they? I saw the patch, thought WTF?, asked about it, and was eventually told that yes, a decision had been made that the ACID API was being removed. Admissions first: I have no money riding on this issue. Whether or not CouchDB has transactional _bulk_docs worries me not at all. Though, I can't say that I have that much sympathy for a business model that relies on an open source project's trunk to remain compatible with required assumptions. Having an ACID guarantee explicitly stated, and then removed with no replacement, is not a 'required assumption' on my part. ACID is a big deal. And in any case, my 'business model' response is to fork CouchDB, which is the appropriate response. But still, do you want people to use this project or not? Promote it or not? What message does that send? Reductio ad absurdum: That's about right. Antony Blakey -- CTO, Linkuistics Pty Ltd Ph: 0438 840 787 Don't anthropomorphize computers. They hate that.
Re: Transactional _bulk_docs
On 06/02/2009, at 12:04 AM, Damien Katz wrote: He mailed us privately. Now he's mailed us publicly. BTW: Noah took me to task for emailing you privately, so I forwarded the email to the list. I was trying to get a resolution without fanning the flames. Antony Blakey - CTO, Linkuistics Pty Ltd Ph: 0438 840 787 Every task involves constraint, Solve the thing without complaint; There are magic links and chains Forged to loose our rigid brains. Structures, structures, though they bind, Strangely liberate the mind. -- James Fallen
Re: Transactional _bulk_docs
On Thu, Feb 5, 2009 at 10:02 PM, Antony Blakey wrote: > > On 06/02/2009, at 6:20 AM, Chris Anderson wrote: > >> Antony, maybe it would help for you to explain just exactly what you >> wouldn't be able to do, without the bulk docs API. It will help to >> inform people about the technical issue. > > > My original email included this: > > --- > > For example, I have documents that can be cloned. The cloned document > contains a reference to the originating document. Then I delete the original > document, the clone history needs to be updated to remove the reference to > the original document and replace it with an original-deleted history item. > There is a business case that requires this consistency. > > With a transactional API this is easy. Without it, I can't see a way to > maintain consistency in the face of concurrent application access and/or > failure. > > --- > > However, I don't think this is really about a specific example. > > The problem is that if you get one side of the relationship written and > visible, but the other side not, then other concurrent accessors will see a > partially successful update. > > One response is "but you'll see this problem during replication", but I > think this is making a big assumption about how replication is > managed/interleaved with local application behaviour. > > Replication, and dealing with conflicts, is in no way automatic. As others > have stated, there is no domain-independent way of resolving conflicts. > Surely if it were possible to build a transactional API on top of a > conflict-based system, then this statement would not be true? > > I am deploying CouchDB like a Notes CLIENT. Not as a high-performance > database server. Replication is an explicit operation, that halts normal > activity. For my first delivery, replicas are read-only, so replication > conflict isn't possible, but when I move to a distributed writers scenario, > resolving replication conflicts will involve a specialized UI, that allows > all conflicts to be resolved before normal operation resumes. Thus the > editing application always sees a conflict-free database. > > The use-case of someone doing a local operation e.g. submitting a web form, > is very different than resolving replication conflicts. Conflict during a > local operation is a matter of application concurrency, whereas conflict > during replication is driven by the overall system model. It has different > temporal, administrative and UI boundaries. > > In short, I think it is a mistake to try and hide the different > characteristics of local (even clustered) operations, and replication. You > may disagree, but if the system distinguishes between these two > fundamentally different things (distinguished by their partition-tolerance), > you can code as though every operation leads to conflict if you wish, but I > can't take advantage of the difference. > >> I know that the long-standing vision of Couch doesn't include special >> API exceptions for when you are running on a single node. And I'm a >> little afraid that the transactional doc commits Antony wants us to >> keep, are only a mirage, which would lead to trouble anyway, when >> distributed systems are involved. > > I don't understand why this needs to be the case. You can do transactions in > distributed systems. Do you have a model that isn't amenable to a Scalaris > treatment? Especially given that we're only talking about transactions over > a set of processes that are providing an illusion of a single system. Such a > cluster already requires some degree of partion-tolerance, right? And if > not, then what distinguishes a cluster from a partition-tolerant p2p mesh? > > Antony Blakey > - > CTO, Linkuistics Pty Ltd > Ph: 0438 840 787 > > The fact that an opinion has been widely held is no evidence whatever that > it is not utterly absurd. > -- Bertrand Russell > > > I'm upset that CouchDB doesn't make me coffee in the morning. But the thing is, CouchDB is totally willing to make you coffee *and* bacon. It loves you *that* much. Enough with the silly. I've watched this drama avalanche for awhile and I finally think it's time for me to put out a few words on what I've seen. A brief history: 1. The mythical IRC conversation on 'removing' the feature: (roughly quoted) Damien: I don't think we can support transactional commits in the face of multiple nodes. We can do ACID writes to disk so that updates aren't lost, but checking with an unbounded number of servers that a commit doesn't conflict isn't feasible. Everyone else: That's pretty reasonable. 2. A patch was applied to trunk that made commits to CouchDB optionally ACID compliant (which gives users the traditional speed/safety choice) as well as removing the atomic 'all or none' semantics. 3. Huge ML threads. History complete. Current status (through my eyes): Near as I can tell Dam
[jira] Commented: (COUCHDB-209) Refine httpd_db_handlers API to map the externals from paths directly to scripts
[ https://issues.apache.org/jira/browse/COUCHDB-209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12670998#action_12670998 ] Paul Joseph Davis commented on COUCHDB-209: --- I've been thinking about this ticket for a bit. I understand the desire to minimize the number of INI edits etc, but something keeps bothering me about removing the [external] section entirely. Also, is this really 0.9 blocking? I would vote that this isn't an API incompatibility. > Refine httpd_db_handlers API to map the externals from paths directly to > scripts > > > Key: COUCHDB-209 > URL: https://issues.apache.org/jira/browse/COUCHDB-209 > Project: CouchDB > Issue Type: Improvement > Components: HTTP Interface >Affects Versions: 0.9 > Environment: all >Reporter: Jeff Hinrichs >Assignee: Chris Anderson >Priority: Blocker > Fix For: 0.9 > > > We could change the API to map the externals from paths directly to > scripts, like > [httpd_db_handlers] > _mypath = {couch_httpd_external, handle_external_req, "/path/to/my/script"} > which would be fine by me. > The current code is like it is because the original implementation was > designed to have multiple scripts mounted at the /db/_external path. > Do you mind opening a ticket about this? - I'm happy to write the code > but I'm supposed to be working on the book right now, so it'll have to > wait. > link to mail list thread: > http://mail-archives.apache.org/mod_mbox/couchdb-user/200901.mbox/%3c5aaed53f0901120631v112916eewcc50e96c44728...@mail.gmail.com%3e > It would appear to be a good solution to allow the flexibility desired while > narrowing the number of local.ini edits to accomplish. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (COUCHDB-128) couchdb is not starting properly from init.d script in trunk
[ https://issues.apache.org/jira/browse/COUCHDB-128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12670996#action_12670996 ] Paul Joseph Davis commented on COUCHDB-128: --- Is there no more information on this ticket? I routinely us /usr/local/etc/init.d/couchdb to start and stop couchdb on Linux. > couchdb is not starting properly from init.d script in trunk > > > Key: COUCHDB-128 > URL: https://issues.apache.org/jira/browse/COUCHDB-128 > Project: CouchDB > Issue Type: Bug > Components: Build System >Reporter: Noah Slater >Assignee: Noah Slater >Priority: Blocker > Fix For: 0.9 > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [jira] Commented: (COUCHDB-194) [startkey, endkey[: provide a right-open range selection method
On 06/02/2009, at 8:36 AM, Paul Davis wrote: I've been pondering this issue of the weird _design/ doc hack. IMO it's not possible to get this hack right because it doesn't acknowledge the reality of Unicode. I'd either agree with Zach on having separately named keys for open or right on *both* ends, or specific to the string and array types, a startswith parameter. I don't much like the startswith idea though as it's not generally applicable. I posted a description of prefixkey with an interpretation over all JSON values. http://mail-archives.apache.org/mod_mbox/couchdb-dev/200901.mbox/%3c67c42c78-4f52-409a-847b-f545f664d...@gmail.com%3e Antony Blakey -- CTO, Linkuistics Pty Ltd Ph: 0438 840 787 If at first you don’t succeed, try, try again. Then quit. No use being a damn fool about it -- W.C. Fields
Re: Transactional _bulk_docs
Ooops... On 06/02/2009, at 1:32 PM, Antony Blakey wrote: In short, I think it is a mistake to try and hide the different characteristics of local (even clustered) operations, and replication. You may disagree, but if the system distinguishes between these two fundamentally different things (distinguished by their partition-tolerance), you can code as though every operation leads to conflict if you wish, but I can't take advantage of the difference. ... if the system doesn't distinguish between those two cases. Distinguishing between the two cases allows for a wider range of uses and application models. Antony Blakey - CTO, Linkuistics Pty Ltd Ph: 0438 840 787 All that is required for evil to triumph is that good men do nothing.
Re: Transactional _bulk_docs
On 06/02/2009, at 6:20 AM, Chris Anderson wrote: Antony, maybe it would help for you to explain just exactly what you wouldn't be able to do, without the bulk docs API. It will help to inform people about the technical issue. My original email included this: --- For example, I have documents that can be cloned. The cloned document contains a reference to the originating document. Then I delete the original document, the clone history needs to be updated to remove the reference to the original document and replace it with an original- deleted history item. There is a business case that requires this consistency. With a transactional API this is easy. Without it, I can't see a way to maintain consistency in the face of concurrent application access and/or failure. --- However, I don't think this is really about a specific example. The problem is that if you get one side of the relationship written and visible, but the other side not, then other concurrent accessors will see a partially successful update. One response is "but you'll see this problem during replication", but I think this is making a big assumption about how replication is managed/interleaved with local application behaviour. Replication, and dealing with conflicts, is in no way automatic. As others have stated, there is no domain-independent way of resolving conflicts. Surely if it were possible to build a transactional API on top of a conflict-based system, then this statement would not be true? I am deploying CouchDB like a Notes CLIENT. Not as a high-performance database server. Replication is an explicit operation, that halts normal activity. For my first delivery, replicas are read-only, so replication conflict isn't possible, but when I move to a distributed writers scenario, resolving replication conflicts will involve a specialized UI, that allows all conflicts to be resolved before normal operation resumes. Thus the editing application always sees a conflict- free database. The use-case of someone doing a local operation e.g. submitting a web form, is very different than resolving replication conflicts. Conflict during a local operation is a matter of application concurrency, whereas conflict during replication is driven by the overall system model. It has different temporal, administrative and UI boundaries. In short, I think it is a mistake to try and hide the different characteristics of local (even clustered) operations, and replication. You may disagree, but if the system distinguishes between these two fundamentally different things (distinguished by their partition- tolerance), you can code as though every operation leads to conflict if you wish, but I can't take advantage of the difference. I know that the long-standing vision of Couch doesn't include special API exceptions for when you are running on a single node. And I'm a little afraid that the transactional doc commits Antony wants us to keep, are only a mirage, which would lead to trouble anyway, when distributed systems are involved. I don't understand why this needs to be the case. You can do transactions in distributed systems. Do you have a model that isn't amenable to a Scalaris treatment? Especially given that we're only talking about transactions over a set of processes that are providing an illusion of a single system. Such a cluster already requires some degree of partion-tolerance, right? And if not, then what distinguishes a cluster from a partition-tolerant p2p mesh? Antony Blakey - CTO, Linkuistics Pty Ltd Ph: 0438 840 787 The fact that an opinion has been widely held is no evidence whatever that it is not utterly absurd. -- Bertrand Russell
Re: [jira] Commented: (COUCHDB-194) [startkey, endkey[: provide a right-open range selection method
I've been pondering this issue of the weird _design/ doc hack. I'd either agree with Zach on having separately named keys for open or right on *both* ends, or specific to the string and array types, a startswith parameter. I don't much like the startswith idea though as it's not generally applicable. Also, did I miss what you'd pass in the _design doc scenario as end key assuming right open semantics? On Thu, Feb 5, 2009 at 4:57 PM, Zachary Zolton wrote: > Maximillian, > > I'd think both _could_ be useful. > > I mean in Ruby we have both for the right-hand boundary of ranges: > irb(main):005:0> (1..5).max > => 5 > irb(main):006:0> (1...5).max > => 4 > > IMHO, it would be better to use a different pair of parameter names, > such that we could easily distinguish between open and closed bounds. > > > Cheers, > > Zach > > > PS. Is it "Maximillian" or "Max"? :^D > > On Thu, Feb 5, 2009 at 3:32 PM, Maximillian Dornseif (JIRA) > wrote: >> >>[ >> https://issues.apache.org/jira/browse/COUCHDB-194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12670911#action_12670911 >> ] >> >> Maximillian Dornseif commented on COUCHDB-194: >> -- >> >> So far nobody seems against it. >> >> The downside is that it MIGHT break some existing code. >> >>> [startkey, endkey[: provide a right-open range selection method >>> --- >>> >>> Key: COUCHDB-194 >>> URL: https://issues.apache.org/jira/browse/COUCHDB-194 >>> Project: CouchDB >>> Issue Type: Improvement >>> Components: HTTP Interface >>>Affects Versions: 0.9 >>>Reporter: Maximillian Dornseif >>>Priority: Blocker >>> Fix For: 1.0 >>> >>> >>> While writing something about using CouchDB I came across the issue of >>> "slice indexes" (called startkey and endkey in CouchDB lingo). >>> I found no exact definition of startkey and endkey anywhere in the >>> documentation. Testing reveals that access on _all_docs and on views >>> documents are retuned in the interval >>> [startkey, endkey] = (startkey <= k <= endkey). >>> I don't know if this was a conscious design decision. But I like to promote >>> a slightly different interpretation (and thus API change): >>> [startkey, endkey[ = (startkey <= k < endkey). >>> Both approaches are valid and used in the real world. Ruby uses the >>> inclusive ("right-closed" in math speak) first approach: >>> >> l = [1,2,3,4] >>> >> l.slice(1,2) >>> => [2, 3] >>> Python uses the exclusive ("right-open" in math speak) second approach: >>> >>> l = [1,2,3,4] >>> >>> l[1,2] >>> [2] >>> For array indices both work fine and which one to prefer is mostly an issue >>> of habit. In spoken language both approaches are used: "Have the Software >>> done until saturday" probably means right-open to the client and >>> right-closed to the coder. >>> But if you are working with keys that are more than array indexes, then >>> right-open is much easier to handle. That is because you have to *guess* >>> the biggest value you want to get. The Wiki at >>> http://wiki.apache.org/couchdb/View_collation contains an example of that >>> problem: >>> It is suggested that you use >>> startkey="_design/"&endkey="_design/Z" >>> or >>> startkey="_design/"&endkey="_design/\u″ >>> to get a list of all design documents - also the replication system in the >>> db core uses the same hack. >>> This breaks if a design document is named "ZTop" or >>> "\Iñtërnâtiônàlizætiøn". Such names might be unlikely but we are >>> computer scientists; "unlikely" is a bad approach to software engineering. >>> The think what we really want to ask CouchDB is to "get all documents with >>> keys starting with '_design/'". >>> This is basically impossible to do with right-closed intervals. We could >>> use startkey="_design/"&endkey="_design0″ ('0′ is the ASCII character after >>> '/') and this will work fine ... until there is actually a document with >>> the key "_design0″ in the system. Unlikely, but ... >>> To make selection by intervals reliable currently clients have to guess the >>> last key (the approach) or use the fist key not to include (the >>> _design0 approach) and then post process the result to remove the last >>> element returned if it exactly matches the given endkey value. >>> If couchdb would change to a right-open interval approach post processing >>> would go away in most cases. See >>> http://blogs.23.nu/c0re/2008/12/building-a-track-and-trace-application-with-couchdb/ >>> for two real world examples. >>> At least for string keys and float keys changing the meaning to [startkey, >>> endkey[ would allow selections like >>> * "all strings starting with 'abc'" >>> * all numbers between 10.5 and 11 >>> It also would hopefully break not to much existing code. Since the notion >>> of end
Re: [jira] Commented: (COUCHDB-194) [startkey, endkey[: provide a right-open range selection method
Maximillian, I'd think both _could_ be useful. I mean in Ruby we have both for the right-hand boundary of ranges: irb(main):005:0> (1..5).max => 5 irb(main):006:0> (1...5).max => 4 IMHO, it would be better to use a different pair of parameter names, such that we could easily distinguish between open and closed bounds. Cheers, Zach PS. Is it "Maximillian" or "Max"? :^D On Thu, Feb 5, 2009 at 3:32 PM, Maximillian Dornseif (JIRA) wrote: > >[ > https://issues.apache.org/jira/browse/COUCHDB-194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12670911#action_12670911 > ] > > Maximillian Dornseif commented on COUCHDB-194: > -- > > So far nobody seems against it. > > The downside is that it MIGHT break some existing code. > >> [startkey, endkey[: provide a right-open range selection method >> --- >> >> Key: COUCHDB-194 >> URL: https://issues.apache.org/jira/browse/COUCHDB-194 >> Project: CouchDB >> Issue Type: Improvement >> Components: HTTP Interface >>Affects Versions: 0.9 >>Reporter: Maximillian Dornseif >>Priority: Blocker >> Fix For: 1.0 >> >> >> While writing something about using CouchDB I came across the issue of >> "slice indexes" (called startkey and endkey in CouchDB lingo). >> I found no exact definition of startkey and endkey anywhere in the >> documentation. Testing reveals that access on _all_docs and on views >> documents are retuned in the interval >> [startkey, endkey] = (startkey <= k <= endkey). >> I don't know if this was a conscious design decision. But I like to promote >> a slightly different interpretation (and thus API change): >> [startkey, endkey[ = (startkey <= k < endkey). >> Both approaches are valid and used in the real world. Ruby uses the >> inclusive ("right-closed" in math speak) first approach: >> >> l = [1,2,3,4] >> >> l.slice(1,2) >> => [2, 3] >> Python uses the exclusive ("right-open" in math speak) second approach: >> >>> l = [1,2,3,4] >> >>> l[1,2] >> [2] >> For array indices both work fine and which one to prefer is mostly an issue >> of habit. In spoken language both approaches are used: "Have the Software >> done until saturday" probably means right-open to the client and >> right-closed to the coder. >> But if you are working with keys that are more than array indexes, then >> right-open is much easier to handle. That is because you have to *guess* the >> biggest value you want to get. The Wiki at >> http://wiki.apache.org/couchdb/View_collation contains an example of that >> problem: >> It is suggested that you use >> startkey="_design/"&endkey="_design/Z" >> or >> startkey="_design/"&endkey="_design/\u″ >> to get a list of all design documents - also the replication system in the >> db core uses the same hack. >> This breaks if a design document is named "ZTop" or >> "\Iñtërnâtiônàlizætiøn". Such names might be unlikely but we are >> computer scientists; "unlikely" is a bad approach to software engineering. >> The think what we really want to ask CouchDB is to "get all documents with >> keys starting with '_design/'". >> This is basically impossible to do with right-closed intervals. We could use >> startkey="_design/"&endkey="_design0″ ('0′ is the ASCII character after '/') >> and this will work fine ... until there is actually a document with the key >> "_design0″ in the system. Unlikely, but ... >> To make selection by intervals reliable currently clients have to guess the >> last key (the approach) or use the fist key not to include (the >> _design0 approach) and then post process the result to remove the last >> element returned if it exactly matches the given endkey value. >> If couchdb would change to a right-open interval approach post processing >> would go away in most cases. See >> http://blogs.23.nu/c0re/2008/12/building-a-track-and-trace-application-with-couchdb/ >> for two real world examples. >> At least for string keys and float keys changing the meaning to [startkey, >> endkey[ would allow selections like >> * "all strings starting with 'abc'" >> * all numbers between 10.5 and 11 >> It also would hopefully break not to much existing code. Since the notion of >> endkey seems to be already considered "fishy" (see the Z approach) most >> code seems to try to avoid that issue. For example >> 'startkey="_design/"&endkey="_design/Z"' still would work unless you >> have a design document being named exactly "Z". > > -- > This message is automatically generated by JIRA. > - > You can reply to this email to add a comment to the issue online. > >
[jira] Created: (COUCHDB-240) Replication breaks with large Attachments.
Replication breaks with large Attachments. -- Key: COUCHDB-240 URL: https://issues.apache.org/jira/browse/COUCHDB-240 Project: CouchDB Issue Type: Bug Components: Database Core Affects Versions: 0.9 Environment: r 741265. Debian Linux unknown revision, FreeBSD 7.0. GBit Network connection between the hosts. Reporter: Maximillian Dornseif I use the code in http://code.google.com/p/couchdb-python/issues/detail?id=54 to do replication between two machines. I'm running 741265 on both machines. I have a Database with big attachments (high-res images, 31.1 GB, 34026 Docs). "Pull" replication breaks with following message sent via http: couchdb.client.ServerError: (500, ('function_clause', "[{lists,map,[#Fun,ok]},\n {couch_rep,open_doc_revs,4},\n {couch_rep,'-enum_docs_parallel/3-fun-1-',3},\n {couch_rep,'-spawn_worker/3-fun-0-',3}]")) With "push" replication the server just drops the connection (httplib2/__init__.py", line 715, in connect socket.error: (61, 'Connection refused') - why "refused" instead of "closed"?). I have only been able to replicate the first 100 documents. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (COUCHDB-194) [startkey, endkey[: provide a right-open range selection method
[ https://issues.apache.org/jira/browse/COUCHDB-194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12670911#action_12670911 ] Maximillian Dornseif commented on COUCHDB-194: -- So far nobody seems against it. The downside is that it MIGHT break some existing code. > [startkey, endkey[: provide a right-open range selection method > --- > > Key: COUCHDB-194 > URL: https://issues.apache.org/jira/browse/COUCHDB-194 > Project: CouchDB > Issue Type: Improvement > Components: HTTP Interface >Affects Versions: 0.9 >Reporter: Maximillian Dornseif >Priority: Blocker > Fix For: 1.0 > > > While writing something about using CouchDB I came across the issue of "slice > indexes" (called startkey and endkey in CouchDB lingo). > I found no exact definition of startkey and endkey anywhere in the > documentation. Testing reveals that access on _all_docs and on views > documents are retuned in the interval > [startkey, endkey] = (startkey <= k <= endkey). > I don't know if this was a conscious design decision. But I like to promote a > slightly different interpretation (and thus API change): > [startkey, endkey[ = (startkey <= k < endkey). > Both approaches are valid and used in the real world. Ruby uses the inclusive > ("right-closed" in math speak) first approach: > >> l = [1,2,3,4] > >> l.slice(1,2) > => [2, 3] > Python uses the exclusive ("right-open" in math speak) second approach: > >>> l = [1,2,3,4] > >>> l[1,2] > [2] > For array indices both work fine and which one to prefer is mostly an issue > of habit. In spoken language both approaches are used: "Have the Software > done until saturday" probably means right-open to the client and right-closed > to the coder. > But if you are working with keys that are more than array indexes, then > right-open is much easier to handle. That is because you have to *guess* the > biggest value you want to get. The Wiki at > http://wiki.apache.org/couchdb/View_collation contains an example of that > problem: > It is suggested that you use > startkey="_design/"&endkey="_design/Z" > or > startkey="_design/"&endkey="_design/\u″ > to get a list of all design documents - also the replication system in the db > core uses the same hack. > This breaks if a design document is named "ZTop" or > "\Iñtërnâtiônàlizætiøn". Such names might be unlikely but we are computer > scientists; "unlikely" is a bad approach to software engineering. > The think what we really want to ask CouchDB is to "get all documents with > keys starting with '_design/'". > This is basically impossible to do with right-closed intervals. We could use > startkey="_design/"&endkey="_design0″ ('0′ is the ASCII character after '/') > and this will work fine ... until there is actually a document with the key > "_design0″ in the system. Unlikely, but ... > To make selection by intervals reliable currently clients have to guess the > last key (the approach) or use the fist key not to include (the _design0 > approach) and then post process the result to remove the last element > returned if it exactly matches the given endkey value. > If couchdb would change to a right-open interval approach post processing > would go away in most cases. See > http://blogs.23.nu/c0re/2008/12/building-a-track-and-trace-application-with-couchdb/ > for two real world examples. > At least for string keys and float keys changing the meaning to [startkey, > endkey[ would allow selections like > * "all strings starting with 'abc'" > * all numbers between 10.5 and 11 > It also would hopefully break not to much existing code. Since the notion of > endkey seems to be already considered "fishy" (see the Z approach) most > code seems to try to avoid that issue. For example > 'startkey="_design/"&endkey="_design/Z"' still would work unless you > have a design document being named exactly "Z". -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (COUCHDB-135) Offset regression between 0.8.0 and trunk
[ https://issues.apache.org/jira/browse/COUCHDB-135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12670893#action_12670893 ] Paul Carey commented on COUCHDB-135: The new patch nails this issue. I've run all the tests from my lib a few hundred times. No failures. Happy days! > Offset regression between 0.8.0 and trunk > - > > Key: COUCHDB-135 > URL: https://issues.apache.org/jira/browse/COUCHDB-135 > Project: CouchDB > Issue Type: Bug > Components: Database Core >Affects Versions: 0.9 > Environment: OSX 10.5 >Reporter: Paul Carey >Priority: Blocker > Fix For: 0.9 > > Attachments: COUCHDB-135.patch, COUCHDB-135.patch, view_offsets.js, > view_offsets2.js > > > The offset returned for certain map queries differs between 0.8.0 and > 0.9.0r702929. > The attached test can be pasted into couch_tests.js. It passes in 0.8.0 and > fails in 0.9. > I believe the skip query param must be passed for this bug to be exhibited. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (COUCHDB-135) Offset regression between 0.8.0 and trunk
[ https://issues.apache.org/jira/browse/COUCHDB-135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Joseph Davis updated COUCHDB-135: -- Attachment: COUCHDB-135.patch I'm pretty sure that COUCHDB-135 was actually (at least) two different bugs. One with propogating row count reductions, the other has to do with when a skip is specified and you skip out of the first KV node. Hopefully this fix works. > Offset regression between 0.8.0 and trunk > - > > Key: COUCHDB-135 > URL: https://issues.apache.org/jira/browse/COUCHDB-135 > Project: CouchDB > Issue Type: Bug > Components: Database Core >Affects Versions: 0.9 > Environment: OSX 10.5 >Reporter: Paul Carey >Priority: Blocker > Fix For: 0.9 > > Attachments: COUCHDB-135.patch, COUCHDB-135.patch, view_offsets.js, > view_offsets2.js > > > The offset returned for certain map queries differs between 0.8.0 and > 0.9.0r702929. > The attached test can be pasted into couch_tests.js. It passes in 0.8.0 and > fails in 0.9. > I believe the skip query param must be passed for this bug to be exhibited. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Transactional _bulk_docs
On Thu, Feb 5, 2009 at 5:34 AM, Damien Katz wrote: > We are going to discuss this on the ML. I was waiting until I got the patch > work to talk about all the implications and how we'd set the flags and modes > of operation and all the implications. The code is going to get more > powerful, the plan is for the feature to go away, not the capability. If we > decided the feature was too important, we'll put it back. But as it stands, > the changes to the code that I'm making now all need to be made regardless > if we change the feature or not. I agree that we should discuss this on the mailing list, and take a formal vote when we're ready to. I'm glad we're talking about the patch, but I can see why Damien would rather finish writing it before we take it apart on here. I opened my response to the this thread by asking for interested parties to discuss how one would implement the bulk_docs feature on top of the capabilities that CouchDB will make available. I'm still coming to understand those new capabilities. I think I'll need to see Damien's patch before I can have any considered opinion of it. For instance, I am not comfortable holding a vote until we've had time to understand the code. On Thu, Feb 5, 2009 at 5:46 AM, Jan Lehnardt wrote: > Hi, > > *pouring water over the fire* > > The progression of this is very unfortunate. There was no formal discussion, > neither on IRC or a mailing list. We are all aware of the ASF ways of > running > a project and we didn't handle that one well. > > Apologies. I agree that we as a PMC should strive to be more transparent in the future. Making the transactional _bulk_docs API available in the first place was a hard decision, and it's not clear that it was the right decision (although it did make testing ACID transactions easier). The CouchDB project came into the Incubator with a lot of momentum and direction, and I consider part of my role with the project, to help insulate Damien from the mailing-list chatter, especially when he's deep in code. I acknowledge that could be a mistake as well, if it leads to community misapprehension. > The whole thing started because I closed a bug with a comment that > there must be an _upcoming discussion_. I sympathize with Antony's predicament. He's been using bulk doc transactions in a high-pressure environment, and it works for him. It's understandable that he'd be upset, first hearing about the patch like this. I know that the long-standing vision of Couch doesn't include special API exceptions for when you are running on a single node. And I'm a little afraid that the transactional doc commits Antony wants us to keep, are only a mirage, which would lead to trouble anyway, when distributed systems are involved. I think a consensus algorithm client library could provide the same semantics as the current feature, even on a cluster. An implementation would let Antony keep his feature, even on larger clusters. It could easily be included as an Erlang plugin. Couch has a way of forcing developers to rethink their applications in order to make them fit into its mode of operation. I think if we approach the problem from a technical angle, it will help everyone to have an informed opinion about the patch. I'd been hoping to hold this discussion until Damien makes his code available, as I think that's when it'd be most appropriate. Antony, maybe it would help for you to explain just exactly what you wouldn't be able to do, without the bulk docs API. It will help to inform people about the technical issue. Sincerely, Chris -- Chris Anderson http://jchris.mfdz.com
[jira] Updated: (COUCHDB-190) _uuid should respond to GET, not POST
[ https://issues.apache.org/jira/browse/COUCHDB-190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zachary Zolton updated COUCHDB-190: --- Attachment: COUCH-190.diff Patch to fix COUCH-190: * Changed /_uuids action to GET instead of POST * Added broadly-compatible cache-busting headers to response * Added to unit tests to JavaScript suite > _uuid should respond to GET, not POST > - > > Key: COUCHDB-190 > URL: https://issues.apache.org/jira/browse/COUCHDB-190 > Project: CouchDB > Issue Type: Improvement > Components: Database Core >Affects Versions: 0.9 >Reporter: Matt Goodall >Priority: Blocker > Fix For: 0.9 > > Attachments: COUCH-190.diff > > > The /_uuid resource can happily return a response to a GET without being > unresty. In fact, supporting POST is probably incorrect as it implies it > would change server state. > Quick summary: > * _uuid never changes server state > * calling _uuid multiple times does not impact other clients > * that the resource returns something different each time it is requested > does not mean it cannot be a POST > * GET with proper cache control (i.e. don't cache it ever) will work equally > well > Full discussion can be found on the user m.l., > http://mail-archives.apache.org/mod_mbox/couchdb-user/200901.mbox/%3c21939021.1440421230910477169.javamail.serv...@perfora%3e. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (COUCHDB-190) _uuid should respond to GET, not POST
[ https://issues.apache.org/jira/browse/COUCHDB-190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zachary Zolton updated COUCHDB-190: --- Attachment: (was: COUCHDB-190.diff) > _uuid should respond to GET, not POST > - > > Key: COUCHDB-190 > URL: https://issues.apache.org/jira/browse/COUCHDB-190 > Project: CouchDB > Issue Type: Improvement > Components: Database Core >Affects Versions: 0.9 >Reporter: Matt Goodall >Priority: Blocker > Fix For: 0.9 > > > The /_uuid resource can happily return a response to a GET without being > unresty. In fact, supporting POST is probably incorrect as it implies it > would change server state. > Quick summary: > * _uuid never changes server state > * calling _uuid multiple times does not impact other clients > * that the resource returns something different each time it is requested > does not mean it cannot be a POST > * GET with proper cache control (i.e. don't cache it ever) will work equally > well > Full discussion can be found on the user m.l., > http://mail-archives.apache.org/mod_mbox/couchdb-user/200901.mbox/%3c21939021.1440421230910477169.javamail.serv...@perfora%3e. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Transactional _bulk_docs
On Feb 5, 2009, at 3:14 AM, Geir Magnusson Jr. wrote: And unlike Ted, I don't agree that a pointer to an IRC log is sufficient to represent a "done decision", and he may not have meant that anyway. Sure, I can see a chat starting on IRC about a topic, but I'd hope that one person would force the move from IRC to the mail list - and at that point, maybe posting a pointer to the *initial* discussion log would be useful. And after that, discussion is on the mail list. Ok, I see that I was unclear. What I said was that the act of making the decision must happen on the list. So the pointer to the IRC discussion is the background to a mailing list discussion to actually make the decision. During that discussion I'd expect other voices to be heard, issues to be raised, etc. I didn't mean that all you had to do was send the pointer to indicate that a decision had been made. Ted
Re: Transactional _bulk_docs
Sure, ideally. But you can't have "everyone" together at the same time on IRC, where at the ASF, we define "everyone" to be, well, "everyone", not you and the 4 others on the PMC. I see 579 people on the user list. I see 294 people on the dev list. Just focusing on the dev list, that's 290 people, or 98.6% of people supposedly interested in CouchDB development, that had zero opportunity to see, review and participate in the discussion. Further, there's now zero chance that any future project participant can look back to understand design decision and philosophy. No institutional memory. Your goal, besides building a great software project, should be to get the community to the point where you can step back and do other things w/o material effect on the community, and that requires information like this to be somewhere accessible. And unlike Ted, I don't agree that a pointer to an IRC log is sufficient to represent a "done decision", and he may not have meant that anyway. Sure, I can see a chat starting on IRC about a topic, but I'd hope that one person would force the move from IRC to the mail list - and at that point, maybe posting a pointer to the *initial* discussion log would be useful. And after that, discussion is on the mail list. I think IRC logs are a very poor substitute to mail traffic (and yes, I grok the downside of async communications). A primary one reason that they are very "in the moment" - if you are in the conversation, it's easy to stay in, but after, when things cool and the context of the moment isn't there, it's neigh impossible. You also can't hit reply and quote a piece for others to see and discuss, further broadening the discussion. What got me engaged on this wasn't the decision itself (only because it was a secret decision), but -like Ted - the mode of operation. It seemed that a very dedicated, engaged and interested community member had to privately petition the PMC for redress on a technical decision that none of us had any awareness of, nor a chance to review. And IMO, from a guy that probably should be a committer and PMC member to boot! (By the way - from my count, not all PMC members are even on the PMC's private@ list, so I have *no clue* where project private discussion - like new committer candidates - are even discussed) geir On Feb 5, 2009, at 2:11 AM, Damien Katz wrote: Ideally yes, but real time communication with everyone together is damn useful. -Damien On Feb 5, 2009, at 2:07 AM, Ted Leung wrote: Uh, project decisions are supposed to be made in the public mailing lists... Ted On Feb 4, 2009, at 6:51 PM, Damien Katz wrote: This decision was discussed and made on IRC. -Damien On Feb 4, 2009, at 9:26 PM, Geir Magnusson Jr. wrote: can you point me to a reference to where the PMC made this decision? I'm interested in the subject for it's own sake, and I'm also interested in figuring out where decisions are made in this project, since I didn't see this one go by on a mail list. geir On Feb 4, 2009, at 9:13 PM, Damien Katz wrote: Geir, there was a decision made by the PMCs to change the transaction model to support partitioned databases. It is a change I am currently working on. -Damien On Feb 4, 2009, at 8:46 PM, Geir Magnusson Jr. wrote: and original question #2? geir On Feb 4, 2009, at 8:38 PM, Antony Blakey wrote: On 05/02/2009, at 12:02 PM, Geir Magnusson Jr. wrote: 1) where is this being forwarded from ? I sent it to the PMC. Antony Blakey - CTO, Linkuistics Pty Ltd Ph: 0438 840 787 A Buddhist walks up to a hot-dog stand and says, "Make me one with everything". He then pays the vendor and asks for change. The vendor says, "Change comes from within".
[jira] Updated: (COUCHDB-135) Offset regression between 0.8.0 and trunk
[ https://issues.apache.org/jira/browse/COUCHDB-135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Carey updated COUCHDB-135: --- Attachment: view_offsets2.js This imaginatively titled test case does trigger what I believe is a race condition. The wrong offset is returned roughly one time in ten (on my machine). Potentially of interest is that once the wrong offset is returned for a query, it's always returned for that same query - i.e. it doesn't appear that the query is being satisfied before the index has been fully built. > Offset regression between 0.8.0 and trunk > - > > Key: COUCHDB-135 > URL: https://issues.apache.org/jira/browse/COUCHDB-135 > Project: CouchDB > Issue Type: Bug > Components: Database Core >Affects Versions: 0.9 > Environment: OSX 10.5 >Reporter: Paul Carey >Priority: Blocker > Fix For: 0.9 > > Attachments: COUCHDB-135.patch, view_offsets.js, view_offsets2.js > > > The offset returned for certain map queries differs between 0.8.0 and > 0.9.0r702929. > The attached test can be pasted into couch_tests.js. It passes in 0.8.0 and > fails in 0.9. > I believe the skip query param must be passed for this bug to be exhibited. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Transactional _bulk_docs
On 5 Feb 2009, at 14:05, Robert Dionne wrote: I'm not very familiar with the ASF "process", excuse my ignorance, but I find the IRC enormously useful and find mailing list threads can be too unwieldy. Check out http://apache.org/foundation/how-it-works.html for more about The ASF Way/. Cheers Jan -- I guess it's because I'm not a fan of top down design. I see the code itself as the design, and the debugging, reworking, and documenting of the code as the construction phase. Best regards, Bob Robert Dionne Chief Bittwiddler dio...@dionne-associates.com 203.231.9961 On Feb 5, 2009, at 6:14 AM, Geir Magnusson Jr. wrote: [sending second time, as I see my first is stuck in moderation, and I want to reply in a timely manner] Sure, ideally. But you can't have "everyone" together at the same time on IRC, where at the ASF, we define "everyone" to be, well, "everyone", not you and the 4 others on the PMC. I see 579 people on the user list. I see 294 people on the dev list. Just focusing on the dev list, that's 290 people, or 98.6% of people supposedly interested in CouchDB development, that had zero opportunity to see, review and participate in the discussion. Further, there's now zero chance that any future project participant can look back to understand design decision and philosophy. No institutional memory. Your goal, besides building a great software project, should be to get the community to the point where you can step back and do other things w/o material effect on the community, and that requires information like this to be somewhere accessible. And unlike Ted, I don't agree that a pointer to an IRC log is sufficient to represent a "done decision", and he may not have meant that anyway. Sure, I can see a chat starting on IRC about a topic, but I'd hope that one person would force the move from IRC to the mail list - and at that point, maybe posting a pointer to the *initial* discussion log would be useful. And after that, discussion is on the mail list. I think IRC logs are a very poor substitute to mail traffic (and yes, I grok the downside of async communications). A primary one reason that they are very "in the moment" - if you are in the conversation, it's easy to stay in, but after, when things cool and the context of the moment isn't there, it's neigh impossible. You also can't hit reply and quote a piece for others to see and discuss, further broadening the discussion. What got me engaged on this wasn't the decision itself (only because it was a secret decision), but -like Ted - the mode of operation. It seemed that a very dedicated, engaged and interested community member had to privately petition the PMC for redress on a technical decision that none of us had any awareness of, nor a chance to review. And IMO, from a guy that probably should be a committer and PMC member to boot! (By the way - from my count, not all PMC members are even on the PMC's private@ list, so I have *no clue* where project private discussion - like new committer candidates - are even discussed) geir On Feb 5, 2009, at 2:11 AM, Damien Katz wrote: Ideally yes, but real time communication with everyone together is damn useful. -Damien On Feb 5, 2009, at 2:07 AM, Ted Leung wrote: Uh, project decisions are supposed to be made in the public mailing lists... Ted On Feb 4, 2009, at 6:51 PM, Damien Katz wrote: This decision was discussed and made on IRC. -Damien On Feb 4, 2009, at 9:26 PM, Geir Magnusson Jr. wrote: can you point me to a reference to where the PMC made this decision? I'm interested in the subject for it's own sake, and I'm also interested in figuring out where decisions are made in this project, since I didn't see this one go by on a mail list. geir On Feb 4, 2009, at 9:13 PM, Damien Katz wrote: Geir, there was a decision made by the PMCs to change the transaction model to support partitioned databases. It is a change I am currently working on. -Damien On Feb 4, 2009, at 8:46 PM, Geir Magnusson Jr. wrote: and original question #2? geir On Feb 4, 2009, at 8:38 PM, Antony Blakey wrote: On 05/02/2009, at 12:02 PM, Geir Magnusson Jr. wrote: 1) where is this being forwarded from ? I sent it to the PMC. Antony Blakey - CTO, Linkuistics Pty Ltd Ph: 0438 840 787 A Buddhist walks up to a hot-dog stand and says, "Make me one with everything". He then pays the vendor and asks for change. The vendor says, "Change comes from within".
Re: Transactional _bulk_docs
On Thu, Feb 05, 2009 at 06:14:26AM -0500, Geir Magnusson Jr. wrote: > What got me engaged on this wasn't the decision itself (only because it > was a secret decision), but -like Ted - the mode of operation. It > seemed that a very dedicated, engaged and interested community member > had to privately petition the PMC for redress on a technical decision > that none of us had any awareness of, nor a chance to review. And IMO, > from a guy that probably should be a committer and PMC member to boot! I think we dropped the ball with this one. I certainly don't remember being involved in this discussion, though I'm sure someone has logs to prove otherwise. This in itself should be indicative of a larger problem here. I think it was fine that this was discussed on IRC, but the moment it came to the point of needing to do anything about it, or make any decisions based upon it, it should have been written up as a formal proposal and sent to the public mailing list for discussion. I hope that the community reaction to this event will be enough to remind us all to do this in the future. > (By the way - from my count, not all PMC members are even on the PMC's > private@ list, so I have *no clue* where project private discussion - > like new committer candidates - are even discussed) Can you email the persons not on the private list as a reminder to join? -- Noah Slater, http://tumbolia.org/nslater
Re: Transactional _bulk_docs
Hi, *pouring water over the fire* The progression of this is very unfortunate. There was no formal discussion, neither on IRC or a mailing list. We are all aware of the ASF ways of running a project and we didn't handle that one well. Apologies. Now: Damien discussed the bulk docs feature on IRC and noted that for multi-node CouchDB and a consistent interface it has to go and we all agreed that this is a good thing. This is effectively a PMC decision. But that's not the Apache-way of doing things. We deferred discussing all details until Damien finished the patch. Multi-node CouchDB was a day one design goal and well communicated everywhere. We were also very vocal about breaking the API before 0.9. Everybody investing in the API has been warned and has been doing on their own risk. Now, the new behaviour is currently being worked on and has not been discussed since Damien is heads down with the code and as usual, I think, planned to introduce the code with the patch. Again, this is code that has been planned from day one. The discussion of keeping the current (in-flux-API) bulk feature is a separate one and I think the voices here are loud enough that we should look at a way to support them. The whole thing started because I closed a bug with a comment that there must be an _upcoming discussion_. This got taken up at THE PMC IS DOING EVERYTHING BEHIND THE SCENES. Which we don't. Damien's latest mail is a little unfortunate. He gets the Apache way and the ASF understands the virtues of IRC, and the middle ground is that major discussions must be held on the mailing lists. The PMC is simply waiting for the patch to land, so there's no need to get nervous. Thanks. (Aside, this came up on user@ last week and I hoped that this would have been the end of that until the patch lands.) Cheers Jan -- On 5 Feb 2009, at 12:14, Geir Magnusson Jr. wrote: [sending second time, as I see my first is stuck in moderation, and I want to reply in a timely manner] Sure, ideally. But you can't have "everyone" together at the same time on IRC, where at the ASF, we define "everyone" to be, well, "everyone", not you and the 4 others on the PMC. I see 579 people on the user list. I see 294 people on the dev list. Just focusing on the dev list, that's 290 people, or 98.6% of people supposedly interested in CouchDB development, that had zero opportunity to see, review and participate in the discussion. Further, there's now zero chance that any future project participant can look back to understand design decision and philosophy. No institutional memory. Your goal, besides building a great software project, should be to get the community to the point where you can step back and do other things w/o material effect on the community, and that requires information like this to be somewhere accessible. And unlike Ted, I don't agree that a pointer to an IRC log is sufficient to represent a "done decision", and he may not have meant that anyway. Sure, I can see a chat starting on IRC about a topic, but I'd hope that one person would force the move from IRC to the mail list - and at that point, maybe posting a pointer to the *initial* discussion log would be useful. And after that, discussion is on the mail list. I think IRC logs are a very poor substitute to mail traffic (and yes, I grok the downside of async communications). A primary one reason that they are very "in the moment" - if you are in the conversation, it's easy to stay in, but after, when things cool and the context of the moment isn't there, it's neigh impossible. You also can't hit reply and quote a piece for others to see and discuss, further broadening the discussion. What got me engaged on this wasn't the decision itself (only because it was a secret decision), but -like Ted - the mode of operation. It seemed that a very dedicated, engaged and interested community member had to privately petition the PMC for redress on a technical decision that none of us had any awareness of, nor a chance to review. And IMO, from a guy that probably should be a committer and PMC member to boot! (By the way - from my count, not all PMC members are even on the PMC's private@ list, so I have *no clue* where project private discussion - like new committer candidates - are even discussed) geir On Feb 5, 2009, at 2:11 AM, Damien Katz wrote: Ideally yes, but real time communication with everyone together is damn useful. -Damien On Feb 5, 2009, at 2:07 AM, Ted Leung wrote: Uh, project decisions are supposed to be made in the public mailing lists... Ted On Feb 4, 2009, at 6:51 PM, Damien Katz wrote: This decision was discussed and made on IRC. -Damien On Feb 4, 2009, at 9:26 PM, Geir Magnusson Jr. wrote: can you point me to a reference to where the PMC made this decision? I'm interested in the subject for it's own sake, and I'm also interested in figuring ou
Re: Transactional _bulk_docs
On Feb 5, 2009, at 6:14 AM, Geir Magnusson Jr. wrote: [sending second time, as I see my first is stuck in moderation, and I want to reply in a timely manner] Sure, ideally. But you can't have "everyone" together at the same time on IRC, where at the ASF, we define "everyone" to be, well, "everyone", not you and the 4 others on the PMC. I see 579 people on the user list. I see 294 people on the dev list. Just focusing on the dev list, that's 290 people, or 98.6% of people supposedly interested in CouchDB development, that had zero opportunity to see, review and participate in the discussion. Further, there's now zero chance that any future project participant can look back to understand design decision and philosophy. No institutional memory. Your goal, besides building a great software project, should be to get the community to the point where you can step back and do other things w/o material effect on the community, and that requires information like this to be somewhere accessible. And unlike Ted, I don't agree that a pointer to an IRC log is sufficient to represent a "done decision", and he may not have meant that anyway. Sure, I can see a chat starting on IRC about a topic, but I'd hope that one person would force the move from IRC to the mail list - and at that point, maybe posting a pointer to the *initial* discussion log would be useful. And after that, discussion is on the mail list. I think IRC logs are a very poor substitute to mail traffic (and yes, I grok the downside of async communications). A primary one reason that they are very "in the moment" - if you are in the conversation, it's easy to stay in, but after, when things cool and the context of the moment isn't there, it's neigh impossible. You also can't hit reply and quote a piece for others to see and discuss, further broadening the discussion. We get a lot of value out of IRC. We are going to discuss this on the ML. I was waiting until I got the patch work to talk about all the implications and how we'd set the flags and modes of operation and all the implications. The code is going to get more powerful, the plan is for the feature to go away, not the capability. If we decided the feature was too important, we'll put it back. But as it stands, the changes to the code that I'm making now all need to be made regardless if we change the feature or not. What got me engaged on this wasn't the decision itself (only because it was a secret decision), but -like Ted - the mode of operation. It seemed that a very dedicated, engaged and interested community member had to privately petition the PMC for redress on a technical decision that none of us had any awareness of, nor a chance to review. And IMO, from a guy that probably should be a committer and PMC member to boot! He mailed us privately. Now he's mailed us publicly. Any discussion about Antony being involved with the project should probably be private. -Damien (By the way - from my count, not all PMC members are even on the PMC's private@ list, so I have *no clue* where project private discussion - like new committer candidates - are even discussed) geir On Feb 5, 2009, at 2:11 AM, Damien Katz wrote: Ideally yes, but real time communication with everyone together is damn useful. -Damien On Feb 5, 2009, at 2:07 AM, Ted Leung wrote: Uh, project decisions are supposed to be made in the public mailing lists... Ted On Feb 4, 2009, at 6:51 PM, Damien Katz wrote: This decision was discussed and made on IRC. -Damien On Feb 4, 2009, at 9:26 PM, Geir Magnusson Jr. wrote: can you point me to a reference to where the PMC made this decision? I'm interested in the subject for it's own sake, and I'm also interested in figuring out where decisions are made in this project, since I didn't see this one go by on a mail list. geir On Feb 4, 2009, at 9:13 PM, Damien Katz wrote: Geir, there was a decision made by the PMCs to change the transaction model to support partitioned databases. It is a change I am currently working on. -Damien On Feb 4, 2009, at 8:46 PM, Geir Magnusson Jr. wrote: and original question #2? geir On Feb 4, 2009, at 8:38 PM, Antony Blakey wrote: On 05/02/2009, at 12:02 PM, Geir Magnusson Jr. wrote: 1) where is this being forwarded from ? I sent it to the PMC. Antony Blakey - CTO, Linkuistics Pty Ltd Ph: 0438 840 787 A Buddhist walks up to a hot-dog stand and says, "Make me one with everything". He then pays the vendor and asks for change. The vendor says, "Change comes from within".
Re: Transactional _bulk_docs
fwiw, I'd like to see these decisions proposed, discussed and resolved on the mailing list. I appreciate it's slower than IRC, though. I thought using mailing lists was the mandated "Apache way" of doing these things, it certainly appears to be on other projects I follow (Lucene, for example). To restate, I didn't think it was a permitted option to use IRC to make important project decisions. Is there at least a transcript of the IRC decision(s)? B. On Thu, Feb 5, 2009 at 8:05 AM, Robert Dionne wrote: > My sense is that the approach to design in CouchDB is very bottoms up. I > applaud that and encourage it and wholeheartedly agree with Alan Perlis > about building software top down *except* the first time. We all know that > very little great software was ever built top down designed by boxologists > armed with UML diagrams. I think CouchDB is at a key point where it needs to > continue to be driven by a small core group of dedicated passionate > programmers. > > Please note that I'm in no way commenting on the make up of that group. > > I'm not very familiar with the ASF "process", excuse my ignorance, but I > find the IRC enormously useful and find mailing list threads can be too > unwieldy. > > I guess it's because I'm not a fan of top down design. I see the code itself > as the design, and the debugging, reworking, and documenting of the code as > the construction phase. > > Best regards, > > Bob > > Robert Dionne > Chief Bittwiddler > dio...@dionne-associates.com > 203.231.9961 > > > > On Feb 5, 2009, at 6:14 AM, Geir Magnusson Jr. wrote: > >> [sending second time, as I see my first is stuck in moderation, and I want >> to reply in a timely manner] >> >> Sure, ideally. >> >> But you can't have "everyone" together at the same time on IRC, where at >> the ASF, we define "everyone" to be, well, "everyone", not you and the 4 >> others on the PMC. >> >> I see 579 people on the user list. I see 294 people on the dev list. >> Just focusing on the dev list, that's 290 people, or 98.6% of people >> supposedly interested in CouchDB development, that had zero opportunity to >> see, review and participate in the discussion. Further, there's now zero >> chance that any future project participant can look back to understand >> design decision and philosophy. No institutional memory. Your goal, >> besides building a great software project, should be to get the community to >> the point where you can step back and do other things w/o material effect on >> the community, and that requires information like this to be somewhere >> accessible. >> >> And unlike Ted, I don't agree that a pointer to an IRC log is sufficient >> to represent a "done decision", and he may not have meant that anyway. >> Sure, I can see a chat starting on IRC about a topic, but I'd hope that one >> person would force the move from IRC to the mail list - and at that point, >> maybe posting a pointer to the *initial* discussion log would be useful. >> And after that, discussion is on the mail list. >> >> I think IRC logs are a very poor substitute to mail traffic (and yes, I >> grok the downside of async communications). A primary one reason that they >> are very "in the moment" - if you are in the conversation, it's easy to stay >> in, but after, when things cool and the context of the moment isn't there, >> it's neigh impossible. You also can't hit reply and quote a piece for >> others to see and discuss, further broadening the discussion. >> >> What got me engaged on this wasn't the decision itself (only because it >> was a secret decision), but -like Ted - the mode of operation. It seemed >> that a very dedicated, engaged and interested community member had to >> privately petition the PMC for redress on a technical decision that none of >> us had any awareness of, nor a chance to review. And IMO, from a guy that >> probably should be a committer and PMC member to boot! >> >> (By the way - from my count, not all PMC members are even on the PMC's >> private@ list, so I have *no clue* where project private discussion - like >> new committer candidates - are even discussed) >> >> geir >> >> On Feb 5, 2009, at 2:11 AM, Damien Katz wrote: >> >>> Ideally yes, but real time communication with everyone together is damn >>> useful. >>> >>> -Damien >>> >>> On Feb 5, 2009, at 2:07 AM, Ted Leung wrote: >>> Uh, project decisions are supposed to be made in the public mailing lists... Ted On Feb 4, 2009, at 6:51 PM, Damien Katz wrote: > This decision was discussed and made on IRC. > > -Damien > > On Feb 4, 2009, at 9:26 PM, Geir Magnusson Jr. wrote: > >> can you point me to a reference to where the PMC made this decision? >> >> I'm interested in the subject for it's own sake, and I'm also >> interested in figuring out where decisions are made in this project, >> since I >> didn't see this one go by on a mail list. >> >> geir >> >> On Feb 4, 2
Re: Transactional _bulk_docs
My sense is that the approach to design in CouchDB is very bottoms up. I applaud that and encourage it and wholeheartedly agree with Alan Perlis about building software top down *except* the first time. We all know that very little great software was ever built top down designed by boxologists armed with UML diagrams. I think CouchDB is at a key point where it needs to continue to be driven by a small core group of dedicated passionate programmers. Please note that I'm in no way commenting on the make up of that group. I'm not very familiar with the ASF "process", excuse my ignorance, but I find the IRC enormously useful and find mailing list threads can be too unwieldy. I guess it's because I'm not a fan of top down design. I see the code itself as the design, and the debugging, reworking, and documenting of the code as the construction phase. Best regards, Bob Robert Dionne Chief Bittwiddler dio...@dionne-associates.com 203.231.9961 On Feb 5, 2009, at 6:14 AM, Geir Magnusson Jr. wrote: [sending second time, as I see my first is stuck in moderation, and I want to reply in a timely manner] Sure, ideally. But you can't have "everyone" together at the same time on IRC, where at the ASF, we define "everyone" to be, well, "everyone", not you and the 4 others on the PMC. I see 579 people on the user list. I see 294 people on the dev list. Just focusing on the dev list, that's 290 people, or 98.6% of people supposedly interested in CouchDB development, that had zero opportunity to see, review and participate in the discussion. Further, there's now zero chance that any future project participant can look back to understand design decision and philosophy. No institutional memory. Your goal, besides building a great software project, should be to get the community to the point where you can step back and do other things w/o material effect on the community, and that requires information like this to be somewhere accessible. And unlike Ted, I don't agree that a pointer to an IRC log is sufficient to represent a "done decision", and he may not have meant that anyway. Sure, I can see a chat starting on IRC about a topic, but I'd hope that one person would force the move from IRC to the mail list - and at that point, maybe posting a pointer to the *initial* discussion log would be useful. And after that, discussion is on the mail list. I think IRC logs are a very poor substitute to mail traffic (and yes, I grok the downside of async communications). A primary one reason that they are very "in the moment" - if you are in the conversation, it's easy to stay in, but after, when things cool and the context of the moment isn't there, it's neigh impossible. You also can't hit reply and quote a piece for others to see and discuss, further broadening the discussion. What got me engaged on this wasn't the decision itself (only because it was a secret decision), but -like Ted - the mode of operation. It seemed that a very dedicated, engaged and interested community member had to privately petition the PMC for redress on a technical decision that none of us had any awareness of, nor a chance to review. And IMO, from a guy that probably should be a committer and PMC member to boot! (By the way - from my count, not all PMC members are even on the PMC's private@ list, so I have *no clue* where project private discussion - like new committer candidates - are even discussed) geir On Feb 5, 2009, at 2:11 AM, Damien Katz wrote: Ideally yes, but real time communication with everyone together is damn useful. -Damien On Feb 5, 2009, at 2:07 AM, Ted Leung wrote: Uh, project decisions are supposed to be made in the public mailing lists... Ted On Feb 4, 2009, at 6:51 PM, Damien Katz wrote: This decision was discussed and made on IRC. -Damien On Feb 4, 2009, at 9:26 PM, Geir Magnusson Jr. wrote: can you point me to a reference to where the PMC made this decision? I'm interested in the subject for it's own sake, and I'm also interested in figuring out where decisions are made in this project, since I didn't see this one go by on a mail list. geir On Feb 4, 2009, at 9:13 PM, Damien Katz wrote: Geir, there was a decision made by the PMCs to change the transaction model to support partitioned databases. It is a change I am currently working on. -Damien On Feb 4, 2009, at 8:46 PM, Geir Magnusson Jr. wrote: and original question #2? geir On Feb 4, 2009, at 8:38 PM, Antony Blakey wrote: On 05/02/2009, at 12:02 PM, Geir Magnusson Jr. wrote: 1) where is this being forwarded from ? I sent it to the PMC. Antony Blakey - CTO, Linkuistics Pty Ltd Ph: 0438 840 787 A Buddhist walks up to a hot-dog stand and says, "Make me one with everything". He then pays the vendor and asks for change. The vendor says, "Change comes from within".
[jira] Commented: (COUCHDB-135) Offset regression between 0.8.0 and trunk
[ https://issues.apache.org/jira/browse/COUCHDB-135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12670736#action_12670736 ] Paul Carey commented on COUCHDB-135: Applying this patch put a big smile on my face - it does indeed fix the main offset calculation error. However, running my test suite, I now see very sporadic errors which makes me think there's still a race condition lurking somewhere. Running my pagination tests suite 100 times results in about 82k requests to CouchDB and 8k test assertions. I had 8 assertions fail over the 100 test runs. I'm fairly sure the issue doesn't lie with my test suite or lib. I'll have a go at creating a test that reproduces the failure. > Offset regression between 0.8.0 and trunk > - > > Key: COUCHDB-135 > URL: https://issues.apache.org/jira/browse/COUCHDB-135 > Project: CouchDB > Issue Type: Bug > Components: Database Core >Affects Versions: 0.9 > Environment: OSX 10.5 >Reporter: Paul Carey >Priority: Blocker > Fix For: 0.9 > > Attachments: COUCHDB-135.patch, view_offsets.js > > > The offset returned for certain map queries differs between 0.8.0 and > 0.9.0r702929. > The attached test can be pasted into couch_tests.js. It passes in 0.8.0 and > fails in 0.9. > I believe the skip query param must be passed for this bug to be exhibited. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Transactional _bulk_docs
[sending second time, as I see my first is stuck in moderation, and I want to reply in a timely manner] Sure, ideally. But you can't have "everyone" together at the same time on IRC, where at the ASF, we define "everyone" to be, well, "everyone", not you and the 4 others on the PMC. I see 579 people on the user list. I see 294 people on the dev list. Just focusing on the dev list, that's 290 people, or 98.6% of people supposedly interested in CouchDB development, that had zero opportunity to see, review and participate in the discussion. Further, there's now zero chance that any future project participant can look back to understand design decision and philosophy. No institutional memory. Your goal, besides building a great software project, should be to get the community to the point where you can step back and do other things w/o material effect on the community, and that requires information like this to be somewhere accessible. And unlike Ted, I don't agree that a pointer to an IRC log is sufficient to represent a "done decision", and he may not have meant that anyway. Sure, I can see a chat starting on IRC about a topic, but I'd hope that one person would force the move from IRC to the mail list - and at that point, maybe posting a pointer to the *initial* discussion log would be useful. And after that, discussion is on the mail list. I think IRC logs are a very poor substitute to mail traffic (and yes, I grok the downside of async communications). A primary one reason that they are very "in the moment" - if you are in the conversation, it's easy to stay in, but after, when things cool and the context of the moment isn't there, it's neigh impossible. You also can't hit reply and quote a piece for others to see and discuss, further broadening the discussion. What got me engaged on this wasn't the decision itself (only because it was a secret decision), but -like Ted - the mode of operation. It seemed that a very dedicated, engaged and interested community member had to privately petition the PMC for redress on a technical decision that none of us had any awareness of, nor a chance to review. And IMO, from a guy that probably should be a committer and PMC member to boot! (By the way - from my count, not all PMC members are even on the PMC's private@ list, so I have *no clue* where project private discussion - like new committer candidates - are even discussed) geir On Feb 5, 2009, at 2:11 AM, Damien Katz wrote: Ideally yes, but real time communication with everyone together is damn useful. -Damien On Feb 5, 2009, at 2:07 AM, Ted Leung wrote: Uh, project decisions are supposed to be made in the public mailing lists... Ted On Feb 4, 2009, at 6:51 PM, Damien Katz wrote: This decision was discussed and made on IRC. -Damien On Feb 4, 2009, at 9:26 PM, Geir Magnusson Jr. wrote: can you point me to a reference to where the PMC made this decision? I'm interested in the subject for it's own sake, and I'm also interested in figuring out where decisions are made in this project, since I didn't see this one go by on a mail list. geir On Feb 4, 2009, at 9:13 PM, Damien Katz wrote: Geir, there was a decision made by the PMCs to change the transaction model to support partitioned databases. It is a change I am currently working on. -Damien On Feb 4, 2009, at 8:46 PM, Geir Magnusson Jr. wrote: and original question #2? geir On Feb 4, 2009, at 8:38 PM, Antony Blakey wrote: On 05/02/2009, at 12:02 PM, Geir Magnusson Jr. wrote: 1) where is this being forwarded from ? I sent it to the PMC. Antony Blakey - CTO, Linkuistics Pty Ltd Ph: 0438 840 787 A Buddhist walks up to a hot-dog stand and says, "Make me one with everything". He then pays the vendor and asks for change. The vendor says, "Change comes from within".