[jira] Commented: (COUCHDB-514) Redirect from _list using view rows
[ https://issues.apache.org/jira/browse/COUCHDB-514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12804920#action_12804920 ] Chris Anderson commented on COUCHDB-514: Joscha, If you can provide just the bug fix (not _stylistic changes) I'll be glad to help you finish it. I do think this will require an Erlang fix. I won't let it slip past 1.0 but I don't have time to write it before 0.11 Chris > Redirect from _list using view rows > --- > > Key: COUCHDB-514 > URL: https://issues.apache.org/jira/browse/COUCHDB-514 > Project: CouchDB > Issue Type: Improvement > Components: JavaScript View Server >Affects Versions: 0.10 >Reporter: Zachary Zolton > Attachments: list-redir.diff, list_views.diff, render.diff > > > There is no way to redirect from a _list function after calling the getRow() > API function. > Here's a link to the discussion on the dev mailing list: > http://is.gd/3KZRg -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (COUCHDB-514) Redirect from _list using view rows
[ https://issues.apache.org/jira/browse/COUCHDB-514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Anderson reassigned COUCHDB-514: -- Assignee: Chris Anderson > Redirect from _list using view rows > --- > > Key: COUCHDB-514 > URL: https://issues.apache.org/jira/browse/COUCHDB-514 > Project: CouchDB > Issue Type: Improvement > Components: JavaScript View Server >Affects Versions: 0.10 >Reporter: Zachary Zolton >Assignee: Chris Anderson > Attachments: list-redir.diff, list_views.diff, render.diff > > > There is no way to redirect from a _list function after calling the getRow() > API function. > Here's a link to the discussion on the dev mailing list: > http://is.gd/3KZRg -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (COUCHDB-583) storing attachments in compressed form and serving them in compressed form if accepted by the client
[ https://issues.apache.org/jira/browse/COUCHDB-583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12804919#action_12804919 ] Chris Anderson commented on COUCHDB-583: I haven't really been tuned into the full discussion of this patch -- I think the biggest questions for something that digs this deep into the file format are: How does it impact stability? (looks fine at my cursory glance, aside from cross compatibility with older versions of the file format, which I'd have to look more closely at) What is the payoff? How much space does this save in practice? (say, with email messages as attachments, vs with pngs or minified js) I'm not asking you to do all that work, just think that real numbers are a selling point. If it's a big payoff then this becomes a priority. We might also want to add options for compressing the views. > storing attachments in compressed form and serving them in compressed form if > accepted by the client > > > Key: COUCHDB-583 > URL: https://issues.apache.org/jira/browse/COUCHDB-583 > Project: CouchDB > Issue Type: New Feature > Components: Database Core, HTTP Interface > Environment: CouchDB trunk >Reporter: Filipe Manana > Attachments: couchdb-583-trunk-10th-try.patch, > couchdb-583-trunk-11th-try.patch, couchdb-583-trunk-12th-try.patch, > couchdb-583-trunk-13th-try.patch, couchdb-583-trunk-14th-try-git.patch, > couchdb-583-trunk-15th-try-git.patch, couchdb-583-trunk-3rd-try.patch, > couchdb-583-trunk-4th-try-trunk.patch, couchdb-583-trunk-5th-try.patch, > couchdb-583-trunk-6th-try.patch, couchdb-583-trunk-7th-try.patch, > couchdb-583-trunk-8th-try.patch, couchdb-583-trunk-9th-try.patch, > jira-couchdb-583-1st-try-trunk.patch, jira-couchdb-583-2nd-try-trunk.patch > > > This feature allows Couch to gzip compress attachments as they are being > received and store them in compressed form. > When a client asks for downloading an attachment (e.g. GET > somedb/somedoc/attachment.txt), the attachment is sent in compressed form if > the client's http request has gzip specified as a valid transfer encoding for > the response (using the http header "Accept-Encoding"). Otherwise couch > decompresses the attachment before sending it back to the client. > Attachments are compressed only if their MIME type matches one of those > listed in a separate config file. Compression level is also configurable in > the default.ini file. > This follows Damien's suggestion from 30 November: > "Perhaps we need a separate user editable ini file to specify compressable or > non-compressable files (would probably be too big for the regular ini file). > What do other web servers do? > Also, a potential optimization is to compress the file while writing to disk, > and serve the compressed bytes directly to clients that can handle it, and > decompressed for those that can't. For compressable types, it's a win for > both disk IO for reads and writes, and CPU on read." > Patch attached. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (COUCHDB-583) storing attachments in compressed form and serving them in compressed form if accepted by the client
[ https://issues.apache.org/jira/browse/COUCHDB-583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12804917#action_12804917 ] Paul Joseph Davis commented on COUCHDB-583: --- Filipe, Sorry, got distracted by a weekend project. I'll try and do a thorough review tomorrow before the big news day on Wednesday. > storing attachments in compressed form and serving them in compressed form if > accepted by the client > > > Key: COUCHDB-583 > URL: https://issues.apache.org/jira/browse/COUCHDB-583 > Project: CouchDB > Issue Type: New Feature > Components: Database Core, HTTP Interface > Environment: CouchDB trunk >Reporter: Filipe Manana > Attachments: couchdb-583-trunk-10th-try.patch, > couchdb-583-trunk-11th-try.patch, couchdb-583-trunk-12th-try.patch, > couchdb-583-trunk-13th-try.patch, couchdb-583-trunk-14th-try-git.patch, > couchdb-583-trunk-15th-try-git.patch, couchdb-583-trunk-3rd-try.patch, > couchdb-583-trunk-4th-try-trunk.patch, couchdb-583-trunk-5th-try.patch, > couchdb-583-trunk-6th-try.patch, couchdb-583-trunk-7th-try.patch, > couchdb-583-trunk-8th-try.patch, couchdb-583-trunk-9th-try.patch, > jira-couchdb-583-1st-try-trunk.patch, jira-couchdb-583-2nd-try-trunk.patch > > > This feature allows Couch to gzip compress attachments as they are being > received and store them in compressed form. > When a client asks for downloading an attachment (e.g. GET > somedb/somedoc/attachment.txt), the attachment is sent in compressed form if > the client's http request has gzip specified as a valid transfer encoding for > the response (using the http header "Accept-Encoding"). Otherwise couch > decompresses the attachment before sending it back to the client. > Attachments are compressed only if their MIME type matches one of those > listed in a separate config file. Compression level is also configurable in > the default.ini file. > This follows Damien's suggestion from 30 November: > "Perhaps we need a separate user editable ini file to specify compressable or > non-compressable files (would probably be too big for the regular ini file). > What do other web servers do? > Also, a potential optimization is to compress the file while writing to disk, > and serve the compressed bytes directly to clients that can handle it, and > decompressed for those that can't. For compressable types, it's a win for > both disk IO for reads and writes, and CPU on read." > Patch attached. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (COUCHDB-583) storing attachments in compressed form and serving them in compressed form if accepted by the client
[ https://issues.apache.org/jira/browse/COUCHDB-583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12804915#action_12804915 ] Filipe Manana commented on COUCHDB-583: --- @Paul any news on this? > storing attachments in compressed form and serving them in compressed form if > accepted by the client > > > Key: COUCHDB-583 > URL: https://issues.apache.org/jira/browse/COUCHDB-583 > Project: CouchDB > Issue Type: New Feature > Components: Database Core, HTTP Interface > Environment: CouchDB trunk >Reporter: Filipe Manana > Attachments: couchdb-583-trunk-10th-try.patch, > couchdb-583-trunk-11th-try.patch, couchdb-583-trunk-12th-try.patch, > couchdb-583-trunk-13th-try.patch, couchdb-583-trunk-14th-try-git.patch, > couchdb-583-trunk-15th-try-git.patch, couchdb-583-trunk-3rd-try.patch, > couchdb-583-trunk-4th-try-trunk.patch, couchdb-583-trunk-5th-try.patch, > couchdb-583-trunk-6th-try.patch, couchdb-583-trunk-7th-try.patch, > couchdb-583-trunk-8th-try.patch, couchdb-583-trunk-9th-try.patch, > jira-couchdb-583-1st-try-trunk.patch, jira-couchdb-583-2nd-try-trunk.patch > > > This feature allows Couch to gzip compress attachments as they are being > received and store them in compressed form. > When a client asks for downloading an attachment (e.g. GET > somedb/somedoc/attachment.txt), the attachment is sent in compressed form if > the client's http request has gzip specified as a valid transfer encoding for > the response (using the http header "Accept-Encoding"). Otherwise couch > decompresses the attachment before sending it back to the client. > Attachments are compressed only if their MIME type matches one of those > listed in a separate config file. Compression level is also configurable in > the default.ini file. > This follows Damien's suggestion from 30 November: > "Perhaps we need a separate user editable ini file to specify compressable or > non-compressable files (would probably be too big for the regular ini file). > What do other web servers do? > Also, a potential optimization is to compress the file while writing to disk, > and serve the compressed bytes directly to clients that can handle it, and > decompressed for those that can't. For compressable types, it's a win for > both disk IO for reads and writes, and CPU on read." > Patch attached. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: svn commit: r903023 - in /couchdb/trunk/share: Makefile.am server/json2.js server/util.js www/script/json2.js www/script/test/show_documents.js www/script/test/update_documents.js www/script/tes
On Mon, Jan 25, 2010 at 8:04 PM, Jan Lehnardt wrote: > Hey Chris, > > great work, thanks. Can you update > http://wiki.apache.org/couchdb/Breaking_changes? :) > I wouldn't mind. But with that old wiki it'll take me 15 minutes to figure out what username I'm shooting for. ... signing up for a new account: http://wiki.apache.org/couchdb/FrontPage?action=newaccount gives a 500 error. try again, and OK I seem to have fixed it. Thanks for the prodding. The wiki wasn't that bad but it could be more fun. http://wiki.apache.org/couchdb/Breaking_changes Chris > Cheers > Jan > -- > > On 25 Jan 2010, at 16:12, jch...@apache.org wrote: > >> Author: jchris >> Date: Tue Jan 26 00:11:59 2010 >> New Revision: 903023 >> >> URL: http://svn.apache.org/viewvc?rev=903023&view=rev >> Log: >> Replace the old JavaScript query server JSON library with json2.js >> >> This change makes us interoperate better with other JSON implementations. It >> also means we can use the native JSON handlers in JavaScript runtimes that >> support them. Should be faster right away on new Spidermonkeys. >> >> There are some potential breaking changes for apps that depend on Couch >> blowing up on 'undefined'. json2.js serialized undefined as 'null' instead >> of crashing. >> >> This change will also affect people using E4X, as you can't just return an >> XML object and have it serialized to a string for you. Calling >> .toXMLString() on these is all you need to do. >> >> Added: >> couchdb/trunk/share/server/json2.js >> - copied, changed from r902422, couchdb/trunk/share/www/script/json2.js >> Modified: >> couchdb/trunk/share/Makefile.am >> couchdb/trunk/share/server/util.js >> couchdb/trunk/share/www/script/json2.js >> couchdb/trunk/share/www/script/test/show_documents.js >> couchdb/trunk/share/www/script/test/update_documents.js >> couchdb/trunk/share/www/script/test/view_errors.js >> >> Modified: couchdb/trunk/share/Makefile.am >> URL: >> http://svn.apache.org/viewvc/couchdb/trunk/share/Makefile.am?rev=903023&r1=903022&r2=903023&view=diff >> == >> --- couchdb/trunk/share/Makefile.am (original) >> +++ couchdb/trunk/share/Makefile.am Tue Jan 26 00:11:59 2010 >> @@ -13,6 +13,7 @@ >> JS_FILE = server/main.js >> >> JS_FILE_COMPONENTS = \ >> + server/json2.js \ >> server/filter.js \ >> server/mimeparse.js \ >> server/render.js \ >> >> Copied: couchdb/trunk/share/server/json2.js (from r902422, >> couchdb/trunk/share/www/script/json2.js) >> URL: >> http://svn.apache.org/viewvc/couchdb/trunk/share/server/json2.js?p2=couchdb/trunk/share/server/json2.js&p1=couchdb/trunk/share/www/script/json2.js&r1=902422&r2=903023&rev=903023&view=diff >> == >> --- couchdb/trunk/share/www/script/json2.js [utf-8] (original) >> +++ couchdb/trunk/share/server/json2.js [utf-8] Tue Jan 26 00:11:59 2010 >> @@ -1,6 +1,6 @@ >> /* >> http://www.JSON.org/json2.js >> - 2009-08-17 >> + 2009-09-29 >> >> Public Domain. >> >> @@ -8,6 +8,14 @@ >> >> See http://www.JSON.org/js.html >> >> + >> + This code should be minified before deployment. >> + See http://javascript.crockford.com/jsmin.html >> + >> + USE YOUR OWN COPY. IT IS EXTREMELY UNWISE TO LOAD CODE FROM SERVERS YOU >> DO >> + NOT CONTROL. >> + >> + >> This file creates a global JSON object containing two methods: stringify >> and parse. >> >> @@ -136,15 +144,9 @@ >> >> This is a reference implementation. You are free to copy, modify, or >> redistribute. >> - >> - This code should be minified before deployment. >> - See http://javascript.crockford.com/jsmin.html >> - >> - USE YOUR OWN COPY. IT IS EXTREMELY UNWISE TO LOAD CODE FROM SERVERS YOU >> DO >> - NOT CONTROL. >> */ >> >> -/*jslint evil: true */ >> +/*jslint evil: true, strict: false */ >> >> /*members "", "\b", "\t", "\n", "\f", "\r", "\"", JSON, "\\", apply, >> call, charCodeAt, getUTCDate, getUTCFullYear, getUTCHours, >> @@ -153,7 +155,6 @@ >> test, toJSON, toString, valueOf >> */ >> >> -"use strict"; >> >> // Create a JSON object only if one does not already exist. We create the >> // methods in a closure to avoid creating global variables. >> >> Modified: couchdb/trunk/share/server/util.js >> URL: >> http://svn.apache.org/viewvc/couchdb/trunk/share/server/util.js?rev=903023&r1=903022&r2=903023&view=diff >> == >> --- couchdb/trunk/share/server/util.js (original) >> +++ couchdb/trunk/share/server/util.js Tue Jan 26 00:11:59 2010 >> @@ -13,14 +13,7 @@ >> var Couch = { >> // moving this away from global so we can move to json2.js later >> toJSON : function (val) { >> - if (typeof(val) == "undefined") { >> - throw "Cannot encode 'undefined' value as JSON"; >> - } >> - if (typeof(val) == "xml") { // E4X support
JavaScript bcrypt (was Re: authentication cleanup)
On Tue, Jan 5, 2010 at 10:21 PM, Benoit Chesneau wrote: >> -- > There is a blowfish encryption implementation available in javascript. > doesn't bcrypt stand for "blowfish crypt" ? > http://www.openbsd.org/cgi-bin/man.cgi?query=bcrypt&apropos=0&sektion=0&manpath=OpenBSD+Current&arch=i386&format=html > > fro where it has been created. > > - benoît > Is anyone up to replace our salted hashes with a JS bcrypt implementation? If we can start supporting bcrypt for 0.11 we're less likely to have salted hash passwords hanging around *forever* from people who create user docs before 1.0. If no one else picks this up soon I'll look at it again for 1.0 Thanks, Chris -- Chris Anderson http://jchrisa.net http://couch.io
Re: upgrading to json2.js
On Mon, Jan 25, 2010 at 11:08 PM, Chris Anderson wrote: > On Tue, Dec 22, 2009 at 10:26 AM, Chris Anderson wrote: >> On Sat, Dec 19, 2009 at 5:07 PM, Chris Anderson wrote: >>> It's well known that in order to take advantage of native JSON >>> libraries in the newest Mozilla JavaScript VMs, we'll need to change >>> our handling of 'undefined' in the toJSON() routine. >>> >>> I propose we make this change now, by replacing our current JSON >>> handling with json2.js, the current reference implementation. >>> >>> I've started the work here: >>> >>> http://github.com/jchris/couchdb/tree/json2 >> >> I've update my json2 branch to reflect my latest commits to trunk. >> > > I've committed this change to CouchDB. It will appear in the 0.11 > release. From the commit message: > > Replace the old JavaScript query server JSON library with json2.js > > This change makes us interoperate better with other JSON > implementations. It also means we can use the native JSON handlers in > JavaScript runtimes that support them. > > There are some potential breaking changes for apps that depend on > Couch blowing up on 'undefined'. json2.js serializes undefined as > 'null' instead of crashing. The change is that undefined in an array gets serialized as null. Thus: $ JSON.stringify([undefined]) -> "[null]" plus the XML stuff. No idea how JSON.stringify(undefined) behaves but we wrap all results in an array before passing to Erlang so it shouldn't be a huge deal. HTH, Paul Davis > This change will also effect people using E4X, as you can't just > return an XML object and have it serialized to a string for you. > Calling .toXMLString() on these is what you need to do here. > > > Best, > Chris > > >> Benoit has fixed the E4X issues. There are a few other test failures >> which I believe have to do with the changed behavior. If anyone wants >> to take a look at these and consider changing the tests where >> appropriate, that'd be super helpful. >> >> Chris >> >>> >>> Everything works except E4X. When I run the view_xml tests, I see this >>> error in the logs: >>> >>> OS Process :: function raised exception (TypeError: >>> String.prototype.toJSON called on incompatible XML) with doc._id >>> 43840f81289e03fec4e9f620b2c03799 >>> >>> In our old implementation of toJSON, we run value.toXMLString() to >>> convert XML to strings. json2.js takes a callback parameter to allow >>> modification of results, but the TypeError is triggered before the >>> callback, it seems. >>> >>> If any of you JavaScript ninjas wanna give this a shot, please help me >>> finish it. >>> >>> Chris >>> >>> -- >>> Chris Anderson >>> http://jchrisa.net >>> http://couch.io >>> >> >> >> >> -- >> Chris Anderson >> http://jchrisa.net >> http://couch.io >> > > > > -- > Chris Anderson > http://jchrisa.net > http://couch.io >
Re: buildbot failure in ASF Buildbot on couchdb-trunk
No more emails. license.skip is gonna rely on paths in the ignore patterns though, so in copying it I would expect things to get triggered again. Paul Davis On Mon, Jan 25, 2010 at 11:06 PM, Chris Anderson wrote: > On Mon, Jan 25, 2010 at 4:42 PM, Paul Davis > wrote: >> Looks like you forgot to add json2.js to license.skip >> > > Thanks. > > Since json2.js has been in _utils for a long time I figured licenses > would be taken care of. > > Fixed. (I think) > >> On Mon, Jan 25, 2010 at 7:57 PM, wrote: >>> The Buildbot has detected a new failure of couchdb-trunk on ASF Buildbot. >>> Full details are available at: >>> http://ci.apache.org/builders/couchdb-trunk/builds/170 >>> >>> Buildbot URL: http://ci.apache.org/ >>> >>> Buildslave for this Build: bb-vm_ubuntu >>> >>> Build Reason: >>> Build Source Stamp: [branch couchdb/trunk] 903023 >>> Blamelist: jchris >>> >>> BUILD FAILED: failed compile_5 >>> >>> sincerely, >>> -The ASF Buildbot >>> >>> >> > > > > -- > Chris Anderson > http://jchrisa.net > http://couch.io >
Re: upgrading to json2.js
On Tue, Dec 22, 2009 at 10:26 AM, Chris Anderson wrote: > On Sat, Dec 19, 2009 at 5:07 PM, Chris Anderson wrote: >> It's well known that in order to take advantage of native JSON >> libraries in the newest Mozilla JavaScript VMs, we'll need to change >> our handling of 'undefined' in the toJSON() routine. >> >> I propose we make this change now, by replacing our current JSON >> handling with json2.js, the current reference implementation. >> >> I've started the work here: >> >> http://github.com/jchris/couchdb/tree/json2 > > I've update my json2 branch to reflect my latest commits to trunk. > I've committed this change to CouchDB. It will appear in the 0.11 release. From the commit message: Replace the old JavaScript query server JSON library with json2.js This change makes us interoperate better with other JSON implementations. It also means we can use the native JSON handlers in JavaScript runtimes that support them. There are some potential breaking changes for apps that depend on Couch blowing up on 'undefined'. json2.js serializes undefined as 'null' instead of crashing. This change will also effect people using E4X, as you can't just return an XML object and have it serialized to a string for you. Calling .toXMLString() on these is what you need to do here. Best, Chris > Benoit has fixed the E4X issues. There are a few other test failures > which I believe have to do with the changed behavior. If anyone wants > to take a look at these and consider changing the tests where > appropriate, that'd be super helpful. > > Chris > >> >> Everything works except E4X. When I run the view_xml tests, I see this >> error in the logs: >> >> OS Process :: function raised exception (TypeError: >> String.prototype.toJSON called on incompatible XML) with doc._id >> 43840f81289e03fec4e9f620b2c03799 >> >> In our old implementation of toJSON, we run value.toXMLString() to >> convert XML to strings. json2.js takes a callback parameter to allow >> modification of results, but the TypeError is triggered before the >> callback, it seems. >> >> If any of you JavaScript ninjas wanna give this a shot, please help me >> finish it. >> >> Chris >> >> -- >> Chris Anderson >> http://jchrisa.net >> http://couch.io >> > > > > -- > Chris Anderson > http://jchrisa.net > http://couch.io > -- Chris Anderson http://jchrisa.net http://couch.io
Re: svn commit: r903023 - in /couchdb/trunk/share: Makefile.am server/json2.js server/util.js www/script/json2.js www/script/test/show_documents.js www/script/test/update_documents.js www/script/test/
Hey Chris, great work, thanks. Can you update http://wiki.apache.org/couchdb/Breaking_changes? :) Cheers Jan -- On 25 Jan 2010, at 16:12, jch...@apache.org wrote: > Author: jchris > Date: Tue Jan 26 00:11:59 2010 > New Revision: 903023 > > URL: http://svn.apache.org/viewvc?rev=903023&view=rev > Log: > Replace the old JavaScript query server JSON library with json2.js > > This change makes us interoperate better with other JSON implementations. It > also means we can use the native JSON handlers in JavaScript runtimes that > support them. Should be faster right away on new Spidermonkeys. > > There are some potential breaking changes for apps that depend on Couch > blowing up on 'undefined'. json2.js serialized undefined as 'null' instead of > crashing. > > This change will also affect people using E4X, as you can't just return an > XML object and have it serialized to a string for you. Calling .toXMLString() > on these is all you need to do. > > Added: >couchdb/trunk/share/server/json2.js > - copied, changed from r902422, couchdb/trunk/share/www/script/json2.js > Modified: >couchdb/trunk/share/Makefile.am >couchdb/trunk/share/server/util.js >couchdb/trunk/share/www/script/json2.js >couchdb/trunk/share/www/script/test/show_documents.js >couchdb/trunk/share/www/script/test/update_documents.js >couchdb/trunk/share/www/script/test/view_errors.js > > Modified: couchdb/trunk/share/Makefile.am > URL: > http://svn.apache.org/viewvc/couchdb/trunk/share/Makefile.am?rev=903023&r1=903022&r2=903023&view=diff > == > --- couchdb/trunk/share/Makefile.am (original) > +++ couchdb/trunk/share/Makefile.am Tue Jan 26 00:11:59 2010 > @@ -13,6 +13,7 @@ > JS_FILE = server/main.js > > JS_FILE_COMPONENTS = \ > +server/json2.js \ > server/filter.js \ > server/mimeparse.js \ > server/render.js \ > > Copied: couchdb/trunk/share/server/json2.js (from r902422, > couchdb/trunk/share/www/script/json2.js) > URL: > http://svn.apache.org/viewvc/couchdb/trunk/share/server/json2.js?p2=couchdb/trunk/share/server/json2.js&p1=couchdb/trunk/share/www/script/json2.js&r1=902422&r2=903023&rev=903023&view=diff > == > --- couchdb/trunk/share/www/script/json2.js [utf-8] (original) > +++ couchdb/trunk/share/server/json2.js [utf-8] Tue Jan 26 00:11:59 2010 > @@ -1,6 +1,6 @@ > /* > http://www.JSON.org/json2.js > -2009-08-17 > +2009-09-29 > > Public Domain. > > @@ -8,6 +8,14 @@ > > See http://www.JSON.org/js.html > > + > +This code should be minified before deployment. > +See http://javascript.crockford.com/jsmin.html > + > +USE YOUR OWN COPY. IT IS EXTREMELY UNWISE TO LOAD CODE FROM SERVERS YOU > DO > +NOT CONTROL. > + > + > This file creates a global JSON object containing two methods: stringify > and parse. > > @@ -136,15 +144,9 @@ > > This is a reference implementation. You are free to copy, modify, or > redistribute. > - > -This code should be minified before deployment. > -See http://javascript.crockford.com/jsmin.html > - > -USE YOUR OWN COPY. IT IS EXTREMELY UNWISE TO LOAD CODE FROM SERVERS YOU > DO > -NOT CONTROL. > */ > > -/*jslint evil: true */ > +/*jslint evil: true, strict: false */ > > /*members "", "\b", "\t", "\n", "\f", "\r", "\"", JSON, "\\", apply, > call, charCodeAt, getUTCDate, getUTCFullYear, getUTCHours, > @@ -153,7 +155,6 @@ > test, toJSON, toString, valueOf > */ > > -"use strict"; > > // Create a JSON object only if one does not already exist. We create the > // methods in a closure to avoid creating global variables. > > Modified: couchdb/trunk/share/server/util.js > URL: > http://svn.apache.org/viewvc/couchdb/trunk/share/server/util.js?rev=903023&r1=903022&r2=903023&view=diff > == > --- couchdb/trunk/share/server/util.js (original) > +++ couchdb/trunk/share/server/util.js Tue Jan 26 00:11:59 2010 > @@ -13,14 +13,7 @@ > var Couch = { > // moving this away from global so we can move to json2.js later > toJSON : function (val) { > -if (typeof(val) == "undefined") { > - throw "Cannot encode 'undefined' value as JSON"; > -} > -if (typeof(val) == "xml") { // E4X support > - val = val.toXMLString(); > -} > -if (val === null) { return "null"; } > -return (Couch.toJSON.dispatcher[val.constructor.name])(val); > +return JSON.stringify(val); > }, > compileFunction : function(source) { > if (!source) throw(["error","not_found","missing function"]); > @@ -47,55 +40,6 @@ > } > } > > -Couch.toJSON.subs = {'\b': '\\b', '\t': '\\t', '\n': '\\n', '\f': '\\f', > - '\r': '\\r', '"' : '\\"', '\\': ''}; > -Couch.toJSON.dispatcher = { > -"Array": function(v) { > - var buf = []; > - for (var
Re: buildbot failure in ASF Buildbot on couchdb-trunk
On Mon, Jan 25, 2010 at 4:42 PM, Paul Davis wrote: > Looks like you forgot to add json2.js to license.skip > Thanks. Since json2.js has been in _utils for a long time I figured licenses would be taken care of. Fixed. (I think) > On Mon, Jan 25, 2010 at 7:57 PM, wrote: >> The Buildbot has detected a new failure of couchdb-trunk on ASF Buildbot. >> Full details are available at: >> http://ci.apache.org/builders/couchdb-trunk/builds/170 >> >> Buildbot URL: http://ci.apache.org/ >> >> Buildslave for this Build: bb-vm_ubuntu >> >> Build Reason: >> Build Source Stamp: [branch couchdb/trunk] 903023 >> Blamelist: jchris >> >> BUILD FAILED: failed compile_5 >> >> sincerely, >> -The ASF Buildbot >> >> > -- Chris Anderson http://jchrisa.net http://couch.io
Re: buildbot failure in ASF Buildbot on couchdb-trunk
And 15M to team Gavin for giving us buildbot + notifications On Mon, Jan 25, 2010 at 7:48 PM, Noah Slater wrote: > Score 1 for team Noah. > > On 26 Jan 2010, at 00:42, Paul Davis wrote: > >> Looks like you forgot to add json2.js to license.skip >> >> On Mon, Jan 25, 2010 at 7:57 PM, wrote: >>> The Buildbot has detected a new failure of couchdb-trunk on ASF Buildbot. >>> Full details are available at: >>> http://ci.apache.org/builders/couchdb-trunk/builds/170 >>> >>> Buildbot URL: http://ci.apache.org/ >>> >>> Buildslave for this Build: bb-vm_ubuntu >>> >>> Build Reason: >>> Build Source Stamp: [branch couchdb/trunk] 903023 >>> Blamelist: jchris >>> >>> BUILD FAILED: failed compile_5 >>> >>> sincerely, >>> -The ASF Buildbot >>> >>> > >
Re: buildbot failure in ASF Buildbot on couchdb-trunk
Score 1 for team Noah. On 26 Jan 2010, at 00:42, Paul Davis wrote: > Looks like you forgot to add json2.js to license.skip > > On Mon, Jan 25, 2010 at 7:57 PM, wrote: >> The Buildbot has detected a new failure of couchdb-trunk on ASF Buildbot. >> Full details are available at: >> http://ci.apache.org/builders/couchdb-trunk/builds/170 >> >> Buildbot URL: http://ci.apache.org/ >> >> Buildslave for this Build: bb-vm_ubuntu >> >> Build Reason: >> Build Source Stamp: [branch couchdb/trunk] 903023 >> Blamelist: jchris >> >> BUILD FAILED: failed compile_5 >> >> sincerely, >> -The ASF Buildbot >> >>
Re: buildbot failure in ASF Buildbot on couchdb-trunk
Looks like you forgot to add json2.js to license.skip On Mon, Jan 25, 2010 at 7:57 PM, wrote: > The Buildbot has detected a new failure of couchdb-trunk on ASF Buildbot. > Full details are available at: > http://ci.apache.org/builders/couchdb-trunk/builds/170 > > Buildbot URL: http://ci.apache.org/ > > Buildslave for this Build: bb-vm_ubuntu > > Build Reason: > Build Source Stamp: [branch couchdb/trunk] 903023 > Blamelist: jchris > > BUILD FAILED: failed compile_5 > > sincerely, > -The ASF Buildbot > >
[jira] Commented: (COUCHDB-632) Generic _changes listener added to jquery.couch.js
[ https://issues.apache.org/jira/browse/COUCHDB-632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12804819#action_12804819 ] Benoit Chesneau commented on COUCHDB-632: - If couchdb stop to get changes we received undefined results and oldest id : Got 79ca040d0a6d784619c61b28e5ff Got undefined Got 79ca040d0a6d784619c61b28e5ff Here is a quick test.html to reproduce : Test Test changes var db = $.couch.db("test"); var changes = db.changes({seq:15}) changes.addListener(function(data) { console.log(data); $("#lines").append("Got " + data.id + " "); }); changes.start(); > Generic _changes listener added to jquery.couch.js > -- > > Key: COUCHDB-632 > URL: https://issues.apache.org/jira/browse/COUCHDB-632 > Project: CouchDB > Issue Type: Improvement > Components: Futon > Environment: the Browser! >Reporter: mikeal >Priority: Minor > Attachments: changes.diff, changes1.diff, jquery.couch.js > > Original Estimate: 0.02h > Remaining Estimate: 0.02h > > I've written a Generic _changes listener and added it to jquery.couch.js > taken from Futon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (COUCHDB-632) Generic _changes listener added to jquery.couch.js
[ https://issues.apache.org/jira/browse/COUCHDB-632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoit Chesneau updated COUCHDB-632: Attachment: changes1.diff Updated diff. fix a try catch non closed : http://github.com/benoitc/couchdb/commit/41eafc56799b1516c9a8d2207fa53366787be0bf > Generic _changes listener added to jquery.couch.js > -- > > Key: COUCHDB-632 > URL: https://issues.apache.org/jira/browse/COUCHDB-632 > Project: CouchDB > Issue Type: Improvement > Components: Futon > Environment: the Browser! >Reporter: mikeal >Priority: Minor > Attachments: changes.diff, changes1.diff, jquery.couch.js > > Original Estimate: 0.02h > Remaining Estimate: 0.02h > > I've written a Generic _changes listener and added it to jquery.couch.js > taken from Futon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (COUCHDB-632) Generic _changes listener added to jquery.couch.js
[ https://issues.apache.org/jira/browse/COUCHDB-632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Anderson updated COUCHDB-632: --- Attachment: changes.diff here's a diff merged to work with latest trunk > Generic _changes listener added to jquery.couch.js > -- > > Key: COUCHDB-632 > URL: https://issues.apache.org/jira/browse/COUCHDB-632 > Project: CouchDB > Issue Type: Improvement > Components: Futon > Environment: the Browser! >Reporter: mikeal >Priority: Minor > Attachments: changes.diff, jquery.couch.js > > Original Estimate: 0.02h > Remaining Estimate: 0.02h > > I've written a Generic _changes listener and added it to jquery.couch.js > taken from Futon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (COUCHDB-632) Generic _changes listener added to jquery.couch.js
[ https://issues.apache.org/jira/browse/COUCHDB-632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mikeal updated COUCHDB-632: --- Attachment: jquery.couch.js modified jquery.couch.js , jchris said I should just attach the whole file instead of a diff. > Generic _changes listener added to jquery.couch.js > -- > > Key: COUCHDB-632 > URL: https://issues.apache.org/jira/browse/COUCHDB-632 > Project: CouchDB > Issue Type: Improvement > Components: Futon > Environment: the Browser! >Reporter: mikeal >Priority: Minor > Attachments: jquery.couch.js > > Original Estimate: 0.02h > Remaining Estimate: 0.02h > > I've written a Generic _changes listener and added it to jquery.couch.js > taken from Futon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (COUCHDB-632) Generic _changes listener added to jquery.couch.js
Generic _changes listener added to jquery.couch.js -- Key: COUCHDB-632 URL: https://issues.apache.org/jira/browse/COUCHDB-632 Project: CouchDB Issue Type: Improvement Components: Futon Environment: the Browser! Reporter: mikeal Priority: Minor I've written a Generic _changes listener and added it to jquery.couch.js taken from Futon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (COUCHDB-631) Replication by doc Ids
[ https://issues.apache.org/jira/browse/COUCHDB-631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Filipe Manana updated COUCHDB-631: -- Attachment: replication-by-doc-ids_trunk.patch The following patch adds support for the optional "doc_ids" attribute (array of strings) of a JSON replication object. The idea was suggested recently by Chris Anderson in the dev mailing list. > Replication by doc Ids > -- > > Key: COUCHDB-631 > URL: https://issues.apache.org/jira/browse/COUCHDB-631 > Project: CouchDB > Issue Type: New Feature > Components: Replication > Environment: trunk >Reporter: Filipe Manana >Priority: Minor > Attachments: replication-by-doc-ids_trunk.patch > > > The following patch adds support for the optional "doc_ids" attribute (array > of strings) of a JSON replication object. > The idea was suggested recently by Chris Anderson in the dev mailing list. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (COUCHDB-631) Replication by doc Ids
Replication by doc Ids -- Key: COUCHDB-631 URL: https://issues.apache.org/jira/browse/COUCHDB-631 Project: CouchDB Issue Type: New Feature Components: Replication Environment: trunk Reporter: Filipe Manana Priority: Minor Attachments: replication-by-doc-ids_trunk.patch The following patch adds support for the optional "doc_ids" attribute (array of strings) of a JSON replication object. The idea was suggested recently by Chris Anderson in the dev mailing list. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (COUCHDB-514) Redirect from _list using view rows
[ https://issues.apache.org/jira/browse/COUCHDB-514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joscha Feth updated COUCHDB-514: Attachment: render.diff Added render.diff which enables the list render function to disable/enable automatic flushing and fixes the problem with start() being sent automatically when getRow() gets called. Needs readline() to be fixed first to also return if there is no header sent, yet. > Redirect from _list using view rows > --- > > Key: COUCHDB-514 > URL: https://issues.apache.org/jira/browse/COUCHDB-514 > Project: CouchDB > Issue Type: Improvement > Components: JavaScript View Server >Affects Versions: 0.10 >Reporter: Zachary Zolton > Attachments: list-redir.diff, list_views.diff, render.diff > > > There is no way to redirect from a _list function after calling the getRow() > API function. > Here's a link to the discussion on the dev mailing list: > http://is.gd/3KZRg -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Updating the CouchDB roadmap
> Thanks for reminding me that I should set _all_dbs to hide dbs the > curertn user can't read if that doesn't incur much additional > overhead. I think that it will incur a huge overhead, if there are a large number of databases and the reader rights are stored within the databases themselves. It'll have to open and read every single database file on disk, even if the user only has access to one. Storing the rights within the user record avoids this problem completely.
Re: Pinning revs
On Mon, Jan 25, 2010 at 02:38:22PM +, Robert Newson wrote: > when you PUT a new document over an existing one you are implicitly > removing the document that was previously there. A concurrent read > operation might see the b+tree before or after that change; either > answer is consistent with some historical version of the database and > no locking is required. > > If, instead, you really wanted to make a new version (from your > applications point of view) you should insert a brand new document and > add a view (or a naming convention) that lets you find the version > history. > > A simple idea would be to append the version to the _id. > (i.e, to 'update' doc1-v1, you would PUT doc1-v2). That's what I thought of first. Given 1000 revisions of one document, stored as 1000 separate documents, then (as you say) you can make a view to find the most recent one. However, you can't apply a view to a view, so it's then impossible to write a view which makes use of only the most recent version of a document. It becomes a bit of a mess. So I think I need to store all the revisions within a single document. Options might be: 1. Store all the revisions nested with the JSON document - or store the prevision revisions as attachments. Unfortunately, I need to version the binary attachments too. 2. Store each attachment with a special naming convention, e.g. blob:r1, blob:r2 etc 3. Store each rev's attachments in a single .zip file attachment. 4. Store each attachment with a name equal to its sha1, and the revisions as nested JSON each containing an "attachments": member that points to the sha1's. Probably the cleanest and also saves duplicating identical content, but still something of a PITA. I guess that, as you say, it could be layered on-top of couchdb as some sort of middleware, or else the client would have to take responsibility for doing the versioning properly. (An _update handler could update the JSON part of a multi-rev document, but I don't think it can do clever stuff with attachments) Regards, Brian.
Re: replicator options
On Mon, Jan 25, 2010 at 8:28 AM, Zachary Zolton wrote: > Having the replicator handle chaining views would really help people > who are already hacking this together with scripts. So, I'd definitely > +1 the idea. Isn't view size and indexing time a separate problem from > designing this replicator API? Yes. The big missing piece in this view-copy API is: What to do if the "replication" dies in the middle. Currently with real replication, you just pick it up where you left off, with the sequence index. For something like a group reduce query, I guess you'd just have to pick up where you left off in the key range. The problem is that someone may have made updates to the db since you started, and you get an inconsistent copy of the view. To properly support this, we'd need an API that allows you to specify a db-update sequence in your view request. As long as the view haven't been compacted (and that seq # actually exists as a snapshot point in the index) then you could pick up with the same index and avoid inconsistencies. Chris > > On Sun, Jan 24, 2010 at 9:47 PM, Chris Anderson wrote: >> On Sun, Jan 24, 2010 at 5:16 PM, Glenn Rempe wrote: >>> On Sun, Jan 24, 2010 at 2:11 PM, Chris Anderson wrote: >>> On Sun, Jan 24, 2010 at 2:04 PM, Glenn Rempe wrote: > On Sun, Jan 24, 2010 at 12:09 AM, Chris Anderson wrote: > >> Devs, >> >> I've been thinking there are a few simple options that would magnify >> the power of the replicator a lot. >> >> ... >> The fun one is chained map reduce. It occurred to me the other night >> that simplest way to present a chainable map reduce abstraction to >> users is through the replicator. The action "copy these view rows to a >> new db" is a natural fit for the replicator. I imagine this would be >> super useful to people doing big messy data munging, and it wouldn't >> be too hard for the replicator to handle. >> >> > I like this idea as well, as chainable map/reduce has been something I think > a lot of people would like to use. The thing I am concerned about, and > which is related to another ongoing thread, is the size of views on disk and > the slowness of generating them. I fear that we would end up ballooning > views on disk to a size that is unmanageable if we chained them. I have an > app in production with 50m rows, whose DB has grown to >100GB, and the views > take up approx 800GB (!). I don't think I could afford the disk space to > even consider using this especially when you consider that in order to > compact a DB or view you need roughly 2x the disk space of the files on > disk. > > I also worry about the time to generate chained views, when the time needed > for generating views currently is already a major weak point of CouchDB > (Generating my views took more than a week). > > In practice, I think only those with relatively small DB's would be able to > take advantage of this feature. > For large data, you'll want a cluster. The same holds true for other Map Reduce frameworks like Hadoop or Google's stuff. >>> >>> That would not resolve the issue I mentioned where views can be a multiple >>> in size of the original data DB. I have about 9 views in a design doc, and >>> my resultant view files on disk are about 9x the size of the original DB >>> data. >>> >>> How would sharding this across multiple DBs in a cluster resolve this? You >>> would still end up with views that are some multiple in size of their >>> original sharded DB. Compounded by how many replicas you have of that view >>> data for chained M/R. >>> >>> I'd be interested if anyone with partitioned CouchDB query experience (Lounger or otherwise) can comment on view generation time when parallelized across multiple machines. >>> I would also be interested in seeing any architectures that make use of this >>> to parallelize view generation. I'm not sure your example of Hadoop or >>> Google M/R are really valid because they provide file system abstractions >>> (e.g. Hadoop FS) for automatically streaming a single copy of the data to >>> where it is needed to be Mapped/Reduced and CouchDB has nothing similar. >>> >>> http://hadoop.apache.org/common/docs/current/hdfs_design.html >>> >>> Don't get me wrong, I would love to see these things happen, I just wonder >>> if there are other issues that need to be resolved first before this is >>> practical for anything but a small dataset. >>> >> >> I know Hadoop and Couch are dissimilar, but the way to parallelize >> CouchDB view generation is with a partitioned cluster like >> CouchDB-Lounge or the Cloudant stuff. >> >> It doesn't help much with the size inefficiencies but will help with >> generation time. >> >> Chris >> >> >> -- >> Chris Anderson >> http://jchrisa.net >> http://couch.i
Re: Parallel view generation (was Re: replicator options)
On Mon, Jan 25, 2010 at 3:10 AM, Simon Metson wrote: > Hi, > This is OT for the original discussion imho > > On 25 Jan 2010, at 01:16, Glenn Rempe wrote: > >>> I'd be interested if anyone with partitioned CouchDB query experience >>> (Lounger or otherwise) can comment on view generation time when >>> parallelized across multiple machines. >>> >>> >> I would also be interested in seeing any architectures that make use of >> this >> to parallelize view generation. I'm not sure your example of Hadoop or >> Google M/R are really valid because they provide file system abstractions >> (e.g. Hadoop FS) for automatically streaming a single copy of the data to >> where it is needed to be Mapped/Reduced and CouchDB has nothing similar. > > IMHO something like HDFS isn't needed, since there's already a simple, > scalable way of getting at the data. What I'd like (to have time to work > on...) is the following: > > 1. be able to configure a pipeline of documents that are sent to the view > server > 1a. be able to set the size of that pipeline to 0, which just sends a > sane header (there are N documents in the database) > 2. view server spawns off child processes (I'm thinking Disco, but Hadoop > would be able to do the same) on the various worker nodes > 3. each worker is given a range of documents to process, pulls these in from > _all_docs > 4. worker processes its portion of the database > 5. worker returns its results to the view server which aggregates them up > into the final view > > The main issue here is how good your view server is; can it take getting > 1000's of responses at once? An HTTP view response would be nice... I'm > pretty sure that CouchDB could handle getting all the requests from workers. > I think this could also allow for view of view processing, without going > through/maintaining an intermediate database. The reason we haven't implemented something like this yet is that it assumes that your bottleneck is CPU-time, and that it's worth it to move docs across a cluster to be processed, then return the rows to the original Couch for indexing. This might help a little bit in cases where your map function is very CPU intensive, but you aren't going to get 8x faster by using 8 boxes, because the bottleneck will quickly become updating the view index file on the original Couch. Partitioning (a cluster of say 8 Couches where each Couch has 1/8th the data) will speed your view generation up by roughly 8x (at the expense of slightly higher http-query overhead.) In this approach, the map reduce model on an individual of one of those Couches isn't any different than it is today. The functions are run close to the data (no map reduce network overhead), and the rows are stored in 1 index per Couch, which is what allows the 8x speedup. Does this make sense? Chris -- Chris Anderson http://jchrisa.net http://couch.io
Re: Updating the CouchDB roadmap
On Mon, Jan 25, 2010 at 1:14 AM, Brian Candler wrote: > On Sun, Jan 24, 2010 at 09:33:02PM -0800, Chris Anderson wrote: >> To round out this list, I think >> >> * Reader ACLs > ... >> >> look like they will make it into 0.11. > > That's the jchris/readeracl branch presumably? > > I was hoping to turn my counter-proposal(*) into code, but I've not had any > time to do so unfortunately. > > Regards, > > Brian. > > (*) which was, in summary: > > 1. user record has roles like "foo:_reader" or ["foo","_reader"] > > 2. _anon user has roles of ":_reader" for all public databases > > 3. you can read database foo only if you have one of > "foo:_reader", "foo:_admin", "_reader" or "_admin" roles > > 4. /_all_dbs lists only those databases to which you or _anon have read access > (but shows every database if you have _reader or _admin roles) Thanks for reminding me that I should set _all_dbs to hide dbs the curertn user can't read if that doesn't incur much additional overhead. Also, I plan to put a Futon interface on the reader and admin lists. And, the security object still needs work, to round out the capability set to be something like what you describe here. > > 5. userdb validate_doc_update allows someone with "foo:_admin" to add and > remove roles foo:*. Also "foo:_manager" to add and remove roles foo:* > apart from foo:_admin > -- Chris Anderson http://jchrisa.net http://couch.io
Re: replicator options
Having the replicator handle chaining views would really help people who are already hacking this together with scripts. So, I'd definitely +1 the idea. Isn't view size and indexing time a separate problem from designing this replicator API? On Sun, Jan 24, 2010 at 9:47 PM, Chris Anderson wrote: > On Sun, Jan 24, 2010 at 5:16 PM, Glenn Rempe wrote: >> On Sun, Jan 24, 2010 at 2:11 PM, Chris Anderson wrote: >> >>> On Sun, Jan 24, 2010 at 2:04 PM, Glenn Rempe wrote: >>> > On Sun, Jan 24, 2010 at 12:09 AM, Chris Anderson >>> wrote: >>> > >>> >> Devs, >>> >> >>> >> I've been thinking there are a few simple options that would magnify >>> >> the power of the replicator a lot. >>> >> >>> >> ... >>> >> The fun one is chained map reduce. It occurred to me the other night >>> >> that simplest way to present a chainable map reduce abstraction to >>> >> users is through the replicator. The action "copy these view rows to a >>> >> new db" is a natural fit for the replicator. I imagine this would be >>> >> super useful to people doing big messy data munging, and it wouldn't >>> >> be too hard for the replicator to handle. >>> >> >>> >> >>> > I like this idea as well, as chainable map/reduce has been something I >>> think >>> > a lot of people would like to use. The thing I am concerned about, and >>> > which is related to another ongoing thread, is the size of views on disk >>> and >>> > the slowness of generating them. I fear that we would end up ballooning >>> > views on disk to a size that is unmanageable if we chained them. I have >>> an >>> > app in production with 50m rows, whose DB has grown to >100GB, and the >>> views >>> > take up approx 800GB (!). I don't think I could afford the disk space to >>> > even consider using this especially when you consider that in order to >>> > compact a DB or view you need roughly 2x the disk space of the files on >>> > disk. >>> > >>> > I also worry about the time to generate chained views, when the time >>> needed >>> > for generating views currently is already a major weak point of CouchDB >>> > (Generating my views took more than a week). >>> > >>> > In practice, I think only those with relatively small DB's would be able >>> to >>> > take advantage of this feature. >>> > >>> >>> For large data, you'll want a cluster. The same holds true for other >>> Map Reduce frameworks like Hadoop or Google's stuff. >>> >>> >> >> That would not resolve the issue I mentioned where views can be a multiple >> in size of the original data DB. I have about 9 views in a design doc, and >> my resultant view files on disk are about 9x the size of the original DB >> data. >> >> How would sharding this across multiple DBs in a cluster resolve this? You >> would still end up with views that are some multiple in size of their >> original sharded DB. Compounded by how many replicas you have of that view >> data for chained M/R. >> >> >>> I'd be interested if anyone with partitioned CouchDB query experience >>> (Lounger or otherwise) can comment on view generation time when >>> parallelized across multiple machines. >>> >>> >> I would also be interested in seeing any architectures that make use of this >> to parallelize view generation. I'm not sure your example of Hadoop or >> Google M/R are really valid because they provide file system abstractions >> (e.g. Hadoop FS) for automatically streaming a single copy of the data to >> where it is needed to be Mapped/Reduced and CouchDB has nothing similar. >> >> http://hadoop.apache.org/common/docs/current/hdfs_design.html >> >> Don't get me wrong, I would love to see these things happen, I just wonder >> if there are other issues that need to be resolved first before this is >> practical for anything but a small dataset. >> > > I know Hadoop and Couch are dissimilar, but the way to parallelize > CouchDB view generation is with a partitioned cluster like > CouchDB-Lounge or the Cloudant stuff. > > It doesn't help much with the size inefficiencies but will help with > generation time. > > Chris > > > -- > Chris Anderson > http://jchrisa.net > http://couch.io >
Re: Pinning revs
Robert Newson wrote: > It's not clear if any of that belongs inside couchdb but clearly > something like it would be useful to a lot of folks. Perhaps it's > another tool outside of couchdb that, like couchapp, adds some finesse > over a fundamental concept? But isn't there the chance that any external program might miss some revisions if old revisions are discarded prior to being picked up by any external process fetching them? Let's say we have a scenario like this with a program fetching revisions based on a cronjob which runs in a predefined interval: A has rev-1 fetching revisions Update on A. Revison gets bumped to rev-2 Compaction run on database Update on A. Revison gets bumped to rev-3 fetching revisions <-- no way to fetch rev-2 as it got deleted already. so for using an external program to do the revision work, there is no way to just do this at a predefined interval, except if compactions can be suppressed until it runs again and as far as I understand the current design might not guarantee this?! So the only other option would be a program which listens on _changes and uses the ?since parameter - what happens to the scenario above then? Are events on _changes preserved even over compaction? regards, Joscha --
Re: Pinning revs
On Mon, Jan 25, 2010 at 9:38 AM, Robert Newson wrote: > fwiw, I have the same hinky feeling about this proposal. If > implemented, it would be the case that revisions are a history > mechanism under user control, when couch has always, and rightly, said > that it is not. > > when you PUT a new document over an existing one you are implicitly > removing the document that was previously there. A concurrent read > operation might see the b+tree before or after that change; either > answer is consistent with some historical version of the database and > no locking is required. > > If, instead, you really wanted to make a new version (from your > applications point of view) you should insert a brand new document and > add a view (or a naming convention) that lets you find the version > history. A simple idea would be to append the version to the _id. > (i.e, to 'update' doc1-v1, you would PUT doc1-v2). Purging some or all > history would then be a sequence of DELETE's up to, and exclusive of, > the latest version. This approach will work correctly through all > compaction, replication, multi-master and offline scenarios. > > It's not clear if any of that belongs inside couchdb but clearly > something like it would be useful to a lot of folks. Perhaps it's > another tool outside of couchdb that, like couchapp, adds some finesse > over a fundamental concept? > > B. > > > > On Mon, Jan 25, 2010 at 1:08 PM, Robert Dionne > wrote: >> I gave this some more thought over the weekend and don't think it's a good >> idea. Admittedly I'm not as deep in the code as the core devs but this >> strikes me as non-trivial to get right. There has also been a lot of effort >> put into telling folks to not think of _rev as a history mechanism. It seems >> very doable but as you point out it needs a strategy for retention, which >> would likely need to be configurable for different scenarios and there would >> need to be a strategy for replication also, how much history to carry along >> and so forth. >> >> If this is not best done by clients I think something in the server that was >> entirely orthogonal, .eg. based on some changes notification and using a log >> or different store, would be better. This would keep the design simpler and >> enable users to leave it out if not needed. >> >> Just my two cents but I'd be -0 on it >> >> Regards, >> >> Bob >> >> >> >> On Jan 24, 2010, at 5:20 AM, Brian Candler wrote: >> >>> Have there been any more thoughts about being able to use _rev as a history >>> mechanism? >>> >>> I think this just means that certain older _revs can survive compaction, and >>> ISTM that the simplest way to achieve this would be to have a bit which >>> marks a particular revision as "pinned" (cannot be discareded). This would >>> be very flexible. For example, you could prune this bit so that you keep >>> one revision per day for the last week, one revision per month before that, >>> and so on. When making an update in a wiki, you could pin the previous >>> revision only if it's more than 1 hour old, allowing multiple updates within >>> this window to be coalesced. >>> >>> I think this would be a very convenient mechanism, and much moreso than >>> building a document with all the previous versions of interest within the >>> document itself, or as attachments. >>> >>> I've even considered introducing artificial conflicts into the database >>> purely as a way to retain previous revs, but that's pretty messy. >>> >>> Regards, >>> >>> Brian. >> >> > As Bob and Rob point out, it may seem easy at first blush, but replication ends up getting fairly complicated. What happens when you're trying to replicate between two servers that have different sets of revisions pinned? HTH, Paul Davis
Re: Pinning revs
fwiw, I have the same hinky feeling about this proposal. If implemented, it would be the case that revisions are a history mechanism under user control, when couch has always, and rightly, said that it is not. when you PUT a new document over an existing one you are implicitly removing the document that was previously there. A concurrent read operation might see the b+tree before or after that change; either answer is consistent with some historical version of the database and no locking is required. If, instead, you really wanted to make a new version (from your applications point of view) you should insert a brand new document and add a view (or a naming convention) that lets you find the version history. A simple idea would be to append the version to the _id. (i.e, to 'update' doc1-v1, you would PUT doc1-v2). Purging some or all history would then be a sequence of DELETE's up to, and exclusive of, the latest version. This approach will work correctly through all compaction, replication, multi-master and offline scenarios. It's not clear if any of that belongs inside couchdb but clearly something like it would be useful to a lot of folks. Perhaps it's another tool outside of couchdb that, like couchapp, adds some finesse over a fundamental concept? B. On Mon, Jan 25, 2010 at 1:08 PM, Robert Dionne wrote: > I gave this some more thought over the weekend and don't think it's a good > idea. Admittedly I'm not as deep in the code as the core devs but this > strikes me as non-trivial to get right. There has also been a lot of effort > put into telling folks to not think of _rev as a history mechanism. It seems > very doable but as you point out it needs a strategy for retention, which > would likely need to be configurable for different scenarios and there would > need to be a strategy for replication also, how much history to carry along > and so forth. > > If this is not best done by clients I think something in the server that was > entirely orthogonal, .eg. based on some changes notification and using a log > or different store, would be better. This would keep the design simpler and > enable users to leave it out if not needed. > > Just my two cents but I'd be -0 on it > > Regards, > > Bob > > > > On Jan 24, 2010, at 5:20 AM, Brian Candler wrote: > >> Have there been any more thoughts about being able to use _rev as a history >> mechanism? >> >> I think this just means that certain older _revs can survive compaction, and >> ISTM that the simplest way to achieve this would be to have a bit which >> marks a particular revision as "pinned" (cannot be discareded). This would >> be very flexible. For example, you could prune this bit so that you keep >> one revision per day for the last week, one revision per month before that, >> and so on. When making an update in a wiki, you could pin the previous >> revision only if it's more than 1 hour old, allowing multiple updates within >> this window to be coalesced. >> >> I think this would be a very convenient mechanism, and much moreso than >> building a document with all the previous versions of interest within the >> document itself, or as attachments. >> >> I've even considered introducing artificial conflicts into the database >> purely as a way to retain previous revs, but that's pretty messy. >> >> Regards, >> >> Brian. > >
Re: Nightly and binary builds
Lincoln Stoll wrote: > FWIW I've setup a nightly builder for CouchDBX - it can be found here: > > http://couch.lstoll.net/nightly/ great, I added a link to this site in the wiki! regards, Joscha --
Re: Pinning revs
I gave this some more thought over the weekend and don't think it's a good idea. Admittedly I'm not as deep in the code as the core devs but this strikes me as non-trivial to get right. There has also been a lot of effort put into telling folks to not think of _rev as a history mechanism. It seems very doable but as you point out it needs a strategy for retention, which would likely need to be configurable for different scenarios and there would need to be a strategy for replication also, how much history to carry along and so forth. If this is not best done by clients I think something in the server that was entirely orthogonal, .eg. based on some changes notification and using a log or different store, would be better. This would keep the design simpler and enable users to leave it out if not needed. Just my two cents but I'd be -0 on it Regards, Bob On Jan 24, 2010, at 5:20 AM, Brian Candler wrote: > Have there been any more thoughts about being able to use _rev as a history > mechanism? > > I think this just means that certain older _revs can survive compaction, and > ISTM that the simplest way to achieve this would be to have a bit which > marks a particular revision as "pinned" (cannot be discareded). This would > be very flexible. For example, you could prune this bit so that you keep > one revision per day for the last week, one revision per month before that, > and so on. When making an update in a wiki, you could pin the previous > revision only if it's more than 1 hour old, allowing multiple updates within > this window to be coalesced. > > I think this would be a very convenient mechanism, and much moreso than > building a document with all the previous versions of interest within the > document itself, or as attachments. > > I've even considered introducing artificial conflicts into the database > purely as a way to retain previous revs, but that's pretty messy. > > Regards, > > Brian.
Re: Pinning revs
Brian Candler wrote: > Have there been any more thoughts about being able to use _rev as a > history mechanism? +1 from here - this would make the revision scenario for my current project incredibly easy! regards, Joscha --
Parallel view generation (was Re: replicator options)
Hi, This is OT for the original discussion imho On 25 Jan 2010, at 01:16, Glenn Rempe wrote: I'd be interested if anyone with partitioned CouchDB query experience (Lounger or otherwise) can comment on view generation time when parallelized across multiple machines. I would also be interested in seeing any architectures that make use of this to parallelize view generation. I'm not sure your example of Hadoop or Google M/R are really valid because they provide file system abstractions (e.g. Hadoop FS) for automatically streaming a single copy of the data to where it is needed to be Mapped/Reduced and CouchDB has nothing similar. IMHO something like HDFS isn't needed, since there's already a simple, scalable way of getting at the data. What I'd like (to have time to work on...) is the following: 1. be able to configure a pipeline of documents that are sent to the view server 1a. be able to set the size of that pipeline to 0, which just sends a sane header (there are N documents in the database) 2. view server spawns off child processes (I'm thinking Disco, but Hadoop would be able to do the same) on the various worker nodes 3. each worker is given a range of documents to process, pulls these in from _all_docs 4. worker processes its portion of the database 5. worker returns its results to the view server which aggregates them up into the final view The main issue here is how good your view server is; can it take getting 1000's of responses at once? An HTTP view response would be nice... I'm pretty sure that CouchDB could handle getting all the requests from workers. I think this could also allow for view of view processing, without going through/maintaining an intermediate database. Cheers Simon
Re: couchdb rewrite handler
On Mon, Jan 25, 2010 at 6:06 AM, Chris Anderson wrote: > On Sun, Jan 24, 2010 at 11:00 AM, Benoit Chesneau wrote: >> Hi, >> >> Folloging suggestion of @jchris I revisited my rewrite handler. This >> time instead of using a javascript function to handle rewriting it >> uses pattern matching in Erlang to do it. The rewriting root is the >> the design doc : >> >> /yourdb/_design/ddocname/_rewrite/ >> >> ddocname should contain a "rewrites" member which a list of rewriting >> rules. If not it will return a 404. >> >> ex : >> >> { >> >> "rewrite": [ >> { >> "from": "", >> "to": "index.html", >> "method": "GET", >> "query": {} >> } >> ] >> } >> >> Urls are relatives to the db if they start by / or to the current path. >> >> Rewriting can use variables. Variables in path are prefixed by ":". >> For example the following rule: >> >> { "from": "show/:id", "to": "_show/mydoc/:id" } >> >> will rewrite >> "/mydb/_design/test/_rewrite/show/someid" to >> "/mydb/_design/test/_rewrite/_show/someid". >> > > do you mean? > > "/mydb/_design/test/_show/someid" yes sorry. > >> or { "from": "view/:type", "to": "_list/types/by_types", query: { >> "key": "type" } >> will rewrite : >> >> "/mydb/_design/test/_rewrite/view/sometype" to >> "/mydb/_design/test/_rewrite/_list/types/by_types?key=sometype". >> > > do you mean? > > "/mydb/_design/test/_list/types/by_types?key=sometype" and yes. Lack of sleep I guess
Re: Updating the CouchDB roadmap
On Sun, Jan 24, 2010 at 09:33:02PM -0800, Chris Anderson wrote: > To round out this list, I think > > * Reader ACLs ... > > look like they will make it into 0.11. That's the jchris/readeracl branch presumably? I was hoping to turn my counter-proposal(*) into code, but I've not had any time to do so unfortunately. Regards, Brian. (*) which was, in summary: 1. user record has roles like "foo:_reader" or ["foo","_reader"] 2. _anon user has roles of ":_reader" for all public databases 3. you can read database foo only if you have one of "foo:_reader", "foo:_admin", "_reader" or "_admin" roles 4. /_all_dbs lists only those databases to which you or _anon have read access (but shows every database if you have _reader or _admin roles) 5. userdb validate_doc_update allows someone with "foo:_admin" to add and remove roles foo:*. Also "foo:_manager" to add and remove roles foo:* apart from foo:_admin
Re: Document validation involving other documents
On Sun, Jan 24, 2010 at 12:21:25PM -0800, Chris Anderson wrote: > The problem with this approach is that validation is run during > replication as well, so any multi-doc data dependencies become > problematic in ad-hoc clusters. But not every application makes sense as an ad-hoc cluster. In tight-knit clusters the databases trust each other and you want the data to be as coherent as possible, so you'd run replication as a user which has permit-everything rights in validate_doc_update. In these models you're more interested in validating the data once at its point of entry, not at every point of replication. It would be horrendous to have a document in instance 1 but not in instance 2, just because it was accepted initially according to some set of rules, but failed to replicate because the rules had changed in the mean time. This is especially true if the rules themselves are documents, and hence may be a bit stale. At worst you may accept an update which would be invalid if you had the most up-to-date rules, or reject one which would be valid, but the fact that you *did* accept or reject it should be consistent throughout the cluster.
Re: Pinning revs
Absolutely! I'm voting on this one. That would ease my work immensely. On Sun, Jan 24, 2010 at 12:20 PM, Brian Candler wrote: > Have there been any more thoughts about being able to use _rev as a history > mechanism? > > I think this just means that certain older _revs can survive compaction, and > ISTM that the simplest way to achieve this would be to have a bit which > marks a particular revision as "pinned" (cannot be discareded). This would > be very flexible. For example, you could prune this bit so that you keep > one revision per day for the last week, one revision per month before that, > and so on. When making an update in a wiki, you could pin the previous > revision only if it's more than 1 hour old, allowing multiple updates within > this window to be coalesced. > > I think this would be a very convenient mechanism, and much moreso than > building a document with all the previous versions of interest within the > document itself, or as attachments. > > I've even considered introducing artificial conflicts into the database > purely as a way to retain previous revs, but that's pretty messy. > > Regards, > > Brian. >