Re: Addition of modify-on-document-write hooks
On Mon, Sep 20, 2010 at 12:34 AM, Randall Leeds wrote: > On Thu, Sep 9, 2010 at 12:19, James Jackson wrote: >> Hi all, >> >> Moving this from the users forum, as it appears what I'm after isn't >> currently available. For the security model I with to implement in a >> production CouchDB cluster, I would like to be able to force a field to be >> written to all docs based on the user context. The _update functionality is >> not what I am after as it requires the user to actually call it when writing >> a document (means security could be got-around by not calling this, and >> setting the required field in the passed document to something arbitrary, >> which would then not get caught by a validation function), and can't modify >> a document which is passed to it (as far as I can tell it can only modify >> existing documents, or create new ones). > > Is the rewrite handler powerful enough to force normal PUT operations > to go through an _update function? Would this break replication? Just > a quick, off-the-cuff thought. > A _rewrite rule can have a `method` property. So you can redirect differently based on the request method (GET, POST, PUT, ...). So yes, it's eventually possible to mimic the CouchDB api behind a _rewrite/ . - benoit
[jira] Commented: (COUCHDB-889) improved docs for windows compile from source in INSTALL.Windows
[ https://issues.apache.org/jira/browse/COUCHDB-889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12912304#action_12912304 ] Mark Hammond commented on COUCHDB-889: -- Many of the changes look good, but have the following comments: * I wonder if including today's current versions is a good thing - it seems it will just increase the bitrot in the future. Can we just default to "latest available" except in the cases where we know it is not? If not, then we probably can't justify the existing "latest available" references either. * Where is nsis used? * The inclusion of your entire PATH doesn't seem to add much value either - if the instructions are correct the path will be correct - so something is redundant. Less is more when it comes to busy people trying to get a build up, and the specified "perfect path" will be incorrect if the retail version of MSVC is used, for example. * Finally, using seamonkey instead of spidermonkey is a fair bit more effort wrt compilation - it might be reasonable to note that spidermonkey can be used if the reader can decypher the build instructions, or at least indicate something like "almost all mozilla products will build the spidermonkey we need (and spidermonkey can, with some difficulty, even be built stand-alone) - below are instructions for seamonkey, but get a spidermonkey using whatever technique you like" > improved docs for windows compile from source in INSTALL.Windows > > > Key: COUCHDB-889 > URL: https://issues.apache.org/jira/browse/COUCHDB-889 > Project: CouchDB > Issue Type: Improvement > Components: Build System, Documentation >Affects Versions: 0.11.2, 1.0.1 > Environment: Windows only. >Reporter: Dave Cottlehuber > Fix For: 0.11.3, 1.0.2 > > Attachments: windows_build_from_source_docs.patch > > Original Estimate: 0h > Remaining Estimate: 0h > > ./INSTALL.Windows does not have enough detail to compile from source, due to > internet bit rot. > Updates include - > - clarification on versions for 32-bit and 64-bit compile setup > - using free Microsoft Visual Studio 2008 Express C++ compiler instead of > full commercial release > - improved details on building javascript, libcurl from source -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Addition of modify-on-document-write hooks
On Thu, Sep 9, 2010 at 12:19, James Jackson wrote: > Hi all, > > Moving this from the users forum, as it appears what I'm after isn't > currently available. For the security model I with to implement in a > production CouchDB cluster, I would like to be able to force a field to be > written to all docs based on the user context. The _update functionality is > not what I am after as it requires the user to actually call it when writing > a document (means security could be got-around by not calling this, and > setting the required field in the passed document to something arbitrary, > which would then not get caught by a validation function), and can't modify a > document which is passed to it (as far as I can tell it can only modify > existing documents, or create new ones). Is the rewrite handler powerful enough to force normal PUT operations to go through an _update function? Would this break replication? Just a quick, off-the-cuff thought.
Re: CouchDb not releasing files
If the bug is confirmed it should be on JIRA if it is not already. If you have a test case that reproduces it that would be fanstastic (bonus points for a JS test in Futon). It's my opinion something this serious should block 1.1, but ultimately that is up to the committers to determine, yes? On Sun, Sep 19, 2010 at 22:09, [mRg] wrote: > Hi all, > > This is just a cross-post to highlight a thread on the user list. (with the > same name as this one - all details etc are on there, happy to repeat here > if needed). It seems this problem was discussed by some devs at CouchCamp > and seems others are suffering this issue. I was just wondering if there was > a JIRA issue related to this that I/we can track and if a fix for this will > be included in any upcoming released (1.1 ?). > > Regards > > Stephen >
CouchDb not releasing files
Hi all, This is just a cross-post to highlight a thread on the user list. (with the same name as this one - all details etc are on there, happy to repeat here if needed). It seems this problem was discussed by some devs at CouchCamp and seems others are suffering this issue. I was just wondering if there was a JIRA issue related to this that I/we can track and if a fix for this will be included in any upcoming released (1.1 ?). Regards Stephen
Re: View HTTP API extension proposal
On Sun, Sep 19, 2010 at 5:47 AM, Jan Lehnardt wrote: > Hi Tomas, > > this sounds like a valuable addition. Back in the day I remember skip allowed > for negative values to skip backwards, I'm not sure what happened to that. > I was just about to write in, why not use skip=-1 and limit =3? if negative skip is no longer supported, is this intentional or an accident? if it is intentional are the reasons good? if not we should fix it because negative skip seems quite useful. Chris > `diameter` or how we want to call it would come with the same caveat that > `skip` comes with as in it should only be used with "small" values as it's > access is unindexed. Other than that, it sounds useful to me. > > I'm sure you know this, but currently the way to get prev-next links is being > smart with all the view options: > > http://guide.couchdb.org/draft/recipes.html#pagination > > with the caveat that "jump to page" isn't really possible, but see the > chapter for details. > > Cheers > Jan > -- > > > On 6 Sep 2010, at 17:04, Tomas Sedovic wrote: > >> Hey all, >> >> I'd like to propose a small addition to the HTTP View API. I can open >> a ticket later and maybe even submit a patch, but I want to discuss it >> with y'all first. >> >> This new View extension would get you a specified document (by the >> View key) plus a few documents before and after it (with regards to >> that View's sort order). >> >> Say that calling this View: >> >> http://southparkelementary.edu/database/_design/students/_view/names >> >> gives the following response (sorted by the key): >> >> {"total_rows":7,"offset":0,"rows":[ >> {"id":"624","key":"broflovski","value":"Kyle Broflovski"}, >> {"id":"928","key":"cartman","value":"Eric Cartman"}, >> {"id":"848","key":"marsh","value":"Stan Marsh"}, >> {"id":"433","key":"mccormick","value":"Kenny McCormic"}, >> {"id":"855","key":"stotch","value":"Butters Stotch"}, >> {"id":"489","key":"testaburger","value":"Wendy Testaburger"}, >> {"id":"292","key":"vulmer","value":"Jimmy Vulmer"} >> ]} >> >> Now, what I'd like to have is this: >> >> >> http://southparkelementary.edu/database/_design/students/_view/names?key="mccormick"&diameter=1 >> >> which would return: >> >> {"total_rows":7,"offset":2,"rows":[ >> {"id":"848","key":"marsh","value":"Stan Marsh"}, >> {"id":"433","key":"mccormick","value":"Kenny McCormic"}, >> {"id":"855","key":"stotch","value":"Butters Stotch"} >> ]} >> >> (I'm not sure what's the best word for this query argument. Here are >> some other suggestions: vicinity, surroundings, neighborhood, nearby) >> >> Essentially, this combines two Views you can get by the clever use of >> `startkey/endkey`, `limit` and `descending` arguments. The advantage >> of this API addition is that it can be used in the CouchDB Lists. >> >> The obvious use case are the previous/next links between document >> pages. Following the example, if I had a web interface where the South >> Park elementary teachers would view the pages of the students, it >> would be nice if every student's page had a link to the previous and >> next student along with their names and small photos. This means that >> the List generating the student's page must have the access to the >> previous and next documents in the given View. >> >> For example getting a student's page: >> >> >> http://southparkelementary.edu/database/_design/students/_list/student_page/names?key="mccormick"&diameter=1 >> >> would generate a similar HTML structure: >> >> ... >> Student: Kenny McCormic >> ... (additional data) >> >> >> > href="/database/_design/students/_list/student_page/names?key="marsh"&diameter=1">previous: >> Stan Marsh >> >> >> > href="/database/_design/students/_list/student_page/names?key="stotch"&diameter=1">next: >> Butters Stotch >> >> >> As far as I can tell, there are two ways of doing that today: >> >> a) client-side >> b) add the linking logic directly to the documents in the database >> >> The first option is not always feasible/desirable and I dislike the >> second option because of its inflexibility. To maintain a >> doubly-linked structure across the database means that changing a >> single document could lead up to three separate document changes. And >> the complexity raises if we want to have multiple ways of sorting. >> >> However, this extension directly plays into the strength of Views, >> which is that you can have the same set of standalone documents sorted >> by several different rules. You can use this in Lists to generate >> linked pages by different sorting rules. >> >> It would also play nicely with the URL rewriting mechanisms, because >> if a list can access the previous/next documents, it can use their >> contents to generate the pretty URLs that are the rave these days. >> >> I have limited knowledge of the CouchDB internals, but from what I >> know, it doesn't look like a big problem. As the views are B+trees, >> the leafs for
Re: Addition of modify-on-document-write hooks
On Sun, Sep 19, 2010 at 5:37 AM, Jan Lehnardt wrote: > > On 14 Sep 2010, at 03:26, J Chris Anderson wrote: > >> >> On Sep 13, 2010, at 6:23 PM, Simon Metson wrote: >> >>> Hi James. >>> I think the thing to do is require that a document has a user field, >>> and that the value of that field matches the userCtx in the >>> validate_doc_update function. This then pushes the issue client side, and >>> makes the servers life easier. It could also be added by the front end >>> apache in the case of our deployment, I think. I can see this sort of >>> trigger thing being a good way of giving people a loaded gun aimed at their >>> foot, they certainly are in Oracle if you're not careful. >>> Cheers >>> Simon >> >> The big issue is that any code which runs on normal document updates, will >> also run during replication, as replication is just a normal client. So this >> means that adding a field will happen not just on the original client PUT >> but also when replication happens. >> >> This is why _update is a separate handler. > > Adding a required field should be idempotent (correct me if I > am wrong) so it doesn't matter that replication is an agent of > the user. people want to add timestamps. they will call it doc.created_at and then be surprised when it behaves like doc.replicated_at the alternative of not setting doc.created_at unless it is null leaves open the potential for clients to spoof timestamps (just like they can now). so I don't see a real use case for something like this, except for trivial things like forcing that doc.foo == "foo" always, but what's the point in that? Chris > > In the past we talked about "blessing" a ddoc / update function > that would magically invoke the update function on every write. > (analog for _show and _list) and people seem to like to explore > that idea. That said, the "magic" bit worries me a little :) > > Cheers > Jan > -- > > > >> >> Chris >> >>> >>> On 9 Sep 2010, at 05:19, James Jackson wrote: >>> Hi all, Moving this from the users forum, as it appears what I'm after isn't currently available. For the security model I with to implement in a production CouchDB cluster, I would like to be able to force a field to be written to all docs based on the user context. The _update functionality is not what I am after as it requires the user to actually call it when writing a document (means security could be got-around by not calling this, and setting the required field in the passed document to something arbitrary, which would then not get caught by a validation function), and can't modify a document which is passed to it (as far as I can tell it can only modify existing documents, or create new ones). I see this ticket: https://issues.apache.org/jira/browse/COUCHDB-441 which talks about the functionality I am after, but appears to have morphed into what is now there. I am willing to implement such functionality, if it already doesn't exist, but wonder if this would be welcome in the trunk, or if there are killer pitfalls which stop this being possible. I note that in the discussion on that ticket there is talk of how to deal with multiple such modify-on-write functions, perhaps this is one area that needs discussion? In any case, I'll probably implement this for our CouchDB installation, but it would be good to make it generic and globally useful such that I can contribute it back. I know of a number of people who would like this functionality... Regards, James. >>> >> > > -- Chris Anderson http://jchrisa.net http://couch.io
Re: multiview on github
Bob, it is just checking that a given id participates in a view, if it makes it around the ring then it wins and gets streamed to the client, adding disjoints would be fairly simple. Currently the only way I can check if an id is in a view is to loop over the results of each view, hence each node in the ring is in its own process to keep things moving. A use case is two views, one that emits datetime (numeric) and another view that emits values, e.g. A, B, C ..., the query would then be to find the all documents with value A between start time and end time. Norman On Sun, Sep 19, 2010 at 5:21 AM, Robert Dionne wrote: > I took another peek at this and I'm curious as to what it's doing. Is it just > checking that a given id participates in a view? So if it makes it around the > ring it wins? Or is it actually computing the result of passing the doc thru > all the views? > > If the answer is the former then would disjunction also be something one > might want? I'm just curious, I don't have a use case and I forget the > original discussion around this. I sort of think of views as a functional > mapping from the database to some subset. That's not entirely accurate given > there's this reduce phase also. So I could imagine composing views in a > functional way, but the same thing can be had with just a different map > function that is the composition. > > Anyway if you have a brief description of this, with a use case, it would > help. > > Cheers, > > Bob > > > > > On Sep 17, 2010, at 11:32 PM, Norman Barker wrote: > >> Chris, James >> >> thanks for bumping this, we are using this internally at 'scale' >> (million+ keys). I want this to work for couchdb as we want to give >> back for such a great product and support this going forward, so any >> suggestions welcomed and we will test and add them to the local github >> account with the aim of getting this into trunk. >> >> Norman >> >> On Fri, Sep 17, 2010 at 7:00 PM, James Hayton >> wrote: >>> I want to use it! I just haven't gotten around to it. I was going to try >>> and test it out this weekend and if I am able, I will certainly report back >>> what I find. >>> >>> James >>> >>> On Fri, Sep 17, 2010 at 5:55 PM, Chris Anderson wrote: >>> On Mon, Aug 30, 2010 at 10:58 AM, Norman Barker wrote: > Bob, > > I can and have been testing the multiview at this scale, it is ok > (fast enough), but I think being able to test inclusion of a document > id in a view without having to loop would be a considerable speed > improvement. If you have any ideas let me know. > I just want to bump this thread, as I think this is a useful feature. I don't expect to be able to test it in the coming weeks, but if I did I would. Is anyone besides Norman using this? Has anyone used it at scale? Cheers, Chris > thanks, > > Norman > > On Mon, Aug 30, 2010 at 10:49 AM, Robert Newson wrote: >> I'm sorry, I've had no time to play with this at scale. >> >> On Mon, Aug 30, 2010 at 5:35 PM, Norman Barker wrote: >>> Hi, >>> >>> are there any more comments on this, if not can you describe the >>> process (in particular how to obtain a wiki and jira account for >>> couchdb which I have been unable to do) and I will start documenting >>> this so we can put this into the trunk. >>> >>> Bob, were you able to do any more testing with large views, are there >>> any suggestions on how to speed up the document id inclusion test as >>> described below? >>> >>> thanks, >>> >>> Norman >>> >>> On Mon, Aug 23, 2010 at 9:22 AM, Norman Barker < norman.bar...@gmail.com> wrote: Bob, thanks for the feedback and for taking a look at the code. Guidelines on when to use a supervisor within couchdb with a gen_server would be appreciated, currently I have a supervisor and a gen_server, but if couchdb has a supervision process I could remove that layer. I think plugins is a great idea, however intersection of views is such as common request, perhaps there needs to plugin system and if a plugin is rated enough it goes into trunk as a core feature. the four (or slightly more) summary is here http://github.com/normanb/couchdb/raw/trunk/src/couchdb/couch_query_ring.erl % % send an id from the start list to the next node in the ring, if the id is in adjacent node then the this node sends to the next ring node % if the id gets all round the ring and back to the start node then is has intersected all queries and should be included. The nodes in the ring % should be sorted in size from small to large for this to be effective % % In addition send the initial id list round in par
Re: Rep. bug in R...... 1.0.1?
Hi Jan, I have had difficult time with the spam filter to post massages and open simply a ticket: https://issues.apache.org/jira/browse/COUCHDB-885 There is also a script that reproduce this behavior inside. After a short discussion with Klaus, I am still not sure if this is a bug or not, but Please take a look again for sure. Furthermore if you try to repeat the steps manually from Futon it behave differently. Cheers Nikolai On 19.09.2010, at 14:34, Jan Lehnardt wrote: > Hi Nikolai, > > sorry to be terse, but can you provide a short script that > exercises the behaviour? Ideally with placeholders for > the two CouchDB URLs so we can fill in values for our > testing environment. > > Cheers > Jan > -- > > On 11 Sep 2010, at 20:16, Nikolai Teofilov wrote: > >> Hi Adam, >> >> The words "pull" in step 4 and "push" in step 6 are correct. I exchanged the >> places of the curl commands ... >> >> The idea is common scenario ... to have master db and each slave server get >> local copy of the master, make local changes ... (attach new files) and send >> the modified copy back to the master. The problem appears only if the >> documents have been updated with new attachments and only between databases >> on two different servers. It looks like by sending back a document updated >> with new attachment will affect the _rev number and a kind of side effect >> appears so if you try to delete those document on the remote db the last >> revision of the document before the update will be still in the database. It >> could be that this is correct but I think the delete operation of a document >> should remove all its revisions as well, correct? >> >> >> 1. - make remote_db (on different machine!) >> 2. - create a doc on the remote_db >> 3. - make local_db (on different machine from the remote couchdb!) >> 4. - (trigger from the local couchdb!) remote_db->local_db >> 5. - put an attachment on local_db/doc >> 6. - trigger from local couchdb! local_db -> remote_db >> 7. - try to delete the remote_db/doc >> the result should be the last _rev is deleted but a copy of the doc is >> still in the remote_db with the initial _rev number. >> >> I am almost sure it is a bug because if you try this on a one couchdb server >> there is no such a problem. If you try with document without attachment >> there is no problem as well and the documents in both last cases are deleted >> completely. >> >> Cheers >> Nikolai >> >> >> On 10.09.2010, at 01:44, Adam Kocoloski wrote: >> >>> Hi Nikolai, I'm not sure I understand. In step 4 you said "pull ..." >>> but what you actually did was push the local (empty?) test database to the >>> remote server. After that the subsequent steps don't make sense. Can you >>> try describing the steps again? Best, >>> >>> Adam >>> >> >
Re: replication bug
On 24 Aug 2010, at 18:26, Nathan Stott wrote: > I tried it, didn't fix my issue. Can you open a new JIRA issue for this so we won't forget about it? Cheers Jan -- > > On Tue, Aug 24, 2010 at 9:38 AM, Adam Kocoloski wrote: >> Hi Nathan, did you get a chance to see if >> https://issues.apache.org/jira/browse/COUCHDB-868 fixed this issue? >> >> Adam >> >> On Aug 23, 2010, at 3:57 PM, Nathan Stott wrote: >> >>> I've identified a bug in replication in couchdb. >>> >>> Here are the steps to reproduce: >>> >>> Create a user named "bubba" >>> Create a database with a design document that has attachments. >>> Make this database have "bubba" as an admin and set a reader role of >>> "readme" >>> >>> Try to replicate this DB on another machine with credentials for bubba >>> in the URL (http://bubba:passw...@remotemachine:port/mydb) >>> >>> You will receive 401s in the log in attachments. It does not matter >>> whether you give bubba the "readme" role or not, the results are the >>> same. Remove the attachment and the design doc will replicate fine. >>> Remove the "readers" from the security object of the DB and the desing >>> doc will replicate fine. >>> >>> This is tested and reproduced on 1.0.1 >> >>
Re: splitting the code in different apps or rewrite httpd layer
On 23 Aug 2010, at 13:46, Benoit Chesneau wrote: > On Mon, Aug 23, 2010 at 1:07 PM, Robert Dionne > wrote: >> >> >> >> On Aug 22, 2010, at 4:58 PM, Mikeal Rogers wrote: >> >>> One idea that was floated at least once was to replace all the code >>> currently have on top of mochiweb directly with webmachine. >> >> If I recall, Paul Davis did some prototyping work on this at one point >> > > Yes some parts is on its repo some other on mine. But it's a 6 months > old work now. Does that mean you consider it a failed experiment? If yes, why? If not, should we get some effort going to finish the code and get it into trunk? Cheers Jan --
Re: Rekindle discussion: `reduce=false` fails unpredictably
On Mon, Aug 30, 2010 at 10:01, Jason Smith wrote: > I propose a minor change to validation: a simple check is made to > determine if the extra parameter would result in a no-op. If so, no > exception is thrown. Therefore: > > map view, reduce=false -> Allowed > map view, reduce=true -> query_parse_error > map view, group or group_level -> no change to today's behavior > map/reduce view -> no change to today's behavior > > (It can't be known at query time whether group and group_level no-op. > In general they do not. Therefore the client must explicitly get it > right.) > > If this is acceptable, I will submit the patch to JIRA. Thank you. Sounds awesome. I filed https://issues.apache.org/jira/browse/COUCHDB-845 about similar issues. Cheers, Dirkjan
Re: couchdb memory issues/leaks with validators and 20MB+ json docs
On 30 Aug 2010, at 08:34, sgoto wrote: > Hey everyone, > > I'm using couchdb to store docs that are somewhat large (20MB+), but > within the configured max size. > > Storing the docs isn't a problem, couchdb seems to handle it fine. I am > having problems when using function validators and couchdb hanging my > machine after all the memory resources are consumed on PUTs. > > Below is a quick explanation of the issue I'm seeing. > > Ideas ? > > sam > > > how to reproduce: > > 1) create a db called testdb > > 2) create an empty javascript validator function > > function(newDoc, oldDoc, user) {} > > 3) create a fake 20MB doc > > if=/dev/zero of=test.mp3 bs=1024 count=2 > echo "{\"hello\":\"" > test.json; echo `base64 test.mp3` >> test.json; echo > "\"}" >> test.json; > > 4) send it to couchdb > > curl -X PUT http://127.0.0.1:5984/testdb/foobar21 -d @test.json > > 5) open a memory/swap monitor and couchdb's binary consume all the memory > (stopping when the swap memory ends) > > kubuntu's system monitor (memory tab) || > top || > watch free || > > 6) remove the javascript validator > > 7) repeat (5) and see how everything is fine > > expected results: > > (5) shouldn't happen. couchdb shouldn't leak memory or consume more memory > than the size of the doc (20MB). How much total memory do you have? CouchDB will consume more than the doc size in memory (I've seen 2-3x) and using a validation function can blow this up more, but unless you are on a really space constrained VPS, you shouldn't run into swap. Cheers Jan -- > > -- > f u cn rd ths u cn b a gd prgmr !
Re: Rekindle discussion: `reduce=false` fails unpredictably
On 30 Aug 2010, at 10:01, Jason Smith wrote: > This was discussed before here: > > http://mail-archives.apache.org/mod_mbox/couchdb-user/200912.mbox/%3c04c82f94-cf83-45d9-b599-47a8dd7c0...@gmail.com%3e > > This is complicating my own client code. I went out of my way to make > views that are valuable both in map and map/reduce form. But I > discovered that client code must never send reduce=false to map views > even though that is a no-op. > > In the cited discussion, people debated how strictly CouchDB should > validate query parameters. > > You could download the ddoc and inspect it. (But CouchApps are > approaching 1MB with all the vendor/ libraries). A _show function > could inform the client which views are map vs. map/reduce. Finally, > an unused reduce function (such as _count) could be used. None of > these seem relaxing. > > I propose a minor change to validation: a simple check is made to > determine if the extra parameter would result in a no-op. If so, no > exception is thrown. Therefore: > > map view, reduce=false -> Allowed > map view, reduce=true -> query_parse_error > map view, group or group_level -> no change to today's behavior > map/reduce view -> no change to today's behavior > > (It can't be known at query time whether group and group_level no-op. > In general they do not. Therefore the client must explicitly get it > right.) > > If this is acceptable, I will submit the patch to JIRA. Thank you. yes please :) Cheers Jan --
Re: View HTTP API extension proposal
Hi Tomas, this sounds like a valuable addition. Back in the day I remember skip allowed for negative values to skip backwards, I'm not sure what happened to that. `diameter` or how we want to call it would come with the same caveat that `skip` comes with as in it should only be used with "small" values as it's access is unindexed. Other than that, it sounds useful to me. I'm sure you know this, but currently the way to get prev-next links is being smart with all the view options: http://guide.couchdb.org/draft/recipes.html#pagination with the caveat that "jump to page" isn't really possible, but see the chapter for details. Cheers Jan -- On 6 Sep 2010, at 17:04, Tomas Sedovic wrote: > Hey all, > > I'd like to propose a small addition to the HTTP View API. I can open > a ticket later and maybe even submit a patch, but I want to discuss it > with y'all first. > > This new View extension would get you a specified document (by the > View key) plus a few documents before and after it (with regards to > that View's sort order). > > Say that calling this View: > >http://southparkelementary.edu/database/_design/students/_view/names > > gives the following response (sorted by the key): > >{"total_rows":7,"offset":0,"rows":[ >{"id":"624","key":"broflovski","value":"Kyle Broflovski"}, >{"id":"928","key":"cartman","value":"Eric Cartman"}, >{"id":"848","key":"marsh","value":"Stan Marsh"}, >{"id":"433","key":"mccormick","value":"Kenny McCormic"}, >{"id":"855","key":"stotch","value":"Butters Stotch"}, >{"id":"489","key":"testaburger","value":"Wendy Testaburger"}, >{"id":"292","key":"vulmer","value":"Jimmy Vulmer"} >]} > > Now, what I'd like to have is this: > > > http://southparkelementary.edu/database/_design/students/_view/names?key="mccormick"&diameter=1 > > which would return: > >{"total_rows":7,"offset":2,"rows":[ >{"id":"848","key":"marsh","value":"Stan Marsh"}, >{"id":"433","key":"mccormick","value":"Kenny McCormic"}, >{"id":"855","key":"stotch","value":"Butters Stotch"} >]} > > (I'm not sure what's the best word for this query argument. Here are > some other suggestions: vicinity, surroundings, neighborhood, nearby) > > Essentially, this combines two Views you can get by the clever use of > `startkey/endkey`, `limit` and `descending` arguments. The advantage > of this API addition is that it can be used in the CouchDB Lists. > > The obvious use case are the previous/next links between document > pages. Following the example, if I had a web interface where the South > Park elementary teachers would view the pages of the students, it > would be nice if every student's page had a link to the previous and > next student along with their names and small photos. This means that > the List generating the student's page must have the access to the > previous and next documents in the given View. > > For example getting a student's page: > > > http://southparkelementary.edu/database/_design/students/_list/student_page/names?key="mccormick"&diameter=1 > > would generate a similar HTML structure: > >... >Student: Kenny McCormic >... (additional data) > > > href="/database/_design/students/_list/student_page/names?key="marsh"&diameter=1">previous: > Stan Marsh > > > href="/database/_design/students/_list/student_page/names?key="stotch"&diameter=1">next: > Butters Stotch > > > As far as I can tell, there are two ways of doing that today: > > a) client-side > b) add the linking logic directly to the documents in the database > > The first option is not always feasible/desirable and I dislike the > second option because of its inflexibility. To maintain a > doubly-linked structure across the database means that changing a > single document could lead up to three separate document changes. And > the complexity raises if we want to have multiple ways of sorting. > > However, this extension directly plays into the strength of Views, > which is that you can have the same set of standalone documents sorted > by several different rules. You can use this in Lists to generate > linked pages by different sorting rules. > > It would also play nicely with the URL rewriting mechanisms, because > if a list can access the previous/next documents, it can use their > contents to generate the pretty URLs that are the rave these days. > > I have limited knowledge of the CouchDB internals, but from what I > know, it doesn't look like a big problem. As the views are B+trees, > the leafs form a linked list already. I'm also guessing that the list > is in fact doubly-linked (the presence of the `descending` View > argument suggests so). Therefore, this change could be just a matter > of finding the document requested by the key and traversing the list > in both directions. > > Please let me know what you think. Suggestions about the naming and > behaviour of the API call are welcome. In the meantime, I'll div
Re: Addition of modify-on-document-write hooks
On 14 Sep 2010, at 03:26, J Chris Anderson wrote: > > On Sep 13, 2010, at 6:23 PM, Simon Metson wrote: > >> Hi James. >> I think the thing to do is require that a document has a user field, >> and that the value of that field matches the userCtx in the >> validate_doc_update function. This then pushes the issue client side, and >> makes the servers life easier. It could also be added by the front end >> apache in the case of our deployment, I think. I can see this sort of >> trigger thing being a good way of giving people a loaded gun aimed at their >> foot, they certainly are in Oracle if you're not careful. >> Cheers >> Simon > > The big issue is that any code which runs on normal document updates, will > also run during replication, as replication is just a normal client. So this > means that adding a field will happen not just on the original client PUT but > also when replication happens. > > This is why _update is a separate handler. Adding a required field should be idempotent (correct me if I am wrong) so it doesn't matter that replication is an agent of the user. In the past we talked about "blessing" a ddoc / update function that would magically invoke the update function on every write. (analog for _show and _list) and people seem to like to explore that idea. That said, the "magic" bit worries me a little :) Cheers Jan -- > > Chris > >> >> On 9 Sep 2010, at 05:19, James Jackson wrote: >> >>> Hi all, >>> >>> Moving this from the users forum, as it appears what I'm after isn't >>> currently available. For the security model I with to implement in a >>> production CouchDB cluster, I would like to be able to force a field to be >>> written to all docs based on the user context. The _update functionality is >>> not what I am after as it requires the user to actually call it when >>> writing a document (means security could be got-around by not calling this, >>> and setting the required field in the passed document to something >>> arbitrary, which would then not get caught by a validation function), and >>> can't modify a document which is passed to it (as far as I can tell it can >>> only modify existing documents, or create new ones). >>> >>> I see this ticket: >>> >>> https://issues.apache.org/jira/browse/COUCHDB-441 >>> >>> which talks about the functionality I am after, but appears to have morphed >>> into what is now there. >>> >>> I am willing to implement such functionality, if it already doesn't exist, >>> but wonder if this would be welcome in the trunk, or if there are killer >>> pitfalls which stop this being possible. I note that in the discussion on >>> that ticket there is talk of how to deal with multiple such modify-on-write >>> functions, perhaps this is one area that needs discussion? >>> >>> In any case, I'll probably implement this for our CouchDB installation, but >>> it would be good to make it generic and globally useful such that I can >>> contribute it back. I know of a number of people who would like this >>> functionality... >>> >>> Regards, >>> James. >> >
Re: Rep. bug in R...... 1.0.1?
Hi Nikolai, sorry to be terse, but can you provide a short script that exercises the behaviour? Ideally with placeholders for the two CouchDB URLs so we can fill in values for our testing environment. Cheers Jan -- On 11 Sep 2010, at 20:16, Nikolai Teofilov wrote: > Hi Adam, > > The words "pull" in step 4 and "push" in step 6 are correct. I exchanged the > places of the curl commands ... > > The idea is common scenario ... to have master db and each slave server get > local copy of the master, make local changes ... (attach new files) and send > the modified copy back to the master. The problem appears only if the > documents have been updated with new attachments and only between databases > on two different servers. It looks like by sending back a document updated > with new attachment will affect the _rev number and a kind of side effect > appears so if you try to delete those document on the remote db the last > revision of the document before the update will be still in the database. It > could be that this is correct but I think the delete operation of a document > should remove all its revisions as well, correct? > > > 1. - make remote_db (on different machine!) > 2. - create a doc on the remote_db > 3. - make local_db (on different machine from the remote couchdb!) > 4. - (trigger from the local couchdb!) remote_db->local_db > 5. - put an attachment on local_db/doc > 6. - trigger from local couchdb! local_db -> remote_db > 7. - try to delete the remote_db/doc > the result should be the last _rev is deleted but a copy of the doc is > still in the remote_db with the initial _rev number. > > I am almost sure it is a bug because if you try this on a one couchdb server > there is no such a problem. If you try with document without attachment there > is no problem as well and the documents in both last cases are deleted > completely. > > Cheers > Nikolai > > > On 10.09.2010, at 01:44, Adam Kocoloski wrote: > >> Hi Nikolai, I'm not sure I understand. In step 4 you said "pull ..." >> but what you actually did was push the local (empty?) test database to the >> remote server. After that the subsequent steps don't make sense. Can you >> try describing the steps again? Best, >> >> Adam >> >
Re: Database-level statistics/on-disk file names
On 17 Sep 2010, at 10:10, Dirkjan Ochtman wrote: > Hi there, > > I'd like to be able to determine how much disk space the index for a > certain ddoc takes. Is there any easy way of doing that? curl $COUCH/db/_design/name/_info {"name":"db","view_index":{"signature":"431517a0e7decdaa8a97d0dc9ffd7412","language":"javascript","disk_size":51,"updater_running":true,"compact_running":false,"waiting_commit":false,"waiting_clients":0,"update_seq":0,"purge_seq":0}} The `signature` field corresponds to the .db_design/`signature`.view file. -- > Relatedly, would it be possible to rejigger the on-disk layout a > little bit to make the filenames easier to understand? For example, we > now have: > > /foo.couch > /.foo_design/.view > /.foo_temp > > It would seem nicer to have, e.g. > > /foo/data.couch > /foo/users-.view > /foo/.temp I understand that it may seem nice to have everything related to a databases in a single directory, for say moving things around but practically, the current format is just as easy to script as the proposed version. What other reasons do you see that would improve with the proposed version? In addition, creating and deleting databases would no longer be atomic (mkdir & fopen), but I'm not sure that's a hard requirement. I know the current model includes some very finicky details about deleting and renaming files on Windows which would need to be taken into consideration when changing the filesystem structure. > (Unless maybe you want to be able to rename design docs without having > to re-index, but I think even that would be manageable.) This is a required feature :) > This would also make index size fairly transparent. I don't see how that's any different with either or any filesystem layout. > Semi-relatedly, has anyone taken a swing at reflecting /_stats into a > Futon page? Oh I'd *love* to see that :) Cheers Jan --
Re: multiview on github
I took another peek at this and I'm curious as to what it's doing. Is it just checking that a given id participates in a view? So if it makes it around the ring it wins? Or is it actually computing the result of passing the doc thru all the views? If the answer is the former then would disjunction also be something one might want? I'm just curious, I don't have a use case and I forget the original discussion around this. I sort of think of views as a functional mapping from the database to some subset. That's not entirely accurate given there's this reduce phase also. So I could imagine composing views in a functional way, but the same thing can be had with just a different map function that is the composition. Anyway if you have a brief description of this, with a use case, it would help. Cheers, Bob On Sep 17, 2010, at 11:32 PM, Norman Barker wrote: > Chris, James > > thanks for bumping this, we are using this internally at 'scale' > (million+ keys). I want this to work for couchdb as we want to give > back for such a great product and support this going forward, so any > suggestions welcomed and we will test and add them to the local github > account with the aim of getting this into trunk. > > Norman > > On Fri, Sep 17, 2010 at 7:00 PM, James Hayton > wrote: >> I want to use it! I just haven't gotten around to it. I was going to try >> and test it out this weekend and if I am able, I will certainly report back >> what I find. >> >> James >> >> On Fri, Sep 17, 2010 at 5:55 PM, Chris Anderson wrote: >> >>> On Mon, Aug 30, 2010 at 10:58 AM, Norman Barker >>> wrote: Bob, I can and have been testing the multiview at this scale, it is ok (fast enough), but I think being able to test inclusion of a document id in a view without having to loop would be a considerable speed improvement. If you have any ideas let me know. >>> >>> I just want to bump this thread, as I think this is a useful feature. >>> I don't expect to be able to test it in the coming weeks, but if I did >>> I would. Is anyone besides Norman using this? Has anyone used it at >>> scale? >>> >>> Cheers, >>> Chris >>> thanks, Norman On Mon, Aug 30, 2010 at 10:49 AM, Robert Newson >>> wrote: > I'm sorry, I've had no time to play with this at scale. > > On Mon, Aug 30, 2010 at 5:35 PM, Norman Barker >>> wrote: >> Hi, >> >> are there any more comments on this, if not can you describe the >> process (in particular how to obtain a wiki and jira account for >> couchdb which I have been unable to do) and I will start documenting >> this so we can put this into the trunk. >> >> Bob, were you able to do any more testing with large views, are there >> any suggestions on how to speed up the document id inclusion test as >> described below? >> >> thanks, >> >> Norman >> >> On Mon, Aug 23, 2010 at 9:22 AM, Norman Barker < >>> norman.bar...@gmail.com> wrote: >>> Bob, >>> >>> thanks for the feedback and for taking a look at the code. Guidelines >>> on when to use a supervisor within couchdb with a gen_server would be >>> appreciated, currently I have a supervisor and a gen_server, but if >>> couchdb has a supervision process I could remove that layer. >>> >>> I think plugins is a great idea, however intersection of views is such >>> as common request, perhaps there needs to plugin system and if a >>> plugin is rated enough it goes into trunk as a core feature. >>> >>> the four (or slightly more) summary is here >>> >>> >>> http://github.com/normanb/couchdb/raw/trunk/src/couchdb/couch_query_ring.erl >>> >>> % >>> % send an id from the start list to the next node in the ring, if the >>> id is in adjacent node then the this node sends to the next ring node >>> >>> % if the id gets all round the ring and back to the start node then is >>> has intersected all queries and should be included. The nodes in the >>> ring >>> % should be sorted in size from small to large for this to be >>> effective >>> % >>> % In addition send the initial id list round in parallel >>> >>> it really needs some eyes from the core couchdb coders to see how to >>> speed up the inclusion testing, looping is bad even if it is done in >>> parallel. >>> >>> Multiview is usable, I am using it with some pretty big mega-views (as >>> per the raindrop) model, I am also available to add features to this >>> as this is core part of our work and we want to give it to couch as a >>> contribution. >>> >>> thanks, >>> >>> Norman >>> >>> On Mon, Aug 23, 2010 at 5:05 AM, Robert Dionne >>> wrote: Hi Norman, I took a peek at multiview. I haven't followed this too closely on >>> the mailing list but this is *view intersection