Hi All, I really like the idea of returning an error message about it being slow along with a helpful url. I don’t really like the idea of a `slow` or `developer` flag for the actual query, I think that will add some confusion.
Cheers Garren > On 11 Jan 2016, at 8:55 PM, Tony Sun <tony.sun...@gmail.com> wrote: > > Hi Robert, > > Building upon what others have stated above, what do you think about > the following: > > 1) Let the user query without creating an index > 2) Return an error message with a new url that has > "slow/no_index/developer":true appended at the end. The message clearly > explains that this query will be slow, and that creating an index will be > more efficient. However, he or she can continue. The error message will > then have a link to point to our documentation. > 3) In Fauxton, there is a checkbox or button that also appends the > "slow/no_index/developer":true to the _find url. If the user clicks it, > then the same message pops up to notify the user. > > > > Tony > > > > On Mon, Jan 11, 2016 at 9:45 AM, Eli Stevens (Gmail) <wickedg...@gmail.com> > wrote: > >> Just wanted to chime in here as a user - I've run into similar >> behavior from CouchDB with the reduce-not-reducing-enough heuristic, >> where stuff I was working on went smoothly in dev, but stopped once >> real load was pushed through it (thankfully for me, that was in >> testing, rather than released to customers). >> >> It's a frustrating experience, and I don't think that a reputation for >> "works until you cross a threshold, and then it doesn't, but only in >> production" is a good thing to move towards. >> >> Perhaps something like adding a key to the returned data along the >> lines of "_slow_warning": "This query is going to be slow on large >> data sets. See http://..." in addition to the ?slow_warning=true query >> param (note that I'm calling it "slow_warning" in both places only to >> increase discoverability; without the url param, the no-index query >> wouldn't work at all). Bikeshed the name as needed. >> >> I'd like to see a lot more URLs in CouchDB error messages in general, >> actually - I would find it very useful when trying to determine what's >> going wrong to have a URL right there in the logs that I can get more >> information from. >> >> On Sun, Jan 10, 2016 at 11:54 AM, Joan Touzet <woh...@apache.org> wrote: >>> Hi Robert, >>> >>> I've been thinking about this one for the week or so, and I have a >>> simple suggestion: >>> >>> Add the query parameter slow=true to enable this behaviour. >>> >>> This meets all the original requirements: >>> >>> 1. It is not default behaviour >>> 2. You can grep the log files for the word 'slow' and find evidence >>> 3. There is a shorthand, simple way to enable the behaviour >>> 4. Any self-respecting developer will try to remove slow=true, find >>> a break, and be forced to learn about indexes >>> 5. It's a bit cheeky, which I think is kind of fun :D >>> >>> All the best, >>> Joan >>> >>> ----- Original Message ----- >>>> From: "William Edney" <bed...@technicalpursuit.com> >>>> To: dev@couchdb.apache.org >>>> Sent: Friday, January 8, 2016 10:27:29 AM >>>> Subject: Re: [POC] Mango Catch All Selector >>>> >>>> Hi Robert - >>>> >>>> As a builder of UI, API and library code who has also done developer >>>> training on a variety of technologies, one simple fix might be go >>>> ahead and >>>> not require indexes to be built, but then to put a big NOTE at the >>>> beginning of the "Mango Getting Started" guide (I would assume there >>>> is >>>> such a piece of documentation) that states: "Note that the examples >>>> in this >>>> document do not require you to build an index, but for performance >>>> reasons >>>> we HIGHLY RECOMMEND that you do so. *Click here* for more information >>>> about >>>> how to do that" (or some such verbiage). >>>> >>>> My 2 cents. >>>> >>>> Cheers, >>>> >>>> - Bill >>>> >>>> On Fri, Jan 8, 2016 at 9:04 AM, Robert Kowalski <r...@kowalski.gd> >>>> wrote: >>>> >>>>> Hi list, >>>>> >>>>> At the end of the mail I would like to invite the other folks from >>>>> the >>>>> mailing list that build interfaces for humans (APIs, CLIs or even >>>>> UIs) >>>>> to chime in again with their opinions. So all people one the ML, >>>>> the >>>>> mail is not just a response to Paul, feedback is welcome :) >>>>> >>>>> Hi Paul, I agree with the timeout. It could lead to very unpleasant >>>>> errors which are hard to debug and support. >>>>> >>>>> I added some thoughts to the other points you made: >>>>> >>>>>> a) know that the slow queries logs exist, >>>>> >>>>> Hmm... If I take a look at the 1.x logging it was very >>>>> straightforward. As a developer you would spin up a CouchDB and you >>>>> get all the log messages into your terminal. It was quite handy in >>>>> general for all kind of debugging. That the logs are not displayed >>>>> directly on stdout/stderr is in my opinion a general 2.x problem. >>>>> The >>>>> problem does occur with all kinds of log message we produce in >>>>> CouchDB >>>>> for 2.x and is not specific to the slow-query-logging. >>>>> >>>>> >>>>>> Ie, "You can try queries with testing:true, when you're ready to >>>>>> move to >>>>> production you can >>>>>> POST your selector to _index to create the index which allows you >>>>>> to >>>>>> remove testing:true". >>>>> >>>>> I really like the migration path you mentioned here with the API to >>>>> create indexes. I am worried to have a too high entry barrier for >>>>> absolute newcomers, people that you want to play around before they >>>>> are ready to think about indexes, e.g. by putting coupling the >>>>> index >>>>> topic from the beginning to the querying. >>>>> >>>>> When I throw too much things to learn on people (which may not >>>>> have >>>>> used a database before), most people get discouraged and does not >>>>> take >>>>> a look. The usual things they feel or say are : "too complicated", >>>>> "I >>>>> have not enough time", "product XY is easier to use". >>>>> >>>>> I would argue that newcomers to a database will launch a high >>>>> traffic, >>>>> multi-gigabyte product with the database from day one. Day one is >>>>> the >>>>> day where they learn how to query the data and put data into the >>>>> database. Even for scenarios where people have a running high >>>>> traffic >>>>> system, and have used other databases at a medium to large scale I >>>>> would expect given they migrate to Couch, that they run both >>>>> systems >>>>> in parallel for the first time in order to fix the issues that >>>>> occur >>>>> during a migration. >>>>> >>>>> I think we we share the same goal (getting beginners started >>>>> quickly) >>>>> and the cool thing about your suggestion is that everyone gets the >>>>> required knowledge to run a production system right from the very >>>>> start. My suggestion leaves some parts out, but reduces the >>>>> cognitive >>>>> load required to get the very first basic results, e.g. in a >>>>> university class setting - or junior developers on their "casual >>>>> friday 20% time". My big hope is, once those folks build high >>>>> traffic >>>>> systems, they remember how easy the usage of CouchDB was and that >>>>> they >>>>> start to learn more about CouchDB in order to run it in a system >>>>> with >>>>> more than a few thousand documents. >>>>> >>>>> >>>>> For us both I think the "what" is clear, but the "how" is a bit >>>>> different. I also think this discussion still makes progress, but I >>>>> am >>>>> afraid it could stall. I see that we both have very good rudiments >>>>> and >>>>> I would like to invite the other folks from the mailing list that >>>>> build interfaces for humans (APIs, CLIs or even UIs) to chime in >>>>> again >>>>> with their opinions - of course I'm also looking forward to your >>>>> answer :) >>>>> >>>>> Best, >>>>> Robert :) >>>>> >>>>> On Wed, Jan 6, 2016 at 6:21 PM, Paul Davis >>>>> <paul.joseph.da...@gmail.com> >>>>> wrote: >>>>>>>> - is a timeout solving the root cause or the symptoms? Could it >>>>>>>> be a >>>>>>>> temporary or additional step as in conjunction with query >>>>>>>> optimisation >>>>>>>> tooling? >>>>>>> >>>>>>> It really depends. From my CouchDB admin and user perspective, >>>>>>> this >>>>>>> doesn't seem so important to me right now. However, I recognize >>>>>>> that >>>>>>> there are different usage scenarios with different requirents >>>>>>> (e.g. the >>>>>>> ones at Cloudant). >>>>>> >>>>>> I don't think there's anything special about Cloudant in this >>>>>> discussion. Its just a question of how do we allow new users the >>>>>> ability to easily test and learn the selector/query API while >>>>>> also >>>>>> preventing them from going too far without creating indexes for >>>>>> their >>>>>> queries. The slow queries messages are fine, but just as any >>>>>> other >>>>>> database they don't really prompt the developer to make the >>>>>> correct >>>>>> change. Ie, the developer has to be savvy enough to a) know that >>>>>> the >>>>>> slow queries logs exist, b) understand that creating an index >>>>>> would >>>>>> speed things up, and then c) know which index to create based on >>>>>> the >>>>>> logged query. >>>>>> >>>>>> In my experience, the group of users that we're concerned about >>>>>> in >>>>>> this discussion most likely don't know about any of those three >>>>>> things, hence why the current API is designed to force them to >>>>>> learn >>>>>> about and understand indexes as part of learning the API. Granted >>>>>> the >>>>>> `_id > null` trick muddies that learning process. I would think >>>>>> that >>>>>> replacing the _id trick with `"testing": true` or similar would >>>>>> be an >>>>>> obvious indication to users that this is a dev/debug type feature >>>>>> and >>>>>> when they went to production they would still be pushed to using >>>>>> an >>>>>> index. If we add the "create index from selector" API then I >>>>>> think >>>>>> this would be a relatively straightforward method to on ramping >>>>>> to >>>>>> both the query and index sides of the API. Ie, "You can try >>>>>> queries >>>>>> with testing:true, when you're ready to move to production you >>>>>> can >>>>>> POST your selector to _index to create the index which allows you >>>>>> to >>>>>> remove testing:true". >>>>>> >>>>>> That's also why I don't particularly care for the timeout >>>>>> approach. >>>>>> It's a binary threshold that a user would (maybe) meet after some >>>>>> unknown amount of time after they falsely believe their app is >>>>>> working >>>>>> correctly. The feedback is "Everything is fine until it isn't". >>>>>> Consider an app that's been working for a week or a month or more >>>>>> that >>>>>> suddenly starts throwing timeouts for a query. From the user's >>>>>> perspective the database broke because the query that used to >>>>>> work >>>>>> fine no longer does. And then there's the follow on question on >>>>>> how >>>>>> that timeout might instruct the user that they need an index, and >>>>>> that >>>>>> the fix may be as easy as POSTing their selector to the _index >>>>>> endpoint. Sure Google would most likely have the answer if our >>>>>> docs >>>>>> are good enough, but by that point the developer is probably >>>>>> already >>>>>> experiencing downtime if their app is live which means they're >>>>>> frantically trying to fix the thing. From my point of view, a few >>>>>> road >>>>>> blocks that guide developers towards the correct usage early on >>>>>> would >>>>>> be better than letting them get to the adrenaline fueled >>>>>> expletive >>>>>> fountain of downtime. >>>>> >>>> >>