Thomas' idea is a good one. From the EXPLAIN plan ResultSet, you can directly get an estimate of the number of bytes that will be scanned. Take a look at this [1] documentation. We need to implement PHOENIX-4735 too (so that things are setup well out-of-the-box). We could have a kind of guardrail config property that would define the max allowed bytes allowed to be read and fail a query that goes over this limit. That would cover 80% of the issues IMHO. Other guardrail config properties could cover other corner cases.
[1] http://phoenix.apache.org/explainplan.html On Mon, Aug 27, 2018 at 3:01 PM Josh Elser <els...@apache.org> wrote: > On 8/27/18 5:03 PM, Thomas D'Silva wrote: > >> 3. Better recommendations to users to not attempt certain queries. > >> > >> We definitively know that there are certain types of queries that > Phoenix > >> cannot support well (compared to optimal Phoenix use-cases). Users very > >> commonly fall into such pitfalls on their own and this leaves a bad > taste > >> in their mouth (thinking that the product "stinks"). > >> > >> Can we do a better job of telling the user when and why it happened? > What > >> would such a user-interaction model look like? Can we supplement the > "why" > >> with instructions of what to do differently (even if in the abstract)? > >> > > Providing relevant feedback before/after a query is run in general is > very > > hard to do. If stats are enabled we have an estimate of how many > rows/bytes > > will be scanned. > > We could have an optional feature that prevent users from running queries > > if the rows/bytes scanned are above a certain threshold. We should also > > enhance our explain > > plan documentationhttp://phoenix.apache.org/explainplan.html with > example > > of queries so users know what kinds of queries Phoenix handles well. > > Breaking this out.. > > Totally agree -- this is by no means "easy". I struggle very often > trying to express just _why_ a query that someone is running in Phoenix > doesn't run as well as they think it should. > > Centralizing on the EXPLAIN plan is good. Making sure it's > consumable/thorough is probably the lowest hanging fruit. If we can give > concrete examples to the kinds of explain plans a user might see, I > think that might get use from users/admins. > > Throwing a random idea out there: with stats and the query plan, can we > give a thumbs-up/thumbs-down? If we can, is that useful? >