[ https://issues.apache.org/jira/browse/CASSANDRA-4915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491730#comment-13491730 ]
Jonathan Ellis commented on CASSANDRA-4915: ------------------------------------------- Short of real native paging (CASSANDRA-4415), I don't think this is really preventable. {{ALLOW FULL SCAN}} would only give you a false sense of security; consider {{SELECT * FROM users WHERE first_name='Ben' AND last_name='Higgenbotham'}}. If first_name is indexed but not last_name, and you have millions of Bens and a handful of Higgenbothams, you have the same problem even though our simplistic heuristic of "is it indexed?" would consider it "safe." > CQL should force limit when query samples data. > ----------------------------------------------- > > Key: CASSANDRA-4915 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4915 > Project: Cassandra > Issue Type: Improvement > Affects Versions: 1.2.0 beta 1 > Reporter: Edward Capriolo > Priority: Minor > > When issuing a query like: > {noformat} > CREATE TABLE videos ( > videoid uuid, > videoname varchar, > username varchar, > description varchar, > tags varchar, > upload_date timestamp, > PRIMARY KEY (videoid,videoname) > ); > SELECT * FROM videos WHERE videoname = 'My funny cat'; > {noformat} > Cassandra samples some data using get_range_slice and then applies the query. > This is very confusing to me, because as an end user am not sure if the query > is fast because Cassandra is performing an optimized query (over an index, or > using a slicePredicate) or if cassandra is simple sampling some random rows > and returning me some results. > My suggestions: > 1) force people to supply a LIMIT clause on any query that is going to > page over get_range_slice > 2) having some type of explain support so I can establish if this > query will work in the > I will champion suggestion 1) because CQL has put itself in a rather unique > un-sql like position by applying an automatic limit clause without the user > asking for them. I also do not believe the CQL language should let the user > issue queries that will not work as intended with "larger-then-auto-limit" > size data sets. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira