[ 
https://issues.apache.org/jira/browse/CASSANDRA-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14352873#comment-14352873
 ] 

mck commented on CASSANDRA-8574:
--------------------------------

>  The problem with both of these so far, is that a single partition key with 
> too many tombstones can make the query job fail hard.

Is the problem purely the tombstones, or could it be that tombstones increase 
occurrence of short reads?

*If* short reads is the underlying problem then there is a possible substantial 
improvement¹ in the code by (instead of having to completely retry each short 
read with a larger pager value) being able to indicate from StorageProxy back 
to the pager that this is a short read (opposed to end of results) and the 
pager should continue.

 ¹ say for example a 10k row is selected and each query with page size 100 is 
short and requires an additional read you're going to be running one hundred 
extra queries (double load on the cluster). being able to indicate back to the 
pager it isn't yet exhausted would mean only one extra query on the tail end.

> Gracefully degrade SELECT when there are lots of tombstones
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-8574
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8574
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jens Rantil
>             Fix For: 3.0
>
>
> *Background:* There's lots of tooling out there to do BigData analysis on 
> Cassandra clusters. Examples are Spark and Hadoop, which is offered by DSE. 
> The problem with both of these so far, is that a single partition key with 
> too many tombstones can make the query job fail hard.
> The described scenario happens despite the user setting a rather small 
> FetchSize. I assume this is a common scenario if you have larger rows.
> *Proposal:* To allow a CQL SELECT to gracefully degrade to only return a 
> smaller batch of results if there are too many tombstones. The tombstones are 
> ordered according to clustering key and one should be able to page through 
> them. Potentially:
>     SELECT * FROM mytable LIMIT 1000 TOMBSTONES;
> would page through maximum 1000 tombstones, _or_ 1000 (CQL) rows.
> I understand that this obviously would degrade performance, but it would at 
> least yield a result.
> *Additional comment:* I haven't dug into Cassandra code, but conceptually I 
> guess this would be doable. Let me know what you think.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to