[ 
https://issues.apache.org/jira/browse/CASSANDRA-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14630977#comment-14630977
 ] 

Sylvain Lebresne commented on CASSANDRA-6492:
---------------------------------------------

bq.  I'm just worried about not being able to meet user expectations when we 
first expose a page size in bytes.

I understand, and it's a valid concern. But I don't know, I'm just not a fan of 
hard-coded magic constants. Even if we hide that "bytes target" from view, we 
might still be really off on our stats and fail it, which can still have user 
visible consequence, and so I'm not sure this ultimately help users 
comprehension of what is going on.

The other aspect is that if we do that (just have a default mode), users for 
which the default doesn't work are still stuck with providing the page size in 
number of rows, which still requires them to guess-estimate their average row 
size, which is annoying to do when we can probably do a pretty good job of 
guess-estimating server-side automatically.

But I totally agree we should be very clear initially that this is "a very soft 
target". And maybe we can experiment a bit to get a better sense of how bad 
that estimate will be in practice. That is, we can try different schemas and 
workloads (even try actively to "game" the estimate), and if it proves very 
easy to get an estimate that is very off, then I can agree that exposing the 
size is probably not a good idea (though if that's the case, it will also be 
worth asking ourselves if even a default is going to help more than it hurts). 
If it's quite hard however (to get an estimate that is very off reality), then 
we'll still warn users that it's not precise, but that's probably good enough 
in practice.

> Have server pick query page size by default
> -------------------------------------------
>
>                 Key: CASSANDRA-6492
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6492
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API
>            Reporter: Jonathan Ellis
>            Assignee: Benjamin Lerer
>            Priority: Minor
>              Labels: client-impacting
>
> We're almost always going to do a better job picking a page size based on 
> sstable stats, than users will guesstimating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to