[ 
https://issues.apache.org/jira/browse/CALCITE-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17310474#comment-17310474
 ] 

Vladimir Sitnikov commented on CALCITE-4557:
--------------------------------------------

(quote}Most likely, they all terminate in 1 or 1.1 comparisons on 
average.{quote}

If the input is already sorted, then the leading columns would not yield an 
early return from the comparisons.

For instance, if the input is already sorted on {{a, b, c}}, then {{sort by a, 
b, c, d}} would have to perform 4 comparisons always.

{code:sql}
select *
  from (
   select a, b, c, d ... inherently ordered by a, b
  )
 order by a, b, c, d
{code}

---

However, I'm not much into assessing the number of distinct values in the sort 
key fields, so I think we should be fine to assume that every next field would 
be compared with 0.1 probability decay.

In other words, "sort key comparison cost factor" should be like 
{{numberOfPreSortedFieldsInSortKey + 1 * (1 - 
0.1^numberOfNonSortedFieldsInSortKey)/(1-0.1)}





> Sort costing should account the number of sort keys: the more keys to 
> compare, the more the cost
> ------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-4557
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4557
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.26.0
>            Reporter: Vladimir Sitnikov
>            Priority: Major
>
> Current costing for sort does not account for the number of keys in the 
> comparison function which might result in worse plan selection.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to