[jira] [Updated] (IGNITE-16430) Calcite engine. Sorted index spool with sorting can't be planned
[ https://issues.apache.org/jira/browse/IGNITE-16430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-16430: --- Summary: Calcite engine. Sorted index spool with sorting can't be planned (was: Calcite engine. Sorted index spool with sorting can be planned) > Calcite engine. Sorted index spool with sorting can't be planned > > > Key: IGNITE-16430 > URL: https://issues.apache.org/jira/browse/IGNITE-16430 > Project: Ignite > Issue Type: Improvement >Reporter: Aleksey Plekhanov >Priority: Major > Labels: calcite2-required, calcite3-required > > Currently, we have code in {{FilterSpoolMergeToSortedIndexSpoolRule}} that > creates a sorted spool even if the input collation is empty. In this case, > collation is created by index condition and the new sort node is inserted > before the spool. But such a plan can never be chosen as the best plan since > when we calculate the cumulative cost for the nested loop correlated join, we > multiply left side rows count to right side commutative cost not taking into > account rewind cost. Currently, cumulative cost for filter + spool = {{{}CPU: > n + n{}}}, memory: {{{}0 + n{}}}, for sorted spool + sort = CPU: {{{}log n + > n*log n{}}}, memory: {{{}n + n{}}}. So, the cost for filter + spool will > always be better than the cost for sorted spool + sort and sorted spool + > sort never can be chosen. But for example, for sorted spool with sort rewind > CPU cost is only {{log n}} since sorting is required only once and rewind CPU > cost of filter + spool is {{{}n + n{}}}. So, starting from some iteration > count cost of {{iterations * rewind cost + cumulative cost}} will be better > than {{{}iterantions * cumulative cost{}}}, and sorted spool + sort will be > chosen in this case. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (IGNITE-16430) Calcite engine. Sorted index spool with sorting can be planned
[ https://issues.apache.org/jira/browse/IGNITE-16430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-16430: --- Labels: calcite2-required calcite3-required (was: ) > Calcite engine. Sorted index spool with sorting can be planned > -- > > Key: IGNITE-16430 > URL: https://issues.apache.org/jira/browse/IGNITE-16430 > Project: Ignite > Issue Type: Improvement >Reporter: Aleksey Plekhanov >Priority: Major > Labels: calcite2-required, calcite3-required > > Currently, we have code in {{FilterSpoolMergeToSortedIndexSpoolRule}} that > creates a sorted spool even if the input collation is empty. In this case, > collation is created by index condition and the new sort node is inserted > before the spool. But such a plan can never be chosen as the best plan since > when we calculate the cumulative cost for the nested loop correlated join, we > multiply left side rows count to right side commutative cost not taking into > account rewind cost. Currently, cumulative cost for filter + spool = {{{}CPU: > n + n{}}}, memory: {{{}0 + n{}}}, for sorted spool + sort = CPU: {{{}log n + > n*log n{}}}, memory: {{{}n + n{}}}. So, the cost for filter + spool will > always be better than the cost for sorted spool + sort and sorted spool + > sort never can be chosen. But for example, for sorted spool with sort rewind > CPU cost is only {{log n}} since sorting is required only once and rewind CPU > cost of filter + spool is {{{}n + n{}}}. So, starting from some iteration > count cost of {{iterations * rewind cost + cumulative cost}} will be better > than {{{}iterantions * cumulative cost{}}}, and sorted spool + sort will be > chosen in this case. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (IGNITE-16430) Calcite engine. Sorted index spool with sorting can be planned
[ https://issues.apache.org/jira/browse/IGNITE-16430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-16430: --- Description: Currently, we have code in {{FilterSpoolMergeToSortedIndexSpoolRule}} that creates a sorted spool even if the input collation is empty. In this case, collation is created by index condition and the new sort node is inserted before the spool. But such a plan can never be chosen as the best plan since when we calculate the cumulative cost for the nested loop correlated join, we multiply left side rows count to right side commutative cost not taking into account rewind cost. Currently, cumulative cost for filter + spool = {{{}CPU: n + n{}}}, memory: {{{}0 + n{}}}, for sorted spool + sort = CPU: {{{}log n + n*log n{}}}, memory: {{{}n + n{}}}. So, the cost for filter + spool will always be better than the cost for sorted spool + sort and sorted spool + sort never can be chosen. But for example, for sorted spool with sort rewind CPU cost is only {{log n}} since sorting is required only once and rewind CPU cost of filter + spool is {{{}n + n{}}}. So, starting from some iteration count cost of {{iterations * rewind cost + cumulative cost}} will be better than {{{}iterantions * cumulative cost{}}}, and sorted spool + sort will be chosen in this case. (was: Currently, we have code in {{FilterSpoolMergeToSortedIndexSpoolRule}} that creates a sorted spool even if the input collation is empty. In this case, collation is created by index condition and the new sort node is inserted before the spool. But such a plan can never be chosen as the best plan since when we calculate the cumulative cost for the nested loop correlated join, we multiply left side rows count to right side commutative cost not taking into account rewind cost. Currently, cumulative cost for filter + spool = {{CPU: n + n}}, memory: {{0 + n}}, for sorted spool + sort = CPU: {{log n+ n*log n}}, memory: {{n + n}}. So, the cost for filter + spool will always be better than the cost for sorted spool + sort and sorted spool + sort never can be chosen. But for example, for sorted spool with sort rewind CPU cost is only {{log n}} since sorting is required only once and rewind CPU cost of filter + spool is {{n + n}}. So, starting from some iteration count cost of {{iterations * rewind cost + cumulative cost}} will be better than {{iterantions * cumulative cost}}, and sorted spool + sort will be chosen in this case.) > Calcite engine. Sorted index spool with sorting can be planned > -- > > Key: IGNITE-16430 > URL: https://issues.apache.org/jira/browse/IGNITE-16430 > Project: Ignite > Issue Type: Improvement >Reporter: Aleksey Plekhanov >Priority: Major > > Currently, we have code in {{FilterSpoolMergeToSortedIndexSpoolRule}} that > creates a sorted spool even if the input collation is empty. In this case, > collation is created by index condition and the new sort node is inserted > before the spool. But such a plan can never be chosen as the best plan since > when we calculate the cumulative cost for the nested loop correlated join, we > multiply left side rows count to right side commutative cost not taking into > account rewind cost. Currently, cumulative cost for filter + spool = {{{}CPU: > n + n{}}}, memory: {{{}0 + n{}}}, for sorted spool + sort = CPU: {{{}log n + > n*log n{}}}, memory: {{{}n + n{}}}. So, the cost for filter + spool will > always be better than the cost for sorted spool + sort and sorted spool + > sort never can be chosen. But for example, for sorted spool with sort rewind > CPU cost is only {{log n}} since sorting is required only once and rewind CPU > cost of filter + spool is {{{}n + n{}}}. So, starting from some iteration > count cost of {{iterations * rewind cost + cumulative cost}} will be better > than {{{}iterantions * cumulative cost{}}}, and sorted spool + sort will be > chosen in this case. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (IGNITE-16430) Calcite engine. Sorted index spool with sorting can be planned
[ https://issues.apache.org/jira/browse/IGNITE-16430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-16430: --- Description: Currently, we have code in {{FilterSpoolMergeToSortedIndexSpoolRule}} that creates a sorted spool even if the input collation is empty. In this case, collation is created by index condition and the new sort node is inserted before the spool. But such a plan can never be chosen as the best plan since when we calculate the cumulative cost for the nested loop correlated join, we multiply left side rows count to right side commutative cost not taking into account rewind cost. Currently, cumulative cost for filter + spool = {{CPU: n + n}}, memory: {{0 + n}}, for sorted spool + sort = CPU: {{log n+ n*log n}}, memory: {{n + n}}. So, the cost for filter + spool will always be better than the cost for sorted spool + sort and sorted spool + sort never can be chosen. But for example, for sorted spool with sort rewind CPU cost is only {{log n}} since sorting is required only once and rewind CPU cost of filter + spool is {{n + n}}. So, starting from some iteration count cost of {{iterations * rewind cost + cumulative cost}} will be better than {{iterantions * cumulative cost}}, and sorted spool + sort will be chosen in this case. (was: Currently, we have code in {{FilterSpoolMergeToSortedIndexSpoolRule}} that creates a sorted spool even if the input collation is empty. In this case, collation is created by index condition and the new sort node is inserted before the spool. But such a plan can never be chosen as the best plan since when we calculate the cumulative cost for the nested loop correlated join, we multiply left side rows count to right side commutative cost not taking into account rewind cost. Currently, cumulative cost for filter + spool = {{{}CPU: n + n{}}}, memory: {{{}0 + n{}}}, for sorted spool + sort = CPU: {{{}log(n)+ n*log(n){}}}, memory: {{{}n + n{}}}. So, the cost for filter + spool will always be better than the cost for sorted spool + sort and sorted spool + sort never can be chosen. But for example, for sorted spool with sort rewind CPU cost is only {{log(n)}} since sorting is required only once and rewind CPU cost of filter + spool is {{{}n + n{}}}. So, starting from some iteration count cost of {{iterations * rewind cost + cumulative cost}} will be better than {{{}iterantions * cumulative cost{}}}, and sorted spool + sort will be chosen in this case.) > Calcite engine. Sorted index spool with sorting can be planned > -- > > Key: IGNITE-16430 > URL: https://issues.apache.org/jira/browse/IGNITE-16430 > Project: Ignite > Issue Type: Improvement >Reporter: Aleksey Plekhanov >Priority: Major > > Currently, we have code in {{FilterSpoolMergeToSortedIndexSpoolRule}} that > creates a sorted spool even if the input collation is empty. In this case, > collation is created by index condition and the new sort node is inserted > before the spool. But such a plan can never be chosen as the best plan since > when we calculate the cumulative cost for the nested loop correlated join, we > multiply left side rows count to right side commutative cost not taking into > account rewind cost. Currently, cumulative cost for filter + spool = {{CPU: n > + n}}, memory: {{0 + n}}, for sorted spool + sort = CPU: {{log n+ n*log n}}, > memory: {{n + n}}. So, the cost for filter + spool will always be better than > the cost for sorted spool + sort and sorted spool + sort never can be chosen. > But for example, for sorted spool with sort rewind CPU cost is only {{log n}} > since sorting is required only once and rewind CPU cost of filter + spool is > {{n + n}}. So, starting from some iteration count cost of {{iterations * > rewind cost + cumulative cost}} will be better than {{iterantions * > cumulative cost}}, and sorted spool + sort will be chosen in this case. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (IGNITE-16430) Calcite engine. Sorted index spool with sorting can be planned
[ https://issues.apache.org/jira/browse/IGNITE-16430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-16430: --- Description: Currently, we have code in {{FilterSpoolMergeToSortedIndexSpoolRule}} that creates a sorted spool even if the input collation is empty. In this case, collation is created by index condition and the new sort node is inserted before the spool. But such a plan can never be chosen as the best plan since when we calculate the cumulative cost for the nested loop correlated join, we multiply left side rows count to right side commutative cost not taking into account rewind cost. Currently, cumulative cost for filter + spool = {{{}CPU: n + n{}}}, memory: {{{}0 + n{}}}, for sorted spool + sort = CPU: {{{}log(n)+ n*log(n){}}}, memory: {{{}n + n{}}}. So, the cost for filter + spool will always be better than the cost for sorted spool + sort and sorted spool + sort never can be chosen. But for example, for sorted spool with sort rewind CPU cost is only {{log(n)}} since sorting is required only once and rewind CPU cost of filter + spool is {{{}n + n{}}}. So, starting from some iteration count cost of {{iterations * rewind cost + cumulative cost}} will be better than {{{}iterantions * cumulative cost{}}}, and sorted spool + sort will be chosen in this case. (was: Currently, we have code in {{FilterSpoolMergeToSortedIndexSpoolRule}} that creates a sorted spool even if the input collation is empty. In this case, collation is created by index condition and the new sort node is inserted before the spool. But such a plan can never be chosen as the best plan since when we calculate the cumulative cost for the nested loop correlated join, we multiply left side rows count to right side commutative cost not taking into account rewind cost. Currently, cumulative cost for filter + spool = {{{}CPU: n + n{}}}, memory: {{{}0 + n{}}}, for sorted spool + sort = CPU: {{{}log(n) + n*log(n){}}}, memory: \{{n + n}}. So, the cost for filter + spool will always be better than the cost for sorted spool + sort and sorted spool + sort never can be chosen. But for example, for sorted spool with sort rewind CPU cost is only {{log(n)}} since sorting is required only once and rewind CPU cost of filter + spool is {{{}n + n{}}}. So, starting from some iteration count cost of {{iterations * rewind cost + cumulative cost}} will be better than {{{}iterantions * cumulative cost{}}}, and sorted spool + sort will be chosen in this case.) > Calcite engine. Sorted index spool with sorting can be planned > -- > > Key: IGNITE-16430 > URL: https://issues.apache.org/jira/browse/IGNITE-16430 > Project: Ignite > Issue Type: Improvement >Reporter: Aleksey Plekhanov >Priority: Major > > Currently, we have code in {{FilterSpoolMergeToSortedIndexSpoolRule}} that > creates a sorted spool even if the input collation is empty. In this case, > collation is created by index condition and the new sort node is inserted > before the spool. But such a plan can never be chosen as the best plan since > when we calculate the cumulative cost for the nested loop correlated join, we > multiply left side rows count to right side commutative cost not taking into > account rewind cost. Currently, cumulative cost for filter + spool = {{{}CPU: > n + n{}}}, memory: {{{}0 + n{}}}, for sorted spool + sort = CPU: {{{}log(n)+ > n*log(n){}}}, memory: {{{}n + n{}}}. So, the cost for filter + spool will > always be better than the cost for sorted spool + sort and sorted spool + > sort never can be chosen. But for example, for sorted spool with sort rewind > CPU cost is only {{log(n)}} since sorting is required only once and rewind > CPU cost of filter + spool is {{{}n + n{}}}. So, starting from some iteration > count cost of {{iterations * rewind cost + cumulative cost}} will be better > than {{{}iterantions * cumulative cost{}}}, and sorted spool + sort will be > chosen in this case. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (IGNITE-16430) Calcite engine. Sorted index spool with sorting can be planned
Aleksey Plekhanov created IGNITE-16430: -- Summary: Calcite engine. Sorted index spool with sorting can be planned Key: IGNITE-16430 URL: https://issues.apache.org/jira/browse/IGNITE-16430 Project: Ignite Issue Type: Improvement Reporter: Aleksey Plekhanov Currently, we have code in {{FilterSpoolMergeToSortedIndexSpoolRule}} that creates a sorted spool even if the input collation is empty. In this case, collation is created by index condition and the new sort node is inserted before the spool. But such a plan can never be chosen as the best plan since when we calculate the cumulative cost for the nested loop correlated join, we multiply left side rows count to right side commutative cost not taking into account rewind cost. Currently, cumulative cost for filter + spool = {{{}CPU: n + n{}}}, memory: {{{}0 + n{}}}, for sorted spool + sort = CPU: {{{}log(n) + n*log(n){}}}, memory: \{{n + n}}. So, the cost for filter + spool will always be better than the cost for sorted spool + sort and sorted spool + sort never can be chosen. But for example, for sorted spool with sort rewind CPU cost is only {{log(n)}} since sorting is required only once and rewind CPU cost of filter + spool is {{{}n + n{}}}. So, starting from some iteration count cost of {{iterations * rewind cost + cumulative cost}} will be better than {{{}iterantions * cumulative cost{}}}, and sorted spool + sort will be chosen in this case. -- This message was sent by Atlassian Jira (v8.20.1#820001)