[ https://issues.apache.org/jira/browse/CALCITE-2635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16738551#comment-16738551 ]
Vladimir Sitnikov commented on CALCITE-2635: -------------------------------------------- {quote}@PerformanceTest(expectedDuration = "2s", variance = "5%"){quote} Expected duration depends on the hardware. For instance, notebook, virtual machine, desktop, vps, etc, all could have very different raw performance. I think it is much better to invest time to having something like https://arewefastyet.com In other words, we could have a set of "standard" benchmarks + consistent machine for execution + scheduled executions so we can track regressions. I'm inclined to merge this fix with no extra tests. Note: the change is a clear win. Alternative option is to implement HashMap to speedup {{org.apache.calcite.rel.type.RelDataType#getField(String fieldName, boolean caseSensitive, boolean elideRecord)}}. We do have {{org.apache.calcite.rel.type.RelDataTypeFactoryImpl#canonize(org.apache.calcite.rel.type.RelDataType)}}, so lazy initialized cache of field positions might help. However, we don't really expect single table to have lots of collations, so we could just go with PR#891 On top of that, we might add a hard limit like "try no more than first 50 collations of the table", so even a table with extreme amount of collations won't create a problem for {{getMonotonocity}} > getMonotonocity is slow on wide tables > -------------------------------------- > > Key: CALCITE-2635 > URL: https://issues.apache.org/jira/browse/CALCITE-2635 > Project: Calcite > Issue Type: Improvement > Components: core > Reporter: Gian Merlino > Assignee: Gian Merlino > Priority: Major > Labels: performance > > RelOptTableImpl's getMonotonocity does an indexOf on > {{rowType.getFieldNames()}}, which is O(N) in the number of fields. > IdentifierNamespace calls getMonotonicity once for every field in the table > namespace, so it becomes O(N^2) in the number of fields. We observed 2-4 > second query planning times with a table that had 18,000 columns, reduced to > about 150ms after patching getMonotonicity to be O(1) in the number of fields. -- This message was sent by Atlassian JIRA (v7.6.3#76005)