[ https://issues.apache.org/jira/browse/CALCITE-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17645458#comment-17645458 ]
Julian Hyde commented on CALCITE-5426: -------------------------------------- Thanks for logging this, [~dmsysolyatin]. It's an important correctness issue and we should fix it. {quote} Are there any performance tests that prove that this optimization significantly improves performance? {quote} No, because there are no performance tests. But if we removed common subexpression elimination - and say moved an array initializer from a static initializer to inside a for loop - I'm sure there would be significant degradation. Let's provide a way for the caller of the API a way to say that the object is mutable. If the object is of an immutable type, say {{String}} or {{Double}} or {{ImmutableList}} then it would be an error if the mutable flag is true. If the object is of a possibly-mutable type, say {{List}} or {{ArrayList}} or an array, then we have to take the caller at their word. Maybe there are different kinds of mutable. Perhaps a list is populated at the start of each execution but doesn't change from one row to the next. Keep an eye out for those cases; we might need more information than boolean mutable vs immutable, or we might be able to safely treat them as immutable. > BlockBuilder should not optimize expressions related to mutable objects to > variable > ----------------------------------------------------------------------------------- > > Key: CALCITE-5426 > URL: https://issues.apache.org/jira/browse/CALCITE-5426 > Project: Calcite > Issue Type: Bug > Components: linq4j > Affects Versions: 1.32.0 > Reporter: Dmitry Sysolyatin > Priority: Major > > Inside BlockBuilder there is an optimization that replaces an expression with > a variable if the expressions are equal and have the final modifier. > But this optimization can cause problems when used with a mutable objects > (One of the problems has been found in CALCITE-5388): > For example [1]: > {code:java} > @Test void testReuseCollectionExpression() throws NoSuchMethodException { > Method putMethod = HashMap.class.getMethod("put", Object.class, > Object.class); > Method sizeMethod = HashMap.class.getMethod("size"); > Expression multiMapParent = b.append("multiMap", > Expressions.new_(Types.of(HashMap.class))); > b.add(Expressions.statement( > Expressions.call(multiMapParent, putMethod, Expressions.box(ONE), > Expressions.box(ONE)))); > BlockBuilder nested = new BlockBuilder(true, b); > Expression multiMapNested = nested.append("multiMap", > Expressions.new_(Types.of(HashMap.class))); > nested.add(Expressions.statement( > Expressions.call(multiMapNested, putMethod, Expressions.box(TWO), > Expressions.box(TWO)))); > nested.add(Expressions.call(multiMapNested, sizeMethod)); > b.add(nested.toBlock()); > b.append(Expressions.call(multiMapParent, sizeMethod)); > // It is wrong output. Map should be reused > assertEquals( > "{\n" > + " final java.util.HashMap multiMap = new > java.util.HashMap();\n" > + " multiMap.put(Integer.valueOf(1), Integer.valueOf(1));\n" > + " {\n" > + " multiMap.put(Integer.valueOf(2), Integer.valueOf(2));\n" > + " return multiMap.size();\n" > + " }\n" > + " return multiMap.size();\n" > + "}\n", > b.toBlock().toString()); > } > {code} > Are there any performance tests that prove that this optimization > significantly improves performance? > [1] > https://github.com/dssysolyatin/calcite/commit/626a5f48ef9e69e543aeec277a4f38000a190b10 -- This message was sent by Atlassian Jira (v8.20.10#820010)