github-actions[bot] commented on code in PR #62698:
URL: https://github.com/apache/doris/pull/62698#discussion_r3239747230


##########
fe/fe-core/src/main/java/org/apache/doris/catalog/FunctionToSqlConverter.java:
##########
@@ -75,12 +77,25 @@ public static String toSql(ScalarFunction fn, boolean 
ifNotExists) {
                     .append("\"" + (fn.getLocation() == null ? "" : 
fn.getLocation().toString()) + "\"");
             boolean isReturnNull = fn.getNullableMode() == 
NullableMode.ALWAYS_NULLABLE;
             sb.append(",\n  \"ALWAYS_NULLABLE\"=").append("\"" + isReturnNull 
+ "\"");
+            sb.append(",\n  \"VOLATILITY\"=").append("\"" + 
fn.getVolatility().toSql() + "\"");

Review Comment:
   This branch also runs for Java/Python UDTFs because they are stored as 
`ScalarFunction` with `isUDTFunction() == true`. `CreateFunctionCommand` now 
rejects any `volatility` property when `isTableFunction` is true, so `SHOW 
CREATE FUNCTION` for an existing Java/Python UDTF will emit a `VOLATILITY` 
property that cannot be replayed. Please only emit volatility for scalar 
Java/Python UDFs, not UDTFs; the same guard is needed in 
`ShowFunctionsCommand.buildProperties()` where `VOLATILITY` is added for every 
Java/Python `ScalarFunction`.



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/VolatileIdentity.java:
##########
@@ -0,0 +1,91 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.nereids.trees.expressions;
+
+import com.google.common.base.Preconditions;
+
+import java.util.Optional;
+
+/** Value object for volatile per-call identity and temporary group-by binding 
comparison. */
+public class VolatileIdentity {
+    public static final VolatileIdentity NON_VOLATILE = new 
VolatileIdentity(Optional.empty(), false);
+
+    private final Optional<ExprId> uniqueId;
+    private final boolean ignoreUniqueId;
+
+    public VolatileIdentity(ExprId uniqueId, boolean ignoreUniqueId) {
+        this(Optional.of(uniqueId), ignoreUniqueId);
+    }
+
+    private VolatileIdentity(Optional<ExprId> uniqueId, boolean 
ignoreUniqueId) {
+        Preconditions.checkArgument(!ignoreUniqueId || uniqueId.isPresent(),
+                "ignoreUniqueId is meaningful only for volatile expressions");
+        this.uniqueId = uniqueId;
+        this.ignoreUniqueId = ignoreUniqueId;
+    }
+
+    public static VolatileIdentity newVolatileIdentity() {
+        return new VolatileIdentity(StatementScopeIdGenerator.newExprId(), 
false);
+    }
+
+    public static VolatileIdentity of(ExprId uniqueId, boolean ignoreUniqueId) 
{
+        return new VolatileIdentity(uniqueId, ignoreUniqueId);
+    }
+
+    public boolean isVolatile() {
+        return uniqueId.isPresent();
+    }
+
+    public ExprId getUniqueId() {
+        return uniqueId.get();
+    }
+
+    public Optional<ExprId> getUniqueIdOptional() {
+        return uniqueId;
+    }
+
+    public boolean ignoreUniqueId() {
+        return ignoreUniqueId;
+    }
+
+    public VolatileIdentity withIgnoreUniqueId(boolean ignoreUniqueId) {
+        Preconditions.checkState(isVolatile(), "Only volatile expressions can 
ignore unique id");
+        return new VolatileIdentity(uniqueId, ignoreUniqueId);
+    }
+
+    /** Compare volatile expressions by identity unless either side 
temporarily ignores it. */
+    public boolean equalsByIdentity(VolatileIdentity other, boolean 
fallbackEquals) {
+        if (!isVolatile() && !other.isVolatile()) {
+            return fallbackEquals;
+        }
+        if (!isVolatile() || !other.isVolatile()) {
+            return false;
+        }
+        if (ignoreUniqueId || other.ignoreUniqueId()) {

Review Comment:
   This makes equality one-sided while `hashCodeByIdentity()` only falls back 
for the object whose `ignoreUniqueId` flag is set. For example, 
`udf.withIgnoreUniqueId(true).equals(udf)` can return true through 
`fallbackEquals`, but the ignored instance hashes structurally and the 
non-ignored instance hashes to the volatile `ExprId`. These expressions are 
used as `HashMap`/`HashSet` keys during group-by binding and replacement, so 
this violates the Java equals/hashCode contract and can make lookups fail 
depending on which form is inserted. If the temporary comparison is only meant 
for keys that have both been rewritten to ignore identity, require both sides 
to ignore the id.
   
   ```suggestion
           if (ignoreUniqueId && other.ignoreUniqueId()) {
   ```



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/AddProjectForUniqueFunction.java:
##########
@@ -274,22 +274,24 @@ public <T extends Expression> Optional<Pair<List<T>, 
LogicalProject<Plan>>> rewr
      */
     @VisibleForTesting
     public List<NamedExpression> tryGenUniqueFunctionAlias(Collection<? 
extends Expression> targets) {
-        Map<UniqueFunction, Integer> unqiueFunctionCounter = 
Maps.newLinkedHashMap();
+        Map<Expression, Integer> unqiueFunctionCounter = 
Maps.newLinkedHashMap();
         for (Expression target : targets) {
             target.foreach(e -> {
                 Expression expr = (Expression) e;
-                if (expr instanceof UniqueFunction) {
-                    unqiueFunctionCounter.merge((UniqueFunction) expr, 1, 
Integer::sum);

Review Comment:
   Broadening this from `UniqueFunction` to any volatile expression changes an 
important assumption of this rule. Built-in unique functions are 
slot-independent, but volatile Java/Python UDFs can reference columns. 
`rewriteExpressions()` always materializes the generated alias in a 
`LogicalProject` over `plan.child(0)`, and `JoinRewrite` feeds it all join 
conjuncts; if the repeated volatile UDF references right-side slots, the new 
left-child project will contain expressions whose input slots are not produced 
by the left child. That creates an invalid plan (or moves evaluation to the 
wrong side). Please either keep this aliasing limited to slot-independent 
volatile expressions or place the project on the child that contains all input 
slots, with tests for a repeated volatile UDF over the right side of a join.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to