[ 
https://issues.apache.org/jira/browse/CALCITE-3786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140152#comment-17140152
 ] 

Danny Chen commented on CALCITE-3786:
-------------------------------------

[~vladimirsitnikov] I didn't intend to do modify the RexNode normalization at 
first, but when i fix to the RexNode#equals instead of the pure string, i found 
that *i have to* because it seems not right that 2 RexCalls do not equals but 
have same digest. I can think of 2 more benefits to switch from pure string 
normalization to nodes:
- We can add more normalizations logic more easily(put complex rexnodes 
behind), i.e. i plan to add a new component RexNormalizaiton
- We should make the configuration with more flexibility, user/downstream can 
config through FrameworkConfig

[~hyuan] Current patch already solved 2, 3 of your questions, for 1, i would 
fix if [~laurent] and i make agreements, but remove the Digest and add 2 more 
interfaces to RelNodes(#hashCode and #equals) goes too far away for these 
reasons:
- We should be cautious for #equals #hashcode #explainTerms to have similar 
variables, it is error-prone for downstream projects to implement for their 
custom nodes
- And it is hard to clarify the semantics between strict #equals1 and 
equivalent #equals1, what if i want a normal #equals1 there
- The additional objects references should not be a bottleneck, removing it 
does not gain much.
Based on that, i would definitely give a -1 for the commit f970df.

As for 4, i'm looking forward to the RexNode memo(maybe), but it has no 
relationship with this issue.


> Add Digest interface to enable efficient hashCode(equals) for RexNode and 
> RelNode
> ---------------------------------------------------------------------------------
>
>                 Key: CALCITE-3786
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3786
>             Project: Calcite
>          Issue Type: New Feature
>          Components: core
>    Affects Versions: 1.21.0
>            Reporter: Vladimir Sitnikov
>            Assignee: Danny Chen
>            Priority: Major
>             Fix For: 1.24.0
>
>          Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> Current digests for RexNode, RelNode, RelType, and similar cases use String 
> concatenation.
> It is easy to implement, however, it has drawbacks:
> 1) String objects cannot be reused. For instance, RexCall has operands, 
> however, the digest is duplicated. It causes extra memory use and extra CPU 
> for string copying
> 2) There's no way to have multiple #toString() methods. RelType might need 
> multiple digests: "including field names", "excluding field names".
> A suggested resolution might be behind the lines of
> {code:java}
> class Digest { // immutable
>   final int hashCode; // speedup hashCode and equals
>   final Object[] contents; // The values are either other Digest objects or 
> Strings
>   String toString(); // e.g. for debugging purposes
>   int compareTo(Digest); // e.g. for debugging purposes.
> }
> {code}
> Note how fields in Kotlin are aligned much better, and it makes it easier to 
> read:
> {code:java}
> class Digest { // immutable
>   val hashCode: Int // speedup hashCode and equals
>   val contents: Array<Any> // The values are either other Digest objects or 
> Strings
>   fun toString(): String // e.g. for debugging purposes
>   fun compareTo(other: Digest): Int // e.g. for debugging purposes.
> }
> {code}
> Then the digest for RexCall could be the bits relevant to RexCall itself + 
> digests of the operands (which can be reused as is)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to