[jira] [Commented] (CALCITE-3786) Add Digest interface to enable efficient hashCode(equals) for RexNode and RelNode

Haisheng Yuan (Jira) Tue, 09 Jun 2020 22:45:03 -0700


    [ 
https://issues.apache.org/jira/browse/CALCITE-3786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17130096#comment-17130096
 ]


Haisheng Yuan commented on CALCITE-3786:
----------------------------------------

I am sorry for disappointing you, I should have explicitly -1 on the proposal 
before you started working on this.

It has nothing to do RexNode memo stuff, I am just against the idea in this 
proposal, which sounds like someone feels headache after banging his head into 
the wall, the doctor wrapped the patient's head with thick bandage, and gave 
the advice: bang your head slightly next time. The proposal can ease the issue 
to some extent by reusing digest, but doesn't solve the underlying problem 
fundamentally that RexNode can't be deduplicated inside planner.

You think the digest in current code base works well, then how do you think 
CALCITE-3784? Maybe no one would run a single 20k lines analytical query or 60 
table joins on Flink, because it is real time streaming platform. 

> Add Digest interface to enable efficient hashCode(equals) for RexNode and 
> RelNode
> ---------------------------------------------------------------------------------
>
>                 Key: CALCITE-3786
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3786
>             Project: Calcite
>          Issue Type: New Feature
>          Components: core
>    Affects Versions: 1.21.0
>            Reporter: Vladimir Sitnikov
>            Assignee: Danny Chen
>            Priority: Major
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Current digests for RexNode, RelNode, RelType, and similar cases use String 
> concatenation.
> It is easy to implement, however, it has drawbacks:
> 1) String objects cannot be reused. For instance, RexCall has operands, 
> however, the digest is duplicated. It causes extra memory use and extra CPU 
> for string copying
> 2) There's no way to have multiple #toString() methods. RelType might need 
> multiple digests: "including field names", "excluding field names".
> A suggested resolution might be behind the lines of
> {code:java}
> class Digest { // immutable
>   final int hashCode; // speedup hashCode and equals
>   final Object[] contents; // The values are either other Digest objects or 
> Strings
>   String toString(); // e.g. for debugging purposes
>   int compareTo(Digest); // e.g. for debugging purposes.
> }
> {code}
> Note how fields in Kotlin are aligned much better, and it makes it easier to 
> read:
> {code:java}
> class Digest { // immutable
>   val hashCode: Int // speedup hashCode and equals
>   val contents: Array<Any> // The values are either other Digest objects or 
> Strings
>   fun toString(): String // e.g. for debugging purposes
>   fun compareTo(other: Digest): Int // e.g. for debugging purposes.
> }
> {code}
> Then the digest for RexCall could be the bits relevant to RexCall itself + 
> digests of the operands (which can be reused as is)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3786) Add Digest interface to enable efficient hashCode(equals) for RexNode and RelNode

Reply via email to