[ 
https://issues.apache.org/jira/browse/CALCITE-3786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17139268#comment-17139268
 ] 

Danny Chen commented on CALCITE-3786:
-------------------------------------

[~vladimirsitnikov] I used the GC profiler and here is the test data:


{code:xml}
Benchmark                                   (digestType)  (disjunctions)  
(joins)  Mode  Cnt   Score    Error   Units
DigestBenchmark.getRel                            OBJECT               1        
1  avgt    5   0.082 ±  0.010   us/op
DigestBenchmark.getRel                            OBJECT               1       
10  avgt    5   0.380 ±  0.025   us/op
DigestBenchmark.getRel                            OBJECT               1       
20  avgt    5   0.732 ±  0.077   us/op
DigestBenchmark.getRel                            OBJECT              10        
1  avgt    5   0.081 ±  0.010   us/op
DigestBenchmark.getRel                            OBJECT              10       
10  avgt    5   0.364 ±  0.022   us/op
DigestBenchmark.getRel                            OBJECT              10       
20  avgt    5   0.697 ±  0.046   us/op
DigestBenchmark.getRel                            OBJECT             100        
1  avgt    5   0.081 ±  0.008   us/op
DigestBenchmark.getRel                            OBJECT             100       
10  avgt    5   0.359 ±  0.025   us/op
DigestBenchmark.getRel                            OBJECT             100       
20  avgt    5   0.726 ±  0.090   us/op

DigestBenchmark.getRel                            STRING               1        
1  avgt    5   1.269 ±   0.035   us/op
DigestBenchmark.getRel                            STRING               1       
10  avgt    5  10.609 ±   0.146   us/op
DigestBenchmark.getRel                            STRING               1       
20  avgt    5  28.708 ±   0.810   us/op
DigestBenchmark.getRel                            STRING              10        
1  avgt    5   1.365 ±   0.073   us/op
DigestBenchmark.getRel                            STRING              10       
10  avgt    5  10.640 ±   0.107   us/op
DigestBenchmark.getRel                            STRING              10       
20  avgt    5  28.171 ±   0.612   us/op
DigestBenchmark.getRel                            STRING             100        
1  avgt    5   1.354 ±   0.083   us/op
DigestBenchmark.getRel                            STRING             100       
10  avgt    5  11.583 ±   5.685   us/op
DigestBenchmark.getRel                            STRING             100       
20  avgt    5  27.828 ±   0.343   us/op


Benchmark                                   (digestType)  (disjunctions)  
(joins)  Mode  Cnt      Score     Error   Units
DigestBenchmark.getRel:·gc.alloc.rate.norm        OBJECT               1        
1  avgt    5  ≈ 10⁻⁴             B/op
DigestBenchmark.getRel:·gc.alloc.rate.norm        OBJECT               1       
10  avgt    5   0.005 ±  0.001    B/op
DigestBenchmark.getRel:·gc.alloc.rate.norm        OBJECT               1       
20  avgt    5   0.020 ±  0.002    B/op
DigestBenchmark.getRel:·gc.alloc.rate.norm        OBJECT              10        
1  avgt    5   0.001 ±  0.001    B/op
DigestBenchmark.getRel:·gc.alloc.rate.norm        OBJECT              10       
10  avgt    5   0.006 ±  0.001    B/op
DigestBenchmark.getRel:·gc.alloc.rate.norm        OBJECT              10       
20  avgt    5   0.022 ±  0.001    B/op
DigestBenchmark.getRel:·gc.alloc.rate.norm        OBJECT             100        
1  avgt    5   0.003 ±  0.001    B/op
DigestBenchmark.getRel:·gc.alloc.rate.norm        OBJECT             100       
10  avgt    5   0.018 ±  0.001    B/op
DigestBenchmark.getRel:·gc.alloc.rate.norm        OBJECT             100       
20  avgt    5   0.047 ±  0.006    B/op

DigestBenchmark.getRel:·gc.alloc.rate.norm        STRING               1        
1  avgt    5   1840.004 ±   0.001    B/op
DigestBenchmark.getRel:·gc.alloc.rate.norm        STRING               1       
10  avgt    5   8568.145 ±   0.002    B/op
DigestBenchmark.getRel:·gc.alloc.rate.norm        STRING               1       
20  avgt    5  16008.839 ±   0.022    B/op
DigestBenchmark.getRel:·gc.alloc.rate.norm        STRING              10        
1  avgt    5   1960.009 ±   0.001    B/op
DigestBenchmark.getRel:·gc.alloc.rate.norm        STRING              10       
10  avgt    5   8568.180 ±   0.003    B/op
DigestBenchmark.getRel:·gc.alloc.rate.norm        STRING              10       
20  avgt    5  16008.913 ±   0.018    B/op
DigestBenchmark.getRel:·gc.alloc.rate.norm        STRING             100        
1  avgt    5   1960.054 ±   0.003    B/op
DigestBenchmark.getRel:·gc.alloc.rate.norm        STRING             100       
10  avgt    5   8568.577 ±   0.284    B/op
DigestBenchmark.getRel:·gc.alloc.rate.norm        STRING             100       
20  avgt    5  16009.819 ±   0.024    B/op
{code}


> Add Digest interface to enable efficient hashCode(equals) for RexNode and 
> RelNode
> ---------------------------------------------------------------------------------
>
>                 Key: CALCITE-3786
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3786
>             Project: Calcite
>          Issue Type: New Feature
>          Components: core
>    Affects Versions: 1.21.0
>            Reporter: Vladimir Sitnikov
>            Assignee: Danny Chen
>            Priority: Major
>             Fix For: 1.24.0
>
>          Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Current digests for RexNode, RelNode, RelType, and similar cases use String 
> concatenation.
> It is easy to implement, however, it has drawbacks:
> 1) String objects cannot be reused. For instance, RexCall has operands, 
> however, the digest is duplicated. It causes extra memory use and extra CPU 
> for string copying
> 2) There's no way to have multiple #toString() methods. RelType might need 
> multiple digests: "including field names", "excluding field names".
> A suggested resolution might be behind the lines of
> {code:java}
> class Digest { // immutable
>   final int hashCode; // speedup hashCode and equals
>   final Object[] contents; // The values are either other Digest objects or 
> Strings
>   String toString(); // e.g. for debugging purposes
>   int compareTo(Digest); // e.g. for debugging purposes.
> }
> {code}
> Note how fields in Kotlin are aligned much better, and it makes it easier to 
> read:
> {code:java}
> class Digest { // immutable
>   val hashCode: Int // speedup hashCode and equals
>   val contents: Array<Any> // The values are either other Digest objects or 
> Strings
>   fun toString(): String // e.g. for debugging purposes
>   fun compareTo(other: Digest): Int // e.g. for debugging purposes.
> }
> {code}
> Then the digest for RexCall could be the bits relevant to RexCall itself + 
> digests of the operands (which can be reused as is)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to