[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable
[ https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091078#comment-17091078 ] Haisheng Yuan commented on CALCITE-3932: Yes, that is feasible. But what if down stream projects already use \{{==}} for data type comparison? I don't know if there are. > Make data type cache thread local, non-evictable > > > Key: CALCITE-3932 > URL: https://issues.apache.org/jira/browse/CALCITE-3932 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Haisheng Yuan >Priority: Major > Fix For: 1.23.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Data type can be evicted out of cache, which is global, thread-safe. > It seems not necessary to cache them globally, because most of them are > RelRecordType, which is query dependent, not sharable between different > queries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable
[ https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091057#comment-17091057 ] Julian Hyde commented on CALCITE-3932: -- I agree, won't-fix is the right remedy here. I wonder whether we should reconsider (as a different Jira case) our interning of types. Change from MUST intern to SHOULD intern, and always compare types using {{equals}}. We clearly want to do *some* interning, especially within a query, so that there aren't hundreds of copies of the same record type all over the place. But if people don't intern, or intern in different query-specific caches, then the logic will still work. If {{equals}} is written using the standard template {code:java} return this == o || o instanceof TheType && field1 == o.field1 and field2 == o.field2 {code} (that is, avoiding deep comparison if possible) then the performance will be pretty much the same. > Make data type cache thread local, non-evictable > > > Key: CALCITE-3932 > URL: https://issues.apache.org/jira/browse/CALCITE-3932 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Haisheng Yuan >Priority: Major > Fix For: 1.23.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Data type can be evicted out of cache, which is global, thread-safe. > It seems not necessary to cache them globally, because most of them are > RelRecordType, which is query dependent, not sharable between different > queries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable
[ https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090927#comment-17090927 ] Haisheng Yuan commented on CALCITE-3932: For thread local, even it is long-lived, as long as the cache is evictable, we are fine. But for the case of materialized views shared among connections, that's indeed a problem. I will close it with won't fix. > Make data type cache thread local, non-evictable > > > Key: CALCITE-3932 > URL: https://issues.apache.org/jira/browse/CALCITE-3932 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Haisheng Yuan >Priority: Major > Fix For: 1.23.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Data type can be evicted out of cache, which is global, thread-safe. > It seems not necessary to cache them globally, because most of them are > RelRecordType, which is query dependent, not sharable between different > queries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable
[ https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090839#comment-17090839 ] Julian Hyde commented on CALCITE-3932: -- It's possible that one thread will create a type and another thread will later use it. Especially in the scenario of materialized views, which are shared among connections and statements. So a thread-local cache doesn't seem to be a good fit. Also, threads might be long-lived and fairly numerous (because people use thread pools) so the cache clutter may build up. A global (static) cache with a WeakInterner sounds more promising. > Make data type cache thread local, non-evictable > > > Key: CALCITE-3932 > URL: https://issues.apache.org/jira/browse/CALCITE-3932 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Haisheng Yuan >Priority: Major > Fix For: 1.23.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Data type can be evicted out of cache, which is global, thread-safe. > It seems not necessary to cache them globally, because most of them are > RelRecordType, which is query dependent, not sharable between different > queries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable
[ https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090165#comment-17090165 ] Haisheng Yuan commented on CALCITE-3932: Thanks for your advice. [~Junzheng Zheng] I opened a PR. [~julianhyde] [~danny0405] [~vladimirsitnikov] Can you take a look? > Make data type cache thread local, non-evictable > > > Key: CALCITE-3932 > URL: https://issues.apache.org/jira/browse/CALCITE-3932 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Haisheng Yuan >Priority: Major > Fix For: 1.23.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Data type can be evicted out of cache, which is global, thread-safe. > It seems not necessary to cache them globally, because most of them are > RelRecordType, which is query dependent, not sharable between different > queries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable
[ https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090161#comment-17090161 ] Junzheng Zheng commented on CALCITE-3932: - I think use a private attribute of RelDataTypeFactory to hold the cache is OK since every query has its own RelDataTypeFactory . If we use threadLocal, a thread may be not correspoding to a query, e.q. if use a single thread to optimize all queries then the cache will be a global cache. > Make data type cache thread local, non-evictable > > > Key: CALCITE-3932 > URL: https://issues.apache.org/jira/browse/CALCITE-3932 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Haisheng Yuan >Priority: Major > Fix For: 1.23.0 > > > Data type can be evicted out of cache, which is global, thread-safe. > It seems not necessary to cache them globally, because most of them are > RelRecordType, which is query dependent, not sharable between different > queries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable
[ https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089712#comment-17089712 ] Haisheng Yuan commented on CALCITE-3932: If we consider the use of ThreadPool, we probably still need evictable cache, e.g. weak interner. > Make data type cache thread local, non-evictable > > > Key: CALCITE-3932 > URL: https://issues.apache.org/jira/browse/CALCITE-3932 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Haisheng Yuan >Priority: Major > Fix For: 1.23.0 > > > Data type can be evicted out of cache, which is global, thread-safe. > It seems not necessary to cache them globally, because most of them are > RelRecordType, which is query dependent, not sharable between different > queries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable
[ https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17086242#comment-17086242 ] Haisheng Yuan commented on CALCITE-3932: [~julianhyde] What do you think? Are the two caches good candidates to use ThreadLocal instead of global? > Make data type cache thread local, non-evictable > > > Key: CALCITE-3932 > URL: https://issues.apache.org/jira/browse/CALCITE-3932 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Haisheng Yuan >Priority: Major > > Data type can be evicted out of cache, which is global, thread-safe. > It seems not necessary to cache them globally, because most of them are > RelRecordType, which is query dependent, not sharable between different > queries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable
[ https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085406#comment-17085406 ] Haisheng Yuan commented on CALCITE-3932: How can it be garbage collected while still being referenced? Even for weak interner? In this issue, I want it non-evictable because guava strong interner is better performanced given that we want the cache short lived (suppose there is no thread-pool). > Make data type cache thread local, non-evictable > > > Key: CALCITE-3932 > URL: https://issues.apache.org/jira/browse/CALCITE-3932 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Haisheng Yuan >Priority: Major > > Data type can be evicted out of cache, which is global, thread-safe. > It seems not necessary to cache them globally, because most of them are > RelRecordType, which is query dependent, not sharable between different > queries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable
[ https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085404#comment-17085404 ] neoremind commented on CALCITE-3932: I was thinking if we use ThreadLocal, are there any places to clean the ThreadLocal? If not intend to clean, then making them soft or weak would be mandatory, because GC will evict them. > Make data type cache thread local, non-evictable > > > Key: CALCITE-3932 > URL: https://issues.apache.org/jira/browse/CALCITE-3932 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Haisheng Yuan >Priority: Major > > Data type can be evicted out of cache, which is global, thread-safe. > It seems not necessary to cache them globally, because most of them are > RelRecordType, which is query dependent, not sharable between different > queries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable
[ https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085402#comment-17085402 ] Danny Chen commented on CALCITE-3932: - We compare our types in many places with `==` in our code, so the cached types should never be weak or soft references [1] [1] https://github.com/apache/calcite/blob/2798d90a8082adb294c3aadbad6d22d9228abf3e/core/src/main/java/org/apache/calcite/plan/RelOptUtil.java#L2184 > Make data type cache thread local, non-evictable > > > Key: CALCITE-3932 > URL: https://issues.apache.org/jira/browse/CALCITE-3932 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Haisheng Yuan >Priority: Major > > Data type can be evicted out of cache, which is global, thread-safe. > It seems not necessary to cache them globally, because most of them are > RelRecordType, which is query dependent, not sharable between different > queries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable
[ https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085322#comment-17085322 ] Haisheng Yuan commented on CALCITE-3932: RelDataTypeFactoryImpl KEY2TYPE_CACHE and DATATYPE_CACHE. [https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/type/RelDataTypeFactoryImpl.java#L53-L65] > Make data type cache thread local, non-evictable > > > Key: CALCITE-3932 > URL: https://issues.apache.org/jira/browse/CALCITE-3932 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Haisheng Yuan >Priority: Major > > Data type can be evicted out of cache, which is global, thread-safe. > It seems not necessary to cache them globally, because most of them are > RelRecordType, which is query dependent, not sharable between different > queries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable
[ https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085317#comment-17085317 ] Julian Hyde commented on CALCITE-3932: -- What data type cache are you referring to? Each query has a RelDataTypeFactory, and that acts as cache. Tying caches to threads seems arbitrary to me. > Make data type cache thread local, non-evictable > > > Key: CALCITE-3932 > URL: https://issues.apache.org/jira/browse/CALCITE-3932 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Haisheng Yuan >Priority: Major > > Data type can be evicted out of cache, which is global, thread-safe. > It seems not necessary to cache them globally, because most of them are > RelRecordType, which is query dependent, not sharable between different > queries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable
[ https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085290#comment-17085290 ] Haisheng Yuan commented on CALCITE-3932: Most of the data type among different queries are not reusable except basic sql type. > Make data type cache thread local, non-evictable > > > Key: CALCITE-3932 > URL: https://issues.apache.org/jira/browse/CALCITE-3932 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Haisheng Yuan >Priority: Major > > Data type can be evicted out of cache, which is global, thread-safe. > It seems not necessary to cache them globally, because most of them are > RelRecordType, which is query dependent, not sharable between different > queries. -- This message was sent by Atlassian Jira (v8.3.4#803005)