[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable

2020-04-23 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091078#comment-17091078
 ] 

Haisheng Yuan commented on CALCITE-3932:


Yes, that is feasible. But what if down stream projects already use \{{==}} for 
data type comparison? I don't know if there are.

> Make data type cache thread local, non-evictable
> 
>
> Key: CALCITE-3932
> URL: https://issues.apache.org/jira/browse/CALCITE-3932
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
> Fix For: 1.23.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Data type can be evicted out of cache, which is global, thread-safe. 
> It seems not necessary to cache them globally, because most of them are 
> RelRecordType, which is query dependent, not sharable between different 
> queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable

2020-04-23 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091057#comment-17091057
 ] 

Julian Hyde commented on CALCITE-3932:
--

I agree, won't-fix is the right remedy here.

I wonder whether we should reconsider (as a different Jira case) our interning 
of types. Change from MUST intern to SHOULD intern, and always compare types 
using {{equals}}.

We clearly want to do *some* interning, especially within a query, so that 
there aren't hundreds of copies of the same record type all over the place. But 
if people don't intern, or intern in different query-specific caches, then the 
logic will still work.

If {{equals}} is written using the standard template
{code:java}
return this == o
  || o instanceof TheType && field1 == o.field1 and field2 == o.field2 {code}
(that is, avoiding deep comparison if possible) then the performance will be 
pretty much the same.

> Make data type cache thread local, non-evictable
> 
>
> Key: CALCITE-3932
> URL: https://issues.apache.org/jira/browse/CALCITE-3932
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
> Fix For: 1.23.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Data type can be evicted out of cache, which is global, thread-safe. 
> It seems not necessary to cache them globally, because most of them are 
> RelRecordType, which is query dependent, not sharable between different 
> queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable

2020-04-23 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090927#comment-17090927
 ] 

Haisheng Yuan commented on CALCITE-3932:


For thread local, even it is long-lived, as long as the cache is evictable, we 
are fine. But for the case of materialized views shared among connections, 
that's indeed a problem.  I will close it with won't fix.

> Make data type cache thread local, non-evictable
> 
>
> Key: CALCITE-3932
> URL: https://issues.apache.org/jira/browse/CALCITE-3932
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
> Fix For: 1.23.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Data type can be evicted out of cache, which is global, thread-safe. 
> It seems not necessary to cache them globally, because most of them are 
> RelRecordType, which is query dependent, not sharable between different 
> queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable

2020-04-23 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090839#comment-17090839
 ] 

Julian Hyde commented on CALCITE-3932:
--

It's possible that one thread will create a type and another thread will later 
use it. Especially in the scenario of materialized views, which are shared 
among connections and statements.

So a thread-local cache doesn't seem to be a good fit.

Also, threads might be long-lived and fairly numerous (because people use 
thread pools) so the cache clutter may build up.

A global (static) cache with a WeakInterner sounds more promising.

> Make data type cache thread local, non-evictable
> 
>
> Key: CALCITE-3932
> URL: https://issues.apache.org/jira/browse/CALCITE-3932
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
> Fix For: 1.23.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Data type can be evicted out of cache, which is global, thread-safe. 
> It seems not necessary to cache them globally, because most of them are 
> RelRecordType, which is query dependent, not sharable between different 
> queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable

2020-04-22 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090165#comment-17090165
 ] 

Haisheng Yuan commented on CALCITE-3932:


Thanks for your advice. [~Junzheng Zheng] I opened a PR.

[~julianhyde] [~danny0405] [~vladimirsitnikov] Can you take a look?

> Make data type cache thread local, non-evictable
> 
>
> Key: CALCITE-3932
> URL: https://issues.apache.org/jira/browse/CALCITE-3932
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
> Fix For: 1.23.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Data type can be evicted out of cache, which is global, thread-safe. 
> It seems not necessary to cache them globally, because most of them are 
> RelRecordType, which is query dependent, not sharable between different 
> queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable

2020-04-22 Thread Junzheng Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090161#comment-17090161
 ] 

Junzheng Zheng commented on CALCITE-3932:
-

I think use a private attribute of RelDataTypeFactory to hold the cache is OK 
since every query has its own RelDataTypeFactory . If we use threadLocal, a 
thread may be not correspoding to a query, e.q. if use a single thread to 
optimize all queries then the cache will be a global cache.

> Make data type cache thread local, non-evictable
> 
>
> Key: CALCITE-3932
> URL: https://issues.apache.org/jira/browse/CALCITE-3932
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
> Fix For: 1.23.0
>
>
> Data type can be evicted out of cache, which is global, thread-safe. 
> It seems not necessary to cache them globally, because most of them are 
> RelRecordType, which is query dependent, not sharable between different 
> queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable

2020-04-22 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089712#comment-17089712
 ] 

Haisheng Yuan commented on CALCITE-3932:


If we consider the use of ThreadPool, we probably still need evictable cache, 
e.g. weak interner. 

> Make data type cache thread local, non-evictable
> 
>
> Key: CALCITE-3932
> URL: https://issues.apache.org/jira/browse/CALCITE-3932
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
> Fix For: 1.23.0
>
>
> Data type can be evicted out of cache, which is global, thread-safe. 
> It seems not necessary to cache them globally, because most of them are 
> RelRecordType, which is query dependent, not sharable between different 
> queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable

2020-04-17 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17086242#comment-17086242
 ] 

Haisheng Yuan commented on CALCITE-3932:


[~julianhyde] What do you think? Are the two caches good candidates to use 
ThreadLocal instead of global?

> Make data type cache thread local, non-evictable
> 
>
> Key: CALCITE-3932
> URL: https://issues.apache.org/jira/browse/CALCITE-3932
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>
> Data type can be evicted out of cache, which is global, thread-safe. 
> It seems not necessary to cache them globally, because most of them are 
> RelRecordType, which is query dependent, not sharable between different 
> queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable

2020-04-16 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085406#comment-17085406
 ] 

Haisheng Yuan commented on CALCITE-3932:


How can it be garbage collected while still being referenced? Even for weak 
interner?

 

In this issue, I want it non-evictable because guava strong interner is better 
performanced given that we want the cache short lived (suppose there is no 
thread-pool).

> Make data type cache thread local, non-evictable
> 
>
> Key: CALCITE-3932
> URL: https://issues.apache.org/jira/browse/CALCITE-3932
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>
> Data type can be evicted out of cache, which is global, thread-safe. 
> It seems not necessary to cache them globally, because most of them are 
> RelRecordType, which is query dependent, not sharable between different 
> queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable

2020-04-16 Thread neoremind (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085404#comment-17085404
 ] 

neoremind commented on CALCITE-3932:


I was thinking if we use ThreadLocal, are there any places to clean the 
ThreadLocal? If not intend to clean, then making them soft or weak would be 
mandatory, because GC will evict them.

> Make data type cache thread local, non-evictable
> 
>
> Key: CALCITE-3932
> URL: https://issues.apache.org/jira/browse/CALCITE-3932
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>
> Data type can be evicted out of cache, which is global, thread-safe. 
> It seems not necessary to cache them globally, because most of them are 
> RelRecordType, which is query dependent, not sharable between different 
> queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable

2020-04-16 Thread Danny Chen (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085402#comment-17085402
 ] 

Danny Chen commented on CALCITE-3932:
-

We compare our types in many places with `==` in our code, so the cached types 
should never be weak or soft references [1]

[1] 
https://github.com/apache/calcite/blob/2798d90a8082adb294c3aadbad6d22d9228abf3e/core/src/main/java/org/apache/calcite/plan/RelOptUtil.java#L2184

> Make data type cache thread local, non-evictable
> 
>
> Key: CALCITE-3932
> URL: https://issues.apache.org/jira/browse/CALCITE-3932
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>
> Data type can be evicted out of cache, which is global, thread-safe. 
> It seems not necessary to cache them globally, because most of them are 
> RelRecordType, which is query dependent, not sharable between different 
> queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable

2020-04-16 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085322#comment-17085322
 ] 

Haisheng Yuan commented on CALCITE-3932:


RelDataTypeFactoryImpl KEY2TYPE_CACHE and DATATYPE_CACHE.

[https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/type/RelDataTypeFactoryImpl.java#L53-L65]

> Make data type cache thread local, non-evictable
> 
>
> Key: CALCITE-3932
> URL: https://issues.apache.org/jira/browse/CALCITE-3932
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>
> Data type can be evicted out of cache, which is global, thread-safe. 
> It seems not necessary to cache them globally, because most of them are 
> RelRecordType, which is query dependent, not sharable between different 
> queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable

2020-04-16 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085317#comment-17085317
 ] 

Julian Hyde commented on CALCITE-3932:
--

What data type cache are you referring to? Each query has a RelDataTypeFactory, 
and that acts as cache.

Tying caches to threads seems arbitrary to me.

> Make data type cache thread local, non-evictable
> 
>
> Key: CALCITE-3932
> URL: https://issues.apache.org/jira/browse/CALCITE-3932
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>
> Data type can be evicted out of cache, which is global, thread-safe. 
> It seems not necessary to cache them globally, because most of them are 
> RelRecordType, which is query dependent, not sharable between different 
> queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable

2020-04-16 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085290#comment-17085290
 ] 

Haisheng Yuan commented on CALCITE-3932:


Most of the data type among different queries are not reusable except basic sql 
type. 

> Make data type cache thread local, non-evictable
> 
>
> Key: CALCITE-3932
> URL: https://issues.apache.org/jira/browse/CALCITE-3932
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>
> Data type can be evicted out of cache, which is global, thread-safe. 
> It seems not necessary to cache them globally, because most of them are 
> RelRecordType, which is query dependent, not sharable between different 
> queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)