[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation

2020-04-18 Thread GitBox
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for 
ReflectiveVisitDispatcher implementation
URL: https://github.com/apache/calcite/pull/1875#issuecomment-615893085
 
 
   I have copied conclusions to JIRA case. Many thanks to Vladimir, Haisheng, 
Danny, Chunwei and Stamatis to take time reviewing my code kindly 😃 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation

2020-04-10 Thread GitBox
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for 
ReflectiveVisitDispatcher implementation
URL: https://github.com/apache/calcite/pull/1875#issuecomment-611911050
 
 
   @danny0405 Many thanks for your advice. Can I close this PR and open a new 
one only to add unit test and benchmark? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation

2020-04-09 Thread GitBox
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for 
ReflectiveVisitDispatcher implementation
URL: https://github.com/apache/calcite/pull/1875#issuecomment-611891530
 
 
   @hsyuan Thanks for your advice about softValues. Actually now I am a little 
reluctant to the change, because according to benchmark, by using Guava global 
cache, the overhead is a little bigger than just looking up method by 
Reflection every time (<20 ns/op gap), and the most performant option is to use 
JDK ConcurrentHashMap as global cache but without size limitation and 
expiration for GC. The master branch version may call Reflection the first 
time, and then will use JDK HashMap as cache in instance level which does good. 
   So my opinion is, if we can not use JDK ConcurrentHashMap as global cache, 
then the update may no longer necessary. Hopefully the unit test and benchmark 
could be useful after all. Sorry for making the thread so long.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation

2020-04-08 Thread GitBox
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for 
ReflectiveVisitDispatcher implementation
URL: https://github.com/apache/calcite/pull/1875#issuecomment-611042597
 
 
   @vlsi Thank you for your quick response. I have added new benchmark test 
case for baseline to disable guava caching. The results shows as below (code is 
in [my 
branch](https://github.com/neoremind/calcite/blob/refvisit_benchmark/ubenchmark/src/jmh/java/org/apache/calcite/benchmarks/ReflectiveVisitDispatcherTest2.java)).
   
   ```
   Benchmark   
(guavaCacheMaxSize)  Mode  CntScoreError  Units
   ReflectiveVisitDispatcherTest102.testGlobalCachingUsingGuavaCache
 0  avgt5  116.446 ±  8.957  ns/op
   ReflectiveVisitDispatcherTest102.testGlobalCachingUsingGuavaCache
   128  avgt5  133.678 ±  8.251  ns/op
   ReflectiveVisitDispatcherTest102.testGlobalCachingUsingHashMap   
 0  avgt5   73.213 ±  7.614  ns/op
   ReflectiveVisitDispatcherTest102.testInstanceCachingWithReflection   
 0  avgt5  227.130 ±  9.033  ns/op
   ```
   
   Table title are operations performed by each solution.
   
| | lookup method | get by key from map | put key value to map | time cost |
-|-|-|-|-|
   `testGlobalCachingUsingGuavaCache with guavaCacheMaxSize = 0` (Disable 
cache, looking up method every time for each call) | Y |  |  | 116.446 ns/op |
   `testGlobalCachingUsingGuavaCache with guavaCacheMaxSize = 128` (Global 
cache with guava cache) |  | Y |  | 133.678 ns/op |
   `testGlobalCachingUsingHashMap` (Global cache with JDK ConcurrentHashMap) |  
| Y |  | 73.213 ns/op |
   `testInstanceCachingWithReflection` (Original implementation to cache per 
instance for each call) | Y | Y | Y | 227.130 ns/op |
   
   
   The conclusion we could get is 
   1. Guava cache overhead is a little too big, compared to JDK 
ConcurrentHashMap. But JDK ConcurrentHashMap is not limit sized, so in the 
previous discussion, we do not choose JDK ConcurrentHashMap.
   2. Guava cache overhead is even bigger than looking up method for each call. 
This is what surprised me.
   3. The original implementation is the slowest one because it has to get from 
map, look up method then put method to map, it combines all the operations, the 
solution works well in the same instance, but for every Calcite invocation it 
has to sacrifice performance to look up method every time, that is where the PR 
tries to improve. 
   
   The current version does improve performance and balance memory usage. But 
it is slower than 1) JDK ConcurrentHashMap for caching (but not limit size), 2) 
looking up method direct in each call without map operations. 
   
   PS. Correct me if I am wrong, there is no `softKeys` api in guava, actually 
I find places in Calcite to hold classes as key in HashMap as global caching, 
like `org.apache.calcite.sql2rel.ReflectiveConvertletTable`, so I was wondering 
if global caching with JDK ConcurrentHashMap could work here as well. Looking 
forward to your advice.
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation

2020-04-08 Thread GitBox
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for 
ReflectiveVisitDispatcher implementation
URL: https://github.com/apache/calcite/pull/1875#issuecomment-610838697
 
 
   @vlsi sorry to bother you, would you take a look of my update again? I'd 
like to move forward this PR, many thanks 😄 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation

2020-04-01 Thread GitBox
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for 
ReflectiveVisitDispatcher implementation
URL: https://github.com/apache/calcite/pull/1875#issuecomment-607168132
 
 
   @vlsi I have addressed your comments.
   1) Set max cache to 0 to disable global caching.
   2) Use `CacheBuilder.newBuilder().softValues()`, does not use weak values, 
because I think soft references are most commonly used to implement 
memory-sensitive caches, they lives longer until before an OOM is thrown.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation

2020-04-01 Thread GitBox
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for 
ReflectiveVisitDispatcher implementation
URL: https://github.com/apache/calcite/pull/1875#issuecomment-607135659
 
 
   I see, how about using weak or soft values?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation

2020-04-01 Thread GitBox
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for 
ReflectiveVisitDispatcher implementation
URL: https://github.com/apache/calcite/pull/1875#issuecomment-607112581
 
 
   @vlsi I replied my your comment in the code review section above. 
   PS. There are global caches existing like below, all of them reachable, 
except that the`Integer.MAX_VALUE` limit might exhaust memory.
   ```
   public static final CalciteSystemProperty 
METADATA_HANDLER_CACHE_MAXIMUM_SIZE =
 intProperty("calcite.metadata.handler.cache.maximum.size", 1000);
   public static final CalciteSystemProperty BINDABLE_CACHE_MAX_SIZE =
 intProperty("calcite.bindable.cache.maxSize", 0, v -> v >= 0 && v <= 
Integer.MAX_VALUE);
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation

2020-03-31 Thread GitBox
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for 
ReflectiveVisitDispatcher implementation
URL: https://github.com/apache/calcite/pull/1875#issuecomment-606988861
 
 
   @vlsi Would you help me review the PR again? Many thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation

2020-03-27 Thread GitBox
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for 
ReflectiveVisitDispatcher implementation
URL: https://github.com/apache/calcite/pull/1875#issuecomment-605405488
 
 
   @vlsi I refined the benchmark in [my 
branch](https://github.com/neoremind/calcite/blob/refvisit_benchmark/ubenchmark/src/jmh/java/org/apache/calcite/benchmarks/ReflectiveVisitDispatcherTest.java).
   
   The result is as below.
   ```
   Benchmark
Mode  Cnt  Score   Error  Units
   ReflectiveVisitDispatcherTest.testGlobalCaching  
avgt3137.270 ±32.275  ns/op
   ReflectiveVisitDispatcherTest.testInstanceCachingWithLambdaFactory   
avgt3  92459.660 ± 56777.698  ns/op
   
ReflectiveVisitDispatcherTest.testInstanceCachingWithLambdaFactoryThreadLocalInitialize
  avgt3199.687 ±24.755  ns/op
   ReflectiveVisitDispatcherTest.testInstanceCachingWithMethodHandle
avgt3   2421.131 ±   633.756  ns/op
   
ReflectiveVisitDispatcherTest.testInstanceCachingWithMethodHandleThreadLocalInitialize
   avgt3218.616 ±41.754  ns/op
   ReflectiveVisitDispatcherTest.testInstanceCachingWithReflection  
avgt3224.603 ±22.770  ns/op
   
ReflectiveVisitDispatcherTest.testInstanceCachingWithReflectionThreadLocalInitialize
 avgt3218.406 ± 9.620  ns/op
   ```
   
   From the result, it seems the pre-work is time consuming for both 
`MethodHandle` and `LambdaFactory` (built upon MethodHandle).  To eliminate the 
initialization effect, I try to use ThreadLocal to initialize once, the result 
do verifies that, in terms of time cost, LambdaFactory < MethodHandle = 
reflection, but the gap is very small. 
   
   Overall, back to what the PR brings, the global caching stragety with 
reflection invocation can satisfy our requirement during the whole process, 
instead of caching per instance or per thread.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation

2020-03-27 Thread GitBox
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for 
ReflectiveVisitDispatcher implementation
URL: https://github.com/apache/calcite/pull/1875#issuecomment-605105348
 
 
   @vlsi good idea, let me try later.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation

2020-03-27 Thread GitBox
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for 
ReflectiveVisitDispatcher implementation
URL: https://github.com/apache/calcite/pull/1875#issuecomment-605091418
 
 
   @vlsi  , as @hsyuan suggested, I make global caching by default. In order to 
benchmark against current instance level caching using `Reflection` and 
`MethodHandle`, especially for `MethodHandle`, classes like `ReflectUtil` and 
`ReflectiveVisitDispatcher` need to be updated much, so I make that change in 
my [fork 
branch](https://github.com/neoremind/calcite/blob/refvisit_benchmark/ubenchmark/src/jmh/java/org/apache/calcite/benchmarks/ReflectiveVisitDispatcherTest.java).
 
   
   The test result is a little out of expectation. 
   ```
   Benchmark
  Mode  Cnt Score  Error  Units
   
ReflectiveVisitDispatcherTest.testReflectiveVisitorDispatcherInvokeGlobalCaching
   avgt3   143.483 ±   32.596  ns/op
   
ReflectiveVisitDispatcherTest.testReflectiveVisitorDispatcherInvokeInstanceCaching
 avgt3   290.380 ±  857.199  ns/op
   
ReflectiveVisitDispatcherTest.testReflectiveVisitorDispatcherInvokeInstanceCachingUseMethodHandle
  avgt3  2487.406 ± 1083.180  ns/op
   ```
   PS. I refine test to return generated result from the benchmark methods, 
update benchmark parameters like fork and iteration times.
   
   MethodHandle is way slow than reflection, the below part is performance 
killer. If I make candidateMethod to be feteched from ThreadLocal which I 
initialized explicitly, the average time can drop down to about 208 ns/op.
   ```
   MethodType mt = MethodType.methodType(returnType, paramTypes);
   MethodHandle candidateMethod = 
MethodHandles.lookup().findVirtual(visitorClass, visitMethodName, mt);
   ```
   
   I will look into the reason if I have more bandwidth. 
   My benchmark result is pretty much the same with 
[JavaReflectionButMuchFaster.html](http://www.jboss.org/optaplanner/blog/2018/01/09/JavaReflectionButMuchFaster.html).
   ```
   DirectAccessavgt   60  2.590 ± 0.014  ns/op
   Reflection  avgt   60  5.275 ± 0.053  ns/op // 104% slower
   MethodHandleavgt   60  6.100 ± 0.079  ns/op // 136% slower
   ```
   
   We need the reflection agility to be able to find methods at runtime, 
`MethodHanlde` seems not a good choice.
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation

2020-03-27 Thread GitBox
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for 
ReflectiveVisitDispatcher implementation
URL: https://github.com/apache/calcite/pull/1875#issuecomment-604957529
 
 
   nice, it works 👏 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation

2020-03-27 Thread GitBox
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for 
ReflectiveVisitDispatcher implementation
URL: https://github.com/apache/calcite/pull/1875#issuecomment-604889135
 
 
   @vlsi Have you encountered the following exception when running test in 
`ubenchmark` module?
   ```
   Exception in thread "main" java.lang.RuntimeException: ERROR: Unable to find 
the resource: /META-INF/BenchmarkList
at 
org.openjdk.jmh.runner.AbstractResourceReader.getReaders(AbstractResourceReader.java:98)
at org.openjdk.jmh.runner.BenchmarkList.find(BenchmarkList.java:122)
at org.openjdk.jmh.runner.Runner.internalRun(Runner.java:263)
at org.openjdk.jmh.runner.Runner.run(Runner.java:209)
at 
org.apache.calcite.benchmarks.PreconditionTest.main(PreconditionTest.java:51)
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation

2020-03-26 Thread GitBox
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for 
ReflectiveVisitDispatcher implementation
URL: https://github.com/apache/calcite/pull/1875#issuecomment-604520763
 
 
   @vlsi I will try to add the benchmark. `MethodHandle` would be a good 
alternative, since it requires both returned and parameter types, it might take 
some time to update the code to make it work. I will look into this solution.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation

2020-03-26 Thread GitBox
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for 
ReflectiveVisitDispatcher implementation
URL: https://github.com/apache/calcite/pull/1875#issuecomment-604479873
 
 
   @danny0405 jmh test result is as below. It shows by using reflection 
invocation to look up methods in every call, the overhead is not small compared 
with global caching. In real case, reflection invocation can happen between 
invocations and different `ReflectVisitorDispatcher` instance. The solution 
might be trivial, but I think it goes toward the right way.
   
   ```
   Benchmark  Mode  CntScore
 Error  Units
   ReflectVisitorDispatcherTest.testGlobalLevelCachingavgt   15  108.398 ±  
15.280  ns/op
   ReflectVisitorDispatcherTest.testInstanceLevelCaching  avgt   15  730.208 ± 
117.693  ns/op
   ```
   
   Test source code is attached in 
[JIRA](https://issues.apache.org/jira/secure/attachment/12997903/ReflectVisitorDispatcherTest.java),
 the full test report can be found 
[here](https://issues.apache.org/jira/secure/attachment/12997904/jmh_result.txt)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services