[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation URL: https://github.com/apache/calcite/pull/1875#issuecomment-615893085 I have copied conclusions to JIRA case. Many thanks to Vladimir, Haisheng, Danny, Chunwei and Stamatis to take time reviewing my code kindly 😃 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation URL: https://github.com/apache/calcite/pull/1875#issuecomment-611911050 @danny0405 Many thanks for your advice. Can I close this PR and open a new one only to add unit test and benchmark? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation URL: https://github.com/apache/calcite/pull/1875#issuecomment-611891530 @hsyuan Thanks for your advice about softValues. Actually now I am a little reluctant to the change, because according to benchmark, by using Guava global cache, the overhead is a little bigger than just looking up method by Reflection every time (<20 ns/op gap), and the most performant option is to use JDK ConcurrentHashMap as global cache but without size limitation and expiration for GC. The master branch version may call Reflection the first time, and then will use JDK HashMap as cache in instance level which does good. So my opinion is, if we can not use JDK ConcurrentHashMap as global cache, then the update may no longer necessary. Hopefully the unit test and benchmark could be useful after all. Sorry for making the thread so long. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation URL: https://github.com/apache/calcite/pull/1875#issuecomment-611042597 @vlsi Thank you for your quick response. I have added new benchmark test case for baseline to disable guava caching. The results shows as below (code is in [my branch](https://github.com/neoremind/calcite/blob/refvisit_benchmark/ubenchmark/src/jmh/java/org/apache/calcite/benchmarks/ReflectiveVisitDispatcherTest2.java)). ``` Benchmark (guavaCacheMaxSize) Mode CntScoreError Units ReflectiveVisitDispatcherTest102.testGlobalCachingUsingGuavaCache 0 avgt5 116.446 ± 8.957 ns/op ReflectiveVisitDispatcherTest102.testGlobalCachingUsingGuavaCache 128 avgt5 133.678 ± 8.251 ns/op ReflectiveVisitDispatcherTest102.testGlobalCachingUsingHashMap 0 avgt5 73.213 ± 7.614 ns/op ReflectiveVisitDispatcherTest102.testInstanceCachingWithReflection 0 avgt5 227.130 ± 9.033 ns/op ``` Table title are operations performed by each solution. | | lookup method | get by key from map | put key value to map | time cost | -|-|-|-|-| `testGlobalCachingUsingGuavaCache with guavaCacheMaxSize = 0` (Disable cache, looking up method every time for each call) | Y | | | 116.446 ns/op | `testGlobalCachingUsingGuavaCache with guavaCacheMaxSize = 128` (Global cache with guava cache) | | Y | | 133.678 ns/op | `testGlobalCachingUsingHashMap` (Global cache with JDK ConcurrentHashMap) | | Y | | 73.213 ns/op | `testInstanceCachingWithReflection` (Original implementation to cache per instance for each call) | Y | Y | Y | 227.130 ns/op | The conclusion we could get is 1. Guava cache overhead is a little too big, compared to JDK ConcurrentHashMap. But JDK ConcurrentHashMap is not limit sized, so in the previous discussion, we do not choose JDK ConcurrentHashMap. 2. Guava cache overhead is even bigger than looking up method for each call. This is what surprised me. 3. The original implementation is the slowest one because it has to get from map, look up method then put method to map, it combines all the operations, the solution works well in the same instance, but for every Calcite invocation it has to sacrifice performance to look up method every time, that is where the PR tries to improve. The current version does improve performance and balance memory usage. But it is slower than 1) JDK ConcurrentHashMap for caching (but not limit size), 2) looking up method direct in each call without map operations. PS. Correct me if I am wrong, there is no `softKeys` api in guava, actually I find places in Calcite to hold classes as key in HashMap as global caching, like `org.apache.calcite.sql2rel.ReflectiveConvertletTable`, so I was wondering if global caching with JDK ConcurrentHashMap could work here as well. Looking forward to your advice. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation URL: https://github.com/apache/calcite/pull/1875#issuecomment-610838697 @vlsi sorry to bother you, would you take a look of my update again? I'd like to move forward this PR, many thanks 😄 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation URL: https://github.com/apache/calcite/pull/1875#issuecomment-607168132 @vlsi I have addressed your comments. 1) Set max cache to 0 to disable global caching. 2) Use `CacheBuilder.newBuilder().softValues()`, does not use weak values, because I think soft references are most commonly used to implement memory-sensitive caches, they lives longer until before an OOM is thrown. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation URL: https://github.com/apache/calcite/pull/1875#issuecomment-607135659 I see, how about using weak or soft values? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation URL: https://github.com/apache/calcite/pull/1875#issuecomment-607112581 @vlsi I replied my your comment in the code review section above. PS. There are global caches existing like below, all of them reachable, except that the`Integer.MAX_VALUE` limit might exhaust memory. ``` public static final CalciteSystemProperty METADATA_HANDLER_CACHE_MAXIMUM_SIZE = intProperty("calcite.metadata.handler.cache.maximum.size", 1000); public static final CalciteSystemProperty BINDABLE_CACHE_MAX_SIZE = intProperty("calcite.bindable.cache.maxSize", 0, v -> v >= 0 && v <= Integer.MAX_VALUE); ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation URL: https://github.com/apache/calcite/pull/1875#issuecomment-606988861 @vlsi Would you help me review the PR again? Many thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation URL: https://github.com/apache/calcite/pull/1875#issuecomment-605405488 @vlsi I refined the benchmark in [my branch](https://github.com/neoremind/calcite/blob/refvisit_benchmark/ubenchmark/src/jmh/java/org/apache/calcite/benchmarks/ReflectiveVisitDispatcherTest.java). The result is as below. ``` Benchmark Mode Cnt Score Error Units ReflectiveVisitDispatcherTest.testGlobalCaching avgt3137.270 ±32.275 ns/op ReflectiveVisitDispatcherTest.testInstanceCachingWithLambdaFactory avgt3 92459.660 ± 56777.698 ns/op ReflectiveVisitDispatcherTest.testInstanceCachingWithLambdaFactoryThreadLocalInitialize avgt3199.687 ±24.755 ns/op ReflectiveVisitDispatcherTest.testInstanceCachingWithMethodHandle avgt3 2421.131 ± 633.756 ns/op ReflectiveVisitDispatcherTest.testInstanceCachingWithMethodHandleThreadLocalInitialize avgt3218.616 ±41.754 ns/op ReflectiveVisitDispatcherTest.testInstanceCachingWithReflection avgt3224.603 ±22.770 ns/op ReflectiveVisitDispatcherTest.testInstanceCachingWithReflectionThreadLocalInitialize avgt3218.406 ± 9.620 ns/op ``` From the result, it seems the pre-work is time consuming for both `MethodHandle` and `LambdaFactory` (built upon MethodHandle). To eliminate the initialization effect, I try to use ThreadLocal to initialize once, the result do verifies that, in terms of time cost, LambdaFactory < MethodHandle = reflection, but the gap is very small. Overall, back to what the PR brings, the global caching stragety with reflection invocation can satisfy our requirement during the whole process, instead of caching per instance or per thread. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation URL: https://github.com/apache/calcite/pull/1875#issuecomment-605105348 @vlsi good idea, let me try later. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation URL: https://github.com/apache/calcite/pull/1875#issuecomment-605091418 @vlsi , as @hsyuan suggested, I make global caching by default. In order to benchmark against current instance level caching using `Reflection` and `MethodHandle`, especially for `MethodHandle`, classes like `ReflectUtil` and `ReflectiveVisitDispatcher` need to be updated much, so I make that change in my [fork branch](https://github.com/neoremind/calcite/blob/refvisit_benchmark/ubenchmark/src/jmh/java/org/apache/calcite/benchmarks/ReflectiveVisitDispatcherTest.java). The test result is a little out of expectation. ``` Benchmark Mode Cnt Score Error Units ReflectiveVisitDispatcherTest.testReflectiveVisitorDispatcherInvokeGlobalCaching avgt3 143.483 ± 32.596 ns/op ReflectiveVisitDispatcherTest.testReflectiveVisitorDispatcherInvokeInstanceCaching avgt3 290.380 ± 857.199 ns/op ReflectiveVisitDispatcherTest.testReflectiveVisitorDispatcherInvokeInstanceCachingUseMethodHandle avgt3 2487.406 ± 1083.180 ns/op ``` PS. I refine test to return generated result from the benchmark methods, update benchmark parameters like fork and iteration times. MethodHandle is way slow than reflection, the below part is performance killer. If I make candidateMethod to be feteched from ThreadLocal which I initialized explicitly, the average time can drop down to about 208 ns/op. ``` MethodType mt = MethodType.methodType(returnType, paramTypes); MethodHandle candidateMethod = MethodHandles.lookup().findVirtual(visitorClass, visitMethodName, mt); ``` I will look into the reason if I have more bandwidth. My benchmark result is pretty much the same with [JavaReflectionButMuchFaster.html](http://www.jboss.org/optaplanner/blog/2018/01/09/JavaReflectionButMuchFaster.html). ``` DirectAccessavgt 60 2.590 ± 0.014 ns/op Reflection avgt 60 5.275 ± 0.053 ns/op // 104% slower MethodHandleavgt 60 6.100 ± 0.079 ns/op // 136% slower ``` We need the reflection agility to be able to find methods at runtime, `MethodHanlde` seems not a good choice. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation URL: https://github.com/apache/calcite/pull/1875#issuecomment-604957529 nice, it works 👏 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation URL: https://github.com/apache/calcite/pull/1875#issuecomment-604889135 @vlsi Have you encountered the following exception when running test in `ubenchmark` module? ``` Exception in thread "main" java.lang.RuntimeException: ERROR: Unable to find the resource: /META-INF/BenchmarkList at org.openjdk.jmh.runner.AbstractResourceReader.getReaders(AbstractResourceReader.java:98) at org.openjdk.jmh.runner.BenchmarkList.find(BenchmarkList.java:122) at org.openjdk.jmh.runner.Runner.internalRun(Runner.java:263) at org.openjdk.jmh.runner.Runner.run(Runner.java:209) at org.apache.calcite.benchmarks.PreconditionTest.main(PreconditionTest.java:51) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation URL: https://github.com/apache/calcite/pull/1875#issuecomment-604520763 @vlsi I will try to add the benchmark. `MethodHandle` would be a good alternative, since it requires both returned and parameter types, it might take some time to update the code to make it work. I will look into this solution. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [calcite] neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation
neoremind commented on issue #1875: [CALCITE-3873] Use global caching for ReflectiveVisitDispatcher implementation URL: https://github.com/apache/calcite/pull/1875#issuecomment-604479873 @danny0405 jmh test result is as below. It shows by using reflection invocation to look up methods in every call, the overhead is not small compared with global caching. In real case, reflection invocation can happen between invocations and different `ReflectVisitorDispatcher` instance. The solution might be trivial, but I think it goes toward the right way. ``` Benchmark Mode CntScore Error Units ReflectVisitorDispatcherTest.testGlobalLevelCachingavgt 15 108.398 ± 15.280 ns/op ReflectVisitorDispatcherTest.testInstanceLevelCaching avgt 15 730.208 ± 117.693 ns/op ``` Test source code is attached in [JIRA](https://issues.apache.org/jira/secure/attachment/12997903/ReflectVisitorDispatcherTest.java), the full test report can be found [here](https://issues.apache.org/jira/secure/attachment/12997904/jmh_result.txt) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services