[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1714 @arunmahadevan Yes ideally we really want to have it. I just said that ExprCompiler seems not having a value to keep and maintain. Agreed with you. Thanks for reviewing! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler
Github user arunmahadevan commented on the issue: https://github.com/apache/storm/pull/1714 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler
Github user arunmahadevan commented on the issue: https://github.com/apache/storm/pull/1714 @HeartSaVioR I suggested to keep an option to plug our code generator. If you think calcites code generator (linq4j) addresses all the current cases then lets go with that. We can revisit having our own code generator within storm-sql once we have more contributors/focussed effort around sql. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1714 @arunmahadevan I'm not sure we can have our own expression compiler which supports features competitive to Calcite. It reminds me why this change is necessary. We have been leaving expression compiler which has lots of unimplemented things and also bugs. It was pushed more than 6 months before and only few of contributors are taking care of. We have lots of milestones (not a single task) to do even without Storm SQL, like Unified API, extending Window feature (session, etc.), Beam integration, Java porting, Metrics, and so on. Even I have been focusing to work on Storm SQL, I give up keeping expression compiler because of having less eyes to watch, and error-prone way of building code block (string concatenation). We could save amount of time if we replace this faster, and it will save another amount of time. Btw, normally we're often worry about dependency if dependency's activity is low or release interval is too long. But Calcite's release interval for minor release is even faster than Storm's bugfix release. Calcite released 1.7.0 from Mar 22, 1.8.0 from Jun 13, and 1.9.0 from Sep 22, and 1.10.0 today. I have already fixed two bugs in unit tests and will be shipped to 1.10.1 or 1.11.0. (Yes in this case we need to wait on that.) I'm not on the fence of making it pluggable, but just wondering we have comparable plugs on that, and we can make an effort for that. Would you want to be a sponsor on current expression compiler? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler
Github user arunmahadevan commented on the issue: https://github.com/apache/storm/pull/1714 @HeartSaVioR with this change we are going to be heavily dependent on the calcite's code generator to translate expressions correct?. Just thinking if we should have an option to make the code generation pluggable with an option to plug in our code generator as well. The calcite's code generator can be the default. Just thinking if in future we want to do some optimizations around the generated code or have our own implementation for a built in function or want to fix some bugs in the generated code without waiting for a calcite release etc, it may not be possible with the current patch. Let me know what you think. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1714 Calcite 1.10.0 RC1 vote passed and artifact is available to Maven. Applied HeartSaVioR@b2bcf9b with changing Calcite version from 1.10.0-SNAPSHOT to 1.10.0. Also rebased and squashed the commits. @arunmahadevan I guess this would be the last review for this patch. Please take a look again. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1714 Just found and fixed bugs regarding aggregate calls and timezone issue in current_date. Now aggregate also needs new Trident map API - STORM-2072 (#1663) to reduce the steps. I'll address this after STORM-2072 is merged to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1714 FYI: The vote for 1.10.0 RC1 is just started so we can expect Calcite 1.10.0 to next week. Submitted patches for recently found bugs to Calcite: CALCITE-1418 and CALCITE-1419. Above issues are marked for 1.11.0, but it's not a thing to rush since at least we are documenting this for `known bug` from STORM-2135. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1714 Addressed documenting covered feature from other PR: #1725 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1714 Added full of scalar function tests which Calcite provides. Calcite 1.9.0 has some bugs so not all functions are working well, but it can be fixed from Calcite side. (I left comments to test codes about bugs I found.) Once we track and document the bugs to our doc. page, I think it would be OK. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1714 Filed [STORM-2135](https://issues.apache.org/jira/browse/STORM-2135) and [STORM-2136](https://issues.apache.org/jira/browse/STORM-2136). I'd like to get some helps on these issues, but don't mind to work on these by myself. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1714 @arunmahadevan Regarding Calcite release, Calcite community is waiting for one issue to get resolved before starting RC. I guess RC will be available in this week or so. I'll watch Calcite community and do some change (Calcite version) and let you know again. (I didn't think about this but I need to modify this PR slightly because of Calcite version. ;) ) Regarding documenting available features, actually I didn't test all of SQL reference on Calcite. :) We can add test cases to see what features Storm SQL is able to provide, and adopt SQL reference page with some modifications. I'll file a follow up JIRA. Regarding examples, yes, we don't have any useful documents or examples so it should be next work of Storm SQL. I'll file a follow up JIRA separately. Thanks for reviewing! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler
Github user arunmahadevan commented on the issue: https://github.com/apache/storm/pull/1714 @HeartSaVioR +1 on this change. If the minor version of the calcite with your fixes can get released soon, it makes sense to wait for that so we can keep the current semantics for querying nested arrays/maps. Can you also file a follow up JIRA to document the functions that are currently available within storm-sql and may be some examples for users to get started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1714 Another head up: my patch got merged to Calcite master, and Calcite dev@ is discussing about releasing new minor version soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1714 Just a head up: my patch for Calcite (https://github.com/apache/calcite/pull/283) passes the review and in progress of merging. (handling some comment fixes.) So I'm expecting that we can get runtime safety with next version of Calcite. Do we want to hold on waiting next version of Calcite? Or just ship this as it is for now, and I will submit another pull request following up the change of Calcite? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1714 I'm trying to address our concern of nested map / array since Calcite community (Julian) says it sounds like a bug. Addressing it from CALCITE-1386 : https://github.com/apache/calcite/pull/283. With applying CALCITE-1386, we could make tests back to origin. (It still need to wrap CAST when detailed type information is not available - I mean ANY) Below is the commit which works with CALCITE-1386 patch. https://github.com/HeartSaVioR/storm/commit/b2bcf9baebb8d9bd878c3d8a74c0d366ded7d399 @arunmahadevan Please take a look into above commit. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1714 @arunmahadevan Other than that I addressed your review comments. Thanks for the thoughtful review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1714 @arunmahadevan Let me summary regarding nested map/array support. * accessing nested element in nested map/array works without modifying Calcite. * it should be defined as ANY. Specifying MAP's value type to MAP doesn't seem to work. I don't understand this behavior but it works anyway. * We need to wrap value with CAST if we need to use value in expression. * For example, `MAPFIELD['a'] = 2` doesn't work but `CAST(MAPFIELD['a'] AS INTEGER) = 2` works. * Using value in projection (fields of SELECT) doesn't need to be wrapped with CAST. * NOTE: this is NOT a safe operation, and errors are shown in runtime. * If value is null in any chance, comparing it to other will **throw RuntimeException** in runtime. * value itself could be null. * value could be null when we access array with wrong index or map with non-exist key. * If we access array with index out of bound, it will **throw ArrayIndexOutOfBoundException** in runtime. So there's a way to support current feature with adding some inconveniences (CAST), but it is no longer safe operation. Since user can't handle errors manually, this can make worker continuously crashed. I asked Calcite dev. group about this and didn't get answer yet. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1714 I don't want to get blocked by non-SQL standard support unless we have clear use case or clear way to implement it. If my understanding is right, both Spark and Flink seems not support this as well, and also Samza SQL too. When this gets merged into master, I expect Storm SQL will support most of SQL reference page on Calcite. We can add tests for checking features availability and document supported features, and it could take some time. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler
Github user arunmahadevan commented on the issue: https://github.com/apache/storm/pull/1714 Can you also document the list of functions that are now available since we are directly using calicite to generate the java code? Does it support all the standard SQL functions? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1714 There's an another issue: when to initialize DataContext. Calcite utilizes the variables in DataContext. and there're datetime related variables in DataContext. The thing is, SQL standard says that functions such as CURRENT_TIMESTAMP return the same value throughout the query. Since we're querying in stream, when to initialize DataContext defines the boundary to show same datetime across rows. With joining stream, the issue is more complicated. For now I just initialize DataContext at the query compilation phase and pass it to components, so that the value of current datetime is static (don't change) across workers in topology. If we would like to make it dynamic, we need to discuss. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---