Re: Disabling JaninoRelMetadataProvider
Hi Julian, Thanks for your good suggestion. The idea of delayed freeze sounds reasonable to me, and it provides more flexibility. I also like the idea of an output report which is helpful for performance diagnostics. Best, Liya Fan On Fri, Sep 25, 2020 at 7:39 AM Julian Hyde wrote: > Would it make sense to delay the 'freeze'? E.g. receive 1,000 requests > (as measured using an AtomicInteger) and then freeze the > JaninoRelMetadataProvider. That way, tests would not be competing with > each other. > > Similarly, in production, you could create a production instance of > JaninoRelMetadataProvider with what you believe to be a good list of > classes and metadata. The first request that asks for a class or > metadata will not fail. But when you have gathered, say, 100 missing > classes (and therefore re-generated 100 times), at that point it can > output a list of the classes that were missing from the initial > config. > > Julian > > On Mon, Sep 21, 2020 at 4:53 AM Fan Liya wrote: > > > > Hi Julian, > > > > Thank you for the feedback and good suggestions. > > > > The idea of lazy load is reasonable. However, the testing of this feature > > is still tricky, because: > > 1. When we have changed the flag, all subsequent test cases are all also > > affected (many components of JaninoRelMetadataProvider are declared as > > static, which is used by all test cases in the same JVM, so it is > difficult > > to revert the changes made by a single test case). > > 2. All test cases may be run in parallel, so when we change the flag, > > other test cases running concurrently may fail. (The > > JaninoRelMetadataProvider object is a singleton, which is shared by all > > threads). > > > > I have updated the PR [1] to implement the idea of lazy load. Could > someone > > please take a look? > > > > Best, > > Liya Fan > > > > [1] https://github.com/apache/calcite/pull/1901 > > > > On Sat, Sep 19, 2020 at 1:50 AM Julian Hyde > wrote: > > > > > CALCITE-3901 looks like a reasonable approach. I like the idea that it > > > would throw and tell you the classes and metadata types that you have > > > missed. > > > > > > The packaging in 3901 isn’t great. I don’t believe that the test needs > to > > > read a system property and not even via the CalciteSystemProperty > > > mechanism. We can do better. > > > > > > Maybe we have a class name and arguments for the metadata provider, > which > > > is loaded on demand. Will work for the test and also in production. > > > > > > Julian > > > > > > > On Sep 18, 2020, at 12:57 AM, Fan Liya wrote: > > > > > > > > Hi Enrico, > > > > > > > > We have also observed the problem. > > > > > > > > Janino compilation is time consuming. For each single Java class, it > > > takes > > > > tens of milli-seconds. > > > > Moreover, the compilation will take place multiple times, because: > > > > 1. We have multiple types of metadata. > > > > 2. The class for a particular type of metadata may be regenerated and > > > > recompiled, because a rel node was not registered at first. > > > > > > > > We opened CALCITE-3901 and provided a PR [1] to address problem 2 by > > > > reducing the number of compilations. However, so far it has not been > > > > accepted yet. > > > > > > > > To solve problem 1, I think we have two potential solutions: > > > > 1. Provide a metadata provider that is not based on codegen and > Janino > > > > compilation (Obviously, this will be a long term plan). > > > > 2. Save the generated Java class, and compile it with the client > code, > > > and > > > > change the implementation of JaninoMetaDataProvider#load3 so that if > a > > > > handler can be found in the current class loader, then it skips the > > > codegen > > > > and compilation. > > > > > > > > Best, > > > > Liya Fan > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1] https://github.com/apache/calcite/pull/1901 > > > > > > > > > > > > > > > >> On Fri, Sep 18, 2020 at 3:01 PM Enrico Olivelli < > eolive...@gmail.com> > > > wrote: > > > >> > > > >> Hi, > > > >> I am seeing that JaninoRelMetadataProvider is quite expensive, at > least > > > for > > > >> the "boostrap" phase of the system. > > > >> > > > >> It is a cool piece of software and it is working well, but in some > cases > > > >> probably it could be quite overkill, for instance in test cases of > > > >> applications that mostly run only one query and then end. > > > >> > > > >> You could argue that the price of the compilation is paid only once > and > > > >> there is no need to complicate things for some corner cases that > > > probably > > > >> do not affect production cases. > > > >> > > > >> In CALCITE-3901 we introduced > calcite.enable.regenerate.metadata.handler > > > >> system property in order to limit the number of times that Janino > kicks > > > in > > > >> but not to drop it at all. > > > >> > > > >> From the code I see that it is a subclass of RelMetadataProvider, > but > > > there > > > >> is no way to get rid of it in VolcanoPlanner. > > > >> >
Re: Disabling JaninoRelMetadataProvider
Would it make sense to delay the 'freeze'? E.g. receive 1,000 requests (as measured using an AtomicInteger) and then freeze the JaninoRelMetadataProvider. That way, tests would not be competing with each other. Similarly, in production, you could create a production instance of JaninoRelMetadataProvider with what you believe to be a good list of classes and metadata. The first request that asks for a class or metadata will not fail. But when you have gathered, say, 100 missing classes (and therefore re-generated 100 times), at that point it can output a list of the classes that were missing from the initial config. Julian On Mon, Sep 21, 2020 at 4:53 AM Fan Liya wrote: > > Hi Julian, > > Thank you for the feedback and good suggestions. > > The idea of lazy load is reasonable. However, the testing of this feature > is still tricky, because: > 1. When we have changed the flag, all subsequent test cases are all also > affected (many components of JaninoRelMetadataProvider are declared as > static, which is used by all test cases in the same JVM, so it is difficult > to revert the changes made by a single test case). > 2. All test cases may be run in parallel, so when we change the flag, > other test cases running concurrently may fail. (The > JaninoRelMetadataProvider object is a singleton, which is shared by all > threads). > > I have updated the PR [1] to implement the idea of lazy load. Could someone > please take a look? > > Best, > Liya Fan > > [1] https://github.com/apache/calcite/pull/1901 > > On Sat, Sep 19, 2020 at 1:50 AM Julian Hyde wrote: > > > CALCITE-3901 looks like a reasonable approach. I like the idea that it > > would throw and tell you the classes and metadata types that you have > > missed. > > > > The packaging in 3901 isn’t great. I don’t believe that the test needs to > > read a system property and not even via the CalciteSystemProperty > > mechanism. We can do better. > > > > Maybe we have a class name and arguments for the metadata provider, which > > is loaded on demand. Will work for the test and also in production. > > > > Julian > > > > > On Sep 18, 2020, at 12:57 AM, Fan Liya wrote: > > > > > > Hi Enrico, > > > > > > We have also observed the problem. > > > > > > Janino compilation is time consuming. For each single Java class, it > > takes > > > tens of milli-seconds. > > > Moreover, the compilation will take place multiple times, because: > > > 1. We have multiple types of metadata. > > > 2. The class for a particular type of metadata may be regenerated and > > > recompiled, because a rel node was not registered at first. > > > > > > We opened CALCITE-3901 and provided a PR [1] to address problem 2 by > > > reducing the number of compilations. However, so far it has not been > > > accepted yet. > > > > > > To solve problem 1, I think we have two potential solutions: > > > 1. Provide a metadata provider that is not based on codegen and Janino > > > compilation (Obviously, this will be a long term plan). > > > 2. Save the generated Java class, and compile it with the client code, > > and > > > change the implementation of JaninoMetaDataProvider#load3 so that if a > > > handler can be found in the current class loader, then it skips the > > codegen > > > and compilation. > > > > > > Best, > > > Liya Fan > > > > > > > > > > > > > > > > > > > > > [1] https://github.com/apache/calcite/pull/1901 > > > > > > > > > > > >> On Fri, Sep 18, 2020 at 3:01 PM Enrico Olivelli > > wrote: > > >> > > >> Hi, > > >> I am seeing that JaninoRelMetadataProvider is quite expensive, at least > > for > > >> the "boostrap" phase of the system. > > >> > > >> It is a cool piece of software and it is working well, but in some cases > > >> probably it could be quite overkill, for instance in test cases of > > >> applications that mostly run only one query and then end. > > >> > > >> You could argue that the price of the compilation is paid only once and > > >> there is no need to complicate things for some corner cases that > > probably > > >> do not affect production cases. > > >> > > >> In CALCITE-3901 we introduced calcite.enable.regenerate.metadata.handler > > >> system property in order to limit the number of times that Janino kicks > > in > > >> but not to drop it at all. > > >> > > >> From the code I see that it is a subclass of RelMetadataProvider, but > > there > > >> is no way to get rid of it in VolcanoPlanner. > > >> > > >> Also, Janino is a third party library, and having the ability of > > dropping > > >> it will help in having a smaller set of dependencies downstream (but > > this > > >> is not blocker at the moment, it is only a nice-to-have) > > >> > > >> Do you have any experience with this problem ? Is there any chance to > > add > > >> some configuration option to pass a RelMetadataProvider implementation > > >> that does not rely on dynamic code generation ? > > >> If the idea is valuable I will be happy to work on it. > > >> > > >> See this issue for reference, with a full stac
Re: Disabling JaninoRelMetadataProvider
Hi Julian, Thank you for the feedback and good suggestions. The idea of lazy load is reasonable. However, the testing of this feature is still tricky, because: 1. When we have changed the flag, all subsequent test cases are all also affected (many components of JaninoRelMetadataProvider are declared as static, which is used by all test cases in the same JVM, so it is difficult to revert the changes made by a single test case). 2. All test cases may be run in parallel, so when we change the flag, other test cases running concurrently may fail. (The JaninoRelMetadataProvider object is a singleton, which is shared by all threads). I have updated the PR [1] to implement the idea of lazy load. Could someone please take a look? Best, Liya Fan [1] https://github.com/apache/calcite/pull/1901 On Sat, Sep 19, 2020 at 1:50 AM Julian Hyde wrote: > CALCITE-3901 looks like a reasonable approach. I like the idea that it > would throw and tell you the classes and metadata types that you have > missed. > > The packaging in 3901 isn’t great. I don’t believe that the test needs to > read a system property and not even via the CalciteSystemProperty > mechanism. We can do better. > > Maybe we have a class name and arguments for the metadata provider, which > is loaded on demand. Will work for the test and also in production. > > Julian > > > On Sep 18, 2020, at 12:57 AM, Fan Liya wrote: > > > > Hi Enrico, > > > > We have also observed the problem. > > > > Janino compilation is time consuming. For each single Java class, it > takes > > tens of milli-seconds. > > Moreover, the compilation will take place multiple times, because: > > 1. We have multiple types of metadata. > > 2. The class for a particular type of metadata may be regenerated and > > recompiled, because a rel node was not registered at first. > > > > We opened CALCITE-3901 and provided a PR [1] to address problem 2 by > > reducing the number of compilations. However, so far it has not been > > accepted yet. > > > > To solve problem 1, I think we have two potential solutions: > > 1. Provide a metadata provider that is not based on codegen and Janino > > compilation (Obviously, this will be a long term plan). > > 2. Save the generated Java class, and compile it with the client code, > and > > change the implementation of JaninoMetaDataProvider#load3 so that if a > > handler can be found in the current class loader, then it skips the > codegen > > and compilation. > > > > Best, > > Liya Fan > > > > > > > > > > > > > > [1] https://github.com/apache/calcite/pull/1901 > > > > > > > >> On Fri, Sep 18, 2020 at 3:01 PM Enrico Olivelli > wrote: > >> > >> Hi, > >> I am seeing that JaninoRelMetadataProvider is quite expensive, at least > for > >> the "boostrap" phase of the system. > >> > >> It is a cool piece of software and it is working well, but in some cases > >> probably it could be quite overkill, for instance in test cases of > >> applications that mostly run only one query and then end. > >> > >> You could argue that the price of the compilation is paid only once and > >> there is no need to complicate things for some corner cases that > probably > >> do not affect production cases. > >> > >> In CALCITE-3901 we introduced calcite.enable.regenerate.metadata.handler > >> system property in order to limit the number of times that Janino kicks > in > >> but not to drop it at all. > >> > >> From the code I see that it is a subclass of RelMetadataProvider, but > there > >> is no way to get rid of it in VolcanoPlanner. > >> > >> Also, Janino is a third party library, and having the ability of > dropping > >> it will help in having a smaller set of dependencies downstream (but > this > >> is not blocker at the moment, it is only a nice-to-have) > >> > >> Do you have any experience with this problem ? Is there any chance to > add > >> some configuration option to pass a RelMetadataProvider implementation > >> that does not rely on dynamic code generation ? > >> If the idea is valuable I will be happy to work on it. > >> > >> See this issue for reference, with a full stacktrace of the execution > >> https://github.com/diennea/herddb/issues/517 > >> > >> Regards > >> Enrico > >> >
Re: Disabling JaninoRelMetadataProvider
CALCITE-3901 looks like a reasonable approach. I like the idea that it would throw and tell you the classes and metadata types that you have missed. The packaging in 3901 isn’t great. I don’t believe that the test needs to read a system property and not even via the CalciteSystemProperty mechanism. We can do better. Maybe we have a class name and arguments for the metadata provider, which is loaded on demand. Will work for the test and also in production. Julian > On Sep 18, 2020, at 12:57 AM, Fan Liya wrote: > > Hi Enrico, > > We have also observed the problem. > > Janino compilation is time consuming. For each single Java class, it takes > tens of milli-seconds. > Moreover, the compilation will take place multiple times, because: > 1. We have multiple types of metadata. > 2. The class for a particular type of metadata may be regenerated and > recompiled, because a rel node was not registered at first. > > We opened CALCITE-3901 and provided a PR [1] to address problem 2 by > reducing the number of compilations. However, so far it has not been > accepted yet. > > To solve problem 1, I think we have two potential solutions: > 1. Provide a metadata provider that is not based on codegen and Janino > compilation (Obviously, this will be a long term plan). > 2. Save the generated Java class, and compile it with the client code, and > change the implementation of JaninoMetaDataProvider#load3 so that if a > handler can be found in the current class loader, then it skips the codegen > and compilation. > > Best, > Liya Fan > > > > > > > [1] https://github.com/apache/calcite/pull/1901 > > > >> On Fri, Sep 18, 2020 at 3:01 PM Enrico Olivelli wrote: >> >> Hi, >> I am seeing that JaninoRelMetadataProvider is quite expensive, at least for >> the "boostrap" phase of the system. >> >> It is a cool piece of software and it is working well, but in some cases >> probably it could be quite overkill, for instance in test cases of >> applications that mostly run only one query and then end. >> >> You could argue that the price of the compilation is paid only once and >> there is no need to complicate things for some corner cases that probably >> do not affect production cases. >> >> In CALCITE-3901 we introduced calcite.enable.regenerate.metadata.handler >> system property in order to limit the number of times that Janino kicks in >> but not to drop it at all. >> >> From the code I see that it is a subclass of RelMetadataProvider, but there >> is no way to get rid of it in VolcanoPlanner. >> >> Also, Janino is a third party library, and having the ability of dropping >> it will help in having a smaller set of dependencies downstream (but this >> is not blocker at the moment, it is only a nice-to-have) >> >> Do you have any experience with this problem ? Is there any chance to add >> some configuration option to pass a RelMetadataProvider implementation >> that does not rely on dynamic code generation ? >> If the idea is valuable I will be happy to work on it. >> >> See this issue for reference, with a full stacktrace of the execution >> https://github.com/diennea/herddb/issues/517 >> >> Regards >> Enrico >>
Re: Disabling JaninoRelMetadataProvider
Hi Enrico, We have also observed the problem. Janino compilation is time consuming. For each single Java class, it takes tens of milli-seconds. Moreover, the compilation will take place multiple times, because: 1. We have multiple types of metadata. 2. The class for a particular type of metadata may be regenerated and recompiled, because a rel node was not registered at first. We opened CALCITE-3901 and provided a PR [1] to address problem 2 by reducing the number of compilations. However, so far it has not been accepted yet. To solve problem 1, I think we have two potential solutions: 1. Provide a metadata provider that is not based on codegen and Janino compilation (Obviously, this will be a long term plan). 2. Save the generated Java class, and compile it with the client code, and change the implementation of JaninoMetaDataProvider#load3 so that if a handler can be found in the current class loader, then it skips the codegen and compilation. Best, Liya Fan [1] https://github.com/apache/calcite/pull/1901 On Fri, Sep 18, 2020 at 3:01 PM Enrico Olivelli wrote: > Hi, > I am seeing that JaninoRelMetadataProvider is quite expensive, at least for > the "boostrap" phase of the system. > > It is a cool piece of software and it is working well, but in some cases > probably it could be quite overkill, for instance in test cases of > applications that mostly run only one query and then end. > > You could argue that the price of the compilation is paid only once and > there is no need to complicate things for some corner cases that probably > do not affect production cases. > > In CALCITE-3901 we introduced calcite.enable.regenerate.metadata.handler > system property in order to limit the number of times that Janino kicks in > but not to drop it at all. > > From the code I see that it is a subclass of RelMetadataProvider, but there > is no way to get rid of it in VolcanoPlanner. > > Also, Janino is a third party library, and having the ability of dropping > it will help in having a smaller set of dependencies downstream (but this > is not blocker at the moment, it is only a nice-to-have) > > Do you have any experience with this problem ? Is there any chance to add > some configuration option to pass a RelMetadataProvider implementation > that does not rely on dynamic code generation ? > If the idea is valuable I will be happy to work on it. > > See this issue for reference, with a full stacktrace of the execution > https://github.com/diennea/herddb/issues/517 > > Regards > Enrico >
Disabling JaninoRelMetadataProvider
Hi, I am seeing that JaninoRelMetadataProvider is quite expensive, at least for the "boostrap" phase of the system. It is a cool piece of software and it is working well, but in some cases probably it could be quite overkill, for instance in test cases of applications that mostly run only one query and then end. You could argue that the price of the compilation is paid only once and there is no need to complicate things for some corner cases that probably do not affect production cases. In CALCITE-3901 we introduced calcite.enable.regenerate.metadata.handler system property in order to limit the number of times that Janino kicks in but not to drop it at all. >From the code I see that it is a subclass of RelMetadataProvider, but there is no way to get rid of it in VolcanoPlanner. Also, Janino is a third party library, and having the ability of dropping it will help in having a smaller set of dependencies downstream (but this is not blocker at the moment, it is only a nice-to-have) Do you have any experience with this problem ? Is there any chance to add some configuration option to pass a RelMetadataProvider implementation that does not rely on dynamic code generation ? If the idea is valuable I will be happy to work on it. See this issue for reference, with a full stacktrace of the execution https://github.com/diennea/herddb/issues/517 Regards Enrico