Hi Ken, Thank you for the follow up. I agree with all your points above. I also think that any meaningful reason should allow us to remove the `final` declaration for any Step. I believe this rule should be followed at least till another approach to provide optimizations for Steps is in place.
P.S. I opened a follow up PR to remove `final` keyword from `LabelStep` here: https://github.com/apache/tinkerpop/pull/2097 Best regards, Oleksandr On Wed, Jul 12, 2023 at 12:10 AM Ken Hu <k...@bitquilltech.com.invalid> wrote: > Hi All, > > Recently, I've been thinking about this again and decided to quickly look > at some statistics of final/non-final Step classes. After a quick search of > the concrete Steps, the split is roughly 118 non-final Steps and 95 final > Steps. Given that more than half the Steps have already been opened, it > makes sense to allow the rest of them to become non-final. I mostly agree > with the three step approach from above but I wouldn't remove final from > every Step in a single PR as suggested in 1. I think it still makes sense > to do it on a case by case basis, however, the requirement for removing > final should be very low (that is almost any reason can be given for > wanting to remove final). > > That being said, I think this is true only for TinkerPop 3. As we move > forward, I think it makes more sense for us to revisit how Steps have been > implemented. Recently, a lot of the modifications have been made for the > sake of performance. If we could change the way we execute the steps to > allow for the optimizations in TinkerPop itself then maybe in a future > version we could still keep the Steps as being final. > > In summary, I'm suggesting that we approve most PRs that are requesting to > remove final from a Step as long as they follow some variation of the three > step approach that Oleksandr suggested before. > > Thanks, > Ken > > On Thu, Apr 20, 2023 at 9:52 AM Ken Hu <k...@bitquilltech.com> wrote: > > > I agree that there are some inconsistencies that we will need to work > > through. I've taken a quick look at your three PRs and they seem > reasonable > > to me. I will officially review them soon. > > > > On Thu, Apr 20, 2023 at 9:28 AM Oleksandr Porunov < > > alexandr.poru...@gmail.com> wrote: > > > >> Hi Ken, > >> > >> This is a great input. Thank you for the feedback. I believe you are > right > >> and we should check each case separately. > >> That said, I was just worried if Graph Providers start blindly replacing > >> steps with optimised versions of those steps then it becomes hard to > track > >> how it affects TinkerPop side optimization strategies and custom Steps. > >> For example, let's take a look at any random optimization strategy on > >> TinkerPop's side. For example EarlyLimitStrategy > >> < > >> > https://github.com/apache/tinkerpop/blob/3.6-dev/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/optimization/EarlyLimitStrategy.java > >> > > >> . > >> This strategy specifically has specific logic regarding usage of > >> `RangeGlobalStep`. > >> I.e. `if (step instanceof RangeGlobalStep)`. > >> If we check this step it's `final` (`public final class > RangeGlobalStep`). > >> In such a case nobody could actually easily replace this strategy > without > >> affecting `EarlyLimitStrategy` because we can't extend / overwrite > >> RangeGlobalStep due to being final. > >> Moreover `if (step instanceof RangeGlobalStep)` could actually be > replaced > >> by `if (RangeGlobalStep.class.equals(step.getClass()))` because we know > >> there are no instances of `RangeGlobalStep` other than `RangeGlobalStep` > >> itself, but the strategy tells that any instance of `RangeGlobalStep` is > >> valid for optimization. > >> > >> The above relation of Optimizations and Steps affects many Step classes > >> and > >> many optimization strategies. Moreover it's kind of inconsistent that > some > >> of the steps are `final` and some of them are non-final even without > >> having > >> any other classes extending them in the TinkerPop codebase. For example, > >> PropertiesStep and PropertyMapStep are non-final, but they don't have > any > >> classes extending them inside TinkerPop codebase. Logically Graph > >> Providers > >> extend and replace those steps, but then why do we keep those steps open > >> if > >> their codebase is bigger and more likely to be changed then the other > >> `final` steps which are more likely to stay as is? (i.e. I would expect > >> PropertyMap step to be changed more often then the IsStep for example, > >> even > >> so the latter one is non-final). > >> > >> I do agree with you that we should probably be curious of any methods / > >> fields which we open with `protected` and don't blindly replace > `private` > >> fields / methods to `protected` fields / methods (as I did in the PR), > but > >> I doubt any Step should be `final`. > >> Taking your input into account I think the way we could do, is: > >> 1. Open all Step classes for extension by removing `final`. (do this in > a > >> single PR for all steps) > >> 2. Move all `private` / `protected` utility methods out from Step > classes > >> into some utility classes. (do this case by case depending on the step) > >> 3. Add public getter methods and / or public constructors to be able to > >> copy necessary step fields from the original step into a new extending > >> step. (do this case by case depending on the step) > >> > >> Thus, I believe the original PR > >> <https://github.com/apache/tinkerpop/pull/2027> should be closed and > >> separate PRs should be opened. So far I opened 3 separate small PRs to > >> make > >> extensions for the specific steps. > >> https://github.com/apache/tinkerpop/pull/2029 > >> https://github.com/apache/tinkerpop/pull/2030 > >> https://github.com/apache/tinkerpop/pull/2031 > >> > >> Would be great to know what you think about it. > >> > >> Thanks, > >> Oleksandr > >> > >> > >> On Thu, Apr 20, 2023 at 4:42 PM Ken Hu <k...@bitquilltech.com.invalid> > >> wrote: > >> > >> > Hi Oleksandr, > >> > > >> > Thanks for bringing this idea to the community. However, I would be > very > >> > hesitant to include such a large change to TinkerPop. The Gremlin > >> language > >> > doesn't really have a specification which means that what a Step does > is > >> > more susceptible to change. I also feel like the current > implementation > >> of > >> > Steps wasn't designed to be a stable and open interface so I would > >> consider > >> > them to be internal TinkerPop classes. From what I've heard, some of > >> these > >> > classes have intentionally been marked as final to prevent misusage > >> which > >> > is something that has happened in the past. Another concern I have is > >> the > >> > increased potential to break other's changes during refactoring or bug > >> > fixing. It's easy to run into a situation where you make a fix for an > >> issue > >> > but don't realize that you have accidentally broken someone else's > >> > implementation because you didn't know their code had a specific > >> functional > >> > dependency on the current implementation. This scenario is much more > >> likely > >> > to occur when you open up this many classes and there is no way to > >> predict > >> > what others will do with it. > >> > > >> > I think we can revisit this idea in the future once we have a firmer > >> > specification for the Gremlin language. For now I think it is better > to > >> > take these changes on in a case by case basis. > >> > > >> > Thanks, > >> > Ken > >> > > >> > On Tue, Apr 18, 2023 at 6:27 PM Oleksandr Porunov < > >> > alexandr.poru...@gmail.com> wrote: > >> > > >> > > Hello everyone, > >> > > > >> > > I would like to get some opinion from the community regarding > relaxing > >> > > strictness of Step classes for better extensibility. > >> > > The problem I encounter while making some optimizations for > >> JanusGraph is > >> > > that many steps are not quite extensible due to being marked as > >> `final` > >> > or > >> > > simply having important fields as `private` without any `public` > >> getters. > >> > > Thus, making it complicated to replace original TinkerPop Steps with > >> > Graph > >> > > Provider's optimized Steps. > >> > > In this discussion I would like to know community opinion on > relaxing > >> > Step > >> > > classes strictness with making the following changes: > >> > > > >> > > 1. Remove `final` keyword from all Step classes, so that it would be > >> > > possible to extend / overwrite them. Without this change Graph > >> Providers > >> > > might need to duplicate some logic during their custom Step > >> > implementation > >> > > (because they won't be able to extend an already existing step and > >> > instead > >> > > will need to re-implement it from scratch). > >> > > Real example: > >> > > We want to optimize the `ProjectStep` step to be able to query all > >> > included > >> > > traversals in parallel instead of sequentially ( > >> > > https://github.com/JanusGraph/janusgraph/issues/3559 ). Current > >> > > TinkerPop's ProjectStep if `final`. Thus, if we want to replace this > >> step > >> > > with the optimized version of this step we will need to create our > >> custom > >> > > step which will have most of the logic duplicated with ProjectStep > >> (most > >> > > likely the only difference will be in the `map` step). > >> > > > >> > > 2. Convert all methods and fields from `private` to `protected` to > be > >> > able > >> > > to access them in extended Step versions. In case the extended > version > >> > > overwrites some of the original TinkerPop functionality then the > child > >> > > class may potentially need to access some of the parent's class > >> fields. > >> > If > >> > > those fields are private and don't have any getters then it might be > >> > > complicated to retrieve that information (it will require using > >> > > Reflection API). > >> > > Having fields as `protected` will allow for extended child classes > to > >> > > access parent's fields. > >> > > Another solution could be to have `protected` getters available for > >> those > >> > > fields. > >> > > Real example: > >> > > We would like to optimize PropertyMap step to pre-fetch elements > >> > properties > >> > > ( https://github.com/JanusGraph/janusgraph/issues/2444 ). > >> > > For that we have to overwrite the `map` method and re-implement it > >> (could > >> > > be considered as anti-pattern, but it's complicated to do that > >> > otherwise). > >> > > In the overwritten method we may have some duplicated logic to be > >> > > re-implemented and for that we will need to access parent fields / > >> > methods. > >> > > This was fixed for PropertyMap here ( > >> > > https://github.com/apache/tinkerpop/pull/2022 ), but there are many > >> > other > >> > > Steps which have private properties without getters. > >> > > > >> > > 3. (optional) Add public getters for fields which may be required to > >> be > >> > > copied from an original Step into an extension Step. > >> > > Real example: > >> > > Again about `PropertyMapStep`. Even though we changed the fields to > >> > > `protected` (which now allows us to set and read those fields from > the > >> > > child class), we can't easily replace the original Step with the > >> extended > >> > > version of this Step because we will need to provide all the > >> information > >> > we > >> > > have from the original Step into the new Step. Without having public > >> > > getters available for those fields we won't be able to retrieve data > >> for > >> > > protected fields (unless we place the optimization class to the same > >> > > package). > >> > > I.e. If we implement a replacement Step for PropertyMap step we will > >> need > >> > > to access `tokens` and `traversalRing` from the original step, but > we > >> > won't > >> > > be able to do that because currently PropertyMap doesn't have public > >> > > getters for those fields. Thus, it will require us to place a new > >> > > implementation class to the same package, so that we could access > >> > protected > >> > > fields. > >> > > > >> > > Overall, I think relaxing Step classes' strictness can benefit Graph > >> > > Providers as they will be able to overwrite functionality they need. > >> That > >> > > said, due to small optimization strategies experience, I might miss > >> some > >> > > development patterns regarding developing optimizations. > Potentially, > >> it > >> > > could be that replacing original Steps is a bad idea but it's hard > >> for me > >> > > to find good optimization solutions otherwise. > >> > > > >> > > These changes don't look to me as breaking changes because no code > >> > changes > >> > > will be required from current Graph Provides sides, but maybe I > missed > >> > > something. > >> > > > >> > > As for now, the JanusGraph community has plans to optimize usage of: > >> > > PropertyMap, ElementMap, Coalesce, Has, Project steps. Thus it > drived > >> > this > >> > > thinking and discussion. > >> > > > >> > > Ticket regarding relaxing step classes strictness: > >> > > https://issues.apache.org/jira/browse/TINKERPOP-2927 > >> > > PR to relax step classes strictness: > >> > > https://github.com/apache/tinkerpop/pull/2027 > >> > > > >> > > Any feedback is appreciated. > >> > > > >> > > Best regards, > >> > > Oleksandr > >> > > > >> > > >> > > >