> 1. ANTLR grammar is distributed as a separate artifact, and users are allowed to extend it with new step definitions by contract. I am not sure about predicates, but that is likely also possible.
Are you suggesting a different packaging than what is already present in gremlin-language? isn't that a separate artifact already? Also, could you say a bit more about how developers would "extend it with new step definitions by contract"? I suppose I'm wary of there being lots of variations of the grammar out there and might open abuses, configuration challenges and other things that confuse users. I feel like call() may already allow for too much independence in the community and less collaboration to instill more generalized (but extendable ) first-class features in TinkerPop. I suppose providers would be free to modify the grammar today and distribute their own versions as things are, but I think that's a bit different than TinkerPop promoting that concept as a way for extension. I do like that you are suggesting mechanisms for "handy" and "developer-friendly" interfaces that can simplify extension. Ideally, we'd keep the grammar as it is but figure out guardrailed approaches to offer step extension. Note that we do have something along those lines with custom strategies where providers can register any custom TraversalStrategy in such a way that the grammar/parser will automatically recognize it and it requires no changes to the grammar itself. In this way, g.withoutStrategies(CustomStrategy) just works. I think Yang is exploring how the same mechanism could work for custom types (e.g. an OrientDB Record Id). I've often wondered if the same would work for custom steps, where providers would just register custom steps and the grammar/parser would recognize them and convert them to call() somehow internally. At that point providers could process the call() directly or use strategies to convert them to concrete custom steps. I presume if that worked for steps it would work for P as well. Perhaps that's worth exploring as an option. Perhaps one advantage to letting providers extend the grammar in the way I think you are describing, is that it would give tool developers (like G.V()) better specific graph database support for their users. Something like g.ytdbStep() could be recognized within the tool itself. Of course, that's a narrow group of users, but on the other hand they serve a large number of end Gremlin users themselves. All that said, this issue is a problem that needs a solution and perhaps what you're suggesting is indeed the best way. It would be great to figure this out. Thanks for putting some thought to it. On Wed, Aug 27, 2025 at 9:36 AM Andrii Lomakin <[email protected]> wrote: > Good day. > > > This question has already been raised several times, but typically, it was > discussed in the context of a specific extension of steps. > > > Here, I want to propose a general approach that we are going to use in > YouTrackDB to extend the Gremlin language. > > > Our intention in this discussion is to receive initial feedback from the TP > team. If they agree, we will eventually contribute part of this approach or > the complete approach in the next versions of the TP framework. > > > We used the following core principle when developing this approach: > existing Gremlin tools should be able to take advantage of all the > functionality provided by Gremlin extensions. The only difference is that > it > will not be as handy for users as using the new features provided. > > > As a result, we paid attention to the two tools already at hand for TP > users: > > 1. DSL annotation processor. > 2. The call step. > > > So first of all, if we can not convert our DSL using the previously > mentioned annotation processor into the standard steps, we convert it into > the `call` of the service. This service is actually a stub, not a real > service. > > The first parameter of this `call` step is the new step name, and the rest > are arguments specified in free form, supported by TP, and understandable > to humans, as well as the optimization strategy that will convert them to > the real step in the phase of traversal preprocessing. This approach allows > for the addition of new steps and types of arguments, such as predicates. > > > Let us look at a fake example: If I call `g.addSchemaClass("Test")`, DSL > converts it into the standard step -> `g.call("extDSL", Map.of("stepName", > "addSchemaClass", "className", "Test"))`, which, during the Traversal > preprocessing phase, is converted to the AddSchemaClass step by the related > optimization strategy. > > > This approach: > > 1. Allows the use of new steps in already existing tools. > 2. The straightforward design to follow to extend Gremlin drivers > allows for > quick incorporation of those drivers by the tool's developers and > application developers. > 3. It allows DB vendors to add new steps themselves easily. > > > However, this does not solve the issue when DSL is passed as text. > > To solve this last-mile issue, we propose the following approach: > > 1. ANTLR grammar is distributed as a separate artifact, and users are > allowed to extend it with new step definitions by contract. I am not > sure > about predicates, but that is likely also possible. > 2. A new Customizer that provides instances of GremlinLexer, > GermlinParser > and a new interface, GremlinASTTransformer, if specified is introduced. > 3. The responsibility of a new GremlinASTTransformer class would be to > transform the AST provided by the extended parser into the standard AST > generated by the standard Gremlin ANTLR grammar. > 4. After that, the same optimization routine that converts fake service > into the new Step instance is applied. > > > This routine can also be made more developer-friendly. > > > For example, there could be handy automations that will allow you not to > write AST transformation code manually, but merely map it to the > specialized TraversalStrategy interface implementation that will provide > all information needed for both transformations: > > 1. Forward -> from the `call` step to the new Step instance conversion. > 2. Backward -> from the Gremllin step in the extended AST to the `call ` > step in Traversal. > > Moreover, a new step, "stepExt," can be introduced to make the Gremlin > extension more semantically clear. In such a case, a new type of > TraversalStrategy can be specialized to work only with a given type of > step. > > > To recap, after this massive wall of text, the benefits of this approach > are: > > 1. Despite being somewhat convoluted, extending the Gremlin language > will not require spending a large amount of resources for developers > with > knowledge of ANTLR, especially if AbstractGremlinASTTransformer and > StepExtensionTraversalStrategy are implemented, which will do all the > heavy > lifting of step transformation. > 2. Existing gremlin tools can be used to call new steps, though less > conveniently than in the presence of new steps in Traveral. > > Looking forward to reading your feedback. >
