Hi all, RepeatUnrollStrategy is designed as an optimization strategy to improve performance by unrolling repeat().times(x) traversals into repeating sequences. However, the current implementation is too aggressive and changes the output semantics of traversals, violating the fundamental principle that optimization strategies should preserve traversal behavior while only improving performance.
The strategy currently transforms `g.inject(5, 3, 7).repeat(order()).times(1)` into `g.inject(5, 3, 7).order()`. But, gremlin> g.inject(5, 3, 7).repeat(order()).times(1) ==>3 ==>5 ==>7 gremlin> g.withoutStrategies(RepeatUnrollStrategy).inject(5, 3, 7).repeat(order()).times(1) ==>5 ==>3 ==>7 shows the original and unrolled queries aren't equivalent. An example (using modern graph) with a barrier step that shows different results is: gremlin> g.V(1).repeat(both().order().tail(2)).times(2) ==>v[5] ==>v[6] gremlin> g.withoutStrategies(RepeatUnrollStrategy).V(1).repeat(both().order().tail(2)).times(2) ==>v[4] ==>v[6] The current approach tries to apply this strategy as broadly as possible. But it is also reactive - we discover semantic violations and add more invalidations like for dedup() and emit(), rather than being conservative from the start. Starting in 3.8, I'm suggesting we reduce RepeatUnrollStrategy to only handle the most basic, provably safe cases (likely some in(), out() queries) which provides the following benefits: 1. Guaranteed semantic preservation: conservative approach ensures the strategy never changes traversal output 2. Reduced maintenance burden: Fewer edge cases and semantic violations to discover and fix 3. Foundation for growth: We can gradually add back optimizations with confidence as we prove their safety Does anyone have any concerns with this change? Thanks, Ken
