[
https://issues.apache.org/jira/browse/TINKERPOP-2903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stephen Mallette updated TINKERPOP-2903:
----------------------------------------
Affects Version/s: 3.5.5
Issue Type: Improvement (was: New Feature)
> Option to include detectable Traversal Timeout
> ----------------------------------------------
>
> Key: TINKERPOP-2903
> URL: https://issues.apache.org/jira/browse/TINKERPOP-2903
> Project: TinkerPop
> Issue Type: Improvement
> Components: language
> Affects Versions: 3.5.5
> Reporter: Martin Häusler
> Priority: Minor
>
> At the moment, it is possible to achieve a timeout with the {{timeLimit}}
> step. This step will cause further elements to not be reported by the result
> iterator once the time limit has been reached. That's good and should stay
> that way, but I've got a different use case in mind.
> What I would like to accomplish is a deadline for the traversal, and if the
> deadline is reached, a dedicated exception (e.g. a subclass of
> {{{}TraversalInterruptedException{}}}) should be thrown, indicating that the
> query has timed out. This allows the caller to detect that the timeout has
> been reached and no valid complete result is to be expected from the
> traversal. To the best of my knowledge (please correct me if I'm wrong) there
> is currently no clean way to accomplish this in gremlin 3.6.x.
> One possibility I could envision is:
> {{graph.traversal().withDeadline(someUnixEpochTimestamp).V()...}}
> This should affect all traversers. If it doesn't impact performance too much,
> the deadline should be checked after every step. Should that prove to be too
> expensive, checking the timeout after every n-th step would also be an
> option. Another way to implement this would be by using the JDK-builtin
> timers, letting a timer run and setting a "timeout" boolean in the traversal
> context to true which is then checked by the traversers after every step (the
> cost to be paid is of course the construction of the scheduler thread, which
> should be reused).
> In theory, this could also be achieved on the database transaction level,
> such that the next access to the transaction would cause the timeout.
> However, if a gremlin query is busy processing elements it has already loaded
> (e.g. in a misconfigured {{repeat()}} which is running in cycles), doing the
> timeout on the transaction level will have no effect.
> Some vendors implement this for remote communication in vendor-specific
> extensions (e.g. AWS Neptune has the "evaluationTimeout"), but I think that
> this feature can (and should) be standardized.
>
> Additionally, I think that a timeout should be a property of the traversal as
> a whole. Implementing it as a {{filter(...)}} step is not ideal:
> * the traversal may spend the majority of time somewhere else, so our
> timeout filter may not trigger at all or trigger way too late
> * if we insert the {{timeLimit(...)}} step near the start of the query, we
> will not be able to stop the query from executing after the initial set of
> elements have been produced. If we insert {{timeLimit(...)}} near the end,
> any barrier steps that occur in the traversal prior to {{timeLimit(...)}}
> will cause the time limit to be checked way too late (e.g.
> {{traversal().V().sort().by(T.label).timeLimit(123456)}}).
> * an explicit step may interfere with optimizers
> The place to check for a timeout IMHO is when a new traverser is spawned, or
> when an existing traverser moves to a different element. This way, we do not
> interfere with optimizers (as the timeout is a property of the traversal, and
> no longer an explicit step) and at the same time it becomes more reliable as
> *any* progress may trigger the timeout. That being said, checking
> `System.currentTimeMillis()` every time is not an option, because this call
> itself incurs some overhead. Doing this on a high frequency (i.e. many
> traversers, many steps to take) can put unnecessary pressure on the system. A
> better approach is to check for an `isTimedOut` boolean, which is initially
> `false` and then switched to `true` by a timer running in a second thread /
> thread pool. One thread should be enough to act as a watchdog for the timers
> of all queries.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)