[jira] [Updated] (TINKERPOP-2903) Option to include detectable Traversal Timeout

Jira Tue, 14 Mar 2023 01:35:04 -0700


     [ 
https://issues.apache.org/jira/browse/TINKERPOP-2903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Martin Häusler updated TINKERPOP-2903:
--------------------------------------
    Description: 
At the moment, it is possible to achieve a timeout with the {{timeLimit}} step. 
This step will cause further elements to not be reported by the result iterator 
once the time limit has been reached. That's good and should stay that way, but 
I've got a different use case in mind.

What I would like to accomplish is a deadline for the traversal, and if the 
deadline is reached, a dedicated exception (e.g. a subclass of 
{{{}TraversalInterruptedException{}}}) should be thrown, indicating that the 
query has timed out. This allows the caller to detect that the timeout has been 
reached and no valid complete result is to be expected from the traversal. To 
the best of my knowledge (please correct me if I'm wrong) there is currently no 
clean way to accomplish this in gremlin 3.6.x.

One possibility I could envision is:

{{graph.traversal().withDeadline(someUnixEpochTimestamp).V()...}}

This should affect all traversers. If it doesn't impact performance too much, 
the deadline should be checked after every step. Should that prove to be too 
expensive, checking the timeout after every n-th step would also be an option. 
Another way to implement this would be by using the JDK-builtin timers, letting 
a timer run and setting a "timeout" boolean in the traversal context to true 
which is then checked by the traversers after every step (the cost to be paid 
is of course the construction of the scheduler thread, which should be reused).

In theory, this could also be achieved on the database transaction level, such 
that the next access to the transaction would cause the timeout. However, if a 
gremlin query is busy processing elements it has already loaded (e.g. in a 
misconfigured {{repeat()}} which is running in cycles), doing the timeout on 
the transaction level will have no effect.

Some vendors implement this for remote communication in vendor-specific 
extensions (e.g. AWS Neptune has the "evaluationTimeout"), but I think that 
this feature can (and should) be standardized.

 

Additionally, I think that a timeout should be a property of the traversal as a 
whole. Implementing it as a `filter(...)` step is not ideal:
 * the traversal may spend the majority of time somewhere else, so our timeout 
filter may not trigger at all or trigger way too late
 * an explicit step may interfere with optimizers

The place to check for a timeout IMHO is when a new traverser is spawned, or 
when an existing traverser moves to a different element. This way, we do not 
interfere with optimizers (as the timeout is a property of the traversal, and 
no longer an explicit step) and at the same time it becomes more reliable as 
*any* progress may trigger the timeout. That being said, checking 
`System.currentTimeMillis()` every time is not an option, because this call 
itself incurs some overhead. Doing this on a high frequency (i.e. many 
traversers, many steps to take) can put unnecessary pressure on the system. A 
better approach is to check for an `isTimedOut` boolean, which is initially 
`false` and then switched to `true` by a timer running in a second thread / 
thread pool. One thread should be enough to act as a watchdog for the timers of 
all queries.

  was:
At the moment, it is possible to achieve a timeout with the {{timeLimit}} step. 
This step will cause further elements to not be reported by the result iterator 
once the time limit has been reached. That's good and should stay that way, but 
I've got a different use case in mind.

What I would like to accomplish is a deadline for the traversal, and if the 
deadline is reached, a dedicated exception (e.g. a subclass of 
{{{}TraversalInterruptedException{}}}) should be thrown, indicating that the 
query has timed out. This allows the caller to detect that the timeout has been 
reached and no valid complete result is to be expected from the traversal. To 
the best of my knowledge (please correct me if I'm wrong) there is currently no 
clean way to accomplish this in gremlin 3.6.x.

One possibility I could envision is:

{{graph.traversal().withDeadline(someUnixEpochTimestamp).V()...}}

This should affect all traversers. If it doesn't impact performance too much, 
the deadline should be checked after every step. Should that prove to be too 
expensive, checking the timeout after every n-th step would also be an option. 
Another way to implement this would be by using the JDK-builtin timers, letting 
a timer run and setting a "timeout" boolean in the traversal context to true 
which is then checked by the traversers after every step (the cost to be paid 
is of course the construction of the scheduler thread, which should be reused).

In theory, this could also be achieved on the database transaction level, such 
that the next access to the transaction would cause the timeout. However, if a 
gremlin query is busy processing elements it has already loaded (e.g. in a 
misconfigured {{repeat()}} which is running in cycles), doing the timeout on 
the transaction level will have no effect.

Some vendors implement this for remote communication in vendor-specific 
extensions (e.g. AWS Neptune has the "evaluationTimeout"), but I think that 
this feature can (and should) be standardized.

 

Additionally, I think that a timeout should be a property of the traversal as a 
whole. Implementing it as a `filter(...)` step is not ideal:
 * the traversal may spend the majority of time somewhere else
 * an explicit step may interfere with optimizers

The place to check for a timeout IMHO is when a new traverser is spawned, or 
when an existing traverser moves to a different element. This way, we do not 
interfere with optimizers (as the timeout is a property of the traversal, and 
no longer an explicit step) and at the same time it becomes more reliable as 
*any* progress may trigger the timeout. That being said, checking 
`System.currentTimeMillis()` every time is not an option, because this call 
itself incurs some overhead. Doing this on a high frequency (i.e. many 
traversers, many steps to take) can put unnecessary pressure on the system. A 
better approach is to check for an `isTimedOut` boolean, which is initially 
`false` and then switched to `true` by a timer running in a second thread / 
thread pool. One thread should be enough to act as a watchdog for the timers of 
all queries.


> Option to include detectable Traversal Timeout
> ----------------------------------------------
>
>                 Key: TINKERPOP-2903
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-2903
>             Project: TinkerPop
>          Issue Type: New Feature
>          Components: language
>            Reporter: Martin Häusler
>            Priority: Minor
>
> At the moment, it is possible to achieve a timeout with the {{timeLimit}} 
> step. This step will cause further elements to not be reported by the result 
> iterator once the time limit has been reached. That's good and should stay 
> that way, but I've got a different use case in mind.
> What I would like to accomplish is a deadline for the traversal, and if the 
> deadline is reached, a dedicated exception (e.g. a subclass of 
> {{{}TraversalInterruptedException{}}}) should be thrown, indicating that the 
> query has timed out. This allows the caller to detect that the timeout has 
> been reached and no valid complete result is to be expected from the 
> traversal. To the best of my knowledge (please correct me if I'm wrong) there 
> is currently no clean way to accomplish this in gremlin 3.6.x.
> One possibility I could envision is:
> {{graph.traversal().withDeadline(someUnixEpochTimestamp).V()...}}
> This should affect all traversers. If it doesn't impact performance too much, 
> the deadline should be checked after every step. Should that prove to be too 
> expensive, checking the timeout after every n-th step would also be an 
> option. Another way to implement this would be by using the JDK-builtin 
> timers, letting a timer run and setting a "timeout" boolean in the traversal 
> context to true which is then checked by the traversers after every step (the 
> cost to be paid is of course the construction of the scheduler thread, which 
> should be reused).
> In theory, this could also be achieved on the database transaction level, 
> such that the next access to the transaction would cause the timeout. 
> However, if a gremlin query is busy processing elements it has already loaded 
> (e.g. in a misconfigured {{repeat()}} which is running in cycles), doing the 
> timeout on the transaction level will have no effect.
> Some vendors implement this for remote communication in vendor-specific 
> extensions (e.g. AWS Neptune has the "evaluationTimeout"), but I think that 
> this feature can (and should) be standardized.
>  
> Additionally, I think that a timeout should be a property of the traversal as 
> a whole. Implementing it as a `filter(...)` step is not ideal:
>  * the traversal may spend the majority of time somewhere else, so our 
> timeout filter may not trigger at all or trigger way too late
>  * an explicit step may interfere with optimizers
> The place to check for a timeout IMHO is when a new traverser is spawned, or 
> when an existing traverser moves to a different element. This way, we do not 
> interfere with optimizers (as the timeout is a property of the traversal, and 
> no longer an explicit step) and at the same time it becomes more reliable as 
> *any* progress may trigger the timeout. That being said, checking 
> `System.currentTimeMillis()` every time is not an option, because this call 
> itself incurs some overhead. Doing this on a high frequency (i.e. many 
> traversers, many steps to take) can put unnecessary pressure on the system. A 
> better approach is to check for an `isTimedOut` boolean, which is initially 
> `false` and then switched to `true` by a timer running in a second thread / 
> thread pool. One thread should be enough to act as a watchdog for the timers 
> of all queries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (TINKERPOP-2903) Option to include detectable Traversal Timeout

Reply via email to