Re: [DISCUSS] Interrupting Traversals

Stephen Mallette Mon, 18 Apr 2016 08:57:03 -0700

Well - I don't think this code is highly specialized. It's good general
practice to respect Thread.interrupted(). I think you'd find that sentiment
in just about any java concurrency programming book.  So, it's not as
though we'd be asking folks to learn something that was new or a difficult
prospect to understand. I can't think of a way to make it a generalized
feature of a Traversal. If you have a while(true) or other long run loop,
checking for thread interruption in there is just a line of code that needs
to be present. Perhaps it is something where "forgetting" can only be
mitigated through code review.


> 1. In OLAP, where there can be multiple threads how does this work?
> 2. In Giraph/Spark, how does this effect job execution and failure
responses?

I ended my initial post in this thread by deleting the last paragraph i
wrote about OLAP.  :)  I guess there's still some question there as to how
that will work. If I interrupt the thread that was executing the OLAP
traversal, it's only going to kill it waiting for the result from Spark or
wherever.  The traversal will still be executing in the context of spark.
I assume the way to deal with this is on an implementation specific basis
where I assume there is a way to cancel a running spark job (or running
giraph job or whatever). If the Traversal that waits for interrupt could
signal that cancellation somehow, i guess that would be the way to
implement that. I don't know enough about the specifics of spark for how
that would work but it sounds plausible, no?

> 3. When we move into threaded OLTP, how will this be triggered/effected?

I'm not sure how that feature will be implemented - so i'm not sure how to
comment on that.

> 4. This doesn't work for "infinite loop" lambdas or "hung databases."

If someone did this:

g.V().sideEffect{while(true){}}

we can't break out of that, but I think that goes back to my initial point
that respecting Thread.interrupted() is a general practice for writing good
code, so if someone wrote a traversal like that, well, they need to accept
the consequences of it.  I guess in the case of a "hung database", it will
be the up to the nature of the "driver" connected to that database to
determine how it responds to interruption.




On Mon, Apr 18, 2016 at 11:28 AM, Marko Rodriguez <okramma...@gmail.com>
wrote:

> Hi,
>
> I think a problem with this is that it requires every step implementation
> to have this construct in it -- though many steps simply extend the base
> FlatMapStep, MapStep, FilterStep, etc. However, not all and thus, this
> requires all providers to know what this about and write their code
> accordingly.
>
> A few questions:
>
>         1. In OLAP, where there can be multiple threads how does this work?
>         2. In Giraph/Spark, how does this effect job execution and failure
> responses?
>         3. When we move into threaded OLTP, how will this be
> triggered/effected?
>         4. This doesn't work for "infinite loop" lambdas or "hung
> databases."
>
> I know this is the oldest ticket in the books and a million solutions have
> been proposed, but it would be nice if this didn't require specialized code
> in all the steps. We are bound to "forget."
>
> Thanks,
> Marko.
>
> http://markorodriguez.com
>
> On Apr 18, 2016, at 8:56 AM, Stephen Mallette <spmalle...@gmail.com>
> wrote:
>
> > Do you mean:
> >
> > if (Thread.currentThread().isInterrupted()) throw new
> > TraversalInterruptedException();
> >
> > If so, Thread.interrupted() basically does that under the covers
> >
> > On Mon, Apr 18, 2016 at 10:51 AM, Ted Wilmes <twil...@gmail.com> wrote:
> >
> >> Yeah, looks like benchmark-wise it's a wash, which is good.  I wasn't
> aware
> >> of the difference between the static interrupted() and non-static
> >> isInterrupted().  I was wondering if in this case it should be
> >> isInterrupted(), but I think how you did it is good because it'll be
> >> evaluated within the traversal thread regardless.
> >>
> >> --Ted
> >>
> >> On Mon, Apr 18, 2016 at 6:11 AM, Stephen Mallette <spmalle...@gmail.com
> >
> >> wrote:
> >>
> >>> A while back, I brought up the issue of being able to interrupt
> >> traversals:
> >>>
> >>>
> >>>
> >>
> https://pony-poc.apache.org/thread.html/e6477fc9c58d37a5bdcb5938a0eaa285456ad15aa39e16446290e2ff@1444993523@%3Cdev.tinkerpop.apache.org%3E
> >>> https://issues.apache.org/jira/browse/TINKERPOP-946
> >>>
> >>> As a quick refresher, making Traversal respect Thread.interrupted() is
> >>> important as you otherwise can quite easily lock up applications like
> >>> Gremlin Server with a few poorly conceived or errant queries. We'd left
> >>> that last thread with liking the idea, but there were concerns about
> the
> >>> complexity of the changes and performance hits.
> >>>
> >>> Given that we now have gremlin-benchmark, I decided to see what the
> >>> performance hit would be for making this change. I took a rough stab at
> >> it
> >>> introducing Thread.interrupted() in all steps where it seemed to make
> >> sense
> >>> to do so and then ran the benchmark before and after the change.
> >>>
> >>> https://gist.github.com/spmallette/ed21267f2e7e17bb3fbd5a8d1a568d2b
> >>>
> >>> I'm not seeing a whole lot of difference between supporting this
> feature
> >>> and not supporting this feature.  Here's the branch I implemented this
> in
> >>> in case you want to look around:
> >>>
> >>> https://github.com/apache/incubator-tinkerpop/tree/TINKERPOP-946
> >>>
> >>> I'm not sure that my changes are completely bulletproof at this point,
> >> but
> >>> I'm reasonably sure that these changes would handle a good majority of
> >>> calls for thread interruption. I expect to re-target my branch at tp31
> >>> (currently from master so that i could use the benchmark suite) if this
> >>> becomes a pull request.
> >>>
> >>> Any thoughts on the benchmark, the implementation, etc?
> >>>
> >>
>
>

Re: [DISCUSS] Interrupting Traversals

Reply via email to