Re: Async capabilities to TinkerPop

Oleksandr Porunov Thu, 28 Jul 2022 19:06:01 -0700

Hmm, that's interesting! Thank you Joshua for the idea!
So, I guess the general idea here could be:
we can start small and start implementing async functionality for some
parts instead of implement async functionality for everything straightaway.


Oleksandr

On Fri, Jul 29, 2022, 00:38 Joshua Shinavier <j...@fortytwo.net> wrote:

> Well, the wrapper I mentioned before did not require a full rewrite of
> TinkerPop :-) Rather, it provided async interfaces for vertices and edges,
> on which operations like subgraph and shortest paths queries were evaluated
> in an asynchronous fashion (using a special language, as it happened, but
> limited Gremlin queries would have been an option). So I think a basic
> async API might be a useful starting point even if it doesn't go very deep.
>
> Josh
>
>
> On Thu, Jul 28, 2022 at 4:21 PM Oleksandr Porunov <
> alexandr.poru...@gmail.com> wrote:
>
>> Hi Joshua and Pieter,
>>
>> Thank you for joining the conversation!
>>
>> I didn't actually look into the implementation details yet but quickly
>> checking Traversal.java code I think Pieter is right here.
>> For some reason I thought we could simply wrap synchronous method in
>> asynchronous, basically something like:
>>
>> // the method which should be implemented by a graph provider
>>
>> Future<E> executeAsync(Callable<E> func);
>>
>> public default Future<E> asyncNext(){
>>     return executeAsync(this::next);
>> }
>>
>> but checking that code I think I was wrong about it. Different steps may
>> execute different logic (i.e. different underlying storage queries) for
>> different graph providers.
>> Thus, wrapping only terminal steps into async functions won't solve the
>> problem most likely.
>>
>> I guess it will require re-writing or extending all steps to be able to
>> pass an async state instead of a sync state.
>>
>> I'm not familiar enough with the TinkerPop code yet to claim that, so
>> probably I could be wrong.
>> I will need to research it a bit more to find out but I think that Pieter
>> is most likely right about a massive re-write.
>>
>> Nevertheless, even if that requires massive re-write, I would be eager to
>> start the ball rolling.
>> I think we either need to try to implement async execution in TinkerPop 3
>> or start making some concrete decisions regarding TinkerPop 4.
>>
>> I see Marko A. Rodriguez started to work on RxJava back in 2019 here
>> https://github.com/apache/tinkerpop/tree/4.0-dev/java/machine/processor/rxjava/src/main/java/org/apache/tinkerpop/machine/processor/rxjava
>>
>> but the process didn't go as far as I understand. I guess it would be
>> good to know if we want to completely rewrite TinkerPop in version 4 or not.
>>
>> If we want to completely rewrite TinkerPop in version 4 then I assume it
>> may take quite some time to do so. In this case I would be more likely to
>> say that it's better to implement async functionality in TinkerPop 3 even
>> if it requires rewriting all steps.
>>
>> In case TinkerPop 4 is a redevelopment with breaking changes but without
>> starting to rewrite the whole functionality then I guess we could try to
>> work on TinkerPop 4 by introducing async functionality and maybe applying
>> more breaking changes in places where it's better to re-work some parts.
>>
>> Best regards,
>> Oleksandr
>>
>>
>> On Thu, Jul 28, 2022 at 7:47 PM pieter gmail <pieter.mar...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> Does this not imply a massive rewrite of TinkerPop? In particular the
>>> iterator chaining pattern of steps should follow a reactive style of
>>> coding?
>>>
>>> Cheers
>>> Pieter
>>>
>>>
>>> On Thu, 2022-07-28 at 15:18 +0100, Oleksandr Porunov wrote:
>>> > I'm interested in adding async capabilities to TinkerPop.
>>> >
>>> > There were many discussions about async capabilities for TinkerPop
>>> > but
>>> > there was no clear consensus on how and when it should be developed.
>>> >
>>> > The benefit for async capabilities is that the user calling a query
>>> > shouldn't need its thread to be blocked to simply wait for the result
>>> > of
>>> > the query execution. Instead of that a graph provider should take
>>> > care
>>> > about implementation of async queries execution.
>>> > If that's the case then many graph providers will be able to optimize
>>> > their
>>> > execution of async queries by handling less resources for the query
>>> > execution.
>>> > As a real example of potential benefit we could get I would like to
>>> > point
>>> > on how JanusGraph executes CQL queries to process Gremlin queries.
>>> > CQL result retrieval:
>>> >
>>> https://github.com/JanusGraph/janusgraph/blob/15a00b7938052274fe15cf26025168299a311224/janusgraph-cql/src/main/java/org/janusgraph/diskstorage/cql/function/slice/CQLSimpleSliceFunction.java#L45
>>> >
>>> > As seen from the code above, JanusGraph already leverages async
>>> > functionality for CQL queries under the hood but JanusGraph is
>>> > required to
>>> > process those queries in synced manner, so what JanusGraph does - it
>>> > simply
>>> > blocks the whole executing thread until result is returned instead of
>>> > using
>>> > async execution.
>>> >
>>> > Of course, that's just a case when we can benefit from async
>>> > execution
>>> > because the underneath storage backend can process async queries. If
>>> > a
>>> > storage backend can't process async queries then we won't get any
>>> > benefit
>>> > from implementing a fake async executor.
>>> >
>>> > That said, I believe quite a few graph providers may benefit from
>>> > having a
>>> > possibility to execute queries in async fashion because they can
>>> > optimize
>>> > their resource utilization.
>>> > I believe that we could have a feature flag for storage providers
>>> > which
>>> > want to implement async execution. Those who can't implement it or
>>> > don't
>>> > want to implement it may simply disable async capabilities which will
>>> > result in throwing an exception anytime an async function is called.
>>> > I
>>> > think it should be fine because we already have some feature flags
>>> > like
>>> > that for graph providers. For example "Null Semantics" was added in
>>> > TinkerPop 3.5.0 but `null` is not supported for all graph providers.
>>> > Thus,
>>> > a feature flag for Null Semantics exists like
>>> > "g.getGraph().features().vertex().supportsNullPropertyValues()".
>>> > I believe we can enable async in TinkerPop 3 by providing async as a
>>> > feature flag and letting graph providers implement it at their will.
>>> > Moreover if a graph provider wants to have async capabilities but
>>> > their
>>> > storage backends don't support async capabilities then it should be
>>> > easy to
>>> > hide async execution under an ExecutorService which mimics async
>>> > execution.
>>> > I believe we could do that for TinkerGraph so that users could
>>> > experiment
>>> > with async API at least. I believe we could simply have a default
>>> > "async"
>>> > function implementation for TinkerGraph which wraps all sync
>>> > executions in
>>> > a function and sends it to that ExecutorService (we can discuss which
>>> > one).
>>> > In such a case TinkerGraph will support async execution even without
>>> > real
>>> > async functionality. We could also potentially provide some
>>> > configuration
>>> > options to TinkerGraph to configure thread pool size, executor
>>> > service
>>> > implementation, etc.
>>> >
>>> > I didn't think about how it is better to implement those async
>>> > capabilities
>>> > for TinkerPop yet but I think reusing a similar approach like in
>>> > Node.js
>>> > which returns Promise when calling Terminal steps could be good. For
>>> > example, we could have a method called `async` which accepts a
>>> > termination
>>> > step and returns a necessary Future object.
>>> > I.e.:
>>> > g.V(123).async(Traversal.next())
>>> > g.V().async(Traversal.toList())
>>> > g.E().async(Traversal.toSet())
>>> > g.E().async(Traversal.iterate())
>>> >
>>> > I know that there were discussions about adding async functionality
>>> > to
>>> > TinkerPop 4 eventually, but I don't see strong reasons why we
>>> > couldn't add
>>> > async functionality to TinkerPop 3 with a feature flag.
>>> > It would be really great to hear some thoughts and concerns about it.
>>> >
>>> > If there are no concerns, I'd like to develop a proposal for further
>>> > discussion.
>>> >
>>> > Best regards,
>>> > Oleksandr Porunov
>>>
>>>

Re: Async capabilities to TinkerPop

Reply via email to