Re: A Question Regarding TP4 Processor Classifications

Marko Rodriguez Thu, 04 Apr 2019 14:25:45 -0700

Hi,

> The Pipes implementation will no doubt be
> faster for execution of a single traversal but I'm wondering if the
> Flowable RxJava would beat the Pipes processor by providing higher
> throughput in a scenario when many, many users are executing queries
> concurrently.


Why would you think that? When a query comes in, you can just new 
Thread(pipes).run() and thread it. ? 

*** My knowledge of server architectures is pretty weak, but….

I think the model for dealing with many concurrent users will be up to the 
server implementation. If you are using LocalMachine, then its one query at a 
time. But if you are using RemoteMachine to, lets say, the TP4 MachineServer, 
traversals are executed in parallel. And, for most providers who will have 
their own server implementation (that can be communicated with by 
RemoteMachine), they will handle it as they see fit (e.g. doing complex stuff 
like routing queries to different machines in the cluster to load balance or 
whatever).

One thing I’m trying to stress in TP4 is “no more complex server 
infrastructure.” You can see our MachineServer implementation. Its ~100 lines 
of code and does parallel execution of queries. Its pretty brain dead simple, 
but with some modern thread/server techniques you all might have, we can make 
it a solid little server that meets most providers’ needs — else, they just 
roll those requirements into their server system.

        
https://github.com/apache/tinkerpop/blob/tp4/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/species/remote/MachineServer.java
 
<https://github.com/apache/tinkerpop/blob/tp4/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/species/remote/MachineServer.java>

> Regardless, providers having multiple processor options is a
> good thing and I don't mean to suggest any premature optimization. At this
> point, I think I'll put together some simple benchmarks just out of
> curiosity but will report back.

Yea, I’m trying to cover all the “semantic bases:”

        Actor model: Akka
        Map/Reduce model: Spark
        Push-based model: RxJava
        Pull-based model: Pipes

If the Compilation/Processor/Bytecode/Traverser/etc. classes are sufficiently 
abstract to naturally enable all these different execution models, then we are 
happy. So far so good…

Marko.

http://rredux.com

> 
> --Ted
> 
> On Thu, Apr 4, 2019 at 12:36 PM Marko Rodriguez <[email protected]>
> wrote:
> 
>> Hi,
>> 
>> This is a pretty neat explanation of why Pipes will be faster than RxJava
>> single-threaded.
>> 
>> The map-operator for Pipes:
>> 
>> https://github.com/apache/tinkerpop/blob/tp4/java/machine/processor/pipes/src/main/java/org/apache/tinkerpop/machine/processor/pipes/MapStep.java
>> <
>> https://github.com/apache/tinkerpop/blob/tp4/java/machine/processor/pipes/src/main/java/org/apache/tinkerpop/machine/processor/pipes/MapStep.java
>>> 
>> 
>> The map-operator for RxJava:
>> 
>> https://github.com/ReactiveX/RxJava/blob/2.x/src/main/java/io/reactivex/internal/operators/flowable/FlowableMap.java
>> <
>> https://github.com/ReactiveX/RxJava/blob/2.x/src/main/java/io/reactivex/internal/operators/flowable/FlowableMap.java
>>> 
>> 
>> RxJava has a lot of overhead. Pipes is as bare bones as you can get.
>> 
>> Marko.
>> 
>> http://rredux.com <http://rredux.com/>
>> 
>> 
>> 
>> 
>>> On Apr 4, 2019, at 11:07 AM, Marko Rodriguez <[email protected]>
>> wrote:
>>> 
>>> Hello,
>>> 
>>> Thank you for the response.
>>> 
>>>> Excellent progress on the the RxJava processor. I was wondering if
>>>> categories 1 and 2 can be combined where Pipes becomes the Flowable
>> version
>>>> of the RxJava processor?
>>> 
>>> I don’t quite understand your questions. Are you saying:
>>> 
>>>      Flowable.of().flatMap(pipesProcessor)
>>> 
>>> or are you saying:
>>> 
>>>      “Get rid of Pipes all together and just use single-threaded RxJava
>> instead."
>>> 
>>> For the first, I don’t see the benefit of that. For the second, Pipes4
>> is really fast! — much faster than Pipes3. (more on this next)
>>> 
>>> 
>>>> In this case, though single threaded, we'd still
>>>> get the benefit of asynchronous execution of traversal steps versus
>>>> blocking execution on thread pools like the current TP3 model.
>>> 
>>> Again, I’m confused. Apologies. I believe that perhaps you think that
>> the Step-model of Pipes is what Bytecode gets compiled to in the TP4 VM. If
>> so, note that this is not the case. The concept of Steps (chained
>> iterators) is completely within the pipes/ package. The machine-core/
>> package compiles Bytecode to a nested List of stateless, unconnected
>> functions (called a Compilation). It is this intermediate representation
>> that ultimately is used by Pipes, RxJava, and Beam to create their
>> respective execution plan (where Pipes does the whole chained iterator step
>> thing).
>>> 
>>> Compilation:
>> https://github.com/apache/tinkerpop/blob/tp4/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/bytecode/compiler/Compilation.java#L43
>> <
>> https://github.com/apache/tinkerpop/blob/tp4/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/bytecode/compiler/Compilation.java#L43
>>> 
>>> 
>>>      Pipes:
>> https://github.com/apache/tinkerpop/blob/tp4/java/machine/processor/pipes/src/main/java/org/apache/tinkerpop/machine/processor/pipes/Pipes.java#L47
>> <
>> https://github.com/apache/tinkerpop/blob/tp4/java/machine/processor/pipes/src/main/java/org/apache/tinkerpop/machine/processor/pipes/Pipes.java#L47
>>> 
>>>      Beam:
>> https://github.com/apache/tinkerpop/blob/tp4/java/machine/processor/beam/src/main/java/org/apache/tinkerpop/machine/processor/beam/Beam.java#L132
>> <
>> https://github.com/apache/tinkerpop/blob/tp4/java/machine/processor/beam/src/main/java/org/apache/tinkerpop/machine/processor/beam/Beam.java#L132
>>> 
>>>      RxJava:
>> https://github.com/apache/tinkerpop/blob/tp4/java/machine/processor/rxjava/src/main/java/org/apache/tinkerpop/machine/processor/rxjava/RxJava.java#L103
>> <
>> https://github.com/apache/tinkerpop/blob/tp4/java/machine/processor/rxjava/src/main/java/org/apache/tinkerpop/machine/processor/rxjava/RxJava.java#L103
>>> 
>>> 
>>>> I would
>>>> imagine Pipes would beat the Flowable performance on a single traversal
>>>> side-by-side basis (thought perhaps not by much), but the Flowable
>> version
>>>> would likely scale up to higher throughput and better CPU utilization
>> when
>>>> under concurrent load.
>>> 
>>> 
>>> Pipes is definitely faster than RxJava (single-threaded). While I only
>> learned RxJava 36 hours ago, I don’t believe it will ever beat Pipes
>> because Pipes4 is brain dead simple — much simpler than in TP3 where a
>> bunch of extra data structures were needed to account for GraphComputer
>> semantics (e.g. ExpandableIterator).
>>> 
>>> I believe, given the CPU utilization/etc. points you make, that RxJava
>> will come into its own in multi-threaded mode (called ParallelFlowable)
>> when trying to get real-time performance from a query that
>> touches/generates lots of data (traversers). This is the reason for
>> Category 2 — real-time, multi-threaded, single machine. I only gave a quick
>> pass last night at making ParallelFlowable work, but gave up when various
>> test cases were failing (— I now believe I know the reason why). I hope to
>> have ParallelFlowable working by mid-week next week and then we can
>> benchmark its performance.
>>> 
>>> I hope I answered your questions or at least explained my confusion.
>>> 
>>> Thanks,
>>> Marko.
>>> 
>>> http://rredux.com <http://rredux.com/>
>>> 
>>> 
>>> 
>>> 
>>>> On Apr 4, 2019, at 10:33 AM, Ted Wilmes <[email protected] <mailto:
>> [email protected]>> wrote:
>>>> 
>>>> Hello,
>>>> 
>>>> 
>>>> --Ted
>>>> 
>>>> On Tue, Apr 2, 2019 at 7:31 AM Marko Rodriguez <[email protected]
>> <mailto:[email protected]>> wrote:
>>>> 
>>>>> Hello,
>>>>> 
>>>>> TP4 will not make a distinction between STANDARD (OLTP) and COMPUTER
>>>>> (OLAP) execution models. In TP4, if a processing engine can convert a
>>>>> bytecode Compilation into a working execution plan then that is all
>> that
>>>>> matters. TinkerPop does not need to concern itself with whether that
>>>>> execution plan is “OLTP" or “OLAP" or with the semantics of its
>> execution
>>>>> (function oriented, iterator oriented, RDD-based, etc.). With that,
>> here
>>>>> are 4 categories of processors that I believe define the full spectrum
>> of
>>>>> what we will be dealing with:
>>>>> 
>>>>>       1. Real-time single-threaded single-machine.
>>>>>               * This is STANDARD (OLTP) in TP3.
>>>>>               * This is the Pipes processor in TP4.
>>>>> 
>>>>>       2. Real-time multi-threaded single-machine.
>>>>>               * This does not exist in TP3.
>>>>>               * We should provide an RxJava processor in TP4.
>>>>> 
>>>>>       3. Near-time distributed multi-machine.
>>>>>               * This does not exist in TP3.
>>>>>               * We should provide an Akka processor in TP4.
>>>>> 
>>>>>       4. Batch-time distributed multi-machine.
>>>>>               * This is COMPUTER (OLAP) in TP3 (Spark or Giraph).
>>>>>               * We should provide a Spark processor in TP4.
>>>>> 
>>>>> I’m not familiar with the specifics of the Flink, Apex, DataFlow,
>> Samza,
>>>>> etc. stream-based processors. However, I believe they can be made to
>> work
>>>>> in near-time or batch-time depending on the amount of data pulled from
>> the
>>>>> database. However, once we understand these technologies better, I
>> believe
>>>>> we should be able to fit them into the categories above.
>>>>> 
>>>>> In conclusion: Do these categories make sense to people?
>> Terminology-wise
>>>>> -- Near-time? Batch-time? Are these distinctions valid?
>>>>> 
>>>>> Thank you,
>>>>> Marko.
>>>>> 
>>>>> http://rredux.com <http://rredux.com/> <http://rredux.com/ <
>> http://rredux.com/>>
>>> 
>> 
>>

Re: A Question Regarding TP4 Processor Classifications

Reply via email to