Re: A collection of examples that map a query language query to provider bytecode.

Marko Rodriguez Thu, 09 May 2019 15:09:28 -0700

Hello Dmitry,

> In TP3 compilation to Bytecode can happen on Gremlin Client side or Gremlin 
> Server side:
> 
> 1. If compilation is simple, it is possible to implement it for all Gremlin 
> Clients: Java, Python, JavaScript, .NET...
> 2. If compilation is complex, it is possible to create a plugin for Gremlin 
> Server. Clients send query string, and server does the compilation.


Yes, but not for the reasons you state. Every TP3-compliant language must be 
able to compile to TP3 bytecode. That bytecode is then submitted, evaluated by 
the TP3 VM, and a traverser iterator is returned.

However, TP3’s GremlinServer also supports JSR223 ScriptEngine which can 
compile query language Strings server side and then return a traverser 
iterator. This exists so people can submit complex Groovy/Python/JS scripts to 
GremlinServer. The problem with this access point is that arbitrary code can be 
submitted and thus while(true) { } can hang the system! dar.

> For example, in Cypher for Gremlin it is possible to use compilation to 
> Bytecode in JVM client, or on the server when using [other language 
> clients][1].

I’m not to familiar with GremlinServer plugin stuff, so I don’t know. I would 
say that all TP3-compliant query languages must be able to compile to TP3 
bytecode.

> My current understanding is that TP4 Server would serve only for I/O purposes.

This is still up in the air, but I believe that we should:

        1. Only support one data access point.
                TP4 bytecode in and traversers out.
        2. The TP4 server should have two components.
                (1) One (or many) bytecode input locations (IP/port) that pass 
the bytecode to the TP4 VM.
                (2) Multiple traverser output locations where distributed 
processors can directly send halted traversers back to the client.

For me, thats it. However, I’m not a network server-guy so I don’t have a clear 
understanding of what is absolutely necessary.

> Where do you see "Query language -> Universal Bytecode" part in TP4 
> architecture? Will it be in the VM? Or in middleware? How will clients look 
> like in TP4?

TP4 will publish a binary serialization specification.
It will be dead simple compared to TP3’s binary specification.
The only types of objects are: Bytecode, Instruction, Traverser, Tuple, and 
Primitive.

Every query language designer that wants to have their query language execute 
on the TP4 VM (and thus, against all supporting processing engines and data 
storage systems) will need to have a compiler from their language to TP4 
bytecode.

We will provide 2 tools in all the popular programming languages (Java, Python, 
JS, …).
        1. A TP4 serializer and deserializer.
        2. A lightweight network client to submit serialized bytecode and 
deserialize Iterator<Traverser> into objects in that language. 

Thus, if the Cypher-TP4 compiler is written in Scala, you would:
        1. build up a org.apache.tinkerpop.machine.bytecode.Bytecode object 
during your compilation process.
        2. use our org.apache.tinkerpop.machine.io 
<http://org.apache.tinkerpop.machine.io/>.RemoteMachine object to send the 
Bytecode and get back Iterator<Traverser> objects.
                - RemoteMachine does the serialization and deserialization for 
you.

I originally wrote out how it currently looks in the tp4/ branch, but realized 
that it asks you to write one too many classes. Thus, I think we will probably 
go with something like this:

Machine machine = RemoteMachine.
                    withStructure(NeptuneStructure.class, config1).
                    withProcessor(AkkaProcessor.class, config2).
                      open(config0);

Iterator<Traverser> results = machine.submit(CypherCompiler.compile("MATCH 
(x)-[knows]->(y)”));

Thus, you would only have to provide a single CypherCompiler class.

If you have any better ideas, please say so. I don’t like that you would have 
to create a CypherCompiler class (even if its just a wrapper) for all popular 
programming languages. :(

Perhaps TP4 has a Compiler interface and compilation happens server side….? But 
then that requires language designers to write their compiler in Java … hmm…..

Hope I’m clear,
Marko.

http://rredux.com <http://rredux.com/>

Re: A collection of examples that map a query language query to provider bytecode.

Reply via email to