On Fri, Jul 22, 2016 at 7:39 PM, Jonathan Ellithorpe
<[email protected]> wrote:
> Sorry for the delay here, been away for a while...
>
> Thanks for clarifying things. That makes sense... it's a bit different from
> the MySQL storage engine model, where one simply implements the API for a
> new storage engine,

Check.

> plugs it in,

Check.

> and the "top-half" doesn't change (e.g.
> the query planner / storage engine driver).

Not entirely true. The query planner/optimizer has to get hints from
the storage engine on how to optimize. Otherwise seems equivalent.
Like Marko said:

>> Each provider will have their own set of Strategies and custom steps that
>> compile Gremlin (beyond TinkerPop’s optimizers) to the most efficient
>> representation given their system’s unique capabilities.

This is true of a MySQL storage engine as well. It could implement
just the basic API, which doesn't include indexing, thus would do full
table scans. Not very performant.

> Any changes that do need to be
> made to the behavior of MySQL are communicated up via the API (e.g. to
> disable certain kinds of locking in the top-half that are handled more
> performantly by the storage engine itself).
>
> One interesting benefit of this approach is that it offers the user storage
> engine independence. An SQL query can "just execute" against a MySQL server
> and be agnostic of what storage engine it's using underneath.

Check and check.

> It *seems* like Gremlin doesn't really achieve this (maybe it wasn't a goal
> of the project to begin with).

Version compatibility aside, gremlin does achieve this. You should be
able to execute the same gremlin query against any database that
supports gremlin. E.g. Titan, OrientDB, Neo4j, TinkerGraph, etc.  As
far as vendor extensions are concerned, you would have the same issue
in RDBMS world.

> For instance, to create an index, user code
> needs to interface with the storage engine directly. Or to support queries
> like the following:
>
> g.V().outE().has('creationDate", gt(sevenDaysAgo)).inV()
>
> with high performance, the user needs to tell the storage engine to keep
> the edges sorted by the key 'creationDate'.

Right, gremlin does not have an API similar to SQL for *tuning* the
underlying database. Typically, the database already has language
(e.g. SQL, Cypher), Java API, or some other means to do this. It would
be convenient to have a tuning API at the gremlin level, but certainly
not necessary.

However, like RDBMS, you do have to manually create indexes for
performance.  I'm not aware of any RDBMS including MySQL that
automatically creates indexes for you.

> Maybe this is changing in Gremlin (or has already changed)?

I'm not sure what version you have been using, but it seems to be
supported this way since the 3.x versions.

-- 
Robert Dale

Reply via email to