Re: [TinkerPop] gremlin-x

Robert Dale Mon, 05 Dec 2016 07:08:39 -0800

Clearly we have different use cases.

You prefer your model to be that of the underlying graph (following that
logic, you would use Hibernate to map to Table objects?) and I prefer using
application domain models.


You prefer your query to return the underlying graph model and I prefer it
to return any data.

You prefer your query to always return all properties and I prefer it to
always return only selected properties.

You prefer your objects to be proxies to the underlying datastore (I think
this blurs the lines between being a graph provider and gremlin client) and
I prefer my objects to be detached with load/store being explicit.

In the end, it sounds like you want gremlin to be an object-graph mapper in
the graph model and I prefer a layered approach where gremlin is a simple
query language of which an object-graph mapper, in any domain model, could
be built on top (like so many other query languages).

So I guess we'll just have to agree to disagree.


Robert Dale

On Fri, Dec 2, 2016 at 10:30 AM, pieter-gmail <[email protected]>
wrote:

> Hi,
>
> Let me disagree with your disagreement ;-)
>
> Regarding Neo4j
>
> I am talking about Neo4j embedded. The node/vertex is pretty much the
> database already being a direct pointer to the node on disc with its
> properties right next to it on disc. I would be surprised if all the
> properties are not also already in its hot cache. I am speculating about
> the internals but when coding in Neo4j embedded you don't care about
> pre-loading all or some properties for performance reasons, just load
> the node and all is well. Its the beauty of embedded Neo4j, latency is
> just not a concern and the node represents a instance of a label.
>
> It would be interesting to execute TinkerPop Neo4j's structure and
> process test suites via gremlin server and compare the performance to
> embedded. I don't really have a clue what to expect. If every property
> access is to be a call via GremlinServer I reckon things will slow down
> significantly. The suite is composed with the implicit assumption that
> property access is not something to think about.
>
> Regarding Hibernate. I have not worked with Hibernate for some time so
> ran a test to make sure.
>
>         EntityManager entityManager =
> entityManagerFactory.createEntityManager();
>         entityManager.getTransaction().begin();
>         int count = 100;
>         for (int i = 1; i < count + 1; i++) {
>             Person person = new Person("person_" + i);
>             entityManager.persist(person);
>         }
>         entityManager.getTransaction().commit();
>         entityManager.close();
>
>         entityManager = entityManagerFactory.createEntityManager();
>         Person person = entityManager.find(Person.class, 1L);
>         assertNotNull(person);
>         assertEquals("person_1", person.getName());
>
> The entityManager.find(Person.class, 1L) resulted in the following sql.
>
> "select person0_.id as id1_5_0_, person0_.name as name2_5_0_ from Person
> person0_ where person0_.id=?"
>
> I did not ask for the name property, it returned it anyways as well it
> should. If every property needs to be gotten separately then latency
> will kill the app.
> If the user has to ask for every property individually, well then part
> of the point of Hibernate disappears.
>
> RE: "Vertex is just a map wrapper"
> But its not just any map, its a Vertex, a core notion of the property
> graph model.
>
> RE: "I don't know anyone who wants to deal with Vertex/Edges"
> We probably live in our own bubbles but I don't know anyone who would
> not want to deal with the core abstractions of the property graph model
> and rather deal with Maps, except perhaps Json/Javascript folks :-)
>
> The property graph model and graph traversals are all about vertices and
> edge traversals, having that right there as a first class citizen in
> code is great.
>
> RE: in hibernate "If I set a property, it does not automatically persist
> it to the database."
> True but its also the cause of pain with hibernate altogether bypassing
> the databases concurrency model with it optimistic locking. And voilla
> you are stuck with lets just ignore the exception and retry and hope we
> get lucky this time round logic. For what its worth setting a property
> on Sqlg runs a update statement. Alas a very good reason why Hibernate
> does what it does is because their way reduces latency being able to run
> batch updates on commit or flush. Sqlg supports batch updates but its
> not the default.
>
> RE: "In your model, there is no difference between transient, in-memory
> state (e.g. workflow) and database state."
> Not sure what you mean here. If you mean application writers keeping
> their own cache of persistent data then you are right. Rule #1 of
> caching is don't cache. Rule #2 is don't cache the cache. Caching is a
> solution to a weakness elsewhere. I am not saying don't ever cache but
> that if you can avoid it do so. Writing transactional caches is also a
> rather specialized and difficult exercise and precisely what databases
> are all about.
>
> Lastly, to make sure we are talking about the same change, are you
> proposing that all gremlins like
>
> GraphTraversal<Vertex, Vertex> vertices =
> this.sqlgGraph.traversal().V().out();
>
> should become
>
> GraphTraversal<Vertex, Map<String, Object>> vertexProperties =
> this.sqlgGraph.traversal().V().out().valueMap();
>
> or worse
>
> GraphTraversal<Vertex, Map<String, Object>> vertexProperties =
> this.sqlgGraph.traversal().V().out().values("propery1", "propety2",
> "property3");
>
> Cheers
> Pieter
>
>
>
>
>
> On 02/12/2016 14:57, Robert Dale wrote:
> > Pieter, while I think Marko may be onto something, I just want to
> > completely disagree with you as a Java dev. ;-)
> >
> > First, in Neo4j's impl, from what I can tell the elements are not
> > fully loaded. Every get (getProperty, edges, etc) does a query to the
> > database. This is more round trips to the database. So this is why I
> > made the statement that implementations are different.  In your sqlg
> > case, you are basically arguing that the default behavior is the sql
> > equivalent of SELECT *.  This is not a good practice. Then you go on
> > to say that if the dev is aware that this is a 'fat' element, they
> > should ask for exact properties.  I think what we're arguing is that
> > the default behavior should be 'always ask for exact properties'. This
> > is the most accepted practice in querying any database, sql, nosql,
> > mongodb, cassandra, etc.
> >
> > That leads us to your Hibernate comment.  In the abstract sense,
> > Vertex is just a map wrapper. I think you're just splitting hairs
> > trying to distinguish a Dog Vertex and a Dog Map. In either case, you
> > would have to query the label.  In any case, I don't know anyone who
> > wants to deal with Vertex/Edges.  What most devs deal with, in my
> > experience, is a domain-specific model.  So whether I get back a
> > Vertex or a Map, either way, I'm going to translate that to my domain
> > model.  Also, in hibernate, when I get a property I didn't query for,
> > I will get a null.  If I set a property, it does not automatically
> > persist it to the database. In your model, there is no difference
> > between transient, in-memory state (e.g. workflow) and database state.
> > BTW, this would also be lots of round trips to the database in your
> > case. Finally, believe it or not, Hibernate attempts to do smart
> > querying where it will actually retrieve only the IDs, then look for
> > them in its second-level cache, if not found, go back to the database
> > to get them.  This is a very common pattern across sql/nosql datastores.
> >
> > So it's not just about becoming more like jdbc but more about a
> > low-level paradigm. To that I agree with you on one thing, the first
> > thing you should do is create a 'baby hibernate' because I don't think
> > gremlin should be an ORM (OGM?).
> >
> >
> >
> > Robert Dale
> >
> > On Thu, Dec 1, 2016 at 2:28 PM, pieter-gmail <[email protected]
> > <mailto:[email protected]>> wrote:
> >
> >     Hi,
> >
> >     "So with ReferenceElements, latency will be less too because it takes
> >     less time to construct the ReferenceVertex than it does to construct
> a
> >     DetachedVertex. Imagine a vertex with 100 properties and meta
> >     properties. ?!"
> >
> >     But ReferencedElement does not have the properties so more round
> trips
> >     are needed increasing latency. One of the first things to make Sqlg
> at
> >     all usable was to make sure that a Vertex contains all of its
> >     properties. Else at least one more call is needed per Vertex. Its a
> >     latency killer. For those mostly few cases where the Vertex is so fat
> >     that it is slow to load and only a few properties are needed then
> >     g.V().hasLabel("label").values("property1", "property2") is used.
> >     So to
> >     my mind ReferencedElement increases latency and does not decreases
> it.
> >
> >     Using ReferencedElement which is hardly an Element at all, after
> >     all it
> >     throws exceptions on almost all of its own interface, the user has to
> >     get the properties manually and then is back in a world of Map and
> >     Lists
> >     of Maps.
> >
> >     A refactor of much existing code will need to toss the Vertex
> >     notion all
> >     together and replace it with Maps and Lists of Maps. Almost like
> >     writing
> >     an application in pure JDBC code with thousands of lines iterating
> >     through ResultSets mapping things back and forth. Unless I am missing
> >     something this change seems huge.
> >
> >     I get that all this is important for non java devs but it be a pity
> if
> >     their problems becomes java devs problems.
> >
> >     Cheers
> >     Pieter
> >
> >
> >     On 01/12/2016 20:38, Marko Rodriguez wrote:
> >     > Hi,
> >     >
> >     > *PIETER REPLIES:*
> >     >
> >     >> One of the first reasons I came to graphs, Neo4j and then
> >     TinkerPop way
> >     >> back was precisely because of the direct access to Node/Vertex.
> >     The user
> >     >> treats it like any other object, not a remote connection. It is
> the
> >     >> embedded nature that makes life so easy. In a way it was like
> >     having a
> >     >> simplistic Hibernate as the core api. 99% of queries we write is
> to
> >     >> retrieve vertices. Not Maps and Lists of something. TinkerPop's
> >     own test
> >     >> suite applies this type of thinking. Querying/modifying
> >     Elements and
> >     >> asserting them. Vertex and Edge abound as first class citizens.
> >     >
> >     > So Graph/Vertex/Edge/VertexProperty/Property will still exist for
> >     > users as objects in the respective GLV language, it is just they
> are
> >     > not “attached” and “rich.”
> >     >
> >     > For instance, in Gremlin-Python, you have:
> >     >
> >     >     v = g.V().next()
> >     >     v.id <http://v.id>
> >     >
> >     > A ReferenceVertex contains the id of the vertex so you can always
> >     > “re-attach” it to the source.
> >     >
> >     >     g.V(v).out()
> >     >
> >     >
> >     >> Graph, Vertex and Edge is the primary abstraction that users
> >     deal with.
> >     >> Having the direct representation of this is very very nice.
> >     >> It makes user code easy and readable.  You know you are dealing
> >     with the
> >     >> "Person/Address/Dog/This/That" entity/label as opposed to just a
> >     >> decontextualized bunch of data, Maps and Lists. If
> >     Vertex/Edge/Property
> >     >> were to disappear I'd say it would be the first call of duty to
> >     write a
> >     >> baby hibernate to bring the property model back in again into
> >     userspace.
> >     >
> >     > Again, the abstraction is still there, but just ALWAYS in a
> >     detached form.
> >     >
> >     >>
> >     >> Regarding jdbc, this kinda makes the point. Sqlg and Hibernate
> >     and many
> >     >> many other tools exists precisely so that users do not need to
> >     use JDBC
> >     >> with endless hardcoded strings guiding the application. Making
> >     TinkerPop
> >     >> more like JDBC is not an obvious plus point.
> >     >
> >     > So, RemoteConnection differs from JDBC in that its not a fat
> string,
> >     > but RemoteConnection.submit(Bytecode). Thus, you still work at the
> >     > GraphTraversal level in every GLV.
> >     >
> >     >> A ReferencedElement is also no good as the problem I experience is
> >     >> latency not bandwidth.
> >     >
> >     > So with ReferenceElements, latency will be less too because it
> takes
> >     > less time to construct the ReferenceVertex than it does to
> >     construct a
> >     > DetachedVertex. Imagine a vertex with 100 properties and meta
> >     > properties. ?!
> >     >
> >     >> I reckon the experience and usage of TinkerPop is rather
> >     different for
> >     >> java and non java people and perhaps even java folks. Hopefully
> >     I am not
> >     >> the only one who have made such heavy happy use of the TinkerPop
> >     >> property meta model and would be sad to see it go.
> >     >>
> >     >> Cheers
> >     >> Pieter
> >     >>
> >     >
> >     >
> >     > *ROBERT REPLIES:*
> >     >
> >     >> I agree the focus should be on the Connection (being separate from
> >     >> Graph) and Traversal. I wouldn't constrain it to
> >     "RemoteConnection",
> >     >> just Connection or GraphConnection. Perhaps there's an
> >     >> EmbeddedConnection and a RemoteConnection or maybe it's
> >     URI-oriented
> >     >> similar to how JDBC does it. In either case, the behavior  of
> >     Remote
> >     >> and Embedded is the same which is what I think we're striving for.
> >     >
> >     > Yes. Good point. Just Connection.
> >     >
> >     >> I would also like to see Transactions be Connection-oriented. With
> >     >> the right API, it could hook into JTA and be able to take
> advantage
> >     >> of various annotations for marking transaction boundaries.
> >     >
> >     >     g = g.openTx()
> >     >     g.V().out().out()
> >     >     g.addV()
> >     >     g.V(1).addE().to(2)
> >     >     g.closeTx();
> >     >
> >     >
> >     > ??? This way, its all about GraphTraversalSource/GraphTraversal.
> >     That
> >     > is truly the “connection” where the Connection implementation is
> >     just
> >     > provider/machine specific shuffling of Bytecode in and
> >     Traversers out.
> >     >
> >     >> Are there features of a lambda that couldn't be replaced by a more
> >     >> feature-rich gremlin?
> >     >> g.V().out('knows').map{it.get().value('name') + ' is the friend
> >     name'}
> >     >> g.V().out('knows').map(lambda(concat(__.it.get().value('name'),
> >     ' is
> >     >> the friend name’))
> >     >
> >     > So we currently have the concept of g:Lambda and this allows for
> >     > lambdas to be used remotely.
> >     >
> >     >     g.V().map(function(“it.get().label()”)) // Gremlin-Java
> >     traversal
> >     >     with a Gremlin-Groovy lambda.
> >     >
> >     >
> >     > The crappy thing is that the lambda is always a String.
> >     >
> >     >> Reference-only makes total sense. This works really well
> especially
> >     >> with a local cache or for use cases where most of the data is
> >     stored
> >     >> in a separate database. I think it would lend itself nicely to
> lazy
> >     >> loading. When you need values there are options for that as well
> >     >> (properties/values/valueMap).  One of the problems with 'attached'
> >     >> elements is you don't know what the implementation does. So
> >     >> potentially every get or set property call is going to the
> database
> >     >> and you don't realize it. That can hurt performance and have
> >     >> unintended consequences.
> >     >
> >     > Dude, I’ve been saying this forever. DetachedXXX is a bad idea
> >     for the
> >     > reasons you have stipulated. Just imagine:
> >     >
> >     >     g.V(1).out(‘knows')
> >     >
> >     >
> >     > The GraphSON return is every vertex 1 knows and all its
> >     properties and
> >     > meta properties?!?! If you wanted that data too you would have
> >     queried
> >     > for it.
> >     >
> >     > Marko.
> >     > --
> >     > You received this message because you are subscribed to the Google
> >     > Groups "Gremlin-users" group.
> >     > To unsubscribe from this group and stop receiving emails from
> >     it, send
> >     > an email to [email protected]
> >     <mailto:gremlin-users%[email protected]>
> >     > <mailto:[email protected]
> >     <mailto:gremlin-users%[email protected]>>.
> >     > To view this discussion on the web visit
> >     >
> >     https://groups.google.com/d/msgid/gremlin-users/7CBD403D-
> 4EC3-4B4B-AFF9-9A54B4D3C4EF%40gmail.com
> >     <https://groups.google.com/d/msgid/gremlin-users/7CBD403D-
> 4EC3-4B4B-AFF9-9A54B4D3C4EF%40gmail.com>
> >     >
> >     <https://groups.google.com/d/msgid/gremlin-users/7CBD403D-
> 4EC3-4B4B-AFF9-9A54B4D3C4EF%40gmail.com?utm_medium=email&utm_source=footer
> >     <https://groups.google.com/d/msgid/gremlin-users/7CBD403D-
> 4EC3-4B4B-AFF9-9A54B4D3C4EF%40gmail.com?utm_medium=email&utm_source=footer
> >>.
> >     > For more options, visit https://groups.google.com/d/optout
> >     <https://groups.google.com/d/optout>.
> >
> >     --
> >     You received this message because you are subscribed to the Google
> >     Groups "Gremlin-users" group.
> >     To unsubscribe from this group and stop receiving emails from it,
> >     send an email to [email protected]
> >     <mailto:gremlin-users%[email protected]>.
> >     To view this discussion on the web visit
> >     https://groups.google.com/d/msgid/gremlin-users/79132fdd-
> f67f-5c3c-f8e3-87ab80f3c6f9%40gmail.com
> >     <https://groups.google.com/d/msgid/gremlin-users/79132fdd-
> f67f-5c3c-f8e3-87ab80f3c6f9%40gmail.com>.
> >     For more options, visit https://groups.google.com/d/optout
> >     <https://groups.google.com/d/optout>.
> >
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "Gremlin-users" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> > an email to [email protected]
> > <mailto:[email protected]>.
> > To view this discussion on the web visit
> > https://groups.google.com/d/msgid/gremlin-users/CABed_
> 4qE89f4oqZPQGjRXP8hn4kQpqVUiE%3DGq%2Bnvu_XfTQ_mWw%40mail.gmail.com
> > <https://groups.google.com/d/msgid/gremlin-users/CABed_
> 4qE89f4oqZPQGjRXP8hn4kQpqVUiE%3DGq%2Bnvu_XfTQ_mWw%40mail.
> gmail.com?utm_medium=email&utm_source=footer>.
> > For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Gremlin-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/gremlin-users/31db7eef-046b-465f-13ea-0044a10da18c%40gmail.com.
> For more options, visit https://groups.google.com/d/optout.
>

Re: [TinkerPop] gremlin-x

Reply via email to