Clearly we have different use cases.

You prefer your model to be that of the underlying graph (following that
logic, you would use Hibernate to map to Table objects?) and I prefer using
application domain models.

You prefer your query to return the underlying graph model and I prefer it
to return any data.

You prefer your query to always return all properties and I prefer it to
always return only selected properties.

You prefer your objects to be proxies to the underlying datastore (I think
this blurs the lines between being a graph provider and gremlin client) and
I prefer my objects to be detached with load/store being explicit.

In the end, it sounds like you want gremlin to be an object-graph mapper in
the graph model and I prefer a layered approach where gremlin is a simple
query language of which an object-graph mapper, in any domain model, could
be built on top (like so many other query languages).

So I guess we'll just have to agree to disagree.


Robert Dale

On Fri, Dec 2, 2016 at 10:30 AM, pieter-gmail <[email protected]>
wrote:

> Hi,
>
> Let me disagree with your disagreement ;-)
>
> Regarding Neo4j
>
> I am talking about Neo4j embedded. The node/vertex is pretty much the
> database already being a direct pointer to the node on disc with its
> properties right next to it on disc. I would be surprised if all the
> properties are not also already in its hot cache. I am speculating about
> the internals but when coding in Neo4j embedded you don't care about
> pre-loading all or some properties for performance reasons, just load
> the node and all is well. Its the beauty of embedded Neo4j, latency is
> just not a concern and the node represents a instance of a label.
>
> It would be interesting to execute TinkerPop Neo4j's structure and
> process test suites via gremlin server and compare the performance to
> embedded. I don't really have a clue what to expect. If every property
> access is to be a call via GremlinServer I reckon things will slow down
> significantly. The suite is composed with the implicit assumption that
> property access is not something to think about.
>
> Regarding Hibernate. I have not worked with Hibernate for some time so
> ran a test to make sure.
>
>         EntityManager entityManager =
> entityManagerFactory.createEntityManager();
>         entityManager.getTransaction().begin();
>         int count = 100;
>         for (int i = 1; i < count + 1; i++) {
>             Person person = new Person("person_" + i);
>             entityManager.persist(person);
>         }
>         entityManager.getTransaction().commit();
>         entityManager.close();
>
>         entityManager = entityManagerFactory.createEntityManager();
>         Person person = entityManager.find(Person.class, 1L);
>         assertNotNull(person);
>         assertEquals("person_1", person.getName());
>
> The entityManager.find(Person.class, 1L) resulted in the following sql.
>
> "select person0_.id as id1_5_0_, person0_.name as name2_5_0_ from Person
> person0_ where person0_.id=?"
>
> I did not ask for the name property, it returned it anyways as well it
> should. If every property needs to be gotten separately then latency
> will kill the app.
> If the user has to ask for every property individually, well then part
> of the point of Hibernate disappears.
>
> RE: "Vertex is just a map wrapper"
> But its not just any map, its a Vertex, a core notion of the property
> graph model.
>
> RE: "I don't know anyone who wants to deal with Vertex/Edges"
> We probably live in our own bubbles but I don't know anyone who would
> not want to deal with the core abstractions of the property graph model
> and rather deal with Maps, except perhaps Json/Javascript folks :-)
>
> The property graph model and graph traversals are all about vertices and
> edge traversals, having that right there as a first class citizen in
> code is great.
>
> RE: in hibernate "If I set a property, it does not automatically persist
> it to the database."
> True but its also the cause of pain with hibernate altogether bypassing
> the databases concurrency model with it optimistic locking. And voilla
> you are stuck with lets just ignore the exception and retry and hope we
> get lucky this time round logic. For what its worth setting a property
> on Sqlg runs a update statement. Alas a very good reason why Hibernate
> does what it does is because their way reduces latency being able to run
> batch updates on commit or flush. Sqlg supports batch updates but its
> not the default.
>
> RE: "In your model, there is no difference between transient, in-memory
> state (e.g. workflow) and database state."
> Not sure what you mean here. If you mean application writers keeping
> their own cache of persistent data then you are right. Rule #1 of
> caching is don't cache. Rule #2 is don't cache the cache. Caching is a
> solution to a weakness elsewhere. I am not saying don't ever cache but
> that if you can avoid it do so. Writing transactional caches is also a
> rather specialized and difficult exercise and precisely what databases
> are all about.
>
> Lastly, to make sure we are talking about the same change, are you
> proposing that all gremlins like
>
> GraphTraversal<Vertex, Vertex> vertices =
> this.sqlgGraph.traversal().V().out();
>
> should become
>
> GraphTraversal<Vertex, Map<String, Object>> vertexProperties =
> this.sqlgGraph.traversal().V().out().valueMap();
>
> or worse
>
> GraphTraversal<Vertex, Map<String, Object>> vertexProperties =
> this.sqlgGraph.traversal().V().out().values("propery1", "propety2",
> "property3");
>
> Cheers
> Pieter
>
>
>
>
>
> On 02/12/2016 14:57, Robert Dale wrote:
> > Pieter, while I think Marko may be onto something, I just want to
> > completely disagree with you as a Java dev. ;-)
> >
> > First, in Neo4j's impl, from what I can tell the elements are not
> > fully loaded. Every get (getProperty, edges, etc) does a query to the
> > database. This is more round trips to the database. So this is why I
> > made the statement that implementations are different.  In your sqlg
> > case, you are basically arguing that the default behavior is the sql
> > equivalent of SELECT *.  This is not a good practice. Then you go on
> > to say that if the dev is aware that this is a 'fat' element, they
> > should ask for exact properties.  I think what we're arguing is that
> > the default behavior should be 'always ask for exact properties'. This
> > is the most accepted practice in querying any database, sql, nosql,
> > mongodb, cassandra, etc.
> >
> > That leads us to your Hibernate comment.  In the abstract sense,
> > Vertex is just a map wrapper. I think you're just splitting hairs
> > trying to distinguish a Dog Vertex and a Dog Map. In either case, you
> > would have to query the label.  In any case, I don't know anyone who
> > wants to deal with Vertex/Edges.  What most devs deal with, in my
> > experience, is a domain-specific model.  So whether I get back a
> > Vertex or a Map, either way, I'm going to translate that to my domain
> > model.  Also, in hibernate, when I get a property I didn't query for,
> > I will get a null.  If I set a property, it does not automatically
> > persist it to the database. In your model, there is no difference
> > between transient, in-memory state (e.g. workflow) and database state.
> > BTW, this would also be lots of round trips to the database in your
> > case. Finally, believe it or not, Hibernate attempts to do smart
> > querying where it will actually retrieve only the IDs, then look for
> > them in its second-level cache, if not found, go back to the database
> > to get them.  This is a very common pattern across sql/nosql datastores.
> >
> > So it's not just about becoming more like jdbc but more about a
> > low-level paradigm. To that I agree with you on one thing, the first
> > thing you should do is create a 'baby hibernate' because I don't think
> > gremlin should be an ORM (OGM?).
> >
> >
> >
> > Robert Dale
> >
> > On Thu, Dec 1, 2016 at 2:28 PM, pieter-gmail <[email protected]
> > <mailto:[email protected]>> wrote:
> >
> >     Hi,
> >
> >     "So with ReferenceElements, latency will be less too because it takes
> >     less time to construct the ReferenceVertex than it does to construct
> a
> >     DetachedVertex. Imagine a vertex with 100 properties and meta
> >     properties. ?!"
> >
> >     But ReferencedElement does not have the properties so more round
> trips
> >     are needed increasing latency. One of the first things to make Sqlg
> at
> >     all usable was to make sure that a Vertex contains all of its
> >     properties. Else at least one more call is needed per Vertex. Its a
> >     latency killer. For those mostly few cases where the Vertex is so fat
> >     that it is slow to load and only a few properties are needed then
> >     g.V().hasLabel("label").values("property1", "property2") is used.
> >     So to
> >     my mind ReferencedElement increases latency and does not decreases
> it.
> >
> >     Using ReferencedElement which is hardly an Element at all, after
> >     all it
> >     throws exceptions on almost all of its own interface, the user has to
> >     get the properties manually and then is back in a world of Map and
> >     Lists
> >     of Maps.
> >
> >     A refactor of much existing code will need to toss the Vertex
> >     notion all
> >     together and replace it with Maps and Lists of Maps. Almost like
> >     writing
> >     an application in pure JDBC code with thousands of lines iterating
> >     through ResultSets mapping things back and forth. Unless I am missing
> >     something this change seems huge.
> >
> >     I get that all this is important for non java devs but it be a pity
> if
> >     their problems becomes java devs problems.
> >
> >     Cheers
> >     Pieter
> >
> >
> >     On 01/12/2016 20:38, Marko Rodriguez wrote:
> >     > Hi,
> >     >
> >     > *PIETER REPLIES:*
> >     >
> >     >> One of the first reasons I came to graphs, Neo4j and then
> >     TinkerPop way
> >     >> back was precisely because of the direct access to Node/Vertex.
> >     The user
> >     >> treats it like any other object, not a remote connection. It is
> the
> >     >> embedded nature that makes life so easy. In a way it was like
> >     having a
> >     >> simplistic Hibernate as the core api. 99% of queries we write is
> to
> >     >> retrieve vertices. Not Maps and Lists of something. TinkerPop's
> >     own test
> >     >> suite applies this type of thinking. Querying/modifying
> >     Elements and
> >     >> asserting them. Vertex and Edge abound as first class citizens.
> >     >
> >     > So Graph/Vertex/Edge/VertexProperty/Property will still exist for
> >     > users as objects in the respective GLV language, it is just they
> are
> >     > not “attached” and “rich.”
> >     >
> >     > For instance, in Gremlin-Python, you have:
> >     >
> >     >     v = g.V().next()
> >     >     v.id <http://v.id>
> >     >
> >     > A ReferenceVertex contains the id of the vertex so you can always
> >     > “re-attach” it to the source.
> >     >
> >     >     g.V(v).out()
> >     >
> >     >
> >     >> Graph, Vertex and Edge is the primary abstraction that users
> >     deal with.
> >     >> Having the direct representation of this is very very nice.
> >     >> It makes user code easy and readable.  You know you are dealing
> >     with the
> >     >> "Person/Address/Dog/This/That" entity/label as opposed to just a
> >     >> decontextualized bunch of data, Maps and Lists. If
> >     Vertex/Edge/Property
> >     >> were to disappear I'd say it would be the first call of duty to
> >     write a
> >     >> baby hibernate to bring the property model back in again into
> >     userspace.
> >     >
> >     > Again, the abstraction is still there, but just ALWAYS in a
> >     detached form.
> >     >
> >     >>
> >     >> Regarding jdbc, this kinda makes the point. Sqlg and Hibernate
> >     and many
> >     >> many other tools exists precisely so that users do not need to
> >     use JDBC
> >     >> with endless hardcoded strings guiding the application. Making
> >     TinkerPop
> >     >> more like JDBC is not an obvious plus point.
> >     >
> >     > So, RemoteConnection differs from JDBC in that its not a fat
> string,
> >     > but RemoteConnection.submit(Bytecode). Thus, you still work at the
> >     > GraphTraversal level in every GLV.
> >     >
> >     >> A ReferencedElement is also no good as the problem I experience is
> >     >> latency not bandwidth.
> >     >
> >     > So with ReferenceElements, latency will be less too because it
> takes
> >     > less time to construct the ReferenceVertex than it does to
> >     construct a
> >     > DetachedVertex. Imagine a vertex with 100 properties and meta
> >     > properties. ?!
> >     >
> >     >> I reckon the experience and usage of TinkerPop is rather
> >     different for
> >     >> java and non java people and perhaps even java folks. Hopefully
> >     I am not
> >     >> the only one who have made such heavy happy use of the TinkerPop
> >     >> property meta model and would be sad to see it go.
> >     >>
> >     >> Cheers
> >     >> Pieter
> >     >>
> >     >
> >     >
> >     > *ROBERT REPLIES:*
> >     >
> >     >> I agree the focus should be on the Connection (being separate from
> >     >> Graph) and Traversal. I wouldn't constrain it to
> >     "RemoteConnection",
> >     >> just Connection or GraphConnection. Perhaps there's an
> >     >> EmbeddedConnection and a RemoteConnection or maybe it's
> >     URI-oriented
> >     >> similar to how JDBC does it. In either case, the behavior  of
> >     Remote
> >     >> and Embedded is the same which is what I think we're striving for.
> >     >
> >     > Yes. Good point. Just Connection.
> >     >
> >     >> I would also like to see Transactions be Connection-oriented. With
> >     >> the right API, it could hook into JTA and be able to take
> advantage
> >     >> of various annotations for marking transaction boundaries.
> >     >
> >     >     g = g.openTx()
> >     >     g.V().out().out()
> >     >     g.addV()
> >     >     g.V(1).addE().to(2)
> >     >     g.closeTx();
> >     >
> >     >
> >     > ??? This way, its all about GraphTraversalSource/GraphTraversal.
> >     That
> >     > is truly the “connection” where the Connection implementation is
> >     just
> >     > provider/machine specific shuffling of Bytecode in and
> >     Traversers out.
> >     >
> >     >> Are there features of a lambda that couldn't be replaced by a more
> >     >> feature-rich gremlin?
> >     >> g.V().out('knows').map{it.get().value('name') + ' is the friend
> >     name'}
> >     >> g.V().out('knows').map(lambda(concat(__.it.get().value('name'),
> >     ' is
> >     >> the friend name’))
> >     >
> >     > So we currently have the concept of g:Lambda and this allows for
> >     > lambdas to be used remotely.
> >     >
> >     >     g.V().map(function(“it.get().label()”)) // Gremlin-Java
> >     traversal
> >     >     with a Gremlin-Groovy lambda.
> >     >
> >     >
> >     > The crappy thing is that the lambda is always a String.
> >     >
> >     >> Reference-only makes total sense. This works really well
> especially
> >     >> with a local cache or for use cases where most of the data is
> >     stored
> >     >> in a separate database. I think it would lend itself nicely to
> lazy
> >     >> loading. When you need values there are options for that as well
> >     >> (properties/values/valueMap).  One of the problems with 'attached'
> >     >> elements is you don't know what the implementation does. So
> >     >> potentially every get or set property call is going to the
> database
> >     >> and you don't realize it. That can hurt performance and have
> >     >> unintended consequences.
> >     >
> >     > Dude, I’ve been saying this forever. DetachedXXX is a bad idea
> >     for the
> >     > reasons you have stipulated. Just imagine:
> >     >
> >     >     g.V(1).out(‘knows')
> >     >
> >     >
> >     > The GraphSON return is every vertex 1 knows and all its
> >     properties and
> >     > meta properties?!?! If you wanted that data too you would have
> >     queried
> >     > for it.
> >     >
> >     > Marko.
> >     > --
> >     > You received this message because you are subscribed to the Google
> >     > Groups "Gremlin-users" group.
> >     > To unsubscribe from this group and stop receiving emails from
> >     it, send
> >     > an email to [email protected]
> >     <mailto:gremlin-users%[email protected]>
> >     > <mailto:[email protected]
> >     <mailto:gremlin-users%[email protected]>>.
> >     > To view this discussion on the web visit
> >     >
> >     https://groups.google.com/d/msgid/gremlin-users/7CBD403D-
> 4EC3-4B4B-AFF9-9A54B4D3C4EF%40gmail.com
> >     <https://groups.google.com/d/msgid/gremlin-users/7CBD403D-
> 4EC3-4B4B-AFF9-9A54B4D3C4EF%40gmail.com>
> >     >
> >     <https://groups.google.com/d/msgid/gremlin-users/7CBD403D-
> 4EC3-4B4B-AFF9-9A54B4D3C4EF%40gmail.com?utm_medium=email&utm_source=footer
> >     <https://groups.google.com/d/msgid/gremlin-users/7CBD403D-
> 4EC3-4B4B-AFF9-9A54B4D3C4EF%40gmail.com?utm_medium=email&utm_source=footer
> >>.
> >     > For more options, visit https://groups.google.com/d/optout
> >     <https://groups.google.com/d/optout>.
> >
> >     --
> >     You received this message because you are subscribed to the Google
> >     Groups "Gremlin-users" group.
> >     To unsubscribe from this group and stop receiving emails from it,
> >     send an email to [email protected]
> >     <mailto:gremlin-users%[email protected]>.
> >     To view this discussion on the web visit
> >     https://groups.google.com/d/msgid/gremlin-users/79132fdd-
> f67f-5c3c-f8e3-87ab80f3c6f9%40gmail.com
> >     <https://groups.google.com/d/msgid/gremlin-users/79132fdd-
> f67f-5c3c-f8e3-87ab80f3c6f9%40gmail.com>.
> >     For more options, visit https://groups.google.com/d/optout
> >     <https://groups.google.com/d/optout>.
> >
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "Gremlin-users" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> > an email to [email protected]
> > <mailto:[email protected]>.
> > To view this discussion on the web visit
> > https://groups.google.com/d/msgid/gremlin-users/CABed_
> 4qE89f4oqZPQGjRXP8hn4kQpqVUiE%3DGq%2Bnvu_XfTQ_mWw%40mail.gmail.com
> > <https://groups.google.com/d/msgid/gremlin-users/CABed_
> 4qE89f4oqZPQGjRXP8hn4kQpqVUiE%3DGq%2Bnvu_XfTQ_mWw%40mail.
> gmail.com?utm_medium=email&utm_source=footer>.
> > For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Gremlin-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/gremlin-users/31db7eef-046b-465f-13ea-0044a10da18c%40gmail.com.
> For more options, visit https://groups.google.com/d/optout.
>

Reply via email to