Hi all, PR for Properties on Elements is ready for community review https://github.com/apache/tinkerpop/pull/1843
Regards, Valentyn On Fri, Feb 3, 2023 at 3:54 PM Valentyn Kahamlyk <valent...@bitquilltech.com> wrote: > Hi all, > > I would like to share an update about implementation of Properties on > Elements. > > 1. For OLTP queries: > Properties serialization can be configured with > g.with('materializeProperties', 'tokens').V() > g.with('materializeProperties', 'all').V() > By default all properties will be returned. > > Some technical details about implementation: > - default server configuration will not contain ReferenceElementStrategy for > TraversalSource's by default; > - haltedTraverserStrategy removed from TraverserIterator; > - property detachment handled in AbstractOpProcessor; > > 2. For OLAP queries: > The current behavior is saved, no properties returned. > To get the properties you can use as before > g.withComputer().withStrategies(HaltedTraverserFactoryStrategy.detached()) > > Request will fail with error when try to use 'materializeProperties' with > OLAP query. > > Regards, Valentyn > > > On Tue, Dec 13, 2022 at 3:36 AM Stephen Mallette <spmalle...@gmail.com> > wrote: > >> I'm ok with all of Dave's suggestions. There was also a consideration for >> backward compatibility to getting a ReferenceElement with this mechanism, >> but given use of Token, perhaps we drop that too. Folks would need to >> continue to use withStrategies(ReferenceElementStrategy) to keep getting >> references if they wanted that. >> >> On Mon, Dec 12, 2022 at 8:38 PM David Bechberger <d...@bechberger.com> >> wrote: >> >> > I like using the with() approach over a strategy, as I think it will be >> > more user-friendly and discoverable for customers. >> > >> > I think my main comments are: >> > >> > 1) Let's keep it simple to start with and just allow "All" or "Tokens". >> > Custom will lead us down a rabbit hole of how far we want to go with >> allow >> > different options per label etc. We can release with a simple v1 and >> > gather feedback from users on how/if they need more granularity. >> > Meanwhile, customers can still get this behavior through the use of the >> > current steps such as valueMap() >> > 2) I think we should leverage the same tokens as we do with valueMap() >> to >> > help show that they are connected. >> > 3) "detach" is not a word that most people would associate with >> returning >> > properties. I suggest something like "materializeProperties" >> > >> > Combining these together we get the following as the proposed syntax: >> > g.with(‘materializeProperties’, T.tokens).V() >> > g.with(‘materializeProperties’, T.all).V() >> > >> > As far as the default goes, no matter which way we choose, someone will >> not >> > be happy. If we choose to default to "tokens", then we are in the same >> > behavior as currently, which benefits existing applications while making >> > new applications unhappy. If we choose to default to "all", then this >> may >> > be a breaking change for applications but will smooth the on ramp for >> new >> > customers as the current behavior is unexpected. Between these two, I >> > would vote for going with the breaking behavior to simplify the new >> > developer experience. While it may break some users, I have not often >> seen >> > users returning the elements, as it is usually a combination of just an >> id >> > or label explicitly or those tokens with all or a subset of properties. >> > >> > Another advantage to defaulting to returning all the materialized >> values is >> > that this will benefit many visualization tools that need to manually >> > request this information today to render correctly. >> > >> > Thoughts? >> > >> > Dave >> > >> > On Thu, Dec 8, 2022 at 7:41 AM Stephen Mallette <spmalle...@gmail.com> >> > wrote: >> > >> > > i was trying to summarize the decisions we're trying to make on this >> > issue >> > > and it occurred to me that the problems we're facing here raise some >> > > fundamental questions about the original direction i'd proposed. i >> long >> > ago >> > > chose a strategy approach to doing this, mostly because we have some >> > > history of doing that with HaltedTraverserStrategy and >> > > ReferenceElementStrategy, but I can't help wondering now if that >> pattern >> > > should be followed. It's troubling me that DetachStrategy, >> > > ReferenceElementStrategy , etc. really don't have any relevance to >> > embedded >> > > use cases (aside from the very specific case of OLAP with >> > > HaltedTraverserStrategy). They are really just mechanisms to help >> control >> > > serialization costs and as such are more request/response oriented as >> > > opposed to traversal oriented. If we treated it that way maybe some of >> > > these hard decisions would go away. >> > > >> > > I was thinking that maybe we could add “detachment” to RequestOptions >> and >> > > use the same syntax already established for DetachStrategy in there. >> > Hooked >> > > up properly this would allow something like: >> > > >> > > g.with(‘detach’, Detach.custom("name", "age")).V() >> > > >> > > On the server, we’d then just have to extract that Detach from the >> > > RequestMessage and apply it on serialization as needed. >> > > >> > > I see several advantages with this revised design: >> > > >> > > 1. No more worries about the extra step in profile() - which is even a >> > > problem now in ReferenceElementStrategy >> > > 2. This works as easily for scripts as it does for bytecode >> > > 3. We don’t mix this capability up with embedded use cases, so >> there’s no >> > > confusion about when/why you use this. >> > > >> > > Going further with the above Gremlin syntax, maybe the argument to >> with() >> > > should be like: >> > > >> > > g.with(‘detach’, "ALL").V() >> > > g.with(‘detach’, "NONE").V() >> > > g.with(‘detach’, ["name", "age"]).V() // a list implies CUSTOM >> > > >> > > The advantage here is that we don’t have to change the grammar in >> > > gremlin-language and the serializers don’t need to change at all. We >> > could >> > > retain the Detach.custom("name", "age") or similar syntax for >> Java/Groovy >> > > and come up with similar sugar for the other languages. >> > > >> > > I think that still leaves the question of what the default should be >> on >> > the >> > > server for 3.7.0 - perhaps we come back to that once there is >> feedback on >> > > this design revision. >> > > >> > > >> > > >> > > On Tue, Nov 22, 2022 at 7:06 AM Stephen Mallette < >> spmalle...@gmail.com> >> > > wrote: >> > > >> > > > > My proposal is to always have some detachment strategy in server >> > > > configuration and add `DetachedStrategy` to allow users the >> flexibility >> > > to >> > > > customize the result. >> > > > >> > > > I think default detachment on the server should remain "reference" >> so >> > if >> > > > you can do that with your new DetachStrategy then that is preferred >> > with >> > > > the idea that HaltedTraverserStrategy is no longer useful and can be >> > > > deprecated. >> > > > >> > > > > How to handle more than one DetachedStrategy? >> > > > >> > > > I agree that they shouldn't stack - DetachedStrategy should not be >> used >> > > > for some sort of security use case. A client-side one should replace >> > the >> > > > server-side one. >> > > > >> > > > > Another question is whether or not to show `DetachElementStep` in >> > the >> > > > response for profiling requests like `g.V().profile()`. >> > > > >> > > > This is the hardest question and goes back to the first because you >> > will >> > > > be using DetachStrategy by default on the server side and, as a >> result, >> > > > when the user does their g.V().profile() without using >> DetachStrategy >> > > they >> > > > will see your DetachElementStep even though they didn't add one on >> the >> > > > client side. Right now we hide the detachment behind >> > > > HaltedTraverserStrategy (for bytecode based requests) and in the >> > default >> > > > configuration of ReferenceElementStrategy in the startup script. The >> > > latter >> > > > makes the former a bit redundant actually. >> > > > >> > > > It seems that for script based requests we see ReferenceElementStep >> now >> > > in >> > > > the profile(): >> > > > >> > > > gremlin> :> g.V().profile() >> > > > ==>Traversal Metrics >> > > > Step >> > Count >> > > > Traversers Time (ms) % Dur >> > > > >> > > > >> > > >> > >> ============================================================================================================= >> > > > TinkerGraphStep(vertex,[]) >> > 6 >> > > > 6 0.623 37.21 >> > > > ReferenceElementStep >> > 6 >> > > > 6 1.052 62.79 >> > > > >TOTAL >> > - >> > > > - 1.676 - >> > > > >> > > > I don't particularly like that but it might be helpful to see that >> the >> > > > detachment is explicitly happening by way of a step.This line of >> > thinking >> > > > has me wondering if we should remove the default detachment all >> > together >> > > > from Gremlin Server configuration: >> > > > >> > > > 1. Most folks should not be relying on ReferenceElementStrategy to >> > > protect >> > > > them from fat elements, because we've spent years pushing them >> toward >> > > > valueMap(), project(), etc. If they suddenly started getting >> properties >> > > on >> > > > elements I sense that wouldn't affect many people negatively. >> > > > 2. With ease of use and intuitive behavior in mind, new users will >> > expect >> > > > properties to be on their elements. >> > > > 3. Removing default detachment would make it so that the >> > > DetachElementStep >> > > > only appears when the user explicitly sets it themselves. There >> would >> > be >> > > no >> > > > surprise that it is there in the profile(). >> > > > 4. The notion of default detachment would then fall to the >> serializers >> > > > which would just include all properties in a DetachedElement. >> > > > >> > > > If TinkerPop took this route then we'd have to be sure to warn >> users of >> > > > (1) possible increased serialization times if they were relying on >> > > > elements having no properties and (2) the fact that if they wrote >> their >> > > > code to explicitly use a ReferenceVertex they will need to add >> > > "reference" >> > > > detachment or generalize their code to the Element interfaces. Maybe >> > some >> > > > other stuff too? >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > On Fri, Nov 18, 2022 at 5:30 PM Valentyn Kahamlyk >> > > > <valent...@bitquilltech.com.invalid> wrote: >> > > > >> > > >> Hi everyone, >> > > >> >> > > >> I'm working on properties on elements ( >> > > >> https://issues.apache.org/jira/browse/TINKERPOP-2824). >> > > >> I found HaltedTraverserStrategy` in `TraverserIterator` that does >> > about >> > > >> the >> > > >> same, converting Elements to detached or references. Tests use >> > > reference >> > > >> Elements, and detached is default for production. GLV's possibly >> can >> > > >> override strategy to get different results. >> > > >> My implementation of `DetachedStrategy` is to add the last step to >> the >> > > >> traversal which modifies the result, so, technically g.V() becomes >> > > >> g.V().[DetachStep]. >> > > >> Also default server configuration adds ReferenceElementStrategy >> > > >> >> > > >> >> > > >> > >> https://github.com/apache/tinkerpop/blob/master/gremlin-server/scripts/empty-sample.groovy#L45 >> > > >> to each traversal. >> > > >> >> > > >> 1. So I end up with 3 ways to cut properties and I'm not sure how >> to >> > > >> handle >> > > >> it. What options would you suggest? >> > > >> My proposal is to always have some detachment strategy in server >> > > >> configuration and add `DetachedStrategy` to allow users the >> > flexibility >> > > to >> > > >> customize the result. `HaltedTraverserStrategy` can be marked >> > deprecated >> > > >> as >> > > >> it handles only bytecode traversal, but not used for scripts. >> > > >> Another side of the problem is the need to detach custom vertex >> > > >> implementations like `TinkerVertex`. >> > > >> >> > > >> 2. How to handle more than one `DetachedStrategy`? For example one >> can >> > > be >> > > >> set in Gremlin Server configuration and other in user traversal. >> > > >> Options: >> > > >> a) apply all. Each successive DetachStrategy becomes more >> restrictive. >> > > >> b) apply only last one. Use case: there are default server >> > > configurations >> > > >> to return only some properties and the user can override it for >> each >> > > >> request. >> > > >> I think both options make sense, but I prefer the second. >> > > >> >> > > >> 3. Another question is whether or not to show `DetachElementStep` >> in >> > the >> > > >> response for profiling requests like `g.V().profile()`. Users may >> be >> > > >> confused at the appearance of a step that they did not add, but on >> the >> > > >> other hand it will be an explanation why there are no properties in >> > the >> > > >> response. >> > > >> Is it better to hide `DetachElementStep` or vice versa? >> > > >> >> > > >> Thanks, Valentyn >> > > >> >> > > > >> > > >> > >> >