Hi all,

PR for Properties on Elements is ready for community review
https://github.com/apache/tinkerpop/pull/1843

Regards, Valentyn


On Fri, Feb 3, 2023 at 3:54 PM Valentyn Kahamlyk <valent...@bitquilltech.com>
wrote:

> Hi all,
>
> I would like to share an update about implementation of Properties on 
> Elements.
>
> 1. For OLTP queries:
> Properties serialization can be configured with
> g.with('materializeProperties', 'tokens').V()
> g.with('materializeProperties', 'all').V()
> By default all properties will be returned.
>
> Some technical details about implementation:
> - default server configuration will not contain ReferenceElementStrategy for 
> TraversalSource's by default;
> - haltedTraverserStrategy removed from TraverserIterator;
> - property detachment handled in AbstractOpProcessor;
>
> 2. For OLAP queries:
> The current behavior is saved, no properties returned.
> To get the properties you can use as before
> g.withComputer().withStrategies(HaltedTraverserFactoryStrategy.detached())
>
> Request will fail with error when try to use 'materializeProperties' with 
> OLAP query.
>
> Regards, Valentyn
>
>
> On Tue, Dec 13, 2022 at 3:36 AM Stephen Mallette <spmalle...@gmail.com>
> wrote:
>
>> I'm ok with all of Dave's suggestions. There was also a consideration for
>> backward compatibility to getting a ReferenceElement with this mechanism,
>> but given use of Token, perhaps we drop that too. Folks would need to
>> continue to use withStrategies(ReferenceElementStrategy) to keep getting
>> references if they wanted that.
>>
>> On Mon, Dec 12, 2022 at 8:38 PM David Bechberger <d...@bechberger.com>
>> wrote:
>>
>> > I like using the with() approach over a strategy, as I think it will be
>> > more user-friendly and discoverable for customers.
>> >
>> > I think my main comments are:
>> >
>> > 1) Let's keep it simple to start with and just allow "All" or "Tokens".
>> > Custom will lead us down a rabbit hole of how far we want to go with
>> allow
>> > different options per label etc.  We can release with a simple v1 and
>> > gather feedback from users on how/if they need more granularity.
>> > Meanwhile, customers can still get this behavior through the use of the
>> > current steps such as valueMap()
>> > 2) I think we should leverage the same tokens as we do with valueMap()
>> to
>> > help show that they are connected.
>> > 3) "detach" is not a word that most people would associate with
>> returning
>> > properties.  I suggest something like "materializeProperties"
>> >
>> > Combining these together we get the following as the proposed syntax:
>> > g.with(‘materializeProperties’, T.tokens).V()
>> > g.with(‘materializeProperties’, T.all).V()
>> >
>> > As far as the default goes, no matter which way we choose, someone will
>> not
>> > be happy.  If we choose to default to "tokens", then we are in the same
>> > behavior as currently, which benefits existing applications while making
>> > new applications unhappy.  If we choose to default to "all", then this
>> may
>> > be a breaking change for applications but will smooth the on ramp for
>> new
>> > customers as the current behavior is unexpected.  Between these two, I
>> > would vote for going with the breaking behavior to simplify the new
>> > developer experience.  While it may break some users, I have not often
>> seen
>> > users returning the elements, as it is usually a combination of just an
>> id
>> > or label explicitly or those tokens with all or a subset of properties.
>> >
>> > Another advantage to defaulting to returning all the materialized
>> values is
>> > that this will benefit many visualization tools that need to manually
>> > request this information today to render correctly.
>> >
>> > Thoughts?
>> >
>> > Dave
>> >
>> > On Thu, Dec 8, 2022 at 7:41 AM Stephen Mallette <spmalle...@gmail.com>
>> > wrote:
>> >
>> > > i was trying to summarize the decisions we're trying to make on this
>> > issue
>> > > and it occurred to me that the problems we're facing here raise some
>> > > fundamental questions about the original direction i'd proposed. i
>> long
>> > ago
>> > > chose a strategy approach to doing this, mostly because we have some
>> > > history of doing that with HaltedTraverserStrategy and
>> > > ReferenceElementStrategy, but I can't help wondering now if that
>> pattern
>> > > should be followed. It's troubling me that DetachStrategy,
>> > > ReferenceElementStrategy , etc. really don't have any relevance to
>> > embedded
>> > > use cases (aside from the very specific case of OLAP with
>> > > HaltedTraverserStrategy). They are really just mechanisms to help
>> control
>> > > serialization costs and as such are more request/response oriented as
>> > > opposed to traversal oriented. If we treated it that way maybe some of
>> > > these hard decisions would go away.
>> > >
>> > > I was thinking that maybe we could add “detachment” to RequestOptions
>> and
>> > > use the same syntax already established for DetachStrategy in there.
>> > Hooked
>> > > up properly this would allow something like:
>> > >
>> > > g.with(‘detach’, Detach.custom("name", "age")).V()
>> > >
>> > > On the server, we’d then just have to extract that Detach from the
>> > > RequestMessage and apply it on serialization as needed.
>> > >
>> > > I see several advantages with this revised design:
>> > >
>> > > 1. No more worries about the extra step in profile() - which is even a
>> > > problem now in ReferenceElementStrategy
>> > > 2. This works as easily for scripts as it does for bytecode
>> > > 3. We don’t mix this capability up with embedded use cases, so
>> there’s no
>> > > confusion about when/why you use this.
>> > >
>> > > Going further with the above Gremlin syntax, maybe the argument to
>> with()
>> > > should be like:
>> > >
>> > > g.with(‘detach’, "ALL").V()
>> > > g.with(‘detach’, "NONE").V()
>> > > g.with(‘detach’, ["name", "age"]).V() // a list implies CUSTOM
>> > >
>> > > The advantage here is that we don’t have to change the grammar in
>> > > gremlin-language and the serializers don’t need to change at all. We
>> > could
>> > > retain the Detach.custom("name", "age") or similar syntax for
>> Java/Groovy
>> > > and come up with similar sugar for the other languages.
>> > >
>> > > I think that still leaves the question of what the default should be
>> on
>> > the
>> > > server for 3.7.0 - perhaps we come back to that once there is
>> feedback on
>> > > this design revision.
>> > >
>> > >
>> > >
>> > > On Tue, Nov 22, 2022 at 7:06 AM Stephen Mallette <
>> spmalle...@gmail.com>
>> > > wrote:
>> > >
>> > > > > My proposal is to always have some detachment strategy in server
>> > > > configuration and add `DetachedStrategy` to allow users the
>> flexibility
>> > > to
>> > > > customize the result.
>> > > >
>> > > > I think default detachment on the server should remain "reference"
>> so
>> > if
>> > > > you can do that with your new DetachStrategy then that is preferred
>> > with
>> > > > the idea that HaltedTraverserStrategy is no longer useful and can be
>> > > > deprecated.
>> > > >
>> > > > > How to handle more than one DetachedStrategy?
>> > > >
>> > > > I agree that they shouldn't stack - DetachedStrategy should not be
>> used
>> > > > for some sort of security use case. A client-side one should replace
>> > the
>> > > > server-side one.
>> > > >
>> > > > >  Another question is whether or not to show `DetachElementStep` in
>> > the
>> > > > response for  profiling requests like `g.V().profile()`.
>> > > >
>> > > > This is the hardest question and goes back to the first because you
>> > will
>> > > > be using DetachStrategy by default on the server side and, as a
>> result,
>> > > > when the user does their g.V().profile() without using
>> DetachStrategy
>> > > they
>> > > > will see your DetachElementStep even though they didn't add one on
>> the
>> > > > client side. Right now we hide the detachment behind
>> > > > HaltedTraverserStrategy (for bytecode based requests) and in the
>> > default
>> > > > configuration of ReferenceElementStrategy in the startup script. The
>> > > latter
>> > > > makes the former a bit redundant actually.
>> > > >
>> > > > It seems that for script based requests we see ReferenceElementStep
>> now
>> > > in
>> > > > the profile():
>> > > >
>> > > > gremlin> :> g.V().profile()
>> > > > ==>Traversal Metrics
>> > > > Step
>> >  Count
>> > > >  Traversers       Time (ms)    % Dur
>> > > >
>> > > >
>> > >
>> >
>> =============================================================================================================
>> > > > TinkerGraphStep(vertex,[])
>> >  6
>> > > >         6           0.623    37.21
>> > > > ReferenceElementStep
>> >  6
>> > > >         6           1.052    62.79
>> > > >                                             >TOTAL
>> >  -
>> > > >         -           1.676        -
>> > > >
>> > > > I don't particularly like that but it might be helpful to see that
>> the
>> > > > detachment is explicitly happening by way of a step.This line of
>> > thinking
>> > > > has me wondering if we should remove the default detachment all
>> > together
>> > > > from Gremlin Server configuration:
>> > > >
>> > > > 1. Most folks should not be relying on ReferenceElementStrategy to
>> > > protect
>> > > > them from fat elements, because we've spent years pushing them
>> toward
>> > > > valueMap(), project(), etc. If they suddenly started getting
>> properties
>> > > on
>> > > > elements I sense that wouldn't affect many people negatively.
>> > > > 2. With ease of use and intuitive behavior in mind, new users will
>> > expect
>> > > > properties to be on their elements.
>> > > > 3. Removing default detachment would make it so that the
>> > > DetachElementStep
>> > > > only appears when the user explicitly sets it themselves. There
>> would
>> > be
>> > > no
>> > > > surprise that it is there in the profile().
>> > > > 4. The notion of default detachment would then fall to the
>> serializers
>> > > > which would just include all properties in a DetachedElement.
>> > > >
>> > > > If TinkerPop took this route then we'd have to be sure to warn
>> users of
>> > > > (1) possible increased serialization times  if they were relying on
>> > > > elements having no properties and (2) the fact that if they wrote
>> their
>> > > > code to explicitly use a ReferenceVertex they will need to add
>> > > "reference"
>> > > > detachment or generalize their code to the Element interfaces. Maybe
>> > some
>> > > > other stuff too?
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > On Fri, Nov 18, 2022 at 5:30 PM Valentyn Kahamlyk
>> > > > <valent...@bitquilltech.com.invalid> wrote:
>> > > >
>> > > >> Hi everyone,
>> > > >>
>> > > >> I'm working on properties on elements (
>> > > >> https://issues.apache.org/jira/browse/TINKERPOP-2824).
>> > > >> I found HaltedTraverserStrategy` in `TraverserIterator` that does
>> > about
>> > > >> the
>> > > >> same, converting Elements  to detached or references. Tests use
>> > > reference
>> > > >> Elements, and detached is default for production. GLV's possibly
>> can
>> > > >> override strategy to get different results.
>> > > >> My implementation of `DetachedStrategy` is to add the last step to
>> the
>> > > >> traversal which modifies the result,  so, technically g.V() becomes
>> > > >> g.V().[DetachStep].
>> > > >> Also default server configuration adds ReferenceElementStrategy
>> > > >>
>> > > >>
>> > >
>> >
>> https://github.com/apache/tinkerpop/blob/master/gremlin-server/scripts/empty-sample.groovy#L45
>> > > >> to each traversal.
>> > > >>
>> > > >> 1. So I end up with 3 ways to cut properties and I'm not sure how
>> to
>> > > >> handle
>> > > >> it. What options would you suggest?
>> > > >> My proposal is to always have some detachment strategy in server
>> > > >> configuration and add `DetachedStrategy` to allow users the
>> > flexibility
>> > > to
>> > > >> customize the result. `HaltedTraverserStrategy` can be marked
>> > deprecated
>> > > >> as
>> > > >> it handles only bytecode traversal, but not used for scripts.
>> > > >> Another side of the problem is the need to detach custom vertex
>> > > >> implementations like `TinkerVertex`.
>> > > >>
>> > > >> 2. How to handle more than one `DetachedStrategy`? For example one
>> can
>> > > be
>> > > >> set in Gremlin Server configuration and other in user traversal.
>> > > >> Options:
>> > > >> a) apply all. Each successive DetachStrategy becomes more
>> restrictive.
>> > > >> b) apply only last one. Use case: there are default server
>> > > configurations
>> > > >> to return only some properties and the user can override it for
>> each
>> > > >> request.
>> > > >> I think both options make sense, but I prefer the second.
>> > > >>
>> > > >> 3. Another question is whether or not to show `DetachElementStep`
>> in
>> > the
>> > > >> response for  profiling requests like `g.V().profile()`. Users may
>> be
>> > > >> confused at the appearance of a step that they did not add, but on
>> the
>> > > >> other hand it will be an explanation why there are no properties in
>> > the
>> > > >> response.
>> > > >> Is it better to hide `DetachElementStep` or vice versa?
>> > > >>
>> > > >> Thanks, Valentyn
>> > > >>
>> > > >
>> > >
>> >
>>
>

Reply via email to