Hi Ken,

Thanks for bringing this up, I believe topic warrants some further discussion. 
My understanding of the intent of the current system is that it aims to provide 
a consistent and predictable set of rules for comparisons between any 
datatypes. Prior to 3.6, in general comparisons between different types in 
gremlin produced undefined behaviour (in practice this usually meant an 
exception). The current system successfully resolved much of this issue 
although it has introduced certain semantic consistency issues (see 
https://issues.apache.org/jira/browse/TINKERPOP-2940). Further, while the docs 
(https://tinkerpop.apache.org/docs/3.7.0/dev/provider/#_ternary_boolean_logics) 
are quite clear regarding the propagation/reduction behaviour in many cases, as 
you probe the edges it becomes muddier.

Considering the following example, the docs quite clearly define the expected 
behaviour of the first traversal, but the expected behaviour is not clear 
outside of basic combinations of AND, OR, and NOT:

gremlin> g.inject(1).not(is(gt("one")))
// Produces no output
gremlin> g.inject(1).not(union(is(gt("one")), is(eq("zero"))))
==>1 // Error is reduced to false prior to Union Step, and thus not propagated 
into the Not Step.

This is a good example that we are currently in a bit of a weird place where 
some of the language semantics are formally defined in documentation, while the 
rest of the language semantics are defined by implementation. It currently 
cannot be determined if the above example is expected or a bug. I believe it is 
important that we find a resolution to this by expanding our formally defined 
semantics or changing the implementation (when a breaking change is 
permittable).

As for the short-term question posed by ANY and ALL, my only concern with your 
suggestion is it would be subject to the following inconsistency although as 
shown above there is current precedent for this sort of thing.

gremlin> g.inject(1).not(is(lt("one")))
// Produces no output
gremlin> g.inject([1]).not(any(is(lt("one"))))
==>[1]

In my opinion the most neutral direction would be for ANY to behave the same as 
a chain of OR’s and for ALL to act as a chain of ANDs. However, it makes sense 
for this short-term decision to align with our long-term direction regarding 
comparability semantics. I wouldn’t be opposed to your proposed implementation 
if the long-term plan is to move all steps towards this immediate reduction 
behaviour.

Thanks,

Cole Greer


From: Ken Hu <k...@bitquilltech.com.INVALID>
Date: Monday, September 11, 2023 at 4:16 PM
To: dev@tinkerpop.apache.org <dev@tinkerpop.apache.org>
Subject: [DISCUSS] Ternary Boolean Handling in New Steps
Hi All,

Starting in version 3.6, the ternary boolean system was introduced to
handle comparison/equality tests within Gremlin. Recently, I've been
implementing some list functions from Proposal 3 which make heavy use of
the GremlinValueComparator to determine if values satisfy a specific
condition. However, I'm finding it a bit tricky to understand how I should
handle the GremlinTypeErrorException. For any() and all(), it seems like it
would make sense to immediately reduce any ERROR state to false as it's a
filter step. In the case of all(), if a GremlinTypeErrorException is
caught, it would mean there was a comparison error so the traverser should
be removed from the stream. However, doing this seemingly clashes with the
original intention of ternary boolean which is to allow a provider-specific
response on how to handle an ERROR state.

My current thoughts are that we should rework the ternary boolean system in
the future to make it easier to incorporate it into new steps. One of the
trickiest parts is that it uses unchecked exceptions as a means to
implement the ERROR state which can get easily missed or accidentally
leaked to the user (which has happened before). For now, I'm planning to go
ahead and immediately reduce ERROR states as I think that is what makes the
most sense for list functions.

Does anyone have any thoughts about this?

Thanks,
Ken

Reply via email to