Re: [DISCUSS] Null Handling 3.5.x

Jorge Bay Gondra Mon, 03 Jun 2019 03:23:59 -0700

I think having a null literal makes sense. It plays well with existent SQL
providers, where there are representations for null:
https://docs.microsoft.com/en-us/dotnet/api/system.dbnull.value


I would propose another literal: UNSET. On some db providers, there's a
distinction between NULL, that causes a write to occur (overwriting
existing value), and UNSET, which is ignored at write time.

See UNSET ticket in Cassandra:
https://issues.apache.org/jira/browse/CASSANDRA-7304

On Mon, Jun 3, 2019 at 10:49 AM Dmitry Novikov <dmitry.novi...@neueda.com>
wrote:

> Hello Stephen,
>
> Sounds like a great idea!
>
> One more use case, returning `null` object in case property does not exist:
>
> g.V().limit(1).coalesce(values('notSureIfExists'),
> constant(Null.instance()))
>
> This would be very useful when working with steps that may fail on not
> existing value. For example `project` step:
>
> gremlin> g.V().limit(1).project('a',
> 'b').by(values('name')).by(values('notSureIfExists'))
> The provided traverser does not map to a value:
> v[1]->[PropertiesStep([notSureIfExists],value)]
>
> Could be improved:
>
> gremlin> g.V().limit(1).project('a',
> 'b').by(values('name')).by(coalesce(values('notSureIfExists'),
> constant(Null.instance())))
> ==>[a:marko,b:null]
>
> `null` is better than any custom constant here, because it clearly
> represents a missing value.
>
> To avoid necessity for `null` guards, it could be defined: "steps that
> take `null` as input will produce `null` as output":
>
> gremlin> g.inject(Null.instance()).id()
> ==> null
> gremlin> g.inject(Null.instance()).math("_ + 1")
> ==> null
> gremlin> g.inject(Null.instance()).properties().as('a').key()
> ==> null
>
> I see two approaches how to handle `null` in aggregation steps:
>
> Steps like `max`, `count`... may either fail on `null` object, requiring
> to use predicate:
>
> gremlin> g.inject(1).inject(Null.instance()).inject(3).max()
> Max step does not work with `null` values
> gremlin>
> g.inject(1).inject(Null.instance()).inject(3).is(neq(Null.instance())).max()
> ==>3
>
> Alternatively exclude `null` values from calculation:
>
> gremlin> g.inject(1).inject(Null.instance()).inject(3).max()
> ==>3
> gremlin> g.inject(1).inject(Null.instance()).inject(3).count()
> ==>2
>
> On 2019/05/31 17:01:34, Stephen Mallette <spmalle...@gmail.com> wrote:
> > I just spent some time fixing:
> >
> > https://issues.apache.org/jira/browse/TINKERPOP-2099
> >
> > which dealt with inconsistencies in null handling for property() step
> when
> > there is a null value. That's all nice now, but null handling still isn't
> > so good overall. It's generally inconsistent in how it behaves in a
> variety
> > of uses in Gremlin - here's a couple examples:
> >
> > gremlin> g.inject(null)
> > java.lang.NullPointerException
> > Type ':help' or ':h' for help.
> > Display stack trace? [yN]n
> > gremlin> g.V().constant(null)
> > gremlin>
> >
> > I've also heard the concern on several occasions that mutation traversals
> > are often difficult to write when you want to remove a property and
> update
> > others at the same time, because it forces you into conditional logic
> where
> > you have to somehow work in a side effect of property("name").drop() as
> > opposed to just inlining property('name',null).
> >
> > I think we should be a bit more respectful of the concept of null with
> > Gremlin and while we probably shouldn't allow a literal null into the
> > traversal stream, it seems like we could provide for our own Null class
> > that could be used in it's place where users/providers needed it, so that
> > we could do:
> >
> > gremlin> g.inject(Null.instance())
> > ==> null
> > gremlin> g.V(1).property("x", 1).property("y",
> > Null.instance()).property("z", 2))
> > ==> v[1]
> >
> > Perhaps we'd add a new Graph.Feature to allow providers to specify how
> they
> > handle such things. Taking this approach creates a position where we
> aren't
> > really changing core engine behavior. Instead, we're just adding a marker
> > that can be used by providers/Gremlin to identify the notion of null and
> > then updating serialization/GLVs to support it.
> >
> > Haven't thought much past that point. Any other implications of taking
> this
> > direction?
> >
>

Re: [DISCUSS] Null Handling 3.5.x

Reply via email to