Hi Stephen,

Responses inline.

On Wed, Jun 9, 2021 at 4:04 AM Stephen Mallette <spmalle...@gmail.com>
wrote:

> Thanks for the update Josh
>
> [...]
> >    reasonably clear how to make that transition. In the beginning, you
> > either
> >    have a schema or you don't.
> >
>
> Could you clarify who is making that choice? Is it the provider saying
> their graph supports schema or not? or did you mean the user is making that
> choice somehow and TinkerPop would thus enforce the schema?
>


At first, I don't think we need native schema support in graph providers.
There will definitely be advantages to such support (e.g. better indexing,
better query planning) where available, but there is a lot you can do with
a schema at the application level, like validation, object-graph mapping
(like Frames, but with no code other than the schema), and Gremlin
traversal optimizations. tl;dr yes, it's the user who determines the
schema, although every provider will come with a set of constraints
(explicit or implicit) on what kinds of schemas can be supported. E.g. most
providers do not support record-valued properties, so a schema with a
record type for a property would be an illegal schema w.r.t. that provider
(or at least, you'd need a mapping to turn the schema into one which is
supported, e.g. by encoding records as strings).


   - *Atomic types*. As part of the basic type system for property graphs,
> [...]
> >    However, all of this is to be discussed in detail on the dev list.
> >
>
> I'm pretty interested in the direction this goes as numbers have always
> been troublesome to our various language variants and it often doesn't make
> Gremlin look smart to those users of language off the JVM.
>


Below is the schema for Dragon's primitive types. Booleans and binary
strings have no parameters, while integer and floating-point types do have
some parameters. The string type happens to have a maximum-length parameter
(other commonly asked-for features being minimum length, regex, etc.). This
is not necessarily the schema we will use for TP4, but it might be close.
Algebraic Property Graphs does not prescribe any particular set of
primitive types; Dragon's types represent a pragmatic choice which has been
appropriate for applications in a particular company. The questions to be
answered for TinkerPop are: where should we draw the line between features
which are built in to the framework, vs. extensions/ornamentation which are
best left to individual graph providers. The PGSWG approach, at the moment,
is more like APG in that there are no prescribed type parameters, and we're
still deciding whether there should be built-in atomic types at all
(leaning toward "yes").

It might be worthwhile if you can summarize the problems we have had with
numeric types, here or in a separate thread, and then we can talk about how
we might be able to address them with schemas and a data model
specification.

Josh


- name: PrimitiveType
  description: "A primitive data type, such as a string or boolean type"
  type:
    union:
      - name: binary
        description: "The type of a binary value, consisting of a
sequence of bytes"
        type: BinaryType

      - name: boolean
        description: "The type of a boolean value, consisting of true or false"
        type: BooleanType

      - name: float
        description: "The type of a floating-point value"
        type: FloatType

      - name: integer
        description: "The type of an integer value"
        type: IntegerType

      - name: string
        description: "The type of a string value"
        type: StringType

- name: BinaryType
  description: "The type of a binary value, consisting of a sequence of bytes"

- name: BooleanType
  description: "The type of a boolean value (either true or false)"

- name: FloatType
  description: "A floating-point data type with a given bit precision"
  type:
    record:
      - name: precision
        description: "The floating-point precision of the type, in
bits. Common precision values are 32 and 64."
        type: NumericPrecision
  default:
    precision:
      bits: 32

- name: IntegerType
  description: "An integer data type with a given bit precision,
signedness, and optional width encoding"
  type:
    record:
      - name: precision
        description: "The integer precision of the type, in bits.
Common precision values are 32 and 64."
        type: NumericPrecision

      - name: signed
        description: "Whether the type represents signed or unsigned integers"
        type: boolean

      - name: fixedWidth
        description: "Whether a fixed-width integer or varint encoding
is preferred"
        type:
          optional: boolean
  default:
    precision:
      bits: 32
    signed: true

- name: StringType
  description: "A string data type with an optional maximum length.
The encoding scheme is unspecified."
  type:
    record:
      - name: maximumLength
        description: >
          If provided, an upper bound (inclusive) on the length of the string.
          If not provided, then there is no such constraint.
        type:
          optional: integer


Josh

Reply via email to