[GitHub] tinkerpop issue #351: TINKERPOP-1274: GraphSON 2.0.

robertdale Thu, 07 Jul 2016 10:57:55 -0700

Github user robertdale commented on the issue:

    https://github.com/apache/tinkerpop/pull/351
  
    So I've caught up on the discussion and I'll offer some more food for 
thought since I haven't seen any other ideas. Embedding metadata is neither 
easy nor fun (not for me anyway). For any serious integration type work it's 
always best to have a well-defined schema up-front.
    
    On types:
    > @spmallette
    > In fact we don't always know the types ahead of time (like Titan's 
GeoPoint), so using the java class name is pretty convenient
    
    Convenience is not the same as using Java types. By "not using java types", 
we mean:
    - not using java package names
    - not using types specific to Java
    - using primitives and other common types that are concise and portable
    - should include domain-specific types. e.g. Vertex, Edge, etc.
    - may include other standards. e.g. GeoJSON
    
    Defining primitives, common types:
    - http://swagger.io/specification/#dataTypeFormat
    - http://bsonspec.org/spec.html
    - http://geojson.org/geojson-spec.html
    - http://ubjson.org/type-reference/
    
    So if your Java implementation conveniently shares the same name as the 
type, then that's wonderful. But if you are to be truly language-agnostic, then 
at some point the types must be known ahead of time in order to be consumed. 
For instance, how can my X parser know how to handle a Titan GeoPoint if it's 
all dynamic? It can't. It must be able to handle this type ahead of time.  And 
I can't imagine someone would want to manually read a graphson file to discover 
all the types that must be handled. Maybe I'm getting out of scope as this goes 
beyond language and steps into being database agnostic. @newkek, please correct 
me if I'm wrong,  but it doesn't look like the code does any dynamic 
serializing. It looks like all types are registered anyway. So I'll argue again 
if you know your types ahead of time, then you may as well have a schema.
    
    But let's continue with embedded metadata...
    
    In JSON, the only unambiguous types are
    - array (unless you want to disambiguate from list which may be very valid)
    - string
    - boolean (true, false)
    - null
    
    To avoid confusion on all other types, including numbers, they should be 
typed. Thus they are objects (and not lists of things). The metadata can be at 
the same level as the object and alleviates these concerns: @newkek " a List in 
which the first element is a Map in which the first entry's key" and 
@PommeVerte "can be a pain in systems that do not necessarily order lists".  
Metadata can be differentiated from member fields by a prefix (e.g. '@').  
Primitive types (or objects) having only a single value would have a "value" 
key which maps to the actual value.
    ```json
    [
       {
          "@type":"Vertex",
          "id":{
             "@type":"int64",
             "value":12345
          },
          "label":"person",
          "properties":{
             "@type":"VertexProperty",
             "skill":{
                "id":{ "@type":"int64",
                        "value":8723
                },
                "@type":"int32",
                "value":5
             },
             "secrets":[
                { "id":{
                      "@type":"int64",
                      "value":8723
                   },
                   "@type":"uuid",
                   "value":"1de7bedf-f9ba-4e94-bde9-f1be28bef239"
                },
                {  "id":{
                      "@type":"int64",
                      "value":8724
                   },
                   "@type":"uuid",
                   "value":"34523adf-f9ba-4e94-bde9-f2345bcd3f45"
                }
             ]
          },
          "inE":[
             {  "@type":"Edge",
                "label":"knows",
                "id":{
                   "@type":"int64",
                   "value":987234
                },
                "properties":{  },
                "outV":[  { } ]
             }
          ]
       }
    ]
    ```
    I wouldn't concern myself with the additional payload size for metadata. I 
wouldn't sacrifice conciseness for size. One could always compress the file if 
size is a concern. Also, the reader/writer could be easily enhanced to support 
zip. I would take the pragmatic approach and address it when it's no longer 
working for people.
    
    Anyway, maybe this is all GraphSON 3.0 stuffs.  HTH.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] tinkerpop issue #351: TINKERPOP-1274: GraphSON 2.0.

Reply via email to