In Python converting from json to yaml and back will be easy. Not 100%
sure about Java part.

On Sat, 2024-05-11 at 10:06 +0800, weibin wrote:
> Thanks lx, agree with that YAML still in our current workflow.
> Maybe we should not be so aggressive to change YMAL to JSON
> immediately and
> if we decide to use proto, we add json2yaml in the final stage as  a
> temporary solution before CLI . What do you guys think?
> 
> BTW, the discussion about the protobuf to define GraphAr format, do
> you
> have any comment, Lixue?
> 
> Best
> weibin
> 
> 李雪(有理) <[email protected]> 于2024年5月11日周六 09:41写道:
> 
> > Thank you for your thoughtful feedback and insights. Regarding the
> > concerns:
> > 1. The implementation of a CLI is a good idea. However, manual
> > viewing or
> > review of configuration files is still necessary, in our current
> > workflow.
> > 2. YAML’s syntax allows for the omission of braces, quotes, and
> > commas,
> > making the entire block easier to read and write, especially for
> > those
> > multi-level nested structure.
> > ------------------------------------------------------------------
> > 发件人:Weibin Zeng <[email protected]>
> > 发送时间:2024年5月10日(星期五) 16:05
> > 收件人:dev<[email protected]>
> > 主 题:Re: 回复:[DISCUSS][format] Using an Interface Definition Language
> > to
> > define GraphAr format
> > Hi, Lixue, Thanks for the reply.
> > For
> > > 1. YAML's format is more human-readable and easier to edit, which
> > > is a
> > significant advantage in scenarios where we frequently need to view
> > or
> > modify configuration files. For example, to define a subgraph from
> > an
> > existing graph.
> > I do not agree that we should let user to edit the yaml/json files
> > directly. Manual modification of schema files is unreliable and
> > unpredictable that would probably bring error that users don't even
> > know
> > why. And that's why we gonna to provide a CLI to restrict the
> > operations on
> > graph data, including the project a subgraph.
> > for the human-readable, here is the ldbc-sample.graph.yml for YAML
> > and
> > JSON:
> > ```
> > name: ldbc_sample
> > vertices:
> >  - person.vertex.yml
> > edges:
> >  - person_knows_person.edge.yml
> > version: gar/v1
> > extra_metadata: {}
> > ```
> > ```
> > {
> >  "name": "ldbc_sample",
> >  "vertices": [
> >  "person.vertex.yml"
> >  ],
> >  "edges": [
> >  "person_knows_person.edge.yml"
> >  ],
> >  "version": "gar/v1",
> >  "extra_metadata": {}
> > }
> > ```
> > JSON is readable enough i think, but not configurable as YAML. But
> > since
> > the files are not allow modified directly, I think JSON is ok.
> > > 2. YAML often provides a more concise representation of the same
> > > data.
> > Can you give an example to show that why YAML provides more concise
> > representation of the data.
> > > 3. YAML natively supports comments and extensions, making it more
> > flexible.
> > I agree that YAML support more feature and more flexible. But it's
> > too
> > flexible that can not provide much template validation support. To
> > GraphAr
> > format, we should consider that if the format is enough to express
> > the
> > schema and configuration of GraphAr. In this point, JSON is good to
> > me.
> > On 2024/05/10 01:47:47 "李雪(有理)" wrote:
> > > Thank you for the information and links provided. While I
> > > understand the
> > application of JSON in GraphScope Flex and its advantages when
> > integrated
> > with GraphAr, considering our specific use case, I still think that
> > YAML
> > might be a more suitable choice for us. Here are the primary
> > reasons:
> > > 1. YAML's format is more human-readable and easier to edit, which
> > > is a
> > significant advantage in scenarios where we frequently need to view
> > or
> > modify configuration files. For example, to define a subgraph from
> > an
> > existing graph.
> > > 2. YAML often provides a more concise representation of the same
> > > data.
> > > 3. YAML natively supports comments and extensions, making it more
> > flexible.
> > > Therefore, we initially favored YAML over JSON. I hope we can
> > > further
> > discuss this topic to find the solution that best fits our project
> > requirements.
> > > -----------------------------------------------------------------
> > > -
> > > 发件人:Weibin Zeng <[email protected]>
> > > 发送时间:2024年5月9日(星期四) 18:52
> > > 收件人:dev<[email protected]>
> > > 主 题:Re: [DISCUSS][format] Using an Interface Definition Language
> > > to
> > define GraphAr format
> > > Sorry, miss the link
> > > > GraphScope Flex now use json as communication format for graph
> > > > schema
> > and check with rest API[1]
> > > [1]
> > https://github.com/alibaba/GraphScope/tree/main/python/graphscope/flex/rest/models
> > <
> > https://github.com/alibaba/GraphScope/tree/main/python/graphscope/flex/rest/models
> > > <
> > https://github.com/alibaba/GraphScope/tree/main/python/graphscope/flex/rest/models
> > <
> > https://github.com/alibaba/GraphScope/tree/main/python/graphscope/flex/rest/models
> > > > 
> > > On 2024/05/09 10:49:24 Weibin Zeng wrote:
> > > > JSONs is ok for me. And GraphScope Flex now use json as
> > > > communication
> > format for graph schema and check with rest API[1], I think
> > switching to
> > JSON is good for GraphAr. Since GraphAr has been integrated into
> > GraphScope.
> > > > 
> > > > On 2024/05/09 08:50:45 Sem wrote:
> > > > > I made a small research about that and it seems to me that
> > > > > classes,
> > > > > generated from protobuf are not serializable into another
> > > > > formats
> > like
> > > > > yaml/json.
> > > > > 
> > > > > There is a 3d party project:
> > > > > https://github.com/krzko/proto2yaml <
> > https://github.com/krzko/proto2yaml >
> > <https://github.com/krzko/proto2yaml
> > <https://github.com/krzko/proto2yaml > > that
> > > > > provide such utility, but it does not look well maintained.
> > > > > 
> > > > > I see that there is an utility, provided by google. It allows
> > > > > conversion to JSON and from JSON in Java/Python (most
> > > > > probably, cpp
> > > > > too):
> > > > > 1.
> > > > > 
> > https://cloud.google.com/java/docs/reference/protobuf/latest/com.google.protobuf.util.JsonFormat
> > <
> > https://cloud.google.com/java/docs/reference/protobuf/latest/com.google.protobuf.util.JsonFormat
> > > <
> > https://cloud.google.com/java/docs/reference/protobuf/latest/com.google.protobuf.util.JsonFormat
> > <
> > https://cloud.google.com/java/docs/reference/protobuf/latest/com.google.protobuf.util.JsonFormat
> > > > 
> > > > > 2.
> > > > > 
> > https://googleapis.dev/python/protobuf/latest/google/protobuf/json_format.html
> > <
> > https://googleapis.dev/python/protobuf/latest/google/protobuf/json_format.html
> > > <
> > https://googleapis.dev/python/protobuf/latest/google/protobuf/json_format.html
> > <
> > https://googleapis.dev/python/protobuf/latest/google/protobuf/json_format.html
> > > > 
> > > > > 
> > > > > But not to/from YAML.
> > > > > 
> > > > > For me that question is important, because we need not only
> > > > > generate
> > > > > the code but resolve the question about
> > serialization/deserialization.
> > > > > 
> > > > > 
> > > > > What do you think about using proto (binary messages) for
> > > > > underlaying
> > > > > communication format in the code and JSONs for human-readable
> > > > > representation on disk? Because it looks like only with
> > > > > switching
> > from
> > > > > YAML to JSON we achieve all the benefits of using protobuf.
> > > > > 
> > > > > With JSON I see it like we just call once `fromJson` (via
> > > > > google
> > > > > protobuf util) to read the data and create proto classes from
> > > > > JSON
> > info
> > > > > files and work with them. At the end we call `toJson` again
> > > > > to
> > > > > serialize messages back.
> > > > > 
> > > > > To achieve the same with YAML we need to use 3d-party and not
> > > > > well
> > > > > maintained library or support our own
> > > > > serialization/deserialization
> > of
> > > > > proto messages (classes) to/from YAML for three languages
> > > > > (Python,
> > > > > Java, Cpp).
> > > > > 
> > > > > On Thu, 2024-05-09 at 15:27 +0800, weibin.zen wrote:
> > > > > > Hi, everyone,
> > > > > > 
> > > > > > I would like to propose that we should considering using an
> > Interface
> > > > > > Definition Language(IDL) like Protobuf[1] for GraphAr
> > > > > > format
> > > > > > definition.
> > > > > > Currently we use YAML to describe schema and metadata of
> > > > > > graph, and
> > > > > > data storage with common format like CSV/Parquet. YAML
> > > > > > provide human-readable ability but it can not provide much
> > > > > > validation, version-controlled. And various programming
> > > > > > languages
> > > > > > need
> > > > > > to parse them and check the validation by themself.
> > > > > > 
> > > > > > Using IDL to describe format would bring benefits like:
> > > > > > 
> > > > > > • provide a clear, standardized, language-agnostic format
> > definition
> > > > > > that can be version-controlled, shared by libraries and
> > > > > > make the
> > > > > > format consistent between implementations.
> > > > > > • The validation by protobuf can be directly use by our
> > > > > > validation
> > of
> > > > > > the schema, no need to let the libraries to implement the
> > validation.
> > > > > > • Cross-language support, libraries can use the generated
> > > > > > structure
> > > > > > as graph info directly.
> > > > > > 
> > > > > > 
> > > > > > This proposal is not replace the YAML with Protobuf. We
> > > > > > still use
> > > > > > YAML as the final schema&metadata file for user readable,
> > > > > > but with
> > > > > > IDL to maintaining a
> > > > > > robust and precis schema definition. It's kind a hybrid
> > > > > > strategy to
> > > > > > accommondates both human and machine needs.
> > > > > > 
> > > > > > But Using IDL do bring some disadvantages, Sem has list
> > > > > > some in the
> > > > > > comment of pr[2]:
> > > > > > 
> > > > > > • the generated code is huge and unreadable.
> > > > > > • the generated code may need to store in git.
> > > > > > • debugging is very hard.
> > > > > > 
> > > > > > 
> > > > > > Since this would be a huge change, and I want to hear the
> > > > > > thoughts
> > > > > > about the proposal from you.
> > > > > > 
> > > > > > 
> > > > > > [1] https://protobuf.dev/ <https://protobuf.dev/ > <
> > https://protobuf.dev/ <https://protobuf.dev/ > >
> > > > > > [2] https://github.com/apache/incubator-graphar/pull/475 <
> > https://github.com/apache/incubator-graphar/pull/475 > <
> > https://github.com/apache/incubator-graphar/pull/475 <
> > https://github.com/apache/incubator-graphar/pull/475 > >
> > > > > > 
> > > > > > Best
> > > > > > weibin.zen
> > > > > 
> > > > > 
> > > > > -------------------------------------------------------------
> > > > > --------
> > > > > To unsubscribe, e-mail: [email protected]
> > > > > For additional commands, e-mail: [email protected]
> > > > > 
> > > > > 
> > > > 
> > > > ---------------------------------------------------------------
> > > > ------
> > > > To unsubscribe, e-mail: [email protected]
> > > > For additional commands, e-mail: [email protected]
> > > > 
> > > > 
> > > -----------------------------------------------------------------
> > > ----
> > > To unsubscribe, e-mail: [email protected]
> > > For additional commands, e-mail: [email protected]
> > -------------------------------------------------------------------
> > --
> > To unsubscribe, e-mail: [email protected]
> > For additional commands, e-mail: [email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to