I have done some research and it seems that jackson[1] can do the yaml2json and json2yaml in java.
[1] https://codebeautify-org.webpkgcache.com/doc/-/s/codebeautify.org/blog/yaml-to-json-using-java/ Best weibin Sem Sinchenko <[email protected]> 于2024年5月11日周六 20:10写道: > In Python converting from json to yaml and back will be easy. Not 100% > sure about Java part. > > On Sat, 2024-05-11 at 10:06 +0800, weibin wrote: > > Thanks lx, agree with that YAML still in our current workflow. > > Maybe we should not be so aggressive to change YMAL to JSON > > immediately and > > if we decide to use proto, we add json2yaml in the final stage as a > > temporary solution before CLI . What do you guys think? > > > > BTW, the discussion about the protobuf to define GraphAr format, do > > you > > have any comment, Lixue? > > > > Best > > weibin > > > > 李雪(有理) <[email protected]> 于2024年5月11日周六 09:41写道: > > > > > Thank you for your thoughtful feedback and insights. Regarding the > > > concerns: > > > 1. The implementation of a CLI is a good idea. However, manual > > > viewing or > > > review of configuration files is still necessary, in our current > > > workflow. > > > 2. YAML’s syntax allows for the omission of braces, quotes, and > > > commas, > > > making the entire block easier to read and write, especially for > > > those > > > multi-level nested structure. > > > ------------------------------------------------------------------ > > > 发件人:Weibin Zeng <[email protected]> > > > 发送时间:2024年5月10日(星期五) 16:05 > > > 收件人:dev<[email protected]> > > > 主 题:Re: 回复:[DISCUSS][format] Using an Interface Definition Language > > > to > > > define GraphAr format > > > Hi, Lixue, Thanks for the reply. > > > For > > > > 1. YAML's format is more human-readable and easier to edit, which > > > > is a > > > significant advantage in scenarios where we frequently need to view > > > or > > > modify configuration files. For example, to define a subgraph from > > > an > > > existing graph. > > > I do not agree that we should let user to edit the yaml/json files > > > directly. Manual modification of schema files is unreliable and > > > unpredictable that would probably bring error that users don't even > > > know > > > why. And that's why we gonna to provide a CLI to restrict the > > > operations on > > > graph data, including the project a subgraph. > > > for the human-readable, here is the ldbc-sample.graph.yml for YAML > > > and > > > JSON: > > > ``` > > > name: ldbc_sample > > > vertices: > > > - person.vertex.yml > > > edges: > > > - person_knows_person.edge.yml > > > version: gar/v1 > > > extra_metadata: {} > > > ``` > > > ``` > > > { > > > "name": "ldbc_sample", > > > "vertices": [ > > > "person.vertex.yml" > > > ], > > > "edges": [ > > > "person_knows_person.edge.yml" > > > ], > > > "version": "gar/v1", > > > "extra_metadata": {} > > > } > > > ``` > > > JSON is readable enough i think, but not configurable as YAML. But > > > since > > > the files are not allow modified directly, I think JSON is ok. > > > > 2. YAML often provides a more concise representation of the same > > > > data. > > > Can you give an example to show that why YAML provides more concise > > > representation of the data. > > > > 3. YAML natively supports comments and extensions, making it more > > > flexible. > > > I agree that YAML support more feature and more flexible. But it's > > > too > > > flexible that can not provide much template validation support. To > > > GraphAr > > > format, we should consider that if the format is enough to express > > > the > > > schema and configuration of GraphAr. In this point, JSON is good to > > > me. > > > On 2024/05/10 01:47:47 "李雪(有理)" wrote: > > > > Thank you for the information and links provided. While I > > > > understand the > > > application of JSON in GraphScope Flex and its advantages when > > > integrated > > > with GraphAr, considering our specific use case, I still think that > > > YAML > > > might be a more suitable choice for us. Here are the primary > > > reasons: > > > > 1. YAML's format is more human-readable and easier to edit, which > > > > is a > > > significant advantage in scenarios where we frequently need to view > > > or > > > modify configuration files. For example, to define a subgraph from > > > an > > > existing graph. > > > > 2. YAML often provides a more concise representation of the same > > > > data. > > > > 3. YAML natively supports comments and extensions, making it more > > > flexible. > > > > Therefore, we initially favored YAML over JSON. I hope we can > > > > further > > > discuss this topic to find the solution that best fits our project > > > requirements. > > > > ----------------------------------------------------------------- > > > > - > > > > 发件人:Weibin Zeng <[email protected]> > > > > 发送时间:2024年5月9日(星期四) 18:52 > > > > 收件人:dev<[email protected]> > > > > 主 题:Re: [DISCUSS][format] Using an Interface Definition Language > > > > to > > > define GraphAr format > > > > Sorry, miss the link > > > > > GraphScope Flex now use json as communication format for graph > > > > > schema > > > and check with rest API[1] > > > > [1] > > > > https://github.com/alibaba/GraphScope/tree/main/python/graphscope/flex/rest/models > > > < > > > > https://github.com/alibaba/GraphScope/tree/main/python/graphscope/flex/rest/models > > > > < > > > > https://github.com/alibaba/GraphScope/tree/main/python/graphscope/flex/rest/models > > > < > > > > https://github.com/alibaba/GraphScope/tree/main/python/graphscope/flex/rest/models > > > > > > > > > On 2024/05/09 10:49:24 Weibin Zeng wrote: > > > > > JSONs is ok for me. And GraphScope Flex now use json as > > > > > communication > > > format for graph schema and check with rest API[1], I think > > > switching to > > > JSON is good for GraphAr. Since GraphAr has been integrated into > > > GraphScope. > > > > > > > > > > On 2024/05/09 08:50:45 Sem wrote: > > > > > > I made a small research about that and it seems to me that > > > > > > classes, > > > > > > generated from protobuf are not serializable into another > > > > > > formats > > > like > > > > > > yaml/json. > > > > > > > > > > > > There is a 3d party project: > > > > > > https://github.com/krzko/proto2yaml < > > > https://github.com/krzko/proto2yaml > > > > <https://github.com/krzko/proto2yaml > > > <https://github.com/krzko/proto2yaml > > that > > > > > > provide such utility, but it does not look well maintained. > > > > > > > > > > > > I see that there is an utility, provided by google. It allows > > > > > > conversion to JSON and from JSON in Java/Python (most > > > > > > probably, cpp > > > > > > too): > > > > > > 1. > > > > > > > > > > https://cloud.google.com/java/docs/reference/protobuf/latest/com.google.protobuf.util.JsonFormat > > > < > > > > https://cloud.google.com/java/docs/reference/protobuf/latest/com.google.protobuf.util.JsonFormat > > > > < > > > > https://cloud.google.com/java/docs/reference/protobuf/latest/com.google.protobuf.util.JsonFormat > > > < > > > > https://cloud.google.com/java/docs/reference/protobuf/latest/com.google.protobuf.util.JsonFormat > > > > > > > > > > > 2. > > > > > > > > > > https://googleapis.dev/python/protobuf/latest/google/protobuf/json_format.html > > > < > > > > https://googleapis.dev/python/protobuf/latest/google/protobuf/json_format.html > > > > < > > > > https://googleapis.dev/python/protobuf/latest/google/protobuf/json_format.html > > > < > > > > https://googleapis.dev/python/protobuf/latest/google/protobuf/json_format.html > > > > > > > > > > > > > > > > > But not to/from YAML. > > > > > > > > > > > > For me that question is important, because we need not only > > > > > > generate > > > > > > the code but resolve the question about > > > serialization/deserialization. > > > > > > > > > > > > > > > > > > What do you think about using proto (binary messages) for > > > > > > underlaying > > > > > > communication format in the code and JSONs for human-readable > > > > > > representation on disk? Because it looks like only with > > > > > > switching > > > from > > > > > > YAML to JSON we achieve all the benefits of using protobuf. > > > > > > > > > > > > With JSON I see it like we just call once `fromJson` (via > > > > > > google > > > > > > protobuf util) to read the data and create proto classes from > > > > > > JSON > > > info > > > > > > files and work with them. At the end we call `toJson` again > > > > > > to > > > > > > serialize messages back. > > > > > > > > > > > > To achieve the same with YAML we need to use 3d-party and not > > > > > > well > > > > > > maintained library or support our own > > > > > > serialization/deserialization > > > of > > > > > > proto messages (classes) to/from YAML for three languages > > > > > > (Python, > > > > > > Java, Cpp). > > > > > > > > > > > > On Thu, 2024-05-09 at 15:27 +0800, weibin.zen wrote: > > > > > > > Hi, everyone, > > > > > > > > > > > > > > I would like to propose that we should considering using an > > > Interface > > > > > > > Definition Language(IDL) like Protobuf[1] for GraphAr > > > > > > > format > > > > > > > definition. > > > > > > > Currently we use YAML to describe schema and metadata of > > > > > > > graph, and > > > > > > > data storage with common format like CSV/Parquet. YAML > > > > > > > provide human-readable ability but it can not provide much > > > > > > > validation, version-controlled. And various programming > > > > > > > languages > > > > > > > need > > > > > > > to parse them and check the validation by themself. > > > > > > > > > > > > > > Using IDL to describe format would bring benefits like: > > > > > > > > > > > > > > • provide a clear, standardized, language-agnostic format > > > definition > > > > > > > that can be version-controlled, shared by libraries and > > > > > > > make the > > > > > > > format consistent between implementations. > > > > > > > • The validation by protobuf can be directly use by our > > > > > > > validation > > > of > > > > > > > the schema, no need to let the libraries to implement the > > > validation. > > > > > > > • Cross-language support, libraries can use the generated > > > > > > > structure > > > > > > > as graph info directly. > > > > > > > > > > > > > > > > > > > > > This proposal is not replace the YAML with Protobuf. We > > > > > > > still use > > > > > > > YAML as the final schema&metadata file for user readable, > > > > > > > but with > > > > > > > IDL to maintaining a > > > > > > > robust and precis schema definition. It's kind a hybrid > > > > > > > strategy to > > > > > > > accommondates both human and machine needs. > > > > > > > > > > > > > > But Using IDL do bring some disadvantages, Sem has list > > > > > > > some in the > > > > > > > comment of pr[2]: > > > > > > > > > > > > > > • the generated code is huge and unreadable. > > > > > > > • the generated code may need to store in git. > > > > > > > • debugging is very hard. > > > > > > > > > > > > > > > > > > > > > Since this would be a huge change, and I want to hear the > > > > > > > thoughts > > > > > > > about the proposal from you. > > > > > > > > > > > > > > > > > > > > > [1] https://protobuf.dev/ <https://protobuf.dev/ > < > > > https://protobuf.dev/ <https://protobuf.dev/ > > > > > > > > > [2] https://github.com/apache/incubator-graphar/pull/475 < > > > https://github.com/apache/incubator-graphar/pull/475 > < > > > https://github.com/apache/incubator-graphar/pull/475 < > > > https://github.com/apache/incubator-graphar/pull/475 > > > > > > > > > > > > > > > > Best > > > > > > > weibin.zen > > > > > > > > > > > > > > > > > > ------------------------------------------------------------- > > > > > > -------- > > > > > > To unsubscribe, e-mail: [email protected] > > > > > > For additional commands, e-mail: [email protected] > > > > > > > > > > > > > > > > > > > > > > --------------------------------------------------------------- > > > > > ------ > > > > > To unsubscribe, e-mail: [email protected] > > > > > For additional commands, e-mail: [email protected] > > > > > > > > > > > > > > ----------------------------------------------------------------- > > > > ---- > > > > To unsubscribe, e-mail: [email protected] > > > > For additional commands, e-mail: [email protected] > > > ------------------------------------------------------------------- > > > -- > > > To unsubscribe, e-mail: [email protected] > > > For additional commands, e-mail: [email protected] > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
