[Discuss] Proposal and feature collection for roadmap

weibin Tue, 21 May 2024 01:00:42 -0700

Hi, all
    It's a good time to update our roadmap. I open this discussion to
collect and discuss our roadmap. I have drafted some items that we have
discussed in our community meeting.


1. Format
   - Define format with protobuf (discuss and vote on [1][2])
   - Support multi-labels for vertex and edge
   - Standardizing the format v1 specification

2. C++ Library
   - Format compatibility to v1
   - Make full use of feature of columnar format parquet/ORC to improve
read/write performance
   - A simple out-of-core compute engine base on graphar

3. Java / Scala with Spark Library
   - Format compatibility to v1
   - Modularize the library: split to info/reader/writer...
   - Integrate with ldbc_snb_datagen_spark[3]

4 Python with PySpark
   - A new PySpark API that work with both Spark Classic and Spark Connect

4. Other
   - ETL CLI for graphar data [4]
   - More language binding
   - Construct a DataHub with GraphAr format

I am looking forward to hearing your thoughts about the roadmap of GraphAr.

[1] https://lists.apache.org/thread/o5bqbhxvcbm6xqj1j1m2h7bhdnsvgsoy
[2] https://lists.apache.org/thread/swg5qb35qxywt6w0k7oxt2srsvqnqgnh
[3] https://github.com/apache/incubator-graphar/issues/463
[4] https://github.com/ldbc/ldbc_snb_datagen_spark

[Discuss] Proposal and feature collection for roadmap

Reply via email to