This is an automated email from the ASF dual-hosted git repository.
weibin pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-graphar.git
The following commit(s) were added to refs/heads/main by this push:
new 71ce490 [Feat][Doc] Update the repository url and the docs url (#444)
71ce490 is described below
commit 71ce4904f5ebd5df3fd330a693b8d48865505c55
Author: Weibin Zeng <[email protected]>
AuthorDate: Wed Apr 10 11:21:53 2024 +0800
[Feat][Doc] Update the repository url and the docs url (#444)
related #432
changes include:
- repo location update: `aliabab/GraphAr` -> `apache/incubator-graphar`
- websize update: `alibaba.github.io/GraphAr` -> `graphar.apache.org`
- README.rst -> README.md
Signed-off-by: acezen <[email protected]>
---
.github/ISSUE_TEMPLATE/bug_report.yml | 6 +-
.github/ISSUE_TEMPLATE/config.yml | 2 +-
.github/PULL_REQUEST_TEMPLATE.md | 2 +-
.github/workflows/ci-nightly.yml | 2 +-
.github/workflows/docs.yml | 2 +-
.github/workflows/java.yml | 2 +-
.github/workflows/release.yml | 6 +-
CONTRIBUTING.rst | 30 +--
README.md | 295 +++++++++++++++++++++
README.rst | 266 -------------------
cpp/README.md | 4 +-
dev/release-process.md | 6 +-
docs/conf.py | 4 +-
docs/cpp/examples/bgl.rst | 2 +-
docs/cpp/examples/out-of-core.rst | 14 +-
docs/cpp/examples/snap-to-graphar.rst | 2 +-
docs/cpp/getting-started.rst | 6 +-
docs/developers/community.rst | 8 +-
docs/java/java-lib.rst | 8 +-
docs/pyspark/how-to.rst | 4 +-
docs/spark/examples/spark.rst | 10 +-
docs/spark/spark-lib.rst | 20 +-
java/README.md | 4 +-
java/cmake/graphar-cpp.cmake | 2 +-
.../apache/graphar/graphinfo/GraphInfoTest.java | 3 +-
spark/README.md | 4 +-
26 files changed, 372 insertions(+), 342 deletions(-)
diff --git a/.github/ISSUE_TEMPLATE/bug_report.yml
b/.github/ISSUE_TEMPLATE/bug_report.yml
index dc2bf68..e06d1f0 100644
--- a/.github/ISSUE_TEMPLATE/bug_report.yml
+++ b/.github/ISSUE_TEMPLATE/bug_report.yml
@@ -22,11 +22,11 @@ labels: [Bug]
body:
- type: markdown
attributes:
- value: ':stop_sign: _For support questions, please visit [GraphAr
Discussions](https://github.com/alibaba/GraphAr/discussions) instead._'
+ value: ':stop_sign: _For support questions, please visit [GraphAr
Discussions](https://github.com/apache/incubator-graphar/discussions) instead._'
- type: checkboxes
attributes:
label: 'Is there an existing issue for this?'
- description: 'Please [search :mag: the
issues](https://github.com/alibaba/GraphAr/issues) to check if this bug has
already been reported.'
+ description: 'Please [search :mag: the
issues](https://github.com/apache/incubator-graphar/issues) to check if this
bug has already been reported.'
options:
- label: 'I have searched the existing issues'
required: true
@@ -83,4 +83,4 @@ body:
required: false
- type: markdown
attributes:
- value: ':stop_sign: _For support questions, please visit [GraphAr
Discussions](https://github.com/alibaba/GraphAr/discussions) instead._'
+ value: ':stop_sign: _For support questions, please visit [GraphAr
Discussions](https://github.com/apache/incubator-graphar/discussions) instead._'
diff --git a/.github/ISSUE_TEMPLATE/config.yml
b/.github/ISSUE_TEMPLATE/config.yml
index 5c36446..1a6c4b6 100644
--- a/.github/ISSUE_TEMPLATE/config.yml
+++ b/.github/ISSUE_TEMPLATE/config.yml
@@ -18,5 +18,5 @@
blank_issues_enabled: false
contact_links:
- name: Questions
- url: https://github.com/alibaba/GraphAr/discussions
+ url: https://github.com/apache/incubator-graphar/discussions
about: Search for and ask questions on our github Discussions chat.
diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md
index 23b6713..95a8f0b 100644
--- a/.github/PULL_REQUEST_TEMPLATE.md
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@@ -6,7 +6,7 @@ Describe the big picture of your changes here to communicate to
the maintainers
_Put an `x` in the boxes that apply. You can also fill these out after
creating the PR. If you're unsure about any of them, don't hesitate to ask.
We're here to help! This is simply a reminder of what we are going to look for
before merging your code._
-- [ ] I have read the
[CONTRIBUTING](https://github.com/alibaba/GraphAr/blob/main/CONTRIBUTING.rst)
doc
+- [ ] I have read the
[CONTRIBUTING](https://github.com/apache/incubator-graphar/blob/main/CONTRIBUTING.rst)
doc
- [ ] I have signed the CLA
- [ ] Lint and unit tests pass locally with my changes
- [ ] I have added tests that prove my fix is effective or that my feature
works
diff --git a/.github/workflows/ci-nightly.yml b/.github/workflows/ci-nightly.yml
index 42efbba..e10e1e7 100644
--- a/.github/workflows/ci-nightly.yml
+++ b/.github/workflows/ci-nightly.yml
@@ -25,7 +25,7 @@ on:
- cron: '00 19 * * *'
jobs:
GraphAr-ubuntu-arrow-from-source:
- if: ${{ github.ref == 'refs/heads/main' && github.repository ==
'alibaba/GraphAr' }}
+ if: ${{ github.ref == 'refs/heads/main' && github.repository ==
'apache/incubator-graphar' }}
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml
index 772dbe6..1d7c1b2 100644
--- a/.github/workflows/docs.yml
+++ b/.github/workflows/docs.yml
@@ -81,7 +81,7 @@ jobs:
- name: Commit Doc
# TODO: open this when apache infrastructure is ready
- # if: ${{ github.ref == 'refs/heads/main' && github.event_name ==
'push' && github.repository == 'alibaba/GraphAr' }}
+ # if: ${{ github.ref == 'refs/heads/main' && github.event_name ==
'push' && github.repository == 'apache/incubator-graphar' }}
if: false
run: |
git config user.email [email protected]
diff --git a/.github/workflows/java.yml b/.github/workflows/java.yml
index d3ae678..5fc5fd4 100644
--- a/.github/workflows/java.yml
+++ b/.github/workflows/java.yml
@@ -69,7 +69,7 @@ jobs:
- name: Run test
run: |
- # Temporarily using Java 8, related issue:
https://github.com/alibaba/GraphAr/issues/277
+ # Temporarily using Java 8, related issue:
https://github.com/apache/incubator-graphar/issues/277
export JAVA_HOME=${JAVA_HOME_8_X64}
export LLVM11_HOME=/usr/lib/llvm-11
pushd java
diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml
index f1b6dea..ec317f6 100644
--- a/.github/workflows/release.yml
+++ b/.github/workflows/release.yml
@@ -31,7 +31,7 @@ concurrency:
jobs:
release:
runs-on: ${{ matrix.os }}
- if: ${{ github.repository == 'alibaba/GraphAr' }}
+ if: ${{ github.repository == 'apache/incubator-graphar' }}
strategy:
matrix:
os: [ubuntu-latest]
@@ -39,12 +39,12 @@ jobs:
steps:
- name: Extract tag name
id: tag
- if: ${{ github.event_name == 'push' && startsWith(github.ref,
'refs/tags/v') && github.repository == 'alibaba/GraphAr' }}
+ if: ${{ github.event_name == 'push' && startsWith(github.ref,
'refs/tags/v') && github.repository == 'apache/incubator-graphar' }}
run: echo "TAG=${GITHUB_REF#refs/tags/}" >> $GITHUB_OUTPUT
- name: Cut a versioned release
uses: "marvinpinto/action-automatic-releases@latest"
- if: ${{ github.event_name == 'push' && startsWith(github.ref,
'refs/tags/v') && github.repository == 'alibaba/GraphAr' }}
+ if: ${{ github.event_name == 'push' && startsWith(github.ref,
'refs/tags/v') && github.repository == 'apache/incubator-graphar' }}
with:
repo_token: "${{ secrets.GITHUB_TOKEN }}"
automatic_release_tag: ${{ steps.tag.outputs.TAG }}
diff --git a/CONTRIBUTING.rst b/CONTRIBUTING.rst
index dbbcac4..b01e80e 100644
--- a/CONTRIBUTING.rst
+++ b/CONTRIBUTING.rst
@@ -13,8 +13,8 @@ Reporting bug
-------------------
If you've noticed a bug in GraphAr, first make sure that you are testing
against
-the `latest version of GraphAr
<https://github.com/alibaba/GraphAr/tree/main>`_ -
-your issue may already have been fixed. If not, search our `issues list
<https://github.com/alibaba/GraphAr/issues>`_
+the `latest version of GraphAr
<https://github.com/apache/incubator-graphar/tree/main>`_ -
+your issue may already have been fixed. If not, search our `issues list
<https://github.com/apache/incubator-graphar/issues>`_
on GitHub in case a similar issue has already been opened.
If you get confirmation of your bug, `file a bug issue`_ before starting to
code.
@@ -194,7 +194,7 @@ up to date with GraphAr's main branch:
.. code:: shell
- $ git remote add upstream https://github.com/alibaba/GraphAr.git
+ $ git remote add upstream https://github.com/apache/incubator-graphar.git
$ git checkout main
$ git pull upstream main
@@ -267,7 +267,7 @@ Maintainers need to do the following to push out a release:
$ git tag -a v0.1.0 -m "GraphAr v0.1.0"
$ git push upstream v0.1.0
-3. The release draft will be automatically built to GitHub by GitHub Actions.
You can edit the release notes draft on `GitHub
<https://github.com/alibaba/GraphAr/releases>`_ to add more details.
+3. The release draft will be automatically built to GitHub by GitHub Actions.
You can edit the release notes draft on `GitHub
<https://github.com/apache/incubator-graphar/releases>`_ to add more details.
4. Publish the release.
.. the reviewing part document is referred and derived from
@@ -365,7 +365,7 @@ Approving a change
^^^^^^^^^^^^^^^^^^^
Any GraphAr core collaborator (any GitHub user with commit rights in the
-:code:`alibaba/GraphAr` repository) is authorized to approve any other
contributor's
+:code:`apache/incubator-graphar` repository) is authorized to approve any
other contributor's
work. Collaborators are not permitted to approve their own pull requests.
Collaborators indicate that they have reviewed and approve of the changes in
@@ -413,7 +413,7 @@ Continuous integration testing
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
All pull requests that contain changes to code must be run through
-continuous integration (CI) testing at `Github Actions
<https://github.com/alibaba/GraphAr/actions>`_.
+continuous integration (CI) testing at `Github Actions
<https://github.com/apache/incubator-graphar/actions>`_.
The pull request change will trigger a CI testing run. Ideally, the code change
will pass ("be green") on all platform configurations supported by GraphAr.
@@ -433,30 +433,30 @@ you can submit a pull request to the related libraries
implementation to impleme
.. _pre-commit: https://pre-commit.com/
-.. _Code of Conduct:
https://github.com/alibaba/GraphAr/blob/main/CODE_OF_CONDUCT.md
+.. _Code of Conduct:
https://github.com/apache/incubator-graphar/blob/main/CODE_OF_CONDUCT.md
-.. _file a bug issue:
https://github.com/alibaba/GraphAr/issues/new?assignees=&labels=Bug&template=bug_report.yml&title=%5BBug%5D%3A+%3Ctitle%3E
+.. _file a bug issue:
https://github.com/apache/incubator-graphar/issues/new?assignees=&labels=Bug&template=bug_report.yml&title=%5BBug%5D%3A+%3Ctitle%3E
-.. _Open a feature request issue:
https://github.com/alibaba/GraphAr/issues/new?assignees=&labels=enhancement&template=feature_request.md&title=%5BFeat%5D
+.. _Open a feature request issue:
https://github.com/apache/incubator-graphar/issues/new?assignees=&labels=enhancement&template=feature_request.md&title=%5BFeat%5D
.. _fork GraphAr: https://help.github.com/articles/fork-a-repo
.. _make a Pull Request:
https://help.github.com/articles/creating-a-pull-request
-.. _Github Discussions: https://github.com/alibaba/GraphAr/discussions
+.. _Github Discussions: https://github.com/apache/incubator-graphar/discussions
.. _git rebasing: http://git-scm.com/book/en/Git-Branching-Rebasing
.. _interactive rebase:
https://help.github.com/en/github/using-git/about-git-rebase
-.. _GraphAr C++ Dependencies:
https://github.com/alibaba/GraphAr/tree/main/cpp#system-setup
+.. _GraphAr C++ Dependencies:
https://github.com/apache/incubator-graphar/tree/main/cpp#system-setup
-.. _GraphAr Spark Dependencies:
https://github.com/alibaba/GraphAr/tree/main/spark#system-setup
+.. _GraphAr Spark Dependencies:
https://github.com/apache/incubator-graphar/tree/main/spark#system-setup
-.. _Contributor License Agreement: https://cla-assistant.io/alibaba/GraphAr
+.. _Contributor License Agreement:
https://cla-assistant.io/apache/incubator-graphar
.. _glossary:
https://chromium.googlesource.com/chromiumos/docs/+/HEAD/glossary.md
-.. _format specification design:
https://github.com/alibaba/GraphAr/tree/main/docs/format/file-format.rst
+.. _format specification design:
https://github.com/apache/incubator-graphar/tree/main/docs/format/file-format.rst
-.. _implementation status:
https://github.com/alibaba/GraphAr/tree/main/docs/format/status.rst
+.. _implementation status:
https://github.com/apache/incubator-graphar/tree/main/docs/format/status.rst
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..5c0d93e
--- /dev/null
+++ b/README.md
@@ -0,0 +1,295 @@
+<h1 align="center" style="clear: both;">
+ <img src="docs/images/graphar-logo.svg" width="350" alt="GraphAr">
+</h1>
+<p align="center">
+ An open source, standard data file format for graph data storage and
retrieval
+</p>
+
+[](https://github.com/apache/incubator-graphar/actions)
+[](https://github.com/apache/incubator-graphar/actions)
+[](https://graphar.apache.org/docs/)
+[](https://github.com/apache/incubator-graphar/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)
+
+📢 Join our [Weekly Community
+Meeting](https://github.com/apache/incubator-graphar/wiki/GraphAr-Weekly-Community-Meeting)
+to learn more about GraphAr and get involved!
+
+# What is GraphAr?
+
+<img src="docs/images/overview.png" class="align-center" width="770"
+alt="Overview" />
+
+Graph processing serves as the essential building block for a diverse
+variety of real-world applications such as social network analytics,
+data mining, network routing, and scientific computing.
+
+GraphAr (short for "Graph Archive") is a project that aims to make it
+easier for diverse applications and systems (in-memory and out-of-core
+storages, databases, graph computing systems, and interactive graph
+query frameworks) to build and access graph data conveniently and
+efficiently.
+
+It can be used for importing/exporting and persistent storage of graph
+data, thereby reducing the burden on systems when working together.
+Additionally, it can serve as a direct data source for graph processing
+applications.
+
+To achieve this, GraphAr provides:
+
+- The Graph Archive(GAR) file format: a standardized system-independent
+ file format for storing graph data
+- Libraries: a set of libraries for reading, writing and transforming
+ GAR files
+
+By using GraphAr, you can:
+
+- Store and persist your graph data in a system-independent way with the
+ GAR file format
+- Easily access and generate GAR files using the libraries
+- Utilize Apache Spark to quickly manipulate and transform your GAR
+ files
+
+# The GAR File Format
+
+The GAR file format is designed for storing property graphs. It uses
+metadata to record all the necessary information of a graph, and
+maintains the actual data in a chunked way.
+
+A property graph consists of vertices and edges, with each vertex
+contains a unique identifier and:
+
+- A text label that describes the vertex type.
+- A collection of properties, with each property can be represented by a
+ key-value pair.
+
+Each edge contains a unique identifier and:
+
+- The outgoing vertex (source).
+- The incoming vertex (destination).
+- A text label that describes the relationship between the two vertices.
+- A collection of properties.
+
+The following is an example property graph containing two types of
+vertices ("person" and "comment") and three types of edges.
+
+<img src="docs/images/property_graph.png" class="align-center"
+width="700" alt="property graph" />
+
+## Vertices in GraphAr
+
+### Logical table of vertices
+
+Each type of vertices (with the same label) constructs a logical vertex
+table, with each vertex assigned with a global index inside this type
+(called internal vertex id) starting from 0, corresponding to the row
+number of the vertex in the logical vertex table. An example layout for
+a logical table of vertices under the label "person" is provided for
+reference.
+
+Given an internal vertex id and the vertex label, a vertex is uniquely
+identifiable and its respective properties can be accessed from this
+table. The internal vertex id is further used to identify the source and
+destination vertices when maintaining the topology of the graph.
+
+<img src="docs/images/vertex_logical_table.png" class="align-center"
+width="650" alt="vertex logical table" />
+
+### Physical table of vertices
+
+The logical vertex table will be partitioned into multiple continuous
+vertex chunks for enhancing the reading/writing efficiency. To maintain
+the ability of random access, the size of vertex chunks for the same
+label is fixed. To support to access required properties avoiding
+reading all properties from the files, and to add properties for
+vertices without modifying the existing files, the columns of the
+logical table will be divided into several column groups.
+
+Take the `person` vertex table as an example, if the chunk size is set
+to be 500, the logical table will be separated into sub-logical-tables
+of 500 rows with the exception of the last one, which may have less than
+500 rows. The columns for maintaining properties will also be divided
+into distinct groups (e.g., 2 for our example). As a result, a total of
+4 physical vertex tables are created for storing the example logical
+table, which can be seen from the following figure.
+
+<img src="docs/images/vertex_physical_table.png" class="align-center"
+width="650" alt="vertex physical table" />
+
+**Note**: For efficiently utilize the filter push-down of the payload
+file format like Parquet, the internal vertex id is stored in the
+payload file as a column. And since the internal vertex id is
+continuous, the payload file format can use the delta encoding for the
+internal vertex id column, which would not bring too much overhead for
+the storage.
+
+## Edges in GraphAr
+
+### Logical table of edges
+
+For maintaining a type of edges (that with the same triplet of the
+source label, edge label, and destination label), a logical edge table
+is established. And in order to support quickly creating a graph from
+the graph storage file, the logical edge table could maintain the
+topology information in a way similar to CSR/CSC (learn more about
+[CSR/CSC](https://en.wikipedia.org/wiki/Sparse_matrix)), that is, the
+edges are ordered by the internal vertex id of either source or
+destination. In this way, an offset table is required to store the start
+offset for each vertex's edges, and the edges with the same
+source/destination will be stored continuously in the logical table.
+
+Take the logical table for `person knows person` edges as an example,
+the logical edge table looks like:
+
+<img src="docs/images/edge_logical_table.png" class="align-center"
+width="650" alt="edge logical table" />
+
+### Physical table of edges
+
+As same with the vertex table, the logical edge table is also
+partitioned into some sub-logical-tables, with each sub-logical-table
+contains edges that the source (or destination) vertices are in the same
+vertex chunk. According to the partition strategy and the order of the
+edges, edges can be stored in GraphAr following one of the four types:
+
+- **ordered_by_source**: all the edges in the logical table are ordered
+ and further partitioned by the internal vertex id of the source, which
+ can be seen as the CSR format.
+- **ordered_by_dest**: all the edges in the logical table are ordered
+ and further partitioned by the internal vertex id of the destination,
+ which can be seen as the CSC format.
+- **unordered_by_source**: the internal id of the source vertex is used
+ as the partition key to divide the edges into different
+ sub-logical-tables, and the edges in each sub-logical-table are
+ unordered, which can be seen as the COO format.
+- **unordered_by_dest**: the internal id of the destination vertex is
+ used as the partition key to divide the edges into different
+ sub-logical-tables, and the edges in each sub-logical-table are
+ unordered, which can also be seen as the COO format.
+
+After that, a sub-logical-table is further divided into edge chunks of a
+predefined, fixed number of rows (referred to as edge chunk size).
+Finally, an edge chunk is separated into physical tables in the
+following way:
+
+- an adjList table (which contains only two columns: the internal vertex
+ id of the source and the destination).
+- 0 or more edge property tables, with each table contains a group of
+ properties.
+
+Additionally, there would be an offset table for **ordered_by_source**
+or **ordered_by_dest** edges. The offset table is used to record the
+starting point of the edges for each vertex. The partition of the offset
+table should be in alignment with the partition of the corresponding
+vertex table. The first row of each offset chunk is always 0, indicating
+the starting point for the corresponding sub-logical-table for edges.
+
+Take the `person knows person` edges to illustrate. Suppose the vertex
+chunk size is set to 500 and the edge chunk size is 1024, and the edges
+are **ordered_by_source**, then the edges could be saved in the
+following physical tables:
+
+<img src="docs/images/edge_physical_table1.png" class="align-center"
+width="650" alt="edge logical table1" />
+
+<img src="docs/images/edge_physical_table2.png" class="align-center"
+width="650" alt="edge logical table2" />
+
+# Building Libraries
+
+GraphAr offers a collection of libraries for the purpose of reading,
+writing and transforming files. Currently, the following libraries are
+available, and plans are in place to expand support to additional
+programming language.
+
+## The C++ Library
+
+See [GraphAr C++
+Library](https://github.com/apache/incubator-graphar/tree/main/cpp) for
+details about the building of the C++ library.
+
+## The Java Library
+
+The GraphAr Java library is created with bindings to the C++ library
+(currently at version v0.10.0), utilizing
+[Alibaba-FastFFI](https://github.com/alibaba/fastFFI) for
+implementation. See [GraphAr Java
+Library](https://github.com/apache/incubator-graphar/tree/main/java) for
+details about the building of the Java library.
+
+## The Spark Library
+
+See [GraphAr Spark
+Library](https://github.com/apache/incubator-graphar/tree/main/spark)
+for details about the Spark library.
+
+## The PySpark Library
+
+The GraphAr PySpark library is developed as bindings to the GraphAr
+Spark library. See [GraphAr PySpark
+Library](https://github.com/apache/incubator-graphar/tree/main/pyspark)
+for details about the PySpark library.
+
+# Contributing
+
+## Contributing Guidelines
+
+Read through our [contribution
+guidelines](https://github.com/apache/incubator-graphar/tree/main/CONTRIBUTING.rst)
+to learn about our submission process, coding rules, and more.
+
+## Code of Conduct
+
+Help us keep GraphAr open and inclusive. Please read and follow our
+[Code of
+Conduct](https://github.com/apache/incubator-graphar/blob/main/CODE_OF_CONDUCT.md).
+
+# Getting Involved
+
+Join the conversation and help the community. Even if you do not plan to
+contribute to GraphAr itself or GraphAr integrations in other projects,
+we'd be happy to have you involved.
+
+- Ask questions on [GitHub
+ Discussions](https://github.com/apache/incubator-graphar/discussions).
+ We welcome all kinds of questions, from beginner to advanced!
+- Follow our activity and ask for feature requests on [GitHub
+ Issues](https://github.com/apache/incubator-graphar/issues/new).
+- Join our [Weekly Community
+
Meeting](https://github.com/apache/incubator-graphar/wiki/GraphAr-Weekly-Community-Meeting).
+
+Read through our [community
+introduction](https://github.com/apache/incubator-graphar/tree/main/docs/developers/community.rst)
+to learn about our communication channels, governance, and more.
+
+# License
+
+**GraphAr** is distributed under [Apache License
+2.0](https://github.com/apache/incubator-graphar/blob/main/LICENSE).
+Please note that third-party libraries may not have the same license as
+GraphAr.
+
+# Publication
+
+- Xue Li, Weibin Zeng, Zhibin Wang, Diwen Zhu, Jingbo Xu, Wenyuan Yu,
+ Jingren Zhou. [Enhancing Data Lakes with GraphAr: Efficient Graph Data
+ Management with a Specialized Storage
+ Scheme\[J\]](https://arxiv.org/abs/2312.09577). arXiv preprint
+ arXiv:2312.09577, 2023.
+
+``` bibtex
+@article{li2023enhancing,
+ author = {Xue Li and Weibin Zeng and Zhibin Wang and Diwen Zhu and Jingbo Xu
and Wenyuan Yu and Jingren Zhou},
+ title = {Enhancing Data Lakes with GraphAr: Efficient Graph Data Management
with a Specialized Storage Scheme},
+ year = {2023},
+ url = {https://doi.org/10.48550/arXiv.2312.09577},
+ doi = {10.48550/ARXIV.2312.09577},
+ eprinttype = {arXiv},
+ eprint = {2312.09577},
+ biburl = {https://dblp.org/rec/journals/corr/abs-2312-09577.bib},
+ bibsource = {dblp computer science bibliography, https://dblp.org}
+}
+```
diff --git a/README.rst b/README.rst
deleted file mode 100644
index ffc8164..0000000
--- a/README.rst
+++ /dev/null
@@ -1,266 +0,0 @@
-.. raw:: html
-
- <h1 align="center" style="clear: both;">
- <img src="docs/images/graphar-logo.svg" width="350" alt="GraphAr">
- </h1>
- <p align="center">
- An open source, standard data file format for graph data storage and
retrieval
- </p>
-
-|GraphAr CI| |Docs CI| |GraphAr Docs| |Good First Issue|
-
-📢 Join our `Weekly Community Meeting`_ to learn more about GraphAr and get
involved!
-
-What is GraphAr?
------------------
-
-|
-
-.. image:: docs/images/overview.png
- :width: 770
- :align: center
- :alt: Overview
-
-|
-
-Graph processing serves as the essential building block for a diverse variety
of
-real-world applications such as social network analytics, data mining, network
routing,
-and scientific computing.
-
-GraphAr (short for "Graph Archive") is a project that aims to make it easier
for diverse applications and
-systems (in-memory and out-of-core storages, databases, graph computing
systems, and interactive graph query frameworks)
-to build and access graph data conveniently and efficiently.
-
-It can be used for importing/exporting and persistent storage of graph data,
-thereby reducing the burden on systems when working together. Additionally, it
can
-serve as a direct data source for graph processing applications.
-
-To achieve this, GraphAr provides:
-
-- The Graph Archive(GAR) file format: a standardized system-independent file
format for storing graph data
-- Libraries: a set of libraries for reading, writing and transforming GAR files
-
-By using GraphAr, you can:
-
-- Store and persist your graph data in a system-independent way with the GAR
file format
-- Easily access and generate GAR files using the libraries
-- Utilize Apache Spark to quickly manipulate and transform your GAR files
-
-The GAR File Format
--------------------
-The GAR file format is designed for storing property graphs. It uses metadata
to
-record all the necessary information of a graph, and maintains the actual data
in
-a chunked way.
-
-A property graph consists of vertices and edges, with each vertex contains a
unique identifier and:
-
-- A text label that describes the vertex type.
-- A collection of properties, with each property can be represented by a
key-value pair.
-
-Each edge contains a unique identifier and:
-
-- The outgoing vertex (source).
-- The incoming vertex (destination).
-- A text label that describes the relationship between the two vertices.
-- A collection of properties.
-
-The following is an example property graph containing two types of vertices
("person" and "comment") and three types of edges.
-
-.. image:: docs/images/property_graph.png
- :width: 700
- :align: center
- :alt: property graph
-
-Vertices in GraphAr
-^^^^^^^^^^^^^^^^^^^
-
-Logical table of vertices
-""""""""""""""""""""""""""
-
-Each type of vertices (with the same label) constructs a logical vertex table,
with each vertex assigned with a global index inside this type (called internal
vertex id) starting from 0, corresponding to the row number of the vertex in
the logical vertex table. An example layout for a logical table of vertices
under the label "person" is provided for reference.
-
-Given an internal vertex id and the vertex label, a vertex is uniquely
identifiable and its respective properties can be accessed from this table. The
internal vertex id is further used to identify the source and destination
vertices when maintaining the topology of the graph.
-
-.. image:: docs/images/vertex_logical_table.png
- :width: 650
- :align: center
- :alt: vertex logical table
-
-Physical table of vertices
-""""""""""""""""""""""""""
-
-The logical vertex table will be partitioned into multiple continuous vertex
chunks for enhancing the reading/writing efficiency. To maintain the ability of
random access, the size of vertex chunks for the same label is fixed. To
support to access required properties avoiding reading all properties from the
files, and to add properties for vertices without modifying the existing files,
the columns of the logical table will be divided into several column groups.
-
-Take the "person" vertex table as an example, if the chunk size is set to be
500, the logical table will be separated into sub-logical-tables of 500 rows
with the exception of the last one, which may have less than 500 rows. The
columns for maintaining properties will also be divided into distinct groups
(e.g., 2 for our example). As a result, a total of 4 physical vertex tables are
created for storing the example logical table, which can be seen from the
following figure.
-
-.. image:: docs/images/vertex_physical_table.png
- :width: 650
- :align: center
- :alt: vertex physical table
-
-**Note**: For efficiently utilize the filter push-down of the payload file
format like Parquet, the internal vertex id is stored in the payload file as a
column. And since the internal vertex id is continuous, the payload file format
can use the delta encoding for the internal vertex id column, which would not
bring too much overhead for the storage.
-
-Edges in GraphAr
-^^^^^^^^^^^^^^^^
-
-Logical table of edges
-""""""""""""""""""""""""""
-
-For maintaining a type of edges (that with the same triplet of the source
label, edge label, and destination label), a logical edge table is established.
And in order to support quickly creating a graph from the graph storage file,
the logical edge table could maintain the topology information in a way similar
to CSR/CSC (learn more about `CSR/CSC
<https://en.wikipedia.org/wiki/Sparse_matrix>`_), that is, the edges are
ordered by the internal vertex id of either source or destination. I [...]
-
-Take the logical table for "person knows person" edges as an example, the
logical edge table looks like:
-
-.. image:: docs/images/edge_logical_table.png
- :width: 650
- :align: center
- :alt: edge logical table
-
-Physical table of edges
-""""""""""""""""""""""""""
-
-As same with the vertex table, the logical edge table is also partitioned into
some sub-logical-tables, with each sub-logical-table contains edges that the
source (or destination) vertices are in the same vertex chunk. According to the
partition strategy and the order of the edges, edges can be stored in GraphAr
following one of the four types:
-
-- **ordered_by_source**: all the edges in the logical table are ordered and
further partitioned by the internal vertex id of the source, which can be seen
as the CSR format.
-- **ordered_by_dest**: all the edges in the logical table are ordered and
further partitioned by the internal vertex id of the destination, which can be
seen as the CSC format.
-- **unordered_by_source**: the internal id of the source vertex is used as the
partition key to divide the edges into different sub-logical-tables, and the
edges in each sub-logical-table are unordered, which can be seen as the COO
format.
-- **unordered_by_dest**: the internal id of the destination vertex is used as
the partition key to divide the edges into different sub-logical-tables, and
the edges in each sub-logical-table are unordered, which can also be seen as
the COO format.
-
-After that, a sub-logical-table is further divided into edge chunks of a
predefined, fixed number of rows (referred to as edge chunk size). Finally, an
edge chunk is separated into physical tables in the following way:
-
-- an adjList table (which contains only two columns: the internal vertex id of
the source and the destination).
-- 0 or more edge property tables, with each table contains a group of
properties.
-
-Additionally, there would be an offset table for **ordered_by_source** or
**ordered_by_dest** edges. The offset table is used to record the starting
point of the edges for each vertex. The partition of the offset table should be
in alignment with the partition of the corresponding vertex table. The first
row of each offset chunk is always 0, indicating the starting point for the
corresponding sub-logical-table for edges.
-
-Take the "person knows person" edges to illustrate. Suppose the vertex chunk
size is set to 500 and the edge chunk size is 1024, and the edges are
**ordered_by_source**, then the edges could be saved in the following physical
tables:
-
-.. image:: docs/images/edge_physical_table1.png
- :width: 650
- :align: center
- :alt: edge logical table1
-
-.. image:: docs/images/edge_physical_table2.png
- :width: 650
- :align: center
- :alt: edge logical table2
-
-Building Libraries
-------------------
-
-GraphAr offers a collection of libraries for the purpose of reading, writing
and transforming files.
-Currently, the following libraries are available, and plans are in place to
expand support to additional programming language.
-
-The C++ Library
-^^^^^^^^^^^^^^^
-See `GraphAr C++ Library`_ for details about the building of the C++ library.
-
-The Java Library
-^^^^^^^^^^^^^^^^
-The GraphAr Java library is created with bindings to the C++ library
(currently at version v0.10.0), utilizing `Alibaba-FastFFI`_ for
implementation.
-See `GraphAr Java Library`_ for details about the building of the Java library.
-
-The Spark Library
-^^^^^^^^^^^^^^^^^
-See `GraphAr Spark Library`_ for details about the Spark library.
-
-The PySpark Library
-^^^^^^^^^^^^^^^^^^^
-The GraphAr PySpark library is developed as bindings to the GraphAr Spark
library.
-See `GraphAr PySpark Library`_ for details about the PySpark library.
-
-
-Contributing
--------------
-
-Contributing Guidelines
-^^^^^^^^^^^^^^^^^^^^^^^^
-
-Read through our `contribution guidelines`_ to learn about our submission
process, coding rules, and more.
-
-Code of Conduct
-^^^^^^^^^^^^^^^^
-
-Help us keep GraphAr open and inclusive. Please read and follow our `Code of
Conduct`_.
-
-Getting Involved
-----------------
-
-Join the conversation and help the community. Even if you do not plan to
contribute
-to GraphAr itself or GraphAr integrations in other projects, we'd be happy to
have you involved.
-
-- Ask questions on `GitHub Discussions`_. We welcome all kinds of questions,
from beginner to advanced!
-- Follow our activity and ask for feature requests on `GitHub Issues`_.
-- Join our `Weekly Community Meeting`_.
-
-Read through our `community introduction`_ to learn about our communication
channels, governance, and more.
-
-
-License
--------
-
-**GraphAr** is distributed under `Apache License 2.0`_. Please note that
-third-party libraries may not have the same license as GraphAr.
-
-Publication
------------
-
-- Xue Li, Weibin Zeng, Zhibin Wang, Diwen Zhu, Jingbo Xu, Wenyuan Yu, Jingren
Zhou.
- `Enhancing Data Lakes with GraphAr: Efficient Graph Data Management with a
Specialized Storage Scheme[J] <https://arxiv.org/abs/2312.09577>`_.
- arXiv preprint arXiv:2312.09577, 2023.
-
-.. code:: bibtex
-
- @article{li2023enhancing,
- author = {Xue Li and Weibin Zeng and Zhibin Wang and Diwen Zhu and Jingbo
Xu and Wenyuan Yu and Jingren Zhou},
- title = {Enhancing Data Lakes with GraphAr: Efficient Graph Data
Management with a Specialized Storage Scheme},
- year = {2023},
- url = {https://doi.org/10.48550/arXiv.2312.09577},
- doi = {10.48550/ARXIV.2312.09577},
- eprinttype = {arXiv},
- eprint = {2312.09577},
- biburl = {https://dblp.org/rec/journals/corr/abs-2312-09577.bib},
- bibsource = {dblp computer science bibliography, https://dblp.org}
- }
-
-
-.. _Apache License 2.0: https://github.com/alibaba/GraphAr/blob/main/LICENSE
-
-.. |GraphAr CI| image::
https://github.com/alibaba/GraphAr/actions/workflows/ci.yml/badge.svg
- :target: https://github.com/alibaba/GraphAr/actions
-
-.. |Docs CI| image::
https://github.com/alibaba/GraphAr/actions/workflows/docs.yml/badge.svg
- :target: https://github.com/alibaba/GraphAr/actions
-
-.. |GraphAr Docs| image::
https://img.shields.io/badge/docs-latest-brightgreen.svg
- :target: https://alibaba.github.io/GraphAr/
-
-.. |Good First Issue| image::
https://img.shields.io/github/labels/alibaba/GraphAr/Good%20First%20Issue?color=green&label=Contribute%20&style=plastic
- :target:
https://github.com/alibaba/GraphAr/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22
-
-.. _GraphAr File Format:
https://alibaba.github.io/GraphAr/user-guide/file-format.html
-
-.. _GraphAr Spark Library: https://github.com/alibaba/GraphAr/tree/main/spark
-
-.. _GraphAr PySpark Library:
https://github.com/alibaba/GraphAr/tree/main/pyspark
-
-.. _GraphAr C++ Library: https://github.com/alibaba/GraphAr/tree/main/cpp
-
-.. _GraphAr Java Library: https://github.com/alibaba/GraphAr/tree/main/java
-
-.. _example files:
https://github.com/GraphScope/gar-test/blob/main/ldbc_sample/
-
-.. _contribution guidelines:
https://github.com/alibaba/GraphAr/tree/main/CONTRIBUTING.rst
-
-.. _Code of Conduct:
https://github.com/alibaba/GraphAr/blob/main/CODE_OF_CONDUCT.md
-
-.. _GraphAr Slack:
https://join.slack.com/t/grapharworkspace/shared_invite/zt-1wh5vo828-yxs0MlXYBPBBNvjOGhL4kQ
-
-.. _Weekly Community Meeting:
https://github.com/alibaba/GraphAr/wiki/GraphAr-Weekly-Community-Meeting
-
-.. _community introduction:
https://github.com/alibaba/GraphAr/tree/main/docs/developers/community.rst
-
-.. _GitHub Issues: https://github.com/alibaba/GraphAr/issues/new
-
-.. _Github Discussions: https://github.com/alibaba/GraphAr/discussions
-
-.. _Alibaba-FastFFI: https://github.com/alibaba/fastFFI
diff --git a/cpp/README.md b/cpp/README.md
index 6babe55..eb83a35 100644
--- a/cpp/README.md
+++ b/cpp/README.md
@@ -41,7 +41,7 @@ All the instructions below assume that you have cloned the
GraphAr git
repository and navigated to the ``cpp`` subdirectory:
```bash
- $ git clone https://github.com/alibaba/GraphAr.git
+ $ git clone https://github.com/apache/incubator-graphar.git
$ cd GraphAr
$ git submodule update --init
$ cd cpp
@@ -123,5 +123,5 @@ The API document is generated in the directory
``cpp/apidoc/html``.
## How to use
-Please refer to our [GraphAr C++ API
Reference](https://alibaba.github.io/GraphAr/reference/api-reference-cpp.html).
+Please refer to our [GraphAr C++ API
Reference](https://graphar.apache.org/docs/libraries/cpp).
diff --git a/dev/release-process.md b/dev/release-process.md
index 92af2a7..e0e2c6b 100644
--- a/dev/release-process.md
+++ b/dev/release-process.md
@@ -8,12 +8,12 @@ GraphAr releases are each assigned an incremental version
number in the format `
- `v0.1.2` - Second patch release upon the `v0.1.0` feature release.
Patch releases are generally fairly minor, primarily intended for fixes and
therefore are fairly unlikely to cause breakages upon update.
-Feature releases are generally larger, bringing new features in addition to
fixes and enhancements. These releases have a greater chance of introducing
breaking changes upon update, so it's worth checking for any notes in the
[release notes](https://github.com/alibaba/GraphAr/releases).
+Feature releases are generally larger, bringing new features in addition to
fixes and enhancements. These releases have a greater chance of introducing
breaking changes upon update, so it's worth checking for any notes in the
[release notes](https://github.com/apache/incubator-graphar/releases).
### Release Planning Process
-Each GraphAr release will have a
[milestone](https://github.com/alibaba/GraphAr/milestones) created with issues
& pull requests assigned to it to define what will be in that release.
Milestones are built up then worked through until complete at which point,
after some testing and documentation updates, the release will be deployed.
+Each GraphAr release will have a
[milestone](https://github.com/apache/incubator-graphar/milestones) created
with issues & pull requests assigned to it to define what will be in that
release. Milestones are built up then worked through until complete at which
point, after some testing and documentation updates, the release will be
deployed.
### Release Announcements
-Feature releases, and some patch releases, will be accompanied by a release
note on the [GitHub release page](https://github.com/alibaba/GraphAr/releases)
which will provide additional detail on features, changes & updates.
+Feature releases, and some patch releases, will be accompanied by a release
note on the [GitHub release
page](https://github.com/apache/incubator-graphar/releases) which will provide
additional detail on features, changes & updates.
diff --git a/docs/conf.py b/docs/conf.py
index 8bff85f..be7688a 100644
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -79,13 +79,13 @@ html_theme = "furo"
html_theme_options = {
"sidebar_hide_name": False, # we use the logo
"navigation_with_keys": True,
- "source_repository": "https://github.com/alibaba/GraphAr/",
+ "source_repository": "https://github.com/apache/incubator-graphar/",
"source_branch": "main",
"source_directory": "docs/",
"footer_icons": [
{
"name": "GitHub",
- "url": "https://github.com/alibaba/GraphAr",
+ "url": "https://github.com/apache/incubator-graphar",
"html": "",
"class": "fa fa-solid fa-github fa-2x",
},
diff --git a/docs/cpp/examples/bgl.rst b/docs/cpp/examples/bgl.rst
index 763b7fa..8c0cfe5 100644
--- a/docs/cpp/examples/bgl.rst
+++ b/docs/cpp/examples/bgl.rst
@@ -90,4 +90,4 @@ Finally, we could use a **VerticesBuilder** of GraphAr to
write the results to n
builder.Dump();
-.. _bgl_example.cc:
https://github.com/alibaba/GraphAr/blob/main/cpp/examples/bgl_example.cc
+.. _bgl_example.cc:
https://github.com/apache/incubator-graphar/blob/main/cpp/examples/bgl_example.cc
diff --git a/docs/cpp/examples/out-of-core.rst
b/docs/cpp/examples/out-of-core.rst
index 9c395e8..066afc2 100644
--- a/docs/cpp/examples/out-of-core.rst
+++ b/docs/cpp/examples/out-of-core.rst
@@ -107,16 +107,16 @@ Meanwhile, BFS could be implemented in a **push**-style
which only traverses the
In some cases, it is required to record the path of BFS, that is, to maintain
each vertex's predecessor (also called *father*) in the traversing tree rather
than only recording the distance. The implementation of BFS with recording
fathers can be found at `bfs_father_example.cc`_.
-.. _pagerank_example.cc:
https://github.com/alibaba/GraphAr/blob/main/cpp/examples/pagerank_example.cc
+.. _pagerank_example.cc:
https://github.com/apache/incubator-graphar/blob/main/cpp/examples/pagerank_example.cc
-.. _cc_stream_example.cc:
https://github.com/alibaba/GraphAr/blob/main/cpp/examples/cc_stream_example.cc
+.. _cc_stream_example.cc:
https://github.com/apache/incubator-graphar/blob/main/cpp/examples/cc_stream_example.cc
-.. _cc_push_example.cc:
https://github.com/alibaba/GraphAr/blob/main/cpp/examples/cc_push_example.cc
+.. _cc_push_example.cc:
https://github.com/apache/incubator-graphar/blob/main/cpp/examples/cc_push_example.cc
-.. _bfs_stream_example.cc:
https://github.com/alibaba/GraphAr/blob/main/cpp/examples/bfs_stream_example.cc
+.. _bfs_stream_example.cc:
https://github.com/apache/incubator-graphar/blob/main/cpp/examples/bfs_stream_example.cc
-.. _bfs_push_example.cc:
https://github.com/alibaba/GraphAr/blob/main/cpp/examples/bfs_push_example.cc
+.. _bfs_push_example.cc:
https://github.com/apache/incubator-graphar/blob/main/cpp/examples/bfs_push_example.cc
-.. _bfs_pull_example.cc:
https://github.com/alibaba/GraphAr/blob/main/cpp/examples/bfs_pull_example.cc
+.. _bfs_pull_example.cc:
https://github.com/apache/incubator-graphar/blob/main/cpp/examples/bfs_pull_example.cc
-.. _bfs_father_example.cc:
https://github.com/alibaba/GraphAr/blob/main/cpp/examples/bfs_father_example.cc
+.. _bfs_father_example.cc:
https://github.com/apache/incubator-graphar/blob/main/cpp/examples/bfs_father_example.cc
diff --git a/docs/cpp/examples/snap-to-graphar.rst
b/docs/cpp/examples/snap-to-graphar.rst
index 3e56918..852bc4d 100644
--- a/docs/cpp/examples/snap-to-graphar.rst
+++ b/docs/cpp/examples/snap-to-graphar.rst
@@ -78,4 +78,4 @@ The code snippet that follows demonstrates the generation and
preservation of th
ASSERT(e_builder->Dump().ok());
e_builder->Clear();
-For comprehensive insights into this example, please consult the accompanying
`source code
<https://github.com/alibaba/GraphAr/tree/main/docs/cpp/examples/snap_dataset_to_graphar.cc>`_
.
+For comprehensive insights into this example, please consult the accompanying
`source code
<https://github.com/apache/incubator-graphar/tree/main/docs/cpp/examples/snap_dataset_to_graphar.cc>`_
.
diff --git a/docs/cpp/getting-started.rst b/docs/cpp/getting-started.rst
index 0c337a3..6c0a341 100644
--- a/docs/cpp/getting-started.rst
+++ b/docs/cpp/getting-started.rst
@@ -178,7 +178,7 @@ This program first reads in the graph information file to
obtain the metadata; t
Please refer to `more examples <examples/out-of-core.html>`_ to learn about
the other available case studies utilizing GraphAr.
-.. _Building Steps:
https://github.com/alibaba/GraphAr/blob/main/README.rst#building-libraries
+.. _Building Steps:
https://github.com/apache/incubator-graphar/blob/main/README.rst#building-libraries
.. _person.vertex.yml:
https://github.com/GraphScope/gar-test/blob/main/ldbc_sample/csv/person.vertex.yml
@@ -194,6 +194,6 @@ Please refer to `more examples
<examples/out-of-core.html>`_ to learn about the
.. _./edge/person_knows_person/ordered_by_source/offset/chunk0:
https://github.com/GraphScope/gar-test/blob/main/ldbc_sample/csv/edge/person_knows_person/ordered_by_source/offset/chunk0
-.. _example program:
https://github.com/alibaba/GraphAr/blob/main/cpp/examples/construct_info_example.cc
+.. _example program:
https://github.com/apache/incubator-graphar/blob/main/cpp/examples/construct_info_example.cc
-.. _pagerank_example.cc:
https://github.com/alibaba/GraphAr/blob/main/cpp/examples/pagerank_example.cc
+.. _pagerank_example.cc:
https://github.com/apache/incubator-graphar/blob/main/cpp/examples/pagerank_example.cc
diff --git a/docs/developers/community.rst b/docs/developers/community.rst
index 6659c10..16f1c97 100644
--- a/docs/developers/community.rst
+++ b/docs/developers/community.rst
@@ -59,13 +59,13 @@ See the `community meeting notes`_ for the next meeting.
Contributing
------------
-As mentioned above, we use `GitHub <https://github.com/alibaba/GraphAr>`_ for
our issue tracker and for source control.
-See the `contribution guidelines
<https://github.com/alibaba/GraphAr/tree/main/CONTRIBUTING.rst>`_ for more.
+As mentioned above, we use `GitHub
<https://github.com/apache/incubator-graphar>`_ for our issue tracker and for
source control.
+See the `contribution guidelines
<https://github.com/apache/incubator-graphar/tree/main/CONTRIBUTING.rst>`_ for
more.
-.. _GraphAr Code of Conduct:
https://github.com/alibaba/GraphAr/blob/main/CODE_OF_CONDUCT.md
+.. _GraphAr Code of Conduct:
https://github.com/apache/incubator-graphar/blob/main/CODE_OF_CONDUCT.md
.. _GraphAr mailing list: https://groups.google.com/g/graphar
.. _GraphAr Slack:
https://join.slack.com/t/grapharworkspace/shared_invite/zt-1wh5vo828-yxs0MlXYBPBBNvjOGhL4kQ
-.. _community meeting notes:
https://github.com/alibaba/GraphAr/wiki/Community-Meeting-Agenda
+.. _community meeting notes:
https://github.com/apache/incubator-graphar/wiki/Community-Meeting-Agenda
diff --git a/docs/java/java-lib.rst b/docs/java/java-lib.rst
index e823b6c..039e558 100644
--- a/docs/java/java-lib.rst
+++ b/docs/java/java-lib.rst
@@ -70,7 +70,7 @@ directory:
.. code-block:: bash
- $ git clone https://github.com/alibaba/GraphAr.git
+ $ git clone https://github.com/apache/incubator-graphar.git
$ cd GraphAr
$ git submodule update --init
$ cd java
@@ -124,7 +124,7 @@ example code.
}
See `test for
-graphinfo
<https://github.com/alibaba/GraphAr/tree/main/java/src/test/java/com/alibaba/graphar/graphinfo>`__
+graphinfo
<https://github.com/apache/incubator-graphar/tree/main/java/src/test/java/com/apache/incubator-graphar/graphinfo>`__
for the complete example.
Writers
@@ -182,7 +182,7 @@ code.
writer.sortAndWriteAdjListTable(table, 0, 0); // Write adj list of vertex
chunk 0 to files
See `test for
-writers
<https://github.com/alibaba/GraphAr/tree/main/java/src/test/java/com/alibaba/graphar/writers>`__
+writers
<https://github.com/apache/incubator-graphar/tree/main/java/src/test/java/com/apache/incubator-graphar/writers>`__
for the complete example.
Readers
@@ -217,5 +217,5 @@ code.
StdPair<Long, Long> range = reader.getRange().value();
See `test for
-readers
<https://github.com/alibaba/GraphAr/tree/main/java/src/test/java/com/alibaba/graphar/readers>`__
+readers
<https://github.com/apache/incubator-graphar/tree/main/java/src/test/java/com/apache/incubator-graphar/readers>`__
for the complete example.
diff --git a/docs/pyspark/how-to.rst b/docs/pyspark/how-to.rst
index c8ebd36..8f16241 100644
--- a/docs/pyspark/how-to.rst
+++ b/docs/pyspark/how-to.rst
@@ -57,7 +57,7 @@ How to use GraphAr PySpark package
Now you can import, create and modify all the classes you can work
call from `scala API of
- GraphAr <https://alibaba.github.io/GraphAr/spark/reference/index.html>`__.
+ GraphAr <https://graphar.apache.org/docs/libraries/cpp>`__.
For simplify using of graphar from python constants, like GAR-types,
supported file-types, etc. are placed in ``graphar_pyspark.enums``.
@@ -79,7 +79,7 @@ How to use GraphAr PySpark package
- EdgeInfo
You can check `Scala library
- documentation
<https://alibaba.github.io/GraphAr/spark/spark-lib.html#information-classes>`__
+ documentation
<https://graphar.apache.org/GraphAr/spark/spark-lib.html#information-classes>`__
for the more detailed information.
.. container:: cell markdown
diff --git a/docs/spark/examples/spark.rst b/docs/spark/examples/spark.rst
index 9b98382..4def182 100644
--- a/docs/spark/examples/spark.rst
+++ b/docs/spark/examples/spark.rst
@@ -200,12 +200,12 @@ See `GraphAr2Neo4j.scala`_ for the complete example.
- The Neo4j Spark Connector supports to use `Spark structured streaming API
<https://neo4j.com/docs/spark/current/streaming/>`_, which works differently
from Spark batching. One can utilize this API to read/write a stream from/to
Neo4j, avoiding to maintain all data in the memory.
-.. _TestGraphTransformer.scala:
https://github.com/alibaba/GraphAr/blob/main/spark/src/test/scala/com/alibaba/graphar/TestGraphTransformer.scala
+.. _TestGraphTransformer.scala:
https://github.com/apache/incubator-graphar/blob/main/spark/src/test/scala/com/apache/incubator-graphar/TestGraphTransformer.scala
-.. _TransformExample.scala:
https://github.com/alibaba/GraphAr/blob/main/spark/src/test/scala/com/alibaba/graphar/TransformExample.scala
+.. _TransformExample.scala:
https://github.com/apache/incubator-graphar/blob/main/spark/src/test/scala/com/apache/incubator-graphar/TransformExample.scala
-.. _ComputeExample.scala:
https://github.com/alibaba/GraphAr/blob/main/spark/src/test/scala/com/alibaba/graphar/ComputeExample.scala
+.. _ComputeExample.scala:
https://github.com/apache/incubator-graphar/blob/main/spark/src/test/scala/com/apache/incubator-graphar/ComputeExample.scala
-.. _Neo4j2GraphAr.scala:
https://github.com/alibaba/GraphAr/blob/main/spark/src/main/scala/com/alibaba/graphar/example/Neo4j2GraphAr.scala
+.. _Neo4j2GraphAr.scala:
https://github.com/apache/incubator-graphar/blob/main/spark/src/main/scala/com/apache/incubator-graphar/example/Neo4j2GraphAr.scala
-.. _GraphAr2Neo4j.scala:
https://github.com/alibaba/GraphAr/blob/main/spark/src/main/scala/com/alibaba/graphar/example/GraphAr2Neo4j.scala
+.. _GraphAr2Neo4j.scala:
https://github.com/apache/incubator-graphar/blob/main/spark/src/main/scala/com/apache/incubator-graphar/example/GraphAr2Neo4j.scala
diff --git a/docs/spark/spark-lib.rst b/docs/spark/spark-lib.rst
index 05d3527..79bad54 100644
--- a/docs/spark/spark-lib.rst
+++ b/docs/spark/spark-lib.rst
@@ -51,7 +51,7 @@ GraphAr supports two Apache Spark versions for now and uses
Maven Profiles to wo
After compilation, a similar file *graphar-x.x.x-SNAPSHOT-shaded.jar* is
generated in the directory *spark/graphar/target/*.
-Please refer to the `building steps
<https://github.com/alibaba/GraphAr/tree/main/spark>`_ for more details.
+Please refer to the `building steps
<https://github.com/apache/incubator-graphar/tree/main/spark>`_ for more
details.
How to Use
@@ -228,20 +228,20 @@ For more information on usage, please refer to the
examples:
- `Neo4j2GraphAr.scala`_ and `GraphAr2Neo4j.scala`_ are examples to conduct
data importing/exporting for Neo4j.
-.. _TestGraphInfo.scala:
https://github.com/alibaba/GraphAr/blob/main/spark/src/test/scala/com/alibaba/graphar/TestGraphInfo.scala
+.. _TestGraphInfo.scala:
https://github.com/apache/incubator-graphar/blob/main/spark/src/test/scala/com/apache/incubator-graphar/TestGraphInfo.scala
-.. _TestIndexGenerator.scala:
https://github.com/alibaba/GraphAr/blob/main/spark/src/test/scala/com/alibaba/graphar/TestIndexGenerator.scala
+.. _TestIndexGenerator.scala:
https://github.com/apache/incubator-graphar/blob/main/spark/src/test/scala/com/apache/incubator-graphar/TestIndexGenerator.scala
-.. _TestWriter.scala:
https://github.com/alibaba/GraphAr/blob/main/spark/src/test/scala/com/alibaba/graphar/TestWriter.scala
+.. _TestWriter.scala:
https://github.com/apache/incubator-graphar/blob/main/spark/src/test/scala/com/apache/incubator-graphar/TestWriter.scala
-.. _TestReader.scala:
https://github.com/alibaba/GraphAr/blob/main/spark/src/test/scala/com/alibaba/graphar/TestReader.scala
+.. _TestReader.scala:
https://github.com/apache/incubator-graphar/blob/main/spark/src/test/scala/com/apache/incubator-graphar/TestReader.scala
-.. _TestGraphTransformer.scala:
https://github.com/alibaba/GraphAr/blob/main/spark/src/test/scala/com/alibaba/graphar/TestGraphTransformer.scala
+.. _TestGraphTransformer.scala:
https://github.com/apache/incubator-graphar/blob/main/spark/src/test/scala/com/apache/incubator-graphar/TestGraphTransformer.scala
-.. _ComputeExample.scala:
https://github.com/alibaba/GraphAr/blob/main/spark/src/test/scala/com/alibaba/graphar/ComputeExample.scala
+.. _ComputeExample.scala:
https://github.com/apache/incubator-graphar/blob/main/spark/src/test/scala/com/apache/incubator-graphar/ComputeExample.scala
-.. _TransformExample.scala:
https://github.com/alibaba/GraphAr/blob/main/spark/src/test/scala/com/alibaba/graphar/TransformExample.scala
+.. _TransformExample.scala:
https://github.com/apache/incubator-graphar/blob/main/spark/src/test/scala/com/apache/incubator-graphar/TransformExample.scala
-.. _Neo4j2GraphAr.scala:
https://github.com/alibaba/GraphAr/blob/main/spark/src/main/scala/com/alibaba/graphar/example/Neo4j2GraphAr.scala
+.. _Neo4j2GraphAr.scala:
https://github.com/apache/incubator-graphar/blob/main/spark/src/main/scala/com/apache/incubator-graphar/example/Neo4j2GraphAr.scala
-.. _GraphAr2Neo4j.scala:
https://github.com/alibaba/GraphAr/blob/main/spark/src/main/scala/com/alibaba/graphar/example/GraphAr2Neo4j.scala
+.. _GraphAr2Neo4j.scala:
https://github.com/apache/incubator-graphar/blob/main/spark/src/main/scala/com/apache/incubator-graphar/example/GraphAr2Neo4j.scala
diff --git a/java/README.md b/java/README.md
index 2ee0b4c..8168291 100644
--- a/java/README.md
+++ b/java/README.md
@@ -43,7 +43,7 @@ Tips:
Make the graphar-java-library directory as the current working directory:
```bash
- $ git clone https://github.com/alibaba/GraphAr.git
+ $ git clone https://github.com/apache/incubator-graphar.git
$ cd GraphAr
$ git submodule update --init
$ cd java
@@ -72,4 +72,4 @@ Then set GraphAr as a dependency in maven project:
## How to use
-Please refer to [GraphAr Java Library
Documentation](https://alibaba.github.io/GraphAr/user-guide/java-lib.html).
\ No newline at end of file
+Please refer to [GraphAr Java Library
Documentation](https://graphar.apache.org/GraphAr/user-guide/java-lib.html).
\ No newline at end of file
diff --git a/java/cmake/graphar-cpp.cmake b/java/cmake/graphar-cpp.cmake
index 9c7356d..8865d7e 100644
--- a/java/cmake/graphar-cpp.cmake
+++ b/java/cmake/graphar-cpp.cmake
@@ -69,7 +69,7 @@ function(build_graphar_cpp)
include(ExternalProject)
ExternalProject_Add(graphar_ep
- GIT_REPOSITORY https://github.com/alibaba/GraphAr.git
+ GIT_REPOSITORY https://github.com/apache/incubator-graphar.git
GIT_TAG ${GAR_VERSION_TO_BUILD}
GIT_SHALLOW TRUE
GIT_SUBMODULES ""
diff --git a/java/src/test/java/org/apache/graphar/graphinfo/GraphInfoTest.java
b/java/src/test/java/org/apache/graphar/graphinfo/GraphInfoTest.java
index 837829f..35f7177 100644
--- a/java/src/test/java/org/apache/graphar/graphinfo/GraphInfoTest.java
+++ b/java/src/test/java/org/apache/graphar/graphinfo/GraphInfoTest.java
@@ -111,7 +111,8 @@ public class GraphInfoTest {
Assert.assertEquals(1, edgeInfos.size());
}
- @Ignore("Problem about arrow 12.0.0 with S3, see
https://github.com/alibaba/GraphAr/issues/187")
+ @Ignore(
+ "Problem about arrow 12.0.0 with S3, see
https://github.com/apache/incubator-graphar/issues/187")
public void testGraphInfoLoadFromS3() {
// arrow::fs::Fi
// nalizeS3 was not called even though S3 was initialized. This could
lead to a
diff --git a/spark/README.md b/spark/README.md
index abf9ad7..7469906 100644
--- a/spark/README.md
+++ b/spark/README.md
@@ -19,7 +19,7 @@ All the instructions below assume that you have cloned the
GraphAr git
repository and navigated to the ``spark`` subdirectory:
```bash
- $ git clone https://github.com/alibaba/GraphAr.git
+ $ git clone https://github.com/apache/incubator-graphar.git
$ cd GraphAr
$ git submodule update --init
$ cd spark
@@ -231,4 +231,4 @@ The example will import the basketballplayer graph from
GraphAr to NebulaGraph a
## How to use
-Please refer to our [GraphAr Spark Library
Documentation](https://alibaba.github.io/GraphAr/spark/spark-lib.html).
+Please refer to our [GraphAr Spark Library
Documentation](https://graphar.apache.org/GraphAr/spark/spark-lib.html).
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]